Stochastic Function Oracles and Their Use in Convergent Optimization Algorithms

Abstract

(Approved for Public Release)Stochastic optimization is the driving force behind most of the modern foundations of AI,such as machine learning and reinforcement learning. Standard approach to optimizationin these domains is the now ubiquitous stochastic gradient method, however, this methodrequires hand-tuning and is generally not robust. There is a significant ongoing effort indeveloping robust and self-tuning algorithms for stochastic optimization. While the usualassumption is that an unbiased stochastic estimate of thegradient is available, in order todevelop more practical and widely applicable algorithms, a broader collection of settingsshould be considered. There is a plethora of practical applications where simply relying onunbiased gradient estimates is inapplicable or insufficient. For example, in expected riskminimization domain or in reinforcement learning the objective function is defined over adistribution over data or over simulation roll-outs, yet individual samples from these distributionsmay be corrupted either by malicious attacks or hardware/software malfunctions. Inthese cases, function or gradient samples can no longer be treated as unbiased estimates anddifferent (robust) estimation procedures need to be employed and appropriately analyzed. In addition, a wide range of applications fall in the domain of derivative free optimizationwhere no direct gradient can be extracted from the function. These applications includereinforcement learning, hyperparameter tuning in deep neural networks, and expected risk(as opposed to loss) minimization. Stochastic gradient approximations can still be computedbased on function value samples, however, such computations are often costly and alwayslead to biased gradient estimates, thus care has to be taken in their development and use. Whether the estimates arebiased or unbiased speaks to the accuracy of these estimates,which in turn drives the accuracy with which an algorithm can approximate the optimalsolution. On the other hand, what drives the efficiency of the algorithm is the reliabilityof these estimates, which means how likely these estimates are to be close to the true gradient.The relationship between estimate reliability and algorithmic efficiency is complexand algorithm dependent, but there are known results from prior work that establish thisrelationship in severalcases. Finally, the cost of stochastic estimates is an integral part ofthe algorithmic efficiency that depends on how estimates areconstructed and thus needs tobe carefully analyzed. In summary to use stochastic estimates in an optimization algorithm one needs to studytheir theoretical and practical properties, such as cost, accuracy and reliability. Thus thegoal of this project is to establish a general framework for analyzing and comparing stochasticfunction information, specifically, zeroth, first and second order oracles that are implementablein a particular setting. The settings we consider will include adversarially corrupteddata in machine learning, computational failures in distributed environments, simulationbased optimization and reinforcement learning and optimizationof nonstandard losses inmachine learning. We will analyze the oracles, and the trade-offs of their cost, reliability andaccuracy for different applications listed above. We will then produce a clear comparison ofadvantages and disadvantages of different methods that deliver the oracles. PI Scheinberg has extensive experience in establishing theoretical framework for inexactfunction oracles and for stochastic optimization. Her work on derivative free optimizationhas been awarded the Lagrange Prize and her contributions tooptimization for machinelearning and stochastic optimization has been awarded the Farkas Prize. Approved for Public Release.

Document Details

Document Type: DoD Grant Award
Publication Date: Jan 13, 2025
Source ID: N000142512088

Entities

People

Katya Scheinberg

Organizations

Georgia Tech Research Corporation
Office of Naval Research
United States Navy

Stochastic Function Oracles and Their Use in Convergent Optimization Algorithms

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas