Stochastic Function Oracles and Their Use in Convergent Optimization Algorithms
Abstract
Project Summary/Abstract (Approved for Public Release)Stochastic optimization is the driving force behind most of the modern foundat,ions of AI, such as machine learning and reinforcement learning. Standard approach to optimization in these domains is the now ubiqu,itous stochastic gradient method, however, this methodrequires hand-tuning and is generally not robust. There is a significant ongoi,ng effort in developing robust and self-tuning algorithms for stochastic optimization. While the usual assumption is that an unbiase,d stochastic estimate of the gradient is available, in order todevelop more practical and widely applicable algorithms, a broader co,llection of settings should be considered. There is a plethora of practical applications where simply relying on unbiased gradient e,stimates is inapplicable or insufficient. For example, in expected riskminimization domain or in reinforcement learning the objectiv,e function is defined over a distribution over data or over simulation roll-outs, yet individual samples from these distributions ma,y be corrupted either by malicious attacks or hardware/software malfunctions. In these cases, function or gradient samples can no lo,nger be treated as unbiased estimates and different (robust) estimation procedures need to be employed and appropriately analyzed.In, addition a wide range of applications fall in the domain of derivative free optimization where no direct gradient can be extracted,from the function. These applications include reinforcement learning, hyperparameter tuning in deep neural networks, and expected ri,sk(as opposed to loss) minimization. Stochastic gradient approximations can still be computed based on function value samples, howev,er, such computations are o,and use.Whether the estimates are biased or unbiased speaks to the accuracy of these estimates, which in turn drives the accuracy wi,th which an algorithm can approximate the optimal solution. On the other hand, what drives the efficiency of the algorithm is the re,stimate reliability and algorithmic efficiency is complex and algorithm dependent, but there are known results from prior work that,establish thisrelationship in several cases. Finally, the cost of stochastic estimates is an integral part of the algorithmic effici,ency that depends on how estimates are constructed and thus needs to be carefully analyzed.In summary to use stochastic estimates in, an optimization algorithm one needs to study their theoretical and practical properties, such as cost, accuracy and reliability. Th,us the goal of this project is to establish a general framework for analyzing and comparing stochasticfunction information, specific,ally, zeroth, first and second order oracles that are implementable in a particular setting. The settings we consider will include a,dversarially corrupted data in machine learning, computational failures in distributed environments, simulationbased optimization an,d reinforcement learning and optimization of nonstandard losses in machine learning. We will analyze the oracles, and the trade-offs, of their cost, reliability and accuracy for different applications listed above. We will then produce a clear comparison ofadvantag,es and disadvantages of different methods that deliver the oracles.PI Scheinberg has extensive experience in establishing theoretica,l framework for inexact function oracles and for stochastic optimization. Her work on derivative free optimization has been awarded,the Lagrange Prize and her contributions to optimization for machinelearning and stochastic optimization has been awarded the Farkas, Prize.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Feb 08, 2022
- Source ID
- N000142212154
Entities
People
- Katya Scheinberg
Organizations
- Cornell University
- Office of Naval Research
- United States Navy