Collaborative Proposal: Primal-Dual Algorithms for Minimax Problems with Applications to Distributionally Robust Learning (White paper tracking number: 23-000005119)

Abstract

Minimax problems arise frequently in many key settings in today#s big data world. Examples include unsupervised and supervised learning problems where the aim is to learn a robust predictive model from data. Other applications in minimax setting include resource allocation problems and distributionally robust optimization (DRO). First-order primal-dual (FODP) methods can efficiently tackle with these large-scale minimax problems; indeed, FOPD methods, their incremental and stochastic variants that rely on inexact gradientinformation have been very popular in practice due to their favorable scalability properties to large datasets and model dimensions. That said, these effective algorithms come with several challenges. First, most of the existing first-order (FO) methods for solving non-convex minimax problems require the knowledge of global Lipschitz constants; however, in many practical settings these constants may not be readily available or using the global constants may lead to conservative step sizes, causing slow convergence. For non-convex minimax problems, efficient FOPD methods with backtracking step-size search techniques that can exploit the local Lipschitzconstants and/or the block-coordinate structure of the problem are missing from the literature. Second, for non-convex minimax problems, existing stochastic FOPD algorithms output a random solution with performance guarantees that only hold in expectation; a drawback is that the iterates can be arbitrarily far away from the desired solution with a positive probability. For any given p in (0, 1), guarantees on the iterates being close to the solution with probability p, and risk-averse guarantees that can quantify the deviations from the average performance are both missing. Third, existing stochastic FOPD algorithms can only deal with a very limited class of non-convex and non-smooth losses. This is a significant disadvantage because more general non-smooth, non-convex losses arise frequently in various applications due to statistical reasons or when non-smooth activation functions, such as ReLU, are used in deep learning. Fourth, for minimax problems in finite-sum form, existing incremental FOPD methods are not very efficient; indeed, there is room for further improvement in terms of the worst-case complexity of existing methods. To the best of our knowledge, no cyclic incremental method that can be faster than the simple gradient descent ascent (GDA) method is known in the related literature. Forboth deterministic and stochastic minimax problems with some structured non-convexity, we propose to develop a backtracking step-size search procedure that can support randomized block coordinate (RBC) updates. If successful, the project will develop the first backtracking step-size search technique for nonconvex mini-max problems that can also efficiently deal with problems having block-coordinate structure through employing RBC updates. We also propose a research agenda that, if successful, will develop the first high-probability and risk-averse guarantees in the context of stochastic FOPD methods for non-convex minimax problems, which can also dealwith time-varying/online problems. Moreover, we propose to study distributional tail properties of the FOPD iterates which would provide us with some insight on the generalization bounds for our algorithms, i.e., bounds on how well the output would be able to perform on the yet unseen data. For DRO problems, we propose novel FO algorithms for handling a general family of non-smooth and non-convex losses (which includes non-smooth weakly convex loss functions as a subclass) and provide new analysis techniques to study their convergence properties. Finally, for minimax problems in finite-sum form, we propose the double incremental aggregated gradient descent ascent method which can potentially be faster than GDA-type methods in both convex and non-convex settings. This abstract is Approved for Public Release.

Document Details

Document Type
DoD Grant Award
Publication Date
Nov 09, 2024
Source ID
N000142412666

Entities

People

  • Necdet Aybat

Organizations

  • Office of Naval Research
  • Pennsylvania State University
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.
  • Operations Research

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms