Going beyond a black-box optimization view of deep learning

Abstract

The project proposes various directions to better understand deep learning, and presents concrete settings whose study involves opening the black box assumed in the usual optimization view. To retain mathematical tractability these settings are subcases of deep learning which still contain interesting and nontrivial learning problems, and yet allow mathematical analysis. All settings involve analysis of the trajectory followed in the losslandscape.First is linear nets (i.e, which lack any nonlinearities). These appear trivial at first sight, but multilayer versions present a challenge to analysis of optimization, since the loss function is still quite nonconvex. Here the project analyses how gradient descent can solve classic matrix completion better ---i.e., with fewer revealed entries---than the classic convex programming method.The second setting is infinitely wide nets, which end up corresponding to regression with respect to a fixed kernel (Neural Tangent Kernel). Again, one imagines that this model must be trivial but actually it is not, and these kernels do very well on small-data tasks. The project seeksto understand how this happens. Finally, the analysis of trajectory yields new surprising insights into current deep architectures, specifically the possibility of attaining state of the art performance using step sizes that increase exponentially, again something not ever seen before in a black box view of optimization. The goal here is to design new efficient training procedures from such insights.

Document Details

Document Type
DoD Grant Award
Publication Date
May 08, 2020
Source ID
N000142012338

Entities

People

  • Sanjeev Arora

Organizations

  • Office of Naval Research
  • Trustees of Princeton University
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.
  • Operations Research
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks