Going beyond a black-box optimization view of deep learning

Abstract

The project proposes various directions to better understand deep learning, and presents concrete settings whose study involves opening the black box assumed in the usual optimization view. To retain mathematical tractability these settings are subcases of deep learning which still contain interesting and nontrivial learning problems, and yet allow mathematical analysis. All settings involve analysis of the trajectory followed in the losslandscape.First is linear nets (i.e, which lack any nonlinearities). These appear trivial at first sight, but multilayer versions present a challenge to analysis of optimization, since the loss function is still quite nonconvex. Here the project analyses how gradient descent can solve classic matrix completion better ---i.e., with fewer revealed entries---than the classic convex programming method.The second setting is infinitely wide nets, which end up corresponding to regression with respect to a fixed kernel (Neural Tangent Kernel). Again, one imagines that this model must be trivial but actually it is not, and these kernels do very well on small-data tasks. The project seeksto understand how this happens. Finally, the analysis of trajectory yields new surprising insights into current deep architectures, specifically the possibility of attaining state of the art performance using step sizes that increase exponentially, again something not ever seen before in a black box view of optimization. The goal here is to design new efficient training procedures from such insights.

Document Details

Document Type: DoD Grant Award
Publication Date: May 08, 2020
Source ID: N000142012338

Entities

People

Sanjeev Arora

Organizations

Office of Naval Research
Trustees of Princeton University
United States Navy

Going beyond a black-box optimization view of deep learning

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas