Going beyond a black-box optimization view of deep learning
Abstract
The project proposes various directions to better understand deep learning, and presents concrete settings whose study involves opening the black box assumed in the usual optimization view. To retain mathematical tractability these settings are subcases of deep learning which still contain interesting and nontrivial learning problems, and yet allow mathematical analysis. All settings involve analysis of the trajectory followed in the losslandscape.First is linear nets (i.e, which lack any nonlinearities). These appear trivial at first sight, but multilayer versions present a challenge to analysis of optimization, since the loss function is still quite nonconvex. Here the project analyses how gradient descent can solve classic matrix completion better ---i.e., with fewer revealed entries---than the classic convex programming method.The second setting is infinitely wide nets, which end up corresponding to regression with respect to a fixed kernel (Neural Tangent Kernel). Again, one imagines that this model must be trivial but actually it is not, and these kernels do very well on small-data tasks. The project seeksto understand how this happens. Finally, the analysis of trajectory yields new surprising insights into current deep architectures, specifically the possibility of attaining state of the art performance using step sizes that increase exponentially, again something not ever seen before in a black box view of optimization. The goal here is to design new efficient training procedures from such insights.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- May 08, 2020
- Source ID
- N000142012338
Entities
People
- Sanjeev Arora
Organizations
- Office of Naval Research
- Trustees of Princeton University
- United States Navy