Mixed-Precision Algorithms for Training Deep Neural Networks 23-000005145
Abstract
Deep neural networks (DNNs) are increasingly used in various science and engineering applications, such as surrogate modeling, numerical solution of high-dimensional nonlinear PDEs, data-driven modeling, and generative modeling in Bayesian inference. DNNs have universal approximation properties and often outperform other function approximators in high dimensions. However, their full potentialcan only be realized when a suitable architecture is chosen, and their weights are trained well through optimization, which is challenging and computationally costly.This project develops mixed-precision approaches for designing and training DNNs. Thereby, we seek to seize opportunities by recent advances in mixed-precision (MP) hardware. The project#s three main thrusts will develop trainingalgorithms for continuous-time deep learning models, develop separable least-squares solvers for training low-precision DNN featureextractors, and develop derivative-free optimization methods that are highly parallel and robust, respectively. The project will validate and compare the algorithms on a variety of test bed problems. A cross-cut effort provides efficient open-source implementations to ensure the reproducibility of our findings and enable future research.This summary is approved for public release.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Mar 15, 2024
- Source ID
- N000142412221
Entities
People
- Lars Ruthotto
Organizations
- Emory University
- Office of Naval Research
- United States Navy