Mixed-Precision Algorithms for Training Deep Neural Networks 23-000005145

Abstract

Deep neural networks (DNNs) are increasingly used in various science and engineering applications, such as surrogate modeling, numerical solution of high-dimensional nonlinear PDEs, data-driven modeling, and generative modeling in Bayesian inference. DNNs have universal approximation properties and often outperform other function approximators in high dimensions. However, their full potentialcan only be realized when a suitable architecture is chosen, and their weights are trained well through optimization, which is challenging and computationally costly.This project develops mixed-precision approaches for designing and training DNNs. Thereby, we seek to seize opportunities by recent advances in mixed-precision (MP) hardware. The project#s three main thrusts will develop trainingalgorithms for continuous-time deep learning models, develop separable least-squares solvers for training low-precision DNN featureextractors, and develop derivative-free optimization methods that are highly parallel and robust, respectively. The project will validate and compare the algorithms on a variety of test bed problems. A cross-cut effort provides efficient open-source implementations to ensure the reproducibility of our findings and enable future research.This summary is approved for public release.

Document Details

Document Type
DoD Grant Award
Publication Date
Mar 15, 2024
Source ID
N000142412221

Entities

People

  • Lars Ruthotto

Organizations

  • Emory University
  • Office of Naval Research
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Defense Technology Research and Development.
  • Neural Network Machine Learning.
  • Operations Research

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks