Characterizing Convolutional Neural Network Early Learning and Accelerating Non-Adaptive, First-Order Methods with Localized Lagrangian Restricted Memory Level Bundling

Abstract

This dissertation studies the underlying optimization problem encountered during the early-learning stages of convolutional neural networks and introduces a training algorithm competitive with existing state-of-the-art methods. A Design of Experiments method is introduced to systematically measure empirical second-order Lipschitz upper bound and region size estimates for local regions of convolutional neural network loss surfaces experienced during the early-learning stages. This method demonstrates that architecture choices can significantly impact the local loss surfaces traversed during training. A Design of Experiments method is used to study the effects convolutional neural network architecture hyper parameters have on different optimization routines' abilities to effectively train and find solutions that generalize well during early learning, demonstrating a relationship between routine selection and network architecture. A method to accelerate the early learning of non-adaptive, first-order optimization routines is developed. The method decomposes the neural network training problem into a series of unconstrained optimization problems within localized trailing Euclidean trust regions and allows non-adaptive methods to exhibit training results which are competitive with adaptive methods.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 2021
Accession Number
AD1151637

Entities

People

  • Benjamin O. Morris

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Air Force
  • Algorithms
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Computational Science
  • Computer Vision
  • Convolutional Neural Networks
  • Engineering
  • Experimental Design
  • Information Processing
  • Information Science
  • Information Systems
  • Machine Learning
  • Network Architecture
  • Neural Networks
  • Probabilistic Models
  • Statistical Analysis

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.
  • Operations Research

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks