Fast, Distributed Algorithms in Deep Networks

Abstract

In this project we demonstrate two different approaches to speed up the training of neural nets. First, even before training, we demonstrate an informed way of initializing parameters closer to their final, trained values. Second, we introduce a new training algorithm that scales linearly when parallelized, allowing for substantially decreased training times on large datasets. Neural nets are famously unintuitive, and as such, parameters are typically randomly assigned, then adjusted during training. However, by using a cosine activation function, a layer of neurons can be made to approximate the implicit feature space of a kernel. Therefore, intuition on kernel selection can guide initial parameter assignments even before any data observations. We implement this approach and show that it can greatly speed uptraining, often approaching the final accuracy after only one training iteration. Our second contribution was in the application of the ADMM algorithm to neural nets. Conventional gradient based optimization methods for neural nets scale poorly which is difficult to avoid with extremely large datasets. The proposed method avoids many of the conditions that typically make gradient based methods slow, allowing for efficient computation without specialized hardware. Our implementation demonstrates strong scalability with linear speedups even up to thousands of cores. We show that for large problems, our approach can converge faster than GPU-based implementations of standard algorithms.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 11, 2016
Accession Number
AD1013468

Entities

People

  • Ryan J. Burmeister

Organizations

  • United States Naval Academy

Tags

Communities of Interest

  • Autonomy
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Accuracy
  • Algorithms
  • Computations
  • Computer Science
  • Computers
  • Data Sets
  • Information Science
  • Iterations
  • Kernel Functions
  • Machine Learning
  • Neural Networks
  • Optimization
  • Supervised Machine Learning
  • Three Dimensional
  • Training
  • Two Dimensional
  • United States Naval Academy

Fields of Study

  • Computer science

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Distributed Systems and Data Platform Development
  • Operations Research

Technology Areas

  • Space