Distributed, Efficient Algorithms for Deep Network Training without Pretraining

Abstract

Distributed, Efficient Algorithms for Deep Network Training Without Pretraining Project Summary With the emergence of large datasets from scientific, web-based (twitter, social networks), and business applications, it has become commonplace to solve model fitting and deep learning problems with millions or even billions of unknowns. However, the traditional backpropagation algorithm for training neural nets has not scaled well to very large networks, requiring the use of human-designed autoencoders or other forms of pretraining to draw useful contributions from the first several layers. In addition, backpropagation can be slow to converge on large networks due to saturation effects and poor conditioning. Therefore, new algorithms for training deep networks that do not require the supervision of pretraining or the computational burden of backpropagation would greatly enhance the abilities of an already-successful approach. This proposal outlines a challenging algorithm development plan that revises model fitting and the deep learning pipeline from algorithms to applications. Rather than attacking deep learning immediately, we begin by studying a broad class of model fitting problems for machine learning and distributed computing. Our research focuses on selfadaptive algorithms – methods that tune their own parameters at runtime to achieve optimal performance with no user oversight. These methods will be immediately applicable to a wide range of model fitting problems in machine learning, including distributed support vector machines, logistic regressions, decision trees, and more. Once we have developed self-adaptive methods for general model fitting, we will specialize these methods to problems in deep learning. Most methods in deep learning require “pre-training” strategies – greedy heuristic methods that define the network behavior (by setting weights) before global model fitting happens. The splitting methods we propose for deep learning have the unique property that they generate features without pre-training or other forms of user supervision. Our goal is to use this powerful framework to generate deep features for use in other areas of machine learning, including reinforcement learning. Because our methods do not require customized pre-training strategies, we can apply them to a wide range of difficult problems beyond the scope of simple neural networks. In particular, we are interested in leveraging deep learning for value function approximation in reinforcement learning. By automating the machine learning pipeline and generating features without user supervision, our work will allow scientists and military personal to leverage deep learning advances without the huge computing resources or deep learning expertise required for traditional methods; training deep networks in less time, with fewer nodes, and without expertise in model fitting.

Document Details

Document Type
DoD Grant Award
Publication Date
Aug 12, 2016
Source ID
N000141512676

Entities

People

  • Thomas Goldstein

Organizations

  • Office of Naval Research
  • United States Navy
  • University of Maryland

Tags

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks