Linear Layers and Partial Weight Reinitialization for Accelerating Neural Network Convergence
Abstract
We present two new approaches for accelerating the training of a neural network: 1) self-pruning using collapsible linear layers, and 2) mid-training weight reinitialization. By following each nonlinear layer with linear layers, then folding these linear layers into subsequent nonlinear layers after training, we are able to reproduce the benefits of overparameterizing the network, then pruning individual elements after training. We also periodically reinitialize the weights of nonlinear elements that do not improve the networks performance during training, freezing retained weights for several epochs to force the reinitialized weights to accommodate information already learned. Both methods demonstrate substantial gains: the resulting models are simpler than those attained by standard pruning and initialization methods, require fewer computations to train, and are more accurate than networks trained with those methods.
Document Details
- Document Type
- Technical Report
- Publication Date
- Aug 01, 2018
- Accession Number
- AD1059350
Entities
People
- John S. Hyatt
- Michael S. Lee
Organizations
- United States Army Research Laboratory