Using a Double-Well Oscillator to Train Binary Neural Network Layers
Abstract
Certain processors specifically designed for neural networks are defined with low-precision weights and activations. Low-precision weights and activations can considerably reduce the power required for computing a neural network. Although the neural network is still capable of generalizing the data to determine relevant features, the loss of precision in the values sometimes results in considerable loss of test accuracy. In this paper, we explore using a dynamical system which introduces transient chaos to the loss function that helps train binary network layers. Implementing theoretical stochastic rounding probabilities on the MNIST data set we improved the test error to state of the art for a binary network. We also show that adding a dynamical equation to the loss function of a network can effectively binarize a network.
Document Details
- Document Type
- Technical Report
- Publication Date
- Aug 15, 2017
- Accession Number
- AD1059408
Entities
People
- Aron Wing
- Rose Rustowicz
Organizations
- Rome Laboratory