Hypercolumn Sparsification for Low-Power Convolutional Neural Networks

Abstract

We provide here a novel method, called hypercolumn sparsification, to achieve high recognition performance for convolutional neural networks (CNNs) despite low-precision weights and activities during both training and test phases. This method is applicable to any CNN architecture that operates on signal patterns (e.g., audio, image, video) to extract information such as class membership. It operates on the stack of feature maps in each of the cascading feature matching and pooling layers through the processing hierarchy of the CNN by an explicit competitive process ( k -WTA, winner take all) that generates a sparse feature vector at each spatial location. This principle is inspired by local brain circuits, where neurons tuned to respond to different patterns in the incoming signals from an upstream region inhibit each other using interneurons, such that only the ones that are maximally activated survive the quenching threshold. We show this process of sparsification is critical for probabilistic learning of low-precision weights and bias terms, thereby making pattern recognition amenable for energy-efficient hardware implementations. Further, we show that hypercolumn sparsification could lead to more data-efficient learning as well as having an emergent property of significantly pruning down the number of connections in the network. A theoretical account and empirical analysis are provided to understand these effects better.

Document Details

Document Type
Pub Defense Publication
Publication Date
Mar 26, 2019
Source ID
10.1145/3304104

Entities

People

  • David W. Payton
  • Narayan Srinivasa
  • Nigel D. Stepp
  • Praveen K Pilly
  • Yannis Liapis

Organizations

  • Defense Advanced Research Projects Agency
  • HRL Laboratories

Tags

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks