An Information Theoretic Interpretation to Deep Neural Networks

Abstract

With the unprecedented performance achieved by deep learning, it is commonly believed that deep neural networks (DNNs) attempt to extract informative features for learning tasks. To formalize this intuition, we apply the local information geometric analysis and establish an information-theoretic framework for feature selection, which demonstrates the information-theoretic optimality of DNN features. Moreover, we conduct a quantitative analysis to characterize the impact of network structure on the feature extraction process of DNNs. Our investigation naturally leads to a performance metric for evaluating the effectiveness of extracted features, called the H-score, which illustrates the connection between the practical training process of DNNs and the information-theoretic framework. Finally, we validate our theoretical results by experimental designs on synthesized data and the ImageNet dataset.

Document Details

Document Type
Pub Defense Publication
Publication Date
Jan 17, 2022
Source ID
10.3390/e24010135

Entities

People

  • Gregory Wayne Wornell
  • Lizhong Zheng
  • Shao-Lun Huang
  • Xiangxiang Xu

Organizations

  • National Natural Science Foundation of China
  • National Science Foundation
  • Office of Naval Research

Tags

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks