An information-theoretic approach to improving the robustness of deep learning architectures

Abstract

Deep neural networks (DNNs) have been shown to outperform ~flat~ learning structures on a variety of applications. There has been great progress recently on scaling DNNs to large-scale applications and reliably training them. Far less developed is the ability to interpret and analyze the fundamental performance limits of DNNs. Signal processing and information theory have traditionally played" an important role in characterizing optimal performance through quantities like the Cram~r~Rao bound, the Bayes error, and differen""t divergence measures; however, the utility of traditional statistical signal processing strategies is limited because they require" complete knowledge of the data distribution and an analytical characterization of the algorithms that process it. This is not realistic for DNNs.The principal goal of this proposal is to investigate non-parametric extensions of fundamental principles from statistical signal processing (SSP) that allow us to better understand data and the complex algorithms that process it. A new and extensible framework is proposed for constructing non-parametric estimators of fundamental quantities in signal processing that does not require density estimation or direct integration. The key innovation is the proposed development and analysis of a set of ~data-driven~ basis functions that can be estimated directly from data without requiring density estimation or direct integration. Linear combinations of these basis functions will allow scientists todevelop non-parametric estimators of new and existing SSP quantities (e.g." Cramer-Rao Bound, Bayes error, divergence measures, Fisher information, entropy, etc.). The new signal processing theory will infor"m analysis of DNNs for an application of particular interest to the PI: developing low-power instantiations of DNNs while preserving their rich expressive power. The anticipated outcomes of this work are (1) new SSP theory for analysis of complex systems such as DNNs; (2) algorithms that help machine learning scientists analyze the performance limits of DNNs; and (3) cost functions for training DNNs that allow the scientist to trade-off between model complexity and model performance in low-power applications.If successfu"l, the resulting work could have a significant impact on Naval applications where increased autonomy in low-power systems would grea""tly improve the efficiency of existing operations. For example, improving the capabilities of low-cost sensors by integrating DNNs t""hat convert the raw data streams to relevant information streams can lead to enhanced surveillance capabilities; or, developing low-"power wearable sensors for soldier health monitoring can improve the safety of Naval personnel.

Document Details

Document Type: DoD Grant Award
Publication Date: Sep 01, 2017
Source ID: N000141712826

Entities

People

Visar Berisha

Organizations

Arizona State University
Office of Naval Research
United States Navy

An information-theoretic approach to improving the robustness of deep learning architectures

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas