Interpretable Convolutional Neural Networks and Adversarial Examples: FY19 Line-Supported Program

Abstract

Convolutional neural networks (CNNs) can achieve remarkable accuracy on many computer-vision and image-recognition problems, but their prediction mechanism is difficult or impossible to understand and normal training produces models that are highly susceptible to adversarial examples - inputs designed by an attacker to be misclassifixC;ed despite being visually indistinguishable from ordinary images of the correct class. These shortcomings have sparked great interest in explaining CNNs and in training CNNs that are resistant to adversarial inputs. In this report, we show that one can apply robust optimization techniques to an interpretable CNN and train models that are both interpretable and robust.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 18, 2020
Accession Number
AD1147899

Entities

People

  • J. K. Su
  • N. Kaushik

Organizations

  • MIT Lincoln Laboratory

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Computer Science
  • Computer Vision
  • Computers
  • Convolutional Neural Networks
  • Data Mining
  • Detection
  • Dimensionality Reduction
  • Image Recognition
  • Information Processing
  • Information Systems
  • Machine Learning
  • Neural Networks
  • Pattern Recognition
  • Signal Processing

Fields of Study

  • Computer science

Readers

  • Educational Psychology
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks