Advanced Data Visualizations for Robust Deep Machine Learning

Abstract

Progress in machine learning has led to impressive advances in Artificial Intelligence over just the last few years, with computers now able to outperform humans on a surprising variety of tasks. Amidst this excitement, however, are warning signs. Machine-learning based systems routinely make unexpected and unexplainable errors. Small noise patterns, imperceptible to humans, confuse computer vision algorithms, whether accidental or purposefully designed by an adversary. Either way, the consequences can be catastrophic, especially as AI is used in medical, legal, military, and other high-stakes applications. These failures share a common cause: machine learning, including deep learning with deep neural networks, relies on fitting complex mathematical models to training data. These networks are very powerful, but when training sets are small or biased, as often happens in practice, the networks can overfit the training data. Moreover, given the complexity and black box nature of these models, it is usually difficult to debug or fix a failure. Although much work is trying to build better algorithms, fixing them will be a long-term effort. We propose a fundamentally different approach built on the hypothesis that instead of fixing the black box, we need to make it more transparent, developing advanced techniques that allow both students and machine learning practitioners to visualize what is learned by deep networks and how different parameters of the learning -- size and sampling of dataset, the source transfer learning dataset, the network parameters, and so on -- affect the learned representation and the process by which it is learned. Our overall goal is to develop practical visualization tools that help machine learning to be effectively applied to challenging but critical classification problems such as those encountered by the Navy. We address four specific challenges, (1) limited training datasets, (2) lack of explainability and debuggability, (3) adversarial inputs, and (4) shortage of expertise with machine learning in the workforce, with three integrated research threads. Thread 1 is a survey of existing work and formal user needs study to characterize the problems, applications, and datasets of importance to the Navy and to identify potential visualization solutions. Thread 2 will develop advanced visualizations to reveal what is being learned by deep models, including how the learning evolves over time and how dataset properties and algorithm parameters impact the classifier’s accuracy, generality, and robustness to adversarial examples. Thread 3 will incorporate these visualizations into courses and educational materials, to evaluate the techniques with students and to educate and recruit the next generation of Navy machine learning experts. A major focus of the project is to increase and diversify the talent pool of graduating students who have expertise in machine learning, and in particular to identify, educate, and recruit high-performing students to participate in the research and consider a career trajectory involving the Navy and other Government research trajectories. The PIs are an experienced team with joint expertise in computer vision, machine learning, and visualization, with deep existing collaborations with Crane and the Navy in general.

Document Details

Document Type
DoD Grant Award
Publication Date
Jul 29, 2020
Source ID
N001741910010

Entities

People

  • David J. Crandall

Organizations

  • Indiana University
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Artificial Intelligence
  • Distributed Systems and Data Platform Development

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy
  • AI & ML - Neural Networks