Advanced Data Visualizations for Robust Deep Machine Learning
Abstract
Progress in machine learning has led to impressive advances in Artificial Intelligence over just the last few years, with computers now able to outperform humans on a surprising variety of tasks. Amidst this excitement, however, are warning signs. Machine-learning based systems routinely make unexpected and unexplainable errors. Small noise patterns, imperceptible to humans, confuse computer vision algorithms, whether accidental or purposefully designed by an adversary. Either way, the consequences can be catastrophic, especially as AI is used in medical, legal, military, and other high-stakes applications. These failures share a common cause: machine learning, including deep learning with deep neural networks, relies on fitting complex mathematical models to training data. These networks are very powerful, but when training sets are small or biased, as often happens in practice, the networks can overfit the training data. Moreover, given the complexity and black box nature of these models, it is usually difficult to debug or fix a failure. Although much work is trying to build better algorithms, fixing them will be a long-term effort. We propose a fundamentally different approach built on the hypothesis that instead of fixing the black box, we need to make it more transparent, developing advanced techniques that allow both students and machine learning practitioners to visualize what is learned by deep networks and how different parameters of the learning -- size and sampling of dataset, the source transfer learning dataset, the network parameters, and so on -- affect the learned representation and the process by which it is learned. Our overall goal is to develop practical visualization tools that help machine learning to be effectively applied to challenging but critical classification problems such as those encountered by the Navy. We address four specific challenges, (1) limited training datasets, (2) lack of explainability and debuggability, (3) adversarial inputs, and (4) shortage of expertise with machine learning in the workforce, with three integrated research threads. Thread 1 is a survey of existing work and formal user needs study to characterize the problems, applications, and datasets of importance to the Navy and to identify potential visualization solutions. Thread 2 will develop advanced visualizations to reveal what is being learned by deep models, including how the learning evolves over time and how dataset properties and algorithm parameters impact the classifier’s accuracy, generality, and robustness to adversarial examples. Thread 3 will incorporate these visualizations into courses and educational materials, to evaluate the techniques with students and to educate and recruit the next generation of Navy machine learning experts. A major focus of the project is to increase and diversify the talent pool of graduating students who have expertise in machine learning, and in particular to identify, educate, and recruit high-performing students to participate in the research and consider a career trajectory involving the Navy and other Government research trajectories. The PIs are an experienced team with joint expertise in computer vision, machine learning, and visualization, with deep existing collaborations with Crane and the Navy in general.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Jul 29, 2020
- Source ID
- N001741910010
Entities
People
- David J. Crandall
Organizations
- Indiana University
- United States Navy