Reducing the Burden of Massive Training Data for Deep Learning

Abstract

Over the past decade, the amount of effort in deep learning (machine learning with many layered neural networks) has increased exponentially due to the impressive performance of these networks on historically difficult problems, such as computer vision, understanding natural languages, and decision-making. Correspondingly, the importance of deep learning to the Navy has become increasingly clear in the past several years. This impressive performance by neural networks is generally achieved by supervised learning, in which the model trains on large, balanced labeled datasets. Studies have shown that more training data allows networks to reach higher accuracy and generalize better, which has led to labeled datasets typically containing thousands to millions of labeled images. Consequently, research in deep learning has exploded following the creation of large benchmark datasets, such as ImageNet and the Enron dataset. While raw data is often plentiful for real world applications, labeling data is hard. Manually labeling thousands of data samples is labor intensive and can be a barrier in practice. Furthermore, in numerous fields (such as medicine, defense, and other scientific fields) a limited number of people who are experts in their fields can often only correctly classify data. Therefore, the goal of this project was to be able to achieve high performance without having to manually label massive amounts of data (i.e., learning with limited labeled data).

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 30, 2021
Accession Number
AD1149304

Entities

People

  • Leslie N. Smith

Organizations

  • United States Naval Research Laboratory

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Accuracy
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Computer Languages
  • Computer Vision
  • Computers
  • Deep Learning
  • Dimensionality Reduction
  • Electromagnetic Spectra
  • Hyperspectral Imagery
  • Information Systems
  • Machine Learning
  • Neural Networks
  • Pattern Recognition
  • Recurrent Neural Networks
  • Semi-Supervised Learning
  • Supervised Machine Learning
  • Unsupervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Economics
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks