The Effect of Training Data Set Composition on the Performance of a Neural Image Caption Generator

Abstract

This research seeks to determine how many images of a particular object in a training data set are necessary to achieve caption quality saturation in neural image caption generators. Understanding the relationship between caption quality and the size and composition of training data sets could improve efficiency in model training and lead to the development of optimized data sets for different tasks. We hypothesize that increasing the exposure of a neural network to an object will improve its performance, up to a point, after which the caption quality will saturate; and that this may vary based on the objects visual homogeneity. We trained several image captioning models, using an existing code Neuraltalk2, on subsets of the Microsoft Common Objects in Context data set, which contained a precise number of some common object categories (e.g., cat and pizza). The performance with different levels of exposure to the selected objects was compared using the Metric for Evaluation of Translation with Explicit Ordering (METEOR) and Consensus-Based Image Description Evaluation (CIDEr) automated scoring metrics. The data indicate that increasing the quantity of images of a particular object in the training data set improved the performance up to 1,500 images, but not beyond that.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2017
Accession Number
AD1039145

Entities

People

  • Abigail Wilson
  • Adrienne Raglin

Organizations

  • United States Army Research Laboratory

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Artificial Intelligence Computing
  • Artificial Intelligence Software
  • Computer Languages
  • Computer Programs
  • Computer Science
  • Computers
  • Convolutional Neural Networks
  • Data Science
  • Data Sets
  • Generators
  • Information Science
  • Machine Learning
  • Military Research
  • Neural Networks
  • Recurrent Neural Networks
  • Saturation
  • Test And Evaluation

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Computational Modeling and Simulation
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks