Extending Generation and Evaluation and Metrics (GEM) to Grounded Natural Language Generation (NLG) Systems and Evaluating their Descriptive Texts Derived from Image Sequences

Abstract

We present here, for consideration in a future Generation and Evaluation and Metrics (GEM) challenge, a graduated, task-based approach to evaluating grounded natural language generation (NLG) systems that generate descriptive texts derived from sequences of input images. We start by characterizing grounded NLG tasks that generate descriptive texts at increasing levels of complexity, then step through examples of these levels with image sequences and facet targets (input) and their derivative descriptive texts (output) from our human-authored data set. For evaluating whether a grounded NLG system is "good enough" for users' needs, we first ask if the user can recover the images the system used to derive descriptive texts at the relevant, graduated level of complexity. The texts judged as adequate in this image-selection task are then analyzed for their semantic facet units (SFUs), which form the basis for scoring descriptive texts generated by other grounded NLG systems. The image-selection and SFU scoring together constitute the evaluation we are piloting for grounded, data-to-text NLG systems.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2021
Accession Number
AD1149441

Entities

People

  • Clare R. Voss
  • Stephanie M. Lukin

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Abstracts
  • Automated Text Summarization
  • Cognitive Science
  • Commerce
  • Communities
  • Computational Linguistics
  • Computer Vision
  • Data Sets
  • Demographic Cohorts
  • Images
  • Instructions
  • Language
  • Linguistics
  • Military Research
  • Natural Language Processing
  • Natural Languages
  • Sequences
  • Standards
  • Test And Evaluation
  • Video
  • Video Clips

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Computational Modeling and Simulation
  • Computer Vision.