Toward Low-Cost Automated Evaluation Metrics for Internet of Things Dialogues

Abstract

We analyze a corpus of system-user dialogues in the Internet of Things domain. Our corpus is automatically, semi-automatically, and manually annotated with a variety of features both on the utterance level and the full dialogue level. The corpus also includes human ratings of dialogue quality collected via crowd sourcing. We calculate correlations between features and human ratings to identify which features are highly associated with human perceptions about dialogue quality in this domain. We also perform linear regression and derive a variety of dialogue quality evaluation functions. These evaluation functions are then applied to a held-out portion of our corpus, and are shown to be highly predictive of human ratings and outperform standard reward-based evaluation functions.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2018
Accession Number
AD1159898

Entities

People

  • Carla Gordon
  • David R Traum
  • Heesik Jeon
  • Hyungtak Choi
  • Jill Boberg
  • Kallirroi Georgila

Organizations

  • Samsung Electronics
  • University of Southern California

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Automated Speech Recognition
  • Computational Linguistics
  • Computational Science
  • Computer Languages
  • Computer Science
  • Data Sets
  • Dialogue Systems
  • Electronics
  • Human-Computer Interaction
  • Internet Of Things
  • Language
  • Ontologies
  • Simulations
  • Standards

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Computational Modeling and Simulation
  • Speech Processing/Speech Recognition.

Technology Areas

  • 5G