Toward Low-Cost Automated Evaluation Metrics for Internet of Things Dialogues

Abstract

We analyze a corpus of system-user dialogues in the Internet of Things domain. Our corpus is automatically, semi-automatically, and manually annotated with a variety of features both on the utterance level and the full dialogue level. The corpus also includes human ratings of dialogue quality collected via crowd sourcing. We calculate correlations between features and human ratings to identify which features are highly associated with human perceptions about dialogue quality in this domain. We also perform linear regression and derive a variety of dialogue quality evaluation functions. These evaluation functions are then applied to a held-out portion of our corpus, and are shown to be highly predictive of human ratings and outperform standard reward-based evaluation functions.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2018
Accession Number: AD1159898

Entities

People

Carla Gordon
David R Traum
Heesik Jeon
Hyungtak Choi
Jill Boberg
Kallirroi Georgila

Organizations

Samsung Electronics
University of Southern California

Toward Low-Cost Automated Evaluation Metrics for Internet of Things Dialogues

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas