A Semi-automated Evaluation Metric for Dialogue Model Coherence

Abstract

We propose a new metric, Voted Appropriateness, which can be used to automatically evaluate dialogue policy decisions, once some wizard data has been collected. We show that this metric outperforms a previously proposed metric Weak agreement. We also present a taxonomy for dialogue model evaluation schemas, and orient our new metric within this taxonomy.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 2016
Accession Number
AD1157726

Entities

People

  • David R Traum
  • Sudeep Gandhe

Organizations

  • University of Southern California

Tags

Communities of Interest

  • Autonomy
  • Biomedical

DTIC Thesaurus Topics

  • Agreements
  • Automatic
  • Cognitive Systems Engineering
  • Computational Linguistics
  • Computational Science
  • Computer Languages
  • Dialogue Systems
  • Electronic Mail
  • Judgment
  • Language
  • Learning
  • Linguistics
  • Machine Learning
  • Machine Translation
  • Natural Language Processing
  • Natural Languages
  • Reinforcement Learning
  • Simulations
  • Taxonomy

Fields of Study

  • Computer science

Readers

  • Artificial Intelligence
  • Computer Vision.
  • Instructional Design and Training Evaluation.