Evaluation of Machine Translation and its Evaluation

Abstract

Evaluation of MT evaluation measures is limited by inconsistent human judgment data. Nonetheless, machine translation can be evaluated using the well-known measures precision, recall, and their average, the F-measure. The unigrambased F-measure has significantly higher correlation with human judgments than recently proposed alternatives. More importantly, this standard measure has an intuitive graphical interpretation, which can facilitate insight into how MT systems might be improved.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2006
Accession Number
ADA453509

Entities

People

  • I. D. Melamed
  • Joseph P. Turian
  • Luke Shea

Organizations

  • New York University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Computational Linguistics
  • Computational Science
  • Data Analysis
  • Information Retrieval
  • Information Science
  • Language
  • Linguistics
  • Machine Translation
  • Natural Language Processing
  • New York
  • Standards
  • Test And Evaluation
  • Translations

Readers

  • Computer Programming and Software Development.
  • Nuclear Civil Defense.
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Translation