Normalization for Automated Metrics: English and Arabic Speech Translation

Abstract

The Defense Advanced Research Projects Agency (DARPA) Spoken Language Communication and Translation System for Tactical Use (TRANSTAC) program has experimented with applying automated me-rics to speech translation dialogues. For translations into English, BLEU, TER, and METEOR scores correlate well with human judgments, but scores for translation into Arabic correlate with human judgments less strongly. This paper provides evidence to sup-port the hypothesis that automated measures of Arabic are lower due to variation and in-flection in Arabic by demonstrating that normalization operations improve correlation between BLEU scores and Likert-type judgments of semantic adequacy as well as be-tween BLEU scores and human judgments of the successful transfer of the meaning of individual content words from English to Arabic.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2009
Accession Number: AD1125098

Entities

People

Alan Rubenstein
Beatrice Oshika
Christy Doran
Dan Parvaz
Gregory A. Sanders
John Aberdeen
Sherri Condon

Organizations

MITRE Corporation

Normalization for Automated Metrics: English and Arabic Speech Translation

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers