Domain Tuning of Bilingual Lexicons for MT

Abstract

Our overall objective is to translate a domain-specific document in a foreign language (in this case, Chinese) to English. Using automatically induced domain-specific, comparable documents and language-independent clustering, we apply domain-tuning techniques to a bilingual lexicon for downstream translation of the input document to English. We will describe our domain-tuning technique and demonstrate its effectiveness by comparing our results to manually constructed domain-specific vocabulary. Our coverage/accuracy experiments indicate that domain-tuned lexicons achieve 88/% precision and 66/% recall. We also ran a Bleu experiment to compare our domain-tuned version to its un-tuned counterpart in an IR Ni-style NIT system. Our domain-tuned lexicons brought about an improvement in the Blen scores: 9.4/% higher than a system trained on a uniformly- weighted dictionary and 275/% higher than a system trained on no dictionary at all.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 2003
Accession Number
ADA455197

Entities

People

  • Bonnie J. Dorr
  • Necip F. Ayan
  • Okan Kolak

Organizations

  • University of Maryland

Tags

Communities of Interest

  • C4I

DTIC Thesaurus Topics

  • Accuracy
  • Algorithms
  • Chemical Weapons
  • Classification
  • Computational Linguistics
  • Computer Science
  • Dictionaries
  • English Language
  • Foreign Languages
  • Information Retrieval
  • Information Science
  • Language
  • Linguistics
  • Machine Translation
  • Natural Language Processing
  • Natural Languages
  • Precision

Readers

  • Computational Linguistics