Name Matching Between Roman and Chinese Scripts

Abstract

There are generally many ways to translite-rate a name from one language script into another. The resulting ambiguity can make it very difficult to "untransliterate" a name by reverse engineering the process. In this paper, we present a highly successful cross-script name matching system that was developed by combining the creativity of human intuition with the power of machine learning. Our system correctly determines whether a name in Chinese script and a name in Roman script match with an F-score of 96 percent. In addition, for name pairs that satisfy a computational test, the F-score is 98 percent.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2007
Accession Number
AD1107372

Entities

People

  • Alan Rubenstein
  • Alex Yeh
  • Ken Samuel
  • Sherri Condon

Organizations

  • MITRE Corporation

Tags

Communities of Interest

  • Autonomy
  • Biomedical
  • C4I

DTIC Thesaurus Topics

  • Algorithms
  • Boundaries
  • Computational Linguistics
  • Computational Science
  • Corporations
  • Data Science
  • Data Set
  • Databases
  • Digital Data
  • Experimental Design
  • Information Processing
  • Information Science
  • Language
  • Linguistics
  • Machine Learning
  • Natural Language Computing
  • Natural Language Processing
  • Personality
  • Phonemes
  • Reverse Engineering
  • Speech

Fields of Study

  • Computer science

Readers

  • Computational Linguistics

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks