The Multilingual Entity Task (MET) Overview

Abstract

In November, 1996, the Message Understanding Conference-6 (MUC-6) evaluation of named entity identification demonstrated that systems are approaching human performance on English language texts [10]. Informal and anonymous. the MET provided a new opportunity to assess progress on the same task in Spanish, Japanese, and Chinese. Preliminary results indicate that MET systems in all three languages performed comparably to those of the MUC-6 evaluation in English. Based upon the Named Entity Task Guidelines [11], the task was to locate and tag with SGML named entity expressions (people, organizations, and locations), time expressions (time and date). and numeric expressions (percentage and money) in Spanish texts from Agence France Presse. in Japanese texts from Kyodo newswire, or in Chinese texts from Xinhua newswire1 Across languages the keywords "press conference" retrieved a rich subcorpus of texts, covering a wide spectrum of topics. Frequency and types of expressions vary in the three language sets [2] [8] [9]. The original task guidelines were modified so that the core guidelines were language independent with language specific rules appended.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 1996
Accession Number
ADA631522

Entities

People

  • Mary E. Okurowski
  • Nancy Chinchor
  • Roberta Merchant

Organizations

  • United States Department of Defense

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Accuracy
  • Acquisition
  • Consistency
  • Department Of Defense
  • English Language
  • Information Operations
  • Language
  • Motor Skills
  • Named Entity Recognition
  • New Mexico
  • Software Development
  • Software Testing
  • Test And Evaluation
  • Text Processing
  • Training

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Library and Information Science