The Multilingual Entity Task (MET) Overview
Abstract
In November, 1996, the Message Understanding Conference-6 (MUC-6) evaluation of named entity identification demonstrated that systems are approaching human performance on English language texts [10]. Informal and anonymous. the MET provided a new opportunity to assess progress on the same task in Spanish, Japanese, and Chinese. Preliminary results indicate that MET systems in all three languages performed comparably to those of the MUC-6 evaluation in English. Based upon the Named Entity Task Guidelines [11], the task was to locate and tag with SGML named entity expressions (people, organizations, and locations), time expressions (time and date). and numeric expressions (percentage and money) in Spanish texts from Agence France Presse. in Japanese texts from Kyodo newswire, or in Chinese texts from Xinhua newswire1 Across languages the keywords "press conference" retrieved a rich subcorpus of texts, covering a wide spectrum of topics. Frequency and types of expressions vary in the three language sets [2] [8] [9]. The original task guidelines were modified so that the core guidelines were language independent with language specific rules appended.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 1996
- Accession Number
- ADA631522
Entities
People
- Mary E. Okurowski
- Nancy Chinchor
- Roberta Merchant
Organizations
- United States Department of Defense