Approaches in MET (Multi-Lingual Entity Task)

Abstract

TWO APPROACHES. BBN and FinCEN participated jointly in the Spanish language task for MET. BBN also participated in Chinese. We also fielded two approaches. The first approach is pattern based and has an architecture as shown in Figure 1. This approach was applied to both Chinese and Spanish. The algorithms (rectangles in the Figure) were used in the two languages; the only component difference was the New Mexico State University segmenter, used to find the word boundaries in Chinese. The components common to both languages are the message reader, which dealt with the input format and SGML conventions via a declarative format description; the part-of-speech tagger (BBN POST); a lexical pattern matcher driven by knowledge bases of patterns and lexicons specific to each language; and the SGML annotation generator. While not shown in Figure 1, an alias prediction algorithm was shared by both languages, using patterns unique to each language.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 1996
Accession Number
ADA631523

Entities

People

  • Damaris Ayuso
  • Daniel Bikel
  • Erik Peterson
  • Patrick Jost
  • Ralph Weischedel
  • Tasha Hall

Organizations

  • BBN Technologies

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Automated Speech Recognition
  • Boundaries
  • Governments
  • Hidden Markov Models
  • Language
  • Linguistics
  • Markov Models
  • Models
  • New Mexico
  • Probability
  • Recognition
  • Spanish Language
  • Training
  • United States
  • United States Government

Readers

  • Computational Linguistics
  • Life Cycle Cost Analysis