CRL/NMSU and Brandeis: Description of the MucBruce System as Used for MUC-4

Abstract

Through their involvement in the Tipster project the Computing Research Laboratory at New Mexico State University and the Computer Science Department at Brandeis University are developing a method for identifying articles of interest and extracting and storing specific kinds of information from large volumes of Japanese and English texts. We intend that the method be general and extensible. The techniques involved are not explicitly tied to these two languages nor to a particular subject area. Development for Tipster has been going on since September, 1992. The system we have used for the MUC-4 tests has only implemented some of the features we plan to include in our final Tipster system. It relies intensively on statistics and on context-free text marking to generate templates. Some more detailed parsing has been added for a limited lexicon, but lack of fuller coverage places an inherent limit on its performance. Most of the information produced in our MUC templates is arrived at by probing the text which surrounds `significant' words for the template type being generated, in order to find appropriately tagged fillers for the template fields.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1992
Accession Number
ADA461003

Entities

People

  • James Pustejovsky
  • Jim Cowie
  • Louise Guthrie
  • Scott Waterman
  • Yorick Wilks

Organizations

  • New Mexico State University

Tags

Communities of Interest

  • Weapons Technologies

DTIC Thesaurus Topics

  • Acquisition
  • Cognitive Science
  • Computational Linguistics
  • Computer Science
  • Computers
  • Dictionaries
  • El Salvador
  • Governments
  • Language
  • Linguistics
  • Machine Translation
  • New Mexico
  • Probabilistic Models
  • Recognition
  • Semantic Models
  • Terrorists
  • Word Lists

Readers

  • Computational Linguistics
  • Systems Analysis and Design
  • Technical Research and Report Writing.