CRL/NMSU and Brandeis: Description of the MucBruce System as Used for MUC-4

Abstract

Through their involvement in the Tipster project the Computing Research Laboratory at New Mexico State University and the Computer Science Department at Brandeis University are developing a method for identifying articles of interest and extracting and storing specific kinds of information from large volumes of Japanese and English texts. We intend that the method be general and extensible. The techniques involved are not explicitly tied to these two languages nor to a particular subject area. Development for Tipster has been going on since September, 1992. The system we have used for the MUC-4 tests has only implemented some of the features we plan to include in our final Tipster system. It relies intensively on statistics and on context-free text marking to generate templates. Some more detailed parsing has been added for a limited lexicon, but lack of fuller coverage places an inherent limit on its performance. Most of the information produced in our MUC templates is arrived at by probing the text which surrounds `significant' words for the template type being generated, in order to find appropriately tagged fillers for the template fields.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 1992
Accession Number: ADA461003

Entities

People

James Pustejovsky
Jim Cowie
Louise Guthrie
Scott Waterman
Yorick Wilks

Organizations

New Mexico State University

CRL/NMSU and Brandeis: Description of the MucBruce System as Used for MUC-4

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers