BBN: Description of the PLUM System as Used for MUC-3

Abstract

Traditional approaches to the problem of extracting data from texts have emphasized handcrafted linguistic knowledge. In contrast, BBN's PLUM system (Probabilistic Language Understanding Model) was developed as part of a DARPA-funded research effort on integrating probabilistic language models with more traditional linguistic techniques. Our research and development goals are * more rapid development of new applications, * the ability to train (and re-train) systems based on user markings of correct and incorrect output, * more accurate selection among interpretations when more than one is found, and * more robust partial interpretation when no complete interpretation can be found. We have previously performed experiments on components of the system with texts from the Wall Street Journal, however, the MUC-3 task is the first end-to-end application of PLUM. MI components except parsing were developed in the last 5 months, and cannot therefore be considered fully mature. The parsing component, the MIT Fast Parser [4], originated outside BBN and has a more extensive history prior to MUC-3. A central assumption of our approach is that in processing unrestricted text for data extraction, a non-trivial amount of the text will not be understood. As a result, all components of PLUM are designed to operate on partially understood input, taking advantage of information when available, and not failing when information is unavailable.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1991
Accession Number
ADA460678

Entities

People

  • Damaris Ayuso
  • Jeff Palmucci
  • Ralph Weischedel
  • Robert Ingria
  • Sean Boisen

Organizations

  • BBN Technologies

Tags

Communities of Interest

  • Counter IED
  • Weapons Technologies

DTIC Thesaurus Topics

  • Abstracts
  • Databases
  • El Salvador
  • Generators
  • Hidden Markov Models
  • Language
  • Machine Learning
  • Markov Models
  • Models
  • Probabilistic Models
  • Probability
  • Semantics
  • Template Patterns
  • Terrorists
  • Text Processing
  • United States Government
  • Ussr

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Distributed Systems and Data Platform Development
  • Theoretical Analysis.