Algorithms that Learn to Extract Information - BBN: Tipster Phase 3

Abstract

All of BBN's research under the TIPSTER III program has focused on doing extraction by applying statistical models trained on annotated data, rather than by using programs that execute hand-written rules. Within the context of MUC-7, the SIFT system for extraction of template entities (TE) and template relations (TR\) used a novel, integrated syntactic/semantic language model to extract sentence level information, and then synthesized information across sentences using in part a trained model for cross-sentence relations. At the named entity (NE) level as well in both MET-1 and MUC-7, BBN employed a trained, HMM-based model. The results in these TIPSTER evaluations are evidence that such trained systems, even at their current level of development, can perform roughly on a par with those based on rules hand tailored by experts. In addition, such trained systems have some significant advantages. They can be easily ported to new domains by simply annotating fresh data. The complex interactions that make rulebased systems difficult to develop and maintain can here be learned automatically from the training data. We believe that improved and extended versions of such trained models have the potential for significant further progress toward practical systems for information extraction.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 01, 1998
Accession Number
ADA631521

Entities

People

  • Heidi Fox
  • Lance Ramshaw
  • Michael Crystal
  • Ralph Weischedel
  • Rebecca Stone
  • Richard Schwartz
  • Scott R. Miller

Organizations

  • BBN Technologies

Tags

Communities of Interest

  • Space
  • Weapons Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence Software
  • Computational Linguistics
  • Computational Science
  • Computer Languages
  • Governments
  • Language
  • Linguistics
  • Materials
  • Named Entity Recognition
  • Natural Language Processing
  • Probabilistic Models
  • Probability
  • Space Systems
  • Template Patterns
  • Test And Evaluation
  • Training

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation