Tasks, Domains, and Languages for Information Extraction

Abstract

The information extraction tasks for the ARPA TIPSTER program center on automatically filling object-oriented data structures, called templates, with information extracted from free text in news stories (for discussion of templates and objects, see "Template Design for Information Extraction" in this volume). With text as input, the TIPSTER systems first detect whether the text contains relevant information. If so, the systems extract specific instances of generic types of information that correspond to each slot in the template and output that information by filling the template slots in an appropriate data representation. These slots are then scored by using an automatic scoring program with templates produced by human analysts that serve as answer keys. Human analysts also prepared development set templates for each domain, which served as training models for system developers (for discussion of the data preparation effort, see "Corpora and Data Preparation for Information Extraction" in this volume).

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Sep 01, 1993
Accession Number: ADA630822

Entities

People

Boyan Onyshkevych
Lynn Carlson
Mary E. Okurowski

Organizations

United States Department of Defense

Tasks, Domains, and Languages for Information Extraction

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas