PROTEUS and PUNDIT: Research in Text Understanding.

Abstract

We are engaged in the development of systems capable of analyzing short narrative messages dealing with a limited domain and extracting the information contained in the narrative. These systems are initially being applied to messages describing equipment failure. This work is a joint effort of New York University and the System Development Corp. for the DARPA Strategic Computing Program. Our aim is to create a system reliable enough for use in an operational environment. This is a formidable task, both because the texts are unedited (and so contain various errors) and because the complexity of any real domain precludes us from assembling a complete collection of the relationships and domain knowledge relevant to understanding texts in the domain. A number of laboratory prototypes have been developed for the analysis of short narratives. None of the systems we know about, however, is reliable enough for use in an operational environment (the possible exceptions are expectation-driven systems, which simply ignore anything deviating from these built-in expectations). Typical success rates reported are that 75-80% of sentences are correctly analyzed, and that many erroneous analyses pass the system undetected; this is not acceptable for most applications. We see the central task of the work to be described below as the construction of a substantially more reliable system for narrative analysis.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 1986
Accession Number
ADA174471

Entities

People

  • Lynette Hirschman
  • Ralph David Grishman

Organizations

  • New York University

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Air Compressors
  • Analyzers
  • Command Centers
  • Computational Linguistics
  • Computer Programming
  • Computer Science
  • Computers
  • Grammars
  • Language
  • Linguistics
  • Lisp Programming Language
  • Message Processing
  • Natural Languages
  • New York
  • Semantics
  • Symbolic Programming
  • Teamwork

Readers

  • Parallel and Distributed Computing.
  • Systems Analysis and Design
  • Technical Research and Report Writing.