CRL/Brandeis: The DIDEROT System

Abstract

Diderot is an information extraction system built at CRL and Brandeis University over the past two years. It was produced as part of our efforts in the Tipster project. The same overall system architecture has been used for English and Japanese and for the micro-electronics and joint venture domains. The past history of the system is discussed and the operation of its major components described. A summary of scores at the 24 month workshop is given. Because of the emphasis on different languages and different subject areas the research has focused on the development of general purpose, re-usable techniques. The CRL/Brandeis group have implemented statistical methods for focusing on the relevant parts of texts, programs which recognize and mark names of people, places and organizations and also dates. The actual analysis of the critical parts of the texts is carried out by a parser controlled by lexical structures for the `key' words in the text. To extend the system's coverage of English and Japanese some of the content of these lexical structures was derived from machine readable dictionaries. These were then enhanced with information extracted from corpora.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1993
Accession Number
ADA461001

Entities

People

  • J. Wang
  • James Pustejovsky
  • Jim Cowie
  • Louise Guthrie
  • Rong Wang
  • Scott Waterman
  • Takahiro Wakao
  • William Ogden
  • Yorick Wilks

Organizations

  • New Mexico State University

Tags

Communities of Interest

  • Advanced Electronics

DTIC Thesaurus Topics

  • Accuracy
  • Algorithms
  • Computer Programming
  • Computer Science
  • Dictionaries
  • Electronics
  • Errors
  • Language
  • Lisp Programming Language
  • Machines
  • Operating Systems
  • Precision
  • Programming Languages
  • Recognition
  • Standards
  • Symbolic Programming
  • Word Lists

Readers

  • Computational Linguistics
  • Technical Research and Report Writing.

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation
  • Microelectronics