MAXIMUM-DEPTH INDEXING FOR COMPUTER RETRIEVAL OF ENGLISH LANGUAGE DATA

Abstract

ONE OF THE SIMPLEST AND YET MOST POWERFUL METHODS FOR ORGANIZING NATURAL LANGUAGE DATA FOR DEEP RETRIEVAL IS THE COMPLETE INDEX. Such an index completely characterizes a text corpus for computer retrieval operations without prohibitive cost in space or time. A general-purpose indexer programmed for the IBM 7090 is described and discussed. This system, starting with unedited English text, produces an index of all or any subset of words in that text. For each word indexed, the volume, chapter, paragraph and sentence number for each of its occurrences in the text is cited. Words with the same root, such as farmer and farming, are cross-referenced to each other. Words that are almost precisely synonymous such as Britain and England are also cross-referenced. U SES OF THE INDEX FOR FINDING INFORMATION RELEVANT TO ANSWERING English questions are briefly described. (Author)

Document Details

Document Type
Technical Report
Publication Date
Apr 10, 1962
Accession Number
AD0275814

Entities

People

  • Keren L. Mcconlogue
  • Robert F. Simmons

Organizations

  • System Development Corporation

Tags

DTIC Thesaurus Topics

  • English Language
  • Language
  • Natural Languages

Readers

  • Artificial Intelligence
  • Library and Information Science
  • Systems Analysis and Design

Technology Areas

  • Space