Regeneration of Information Rather Than Information Retrieval. 'Concept Creation Method'.

Abstract

Several approaches were considered for the isolation and/or selection of 'concepts' from documents. Two of these approaches were adopted. The first involved the computation of association factors for word adjacency, sentence association, and paragraph cluster. These factors were based upon the number of joint appearances compared to the number of individual appearances of the words involved. The second approach involved the formation of statistical phrases-strings of important words bounded on the ends by punctuation or words from the unimportant word list. Using the two associative measures -- sentence association and paragraph cluster -- attempts at retrieval were made along the lines of the earlier manual extracts and 'concepts'. Section two describes an independent experiment on new materials as a check on the effectiveness of the computer programs. The tests indicate that automatic extraction is feasible, however more refinement in the programs is necessary; particularly the PHRASES-SEARCH system needs improvement to cope with natural language input.

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1974
Accession Number
ADA003617

Entities

People

  • Claude E. Stanley Jr.
  • Craig Prentiss
  • Jack Belzer

Organizations

  • University of Pittsburgh

Tags

DTIC Thesaurus Topics

  • Automatic
  • Computations
  • Computer Programs
  • Computers
  • Extraction
  • Information Retrieval
  • Language
  • Materials
  • Natural Languages
  • Word Lists

Readers

  • Computational Linguistics
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Information Retrieval