SOME THEORETICAL ASPECTS OF THE IMPROVEMENT OF DOCUMENT SCREENING BY ASSOCIATIVE TRANSFORMATIONS.

Abstract

With respect to document storage and retrieval, one can think of associative techniques as either (1) those which improve the file, or (2) those which improve the search query. The objective of both is to improve the search outcome. In this study, techniques which improve the file have been considered. The file to which they can be applied is one expressible as a matrix of zeros and ones. Two kinds of files have been considered. In the first kind, it is possible to define a 'correct' indexing and in the second kind whether an index term should or should not be selected is partly a matter of opinion. For the first kind of file, mathematical expressions have been found for the amount by which a person can gain by using associative techniques. An experiment was performed on a small Patent Office file to simulate the second situation. Actual indexings of zero or one were replaced by a weighted average of three associative estimates of the relatedness of the term to the document. Retrieval was shown to be improved in the sense that fewer documents had to be examined in order to discover documents which had been judged relevant in previous Patent Office searches. Greater inconsistency of indexing was introduced into the file by randomization and the searching experiment was repeated. Again, the associative techniques proved to be superior. (Author)

Document Details

Document Type
Technical Report
Publication Date
Nov 30, 1965
Accession Number
AD0628191

Entities

People

  • Donald T. Searls
  • Edward C. Bryant
  • Robert H. Shumway

Organizations

  • Westat

Tags

DTIC Thesaurus Topics

  • Directories
  • Index Terms
  • Indexes
  • Patent Office

Readers

  • Computational Linguistics
  • Computer Science.
  • Regression Analysis.