AUTOMATIC CLASSIFICATION FOR THE ASTIA MATHEMATICS COLLECTION.

Abstract

Information Retrieval of documents based on a series of word descriptors which characterize each document requires the formation of a search description. The user of the library can be greatly aided in the search if he knows which descriptors do not occur together in any document in the file and which descriptors have some relatively high probability of co-occurring within a description. This thesis investigates this absence and probable presence of co-occurence between pairs of descriptors in the approximately 25,000 document Mathematics section of the ASTIA library. This absence and probable presence is revealed by an Exclusive and an Inclusive Stratification program whose results are presented as a two-level lescriptor classification system. The results of this classification are measured by a figure which states the probability that the file contains documents each of which is formed by choosing any two descriptors form an Inclusive Group. Another measure presented is the reduction in the number of possible descriptions of any length. (Author)

Document Details

Document Type
Technical Report
Publication Date
May 01, 1964
Accession Number
AD0610313

Entities

People

  • Barry Zimmerman

Organizations

  • University of Pennsylvania

Tags

DTIC Thesaurus Topics

  • Automatic
  • Classification
  • Information Retrieval
  • Mathematics
  • Probability
  • Stratification

Fields of Study

  • Computer science

Readers

  • Computer Vision.
  • Library and Information Science
  • Regression Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Information Retrieval