AUTOMATIC CLASSIFICATION FOR THE ASTIA MATHEMATICS COLLECTION.
Abstract
Information Retrieval of documents based on a series of word descriptors which characterize each document requires the formation of a search description. The user of the library can be greatly aided in the search if he knows which descriptors do not occur together in any document in the file and which descriptors have some relatively high probability of co-occurring within a description. This thesis investigates this absence and probable presence of co-occurence between pairs of descriptors in the approximately 25,000 document Mathematics section of the ASTIA library. This absence and probable presence is revealed by an Exclusive and an Inclusive Stratification program whose results are presented as a two-level lescriptor classification system. The results of this classification are measured by a figure which states the probability that the file contains documents each of which is formed by choosing any two descriptors form an Inclusive Group. Another measure presented is the reduction in the number of possible descriptions of any length. (Author)
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 1964
- Accession Number
- AD0610313
Entities
People
- Barry Zimmerman
Organizations
- University of Pennsylvania