COMPUTER CLASSIFICATION OF INTELLIGENCE-TYPE DOCUMENTS.
Abstract
A computer classification technique was successfully tested on intelligence-type documents. Results of experiments are also reported on technical data bases in the English and German languages. Since the technique is statistical rather than syntactical it can classify documents in any language without requiring translation. In addition to the usual tests on sample and control data bases, a successful test was performed on another additional data base that had not been used to generate the classification statistics. The statistical technique is based upon multiple discriminant functions, which have the ability to classify into any number of categories, the technique provides for classification to several levels of detail. A user may select any set of subject categories suiting his need, and provides a set of sample documents for each category. A subset of words to form the classification bases are selected from the sample in accordance with their statistical properties. Classification applications are not limited to document retrieval, but may include document routing, screeening, or disseminating functions. (Author)
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 1967
- Accession Number
- AD0820801
Entities
People
- John H. Williams Jr.
- Mathew P. Perriens
Organizations
- International Business Machines Corporation (Armonk, NY)