RESULTS OF CLASSIFYING DOCUMENTS WITH MULTIPLE DISCRIMINANT FUNCTIONS.
Abstract
An important, but frequently underemphasized step in the classification process is the selection of attributes. In classification problems of mutually exclusive assignment, a set of attributes is selected to represent the category. For information retrieval applications the assumption of mutually exclusive categories may not hold. Therefore, the problem of the selection of measurable attributes to represent the categories becomes more acute. Discriminant analysis appears to offer a solution not only to the selection of attributes problem, but also to the document relevance problem. In the selection phase it provides a method of selecting a set of attributes whose ratio of among-category variance to withincategory variance is largest. In the actual classification process a distance measure can then be employed to determine the degree of relevance of a given document with respect to each category. Classification experiments have been conducted on 794 Solid State abstracts. Classification accuracies up to 90 percent were achieved using the discriminant procedures.
Document Details
- Document Type
- Technical Report
- Publication Date
- Mar 15, 1965
- Accession Number
- AD0612272
Entities
People
- J. H. Williams
Organizations
- International Business Machines Corporation (Armonk, NY)