RESULTS OF CLASSIFYING DOCUMENTS WITH MULTIPLE DISCRIMINANT FUNCTIONS.

Abstract

An important, but frequently underemphasized step in the classification process is the selection of attributes. In classification problems of mutually exclusive assignment, a set of attributes is selected to represent the category. For information retrieval applications the assumption of mutually exclusive categories may not hold. Therefore, the problem of the selection of measurable attributes to represent the categories becomes more acute. Discriminant analysis appears to offer a solution not only to the selection of attributes problem, but also to the document relevance problem. In the selection phase it provides a method of selecting a set of attributes whose ratio of among-category variance to withincategory variance is largest. In the actual classification process a distance measure can then be employed to determine the degree of relevance of a given document with respect to each category. Classification experiments have been conducted on 794 Solid State abstracts. Classification accuracies up to 90 percent were achieved using the discriminant procedures.

Document Details

Document Type
Technical Report
Publication Date
Mar 15, 1965
Accession Number
AD0612272

Entities

People

  • J. H. Williams

Organizations

  • International Business Machines Corporation (Armonk, NY)

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Accuracy
  • Classification
  • Computing-Related Activities
  • Data Science
  • Discriminant Analysis
  • Information Retrieval
  • Information Science
  • Mathematics
  • Standards

Readers

  • Regression Analysis.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Learning Algorithms