BREAKING THE COST BARRIER IN AUTOMATIC CLASSIFICATION

Abstract

A low-cost automatic classification method is reported that uses computer time in proportion to NlogN, where N is the number of information items and the base is a parameter. Some barriers besides cost are treated briefly in the opening section, including types of intellectual resistance to the idea of doing classification by content-word similarity. The second section explains the basic processes of document grouping by similarity, and discusses the advantages of the reported method over methods commonly experimented with. The operation of an iterative procedure using word profiles to progressively improve the grouping of content-word lists is described. Then some possible applications aside from document classification are enumerated. The final section begins by presenting theoretical underpinnings that explain the form taken by the components of the method. An account of the struggle to make the method work is sketched, followed by a cycle-by-cycle description of a feasibility demonstration. The conclusion states that mere cheapness is not enough and analyzes what researchers and developers might have to do before user acceptance of automatic classification can be assured.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jul 01, 1966
Accession Number: AD0636837

Entities

People

L. B. Doyle

Organizations

System Development Corporation

BREAKING THE COST BARRIER IN AUTOMATIC CLASSIFICATION

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers