CIRC II Data Base Classification.

Abstract

This report describes the development of a classification system for the CIRC II Data Base. 98 CIRC II classes are designed which partition the documents of this data base. The software which assigns these classes to incoming documents utilizes a sequential classification algorithm. In this approach, only as much of each document is read to accurately assign one or more classes, together with a confidence probability for each assigned class. In this way, a compromise is obtained between efficiency and accuracy. A number of parameters are available in this software to effect this trade off. Additional software has been developed to analyze sample documents to define the CIRC II classes, producing keywords and frequency distributions over the classes. This software provides flexibility for the classification system, as a class can be added or deleted, a class modified by submitting additional documents, or the keyword selection criterion can be altered. A number of experiments were conducted using this classification system on CIRC II documents. It was shown that satisfactory classification could be achieved, and a stable set of keywords and frequency distributions obtained. (Author)

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 1977
Accession Number
ADA042268

Entities

People

  • Anthony E. Petrarca
  • Barry J. Brinkman
  • Laurel G. Crawford
  • Lee J. White
  • Sanjay Mittal

Organizations

  • Ohio State University

Tags

Communities of Interest

  • Biomedical
  • C4I
  • Energy and Power Technologies
  • Human Systems
  • Space
  • Weapons Technologies

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Satellites
  • Chemistry
  • Computer Programming
  • Computer Programs
  • Computers
  • Construction
  • Databases
  • Economics
  • Engineers
  • Fluid Mechanics
  • Information Science
  • Materials
  • Materials Science
  • Medical Personnel
  • Physics Laboratories
  • Soil Science

Fields of Study

  • Computer science
  • Engineering

Readers

  • Library and Information Science
  • Neural Network Machine Learning.
  • Software Engineering