PROGRAM DOCUMENTATION FOR MARK I STATISTICAL ASSOCIATION PROCEDURES FOR MESSAGE CONTENT ANALYSIS,

Abstract

A statistical method for automatic document retrieval and message content analysis is described. The method involves building a matrix for the corpus based of word co-occurrences within sentences; this matrix is then normalized in order to eliminate what are considered to be extraneous factors. The normalized matrix is used by the retrieval algorithm to expand a set of query terms to include terms associated with them, the new set, in turn, being used to select documents from the corpus. All of these operations are performed on an IBM 7090 computer. This report gives a detailed description of the computer programs involved. (Author)

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1963
Accession Number
AD0427004

Entities

People

  • J. B. H. Baker
  • J. Spiegel
  • R. Vicksell

Organizations

  • MITRE Corporation

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Automatic
  • Computer Program Documentation
  • Computer Programs
  • Computers
  • Computing Devices

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Systems Analysis and Design