IMPLEMENTATION OF DOCUMENT FORMAT RECOGNITION,

Abstract

The design and construction of an experimental model of an automatic system for the recognition of the page format of foreign and domestic journals are described. The system consisted of a laboratory page reader, a PDP-8 control computer, and a CDC 3200 general purpose computer. Operationally, such a system would be used in information retrieval and language translation applications. During the initial phases, the format recognition and analysis programs were designed and coded under the assumption that the page reader would be able to recognize characters on the journal page. When it became apparent that the flying spot scanner in the laboratory page reader had insufficient resolution for the simple separation of characters, these programs were revised to permit an investigation of the feasibility of document format recognition without character recognition. While it was possible to perform rudimentary format recognition such as distinguishing between text and graphics, the system performance was far below that possible with character recognition. It is recommended that further work in this area be directed toward the improvement of the page reader.

Document Details

Document Type
Technical Report
Publication Date
Oct 01, 1966
Accession Number
AD0803633

Entities

People

  • J. Stoddard
  • M. B. Robinson
  • M. Blitz
  • R. Sallen
  • R. Sanders

Organizations

  • Sylvania Electric Products

Tags

DTIC Thesaurus Topics

  • Automatic
  • Character Recognition
  • Computers
  • Construction
  • Domestic
  • Flying Spot Scanners
  • Graphics
  • Information Retrieval
  • Language
  • Language Translation
  • Personality
  • Recognition
  • Scanners
  • Translations

Fields of Study

  • Computer science

Readers

  • Computer Science/Computer Engineering/Data Science/Digital Signal Processing.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation