Automatic Typeset Input Technique Evaluation.

Abstract

The need to design multi-font print readers is becoming critical for the input conversion activities of the Air Force Foreign Technology Division Machine Translation facilities. This effort was to evaluate existing optical character recognition capabilities toward the total requirement of a Russian typeset print reader. In this research four pages of original scientific Russian text were used as the data base. The contractor demonstrated that the scanning and conversion of Russian text by OCR is feasible and potentially economical. The second objective was to identify the problem areas and compile solutions to them. The results of the analysis of scanning the four samples indicate that secondary recognition and increased digital resolution will be most effective in reducing the total error rate. An ultimate total error rate of less than .5% appears achievable. (Author)

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1971
Accession Number
AD0739873

Entities

People

  • A. Schapira
  • David Bantz
  • W. Clarkson

Tags

DTIC Thesaurus Topics

  • Air Force
  • Automatic
  • Buildings And Structures
  • Character Recognition
  • Contractors
  • Conversion
  • Databases
  • Foreign Technology
  • Identification
  • Machine Translation
  • Optical Character Recognition
  • Pattern Recognition
  • Personality
  • Recognition
  • Scanning
  • Test And Evaluation

Readers

  • Computer Programming and Software Development.
  • Computer Science/Computer Engineering/Data Science/Digital Signal Processing.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation