PRINT READER OUTPUT CORRECTION STUDY.

Abstract

This study resulted in the development of a basic model for assisting an optical character recognition unit in deciding character identity by context dependent factors. A programmed method was designed to introduce controlled errors in a textual data base to simulate the characteristics of output from an optical character recognition device. Two error types rejects and best guesses, were created in the alphabetic words on this English textual material. A basic model was designed, implemented, and evaluated for effectively correcting these errors. Correction techniques used in this model are based, not on full dictionary lookup, but on n-gram occurrence lists common word dictionaries, environmental dictionaries, and character confusion tables. (Author)

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 1967
Accession Number
AD0819851

Entities

People

  • Jay Ogg
  • Kenneth A. Garrison
  • Marihelen Jones
  • Nicholas Jacobs

Organizations

  • International Business Machines Corporation (Armonk, NY)

Tags

DTIC Thesaurus Topics

  • Character Recognition
  • Databases
  • Dictionaries
  • Identification
  • Identities
  • Materials
  • Neurobehavioral Manifestations
  • Optical Character Recognition
  • Pattern Recognition
  • Personality
  • Recognition

Readers

  • Computational Linguistics
  • Computational Modeling and Simulation
  • Computer Science/Computer Engineering/Data Science/Digital Signal Processing.