Handwriting Identification, Matching, and Indexing in Noisy Document Images

Abstract

Throughout history, handwriting has been the primary means of recording information that is persevered across both time and space. With the coming of the electronic docu- ment era, we are challenged with making an enormous amount of handwritten documents available for electronic access. Though many handwritten documents contain only hand- writing, now, more are mixed with printed text, noise, and background patterns. The mixture of handwriting with other components presents a great challenge for making an original document electronically accessible. Many handwritten documents come together with a special background pattern, rule lines, which are printed on the paper to guide writing. After digitization, rule lines will touch text and cause problems for further document image analysis if they are not detected and removed. In this dissertation, we present a rule line detection algorithm based on hidden Markov model (HMM) decoding, achieving both high detection accuracy and a low false alarm rate. After detection, line removal is performed by line width thresholding.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2006
Accession Number
ADA447910

Entities

People

  • Yefeng Zheng

Organizations

  • University of Maryland

Tags

Communities of Interest

  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Artificial Intelligence Software
  • Character Recognition
  • Computer Languages
  • Computer Programs
  • Computer Vision
  • Computers
  • Databases
  • Detection
  • Feature Extraction
  • Information Retrieval
  • Information Science
  • Network Science
  • Pattern Recognition
  • Probability Distributions
  • Random Variables
  • Supervised Machine Learning
  • Two Dimensional

Fields of Study

  • Computer science

Readers

  • Computer Science/Computer Engineering/Data Science/Digital Signal Processing.
  • Military History of the United States in the 20th Century.
  • Speech Processing/Speech Recognition.

Technology Areas

  • Microelectronics
  • Space
  • Space - Space Objects