Handwriting Identification, Matching, and Indexing in Noisy Document Images
Abstract
Throughout history, handwriting has been the primary means of recording information that is persevered across both time and space. With the coming of the electronic docu- ment era, we are challenged with making an enormous amount of handwritten documents available for electronic access. Though many handwritten documents contain only hand- writing, now, more are mixed with printed text, noise, and background patterns. The mixture of handwriting with other components presents a great challenge for making an original document electronically accessible. Many handwritten documents come together with a special background pattern, rule lines, which are printed on the paper to guide writing. After digitization, rule lines will touch text and cause problems for further document image analysis if they are not detected and removed. In this dissertation, we present a rule line detection algorithm based on hidden Markov model (HMM) decoding, achieving both high detection accuracy and a low false alarm rate. After detection, line removal is performed by line width thresholding.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2006
- Accession Number
- ADA447910
Entities
People
- Yefeng Zheng
Organizations
- University of Maryland