Adaptive Hindi OCR Using Generalized Hausdorff Image Comparison

Abstract

In this paper, we present an adaptive Hindi OCR using generalized Hausdor image comparison implemented as part of a rapidly retargetable language tool report. The system includes: script identification, character segmentation, training sample creation and character recognition. The OCR design (completed in one month) was applied to a complete Hindi-English bilingual dictionary (with 1083 pages) and a collection of ideal images extracted from Hindi documents in PDF format. Experimental results show the recognition accuracy can reach 88% for noisy images and 95% for ideal images, both at the character level. The presented method can also be extended to design OCR systems for different scripts.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Aug 19, 2003
Accession Number: ADA455170

Entities

People

David S. Doermann
Huanfeng Ma

Organizations

University of Maryland

Adaptive Hindi OCR Using Generalized Hausdorff Image Comparison

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Readers