A Point Matching Algorithm for Automatic Groundtruth Generation

Abstract

Geometric groundtruth at the character, word, and line levels is crucial for developing and evaluating optical character recognition (OCR) algorithms. Kanungo and Haralick proposed a closed-loop methodology for generating character-level groundtruth for rescanned images. In this paper, we present a robust version of their methodology. We grouped the feature points and used a feature point registration algorithm on the grouped feature point set to estimate the transformation. The Euclidean distance between character centroids was used as the error metric. We performed experiments on the University of Washington data set.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 2001
Accession Number
ADA458743

Entities

People

  • Doe-wan Kim
  • Tapas Kanungo

Organizations

  • University of Maryland

Tags

Communities of Interest

  • C4I

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Automatic
  • Cell Size
  • Character Recognition
  • Data Sets
  • Demographic Cohorts
  • Feature Extraction
  • Information Operations
  • Iterations
  • Language
  • Mathematics
  • Optical Character Recognition
  • Personality
  • Translations
  • Universities

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Computer Vision.