Automatic Modeling and Localization for Object Recognition

Abstract

Being able to accurately estimate an object's pose (location) in an image is important for practical implementations and applications of object recognition. Recognition algorithms often trade off accuracy of the pose estimate for efficiency -- usually resulting in brittle and inaccurate recognition. One solution is object localization -- a local search for the object's true pose given a rough initial estimate of the pose. Localization is made difficult by the unfavorable characteristics (for example, noise, clutter, occlusion and missing data) of real images. In this thesis, we present novel algorithms for localizing 3D objects in 3D range-image data (3D-3D localization) and for localizing 3D objects in 2D intensity-image data (3D-2D localization). Our localization algorithms utilize robust statistical techniques to reduce the sensitivity of the algorithms to the noise, clutter, missing data, and occlusion which are common in real images. Our localization results demonstrate that our algorithms can accurately determine the pose in noisy, cluttered images despite significant errors in the initial pose estimate. Acquiring accurate object models that facilitate localization is also of great practical importance for object recognition. In the past, models for recognition and localization were typically created by hand using computer-aided design (CAD) tools. Manual modeling suffers from expense and accuracy limitations. In this thesis, we present novel algorithms to automatically construct object-localization models from many images of the object. We present a consensus-search approach to determine which parts of the image justifiably constitute inclusion in the model. Using this approach, our modeling algorithms are relatively insensitive to the imperfections and noise typical of real image data. Our results demonstrate that our modeling algorithms can construct very accurate geometric models from rather noisy input data.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 25, 1996
Accession Number
ADA461112

Entities

People

  • Mark D. Wheeler

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies
  • Engineered Resilient Systems
  • Sensors

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Computational Fluid Dynamics
  • Computational Science
  • Computer Graphics
  • Computer Science
  • Computer Vision
  • Computer-Aided Design
  • Coordinate Systems
  • Geometry
  • Grids
  • Image Processing
  • Military Research
  • Object Recognition
  • Pattern Recognition
  • Recognition
  • Three Dimensional
  • Trees (Data Structures)

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Computer Vision.