Learning Distance Functions for Exemplar-Based Object Recognition

Abstract

This thesis investigates an exemplar-based approach to object recognition that learns, on an image-by-image basis, relative importance of patch-based features for determining similarity. We represent images as sets of patch-based features. To find the distance between two images, we first find for each patch its nearest patch in the other image and compute their inter-patch distance. The weighted sum of these inter-patch distances is defined to be the distance between the two images. Main contribution of this thesis is a method for learning a set-to-set distance function specific to each training image and demonstrating the use of these functions for image browsing, retrieval and classification. Goal of the learning algorithm is to assign a non-negative weight to each patch-based feature of the image such that the most useful patches are assigned large weights and irrelevant or confounding patches are given zero weights. We formulate this as a large-margin optimization and discuss two versions: a "focal" version that learns weights for each image separately, and a "global" version that jointly learns the weights for all training images. In the focal version, the distance functions learned for the training images are not directly comparable to one another and can be most directly applied to in-sample applications such as image browsing, though with heuristics or additional learning, these functions can be used for image retrieval or classification. The global approach, however, learns distance functions that are globally consistent and can be used directly for image retrieval and classification. Using geometric blur and simple color features, we show that both versions perform as well or better than algorithms on the Caltech 101 object recognition benchmark. The global version achieves the best results, a 63.2% mean recognition rate when trained with fifteen images per category and 66.6% when trained with twenty.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 08, 2007
Accession Number
ADA637151

Entities

People

  • Andrea L. Frome

Organizations

  • University of California, Berkeley

Tags

Communities of Interest

  • Air Platforms
  • Autonomy
  • Energy and Power Technologies
  • Materials and Manufacturing Processes
  • Weapons Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence Software
  • Birds
  • Computational Science
  • Computer Languages
  • Computer Science
  • Computer Vision
  • Databases
  • Dimensionality Reduction
  • Distance Learning
  • Electrical Engineering
  • Image Processing
  • Machine Learning
  • Object Recognition
  • Recognition
  • Supervised Machine Learning
  • Three Dimensional

Fields of Study

  • Computer science

Readers

  • Computer Vision.
  • Neural Network Machine Learning.
  • Regression Analysis.