A Detailed Look at Scale and Translation Invariance in a Hierarchical Neural Model of Visual Object Recognition
Abstract
The HMAX model has recently been proposed by Riesenhuber & Poggio [15] as a hierarchical model of position- and size-invariant object recognition in visual cortex. It has also turned out to model successfully a number of other properties of the ventral visual stream (the visual pathway thought to be crucial for object recognition in cortex), and particularly of (view-tuned) neurons in macaque inferotemporal cortex, the brain area at the top of the ventral stream. The original modeling study [15] only used paperclip stimuli, as in the corresponding physiology experiment [8], and did not explore systematically how model units invariance properties depended on model parameters. In this study, we aimed at a deeper understanding of the inner workings of HMAX and its performance for various parameter settings and natural stimulus classes. We examined HMAX responses for different stimulus sizes and positions systematically and found a dependence of model units responses on stimulus position for which a quantitative description is offered. Scale invariance properties were found to be dependent on the particular stimulus class used. Moreover, a given view-tuned unit can exhibit substantially different invariance ranges when mapped with different probe stimuli. This has potentially interesting ramifications for experimental studies in which the receptive field of a neuron and its scale invariance properties are usually only mapped with probe objects of a single type.
Document Details
- Document Type
- Technical Report
- Publication Date
- Aug 01, 2002
- Accession Number
- ADA459489
Entities
People
- Maximilian Riesenhuber
- Robert Schneider
Organizations
- Massachusetts Institute of Technology