Generalization of Figure-Ground Segmentation from Binocular to Monocular Vision in an Embodied Biological Brain Model
Abstract
Humans have the remarkable ability to generalize from binocular to monocular gure-ground segmentation of complex scenes. This is clearly evident anytime we look at a photograph, computer monitor or simply close one eye. We hypothesized that this skill is due to of the ability of our brains to use rich embodied signals, such as disparity, to train up depth perception when only the information from one eye is available. In order to test this hypothesis we enhanced our virtual robot, Emer, who is already capable of performing robust, state-of-the-art, invariant 3D object recognition, with the ability to learn figure-ground segmentation, allowing him to recognize objects against complex backgrounds. Continued development of this skill holds great promise for efforts, like Emer, that aim to create an Artificial General Intelligence (AGI). For example, it promises to unlock vast sets of training data, such as Google Images, which have previously been inaccessible to AGI models due to their lack of embodied, deep learning. More immediately practical implications, such as achieving human performance on the Caltech101 object recognition dataset, are discussed.
Document Details
- Document Type
- Technical Report
- Publication Date
- Aug 01, 2011
- Accession Number
- ADA557818
Entities
People
- Brian Mingus
- Dean Wyatte
- Kenneth W Latimer
- Randall C. O'Reilly
- Seth Herd
- Trent Kriete