Monocular Depth Perception and Robotic Grasping of Novel Objects

Abstract

The ability to perceive the 3D shape of the environment is a basic ability for a robot. We present an algorithm to convert standard digital pictures into 3D models. This is a challenging problem, since an image is formed by a projection of the 3D scene onto two dimensions, thus losing the depth information. We take a supervised learning approach to this problem, and use a Markov Random Field (MRF) to model the scene depth as a function of the image features. We show that, even on unstructured scenes of a large variety of environments, our algorithm is frequently able to recover accurate 3D models. We then apply our methods to robotics applications: (1) obstacle avoidance for autonomously driving a small electric car, and (b) robot manipulation, where we develop vision-based learning algorithms for grasping novel objects. This enables our robot to perform tasks such as open new doors, clear up cluttered tables, and unload items from a dishwasher.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 2009
Accession Number
ADA630917

Entities

People

  • Ashutosh Saxena

Organizations

  • Stanford University

Tags

Communities of Interest

  • Autonomy
  • Sensors

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Collision Avoidance
  • Computational Science
  • Computer Stereo Vision
  • Computer Vision
  • Geometry
  • Grids
  • Laser Rangefinding
  • Linear Programming
  • Machine Learning
  • Probabilistic Models
  • Range Finding
  • Robots
  • Supervised Machine Learning
  • Three Dimensional

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.
  • Robotics and Automation.
  • Vision Science/Vision Psychology/Cognitive Neuroscience.

Technology Areas

  • AI & ML
  • AI & ML - Autonomous Systems
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks
  • Autonomy