THIS GRANT IS A CONTINUATION OF N00014-14-1-0156 Infinite Latent CRFs for Robotic Perception and Planning Using Human Context

Abstract

Infinite Latent CRFs for Robotic Perception and Planning using Human Context. Saxena Abstract To successfully operate, a robot needs to be able to properly sense and interpret its surroundings. Our environments have three components: objects, humans and spatial layout, and these are tightly related to each other. Being able to reason about these components and the interplay between them, is critical for a robot to perform tasks. One way to do so is through Conditional Random Fields (CRFs) that have been a workhorse of machine learning. They have been successfully applied to model some aspects of an environment for perceiving them (such as in several previous works by the PI). While CRFs have been quite successful in modeling the environment because of their conditional independence properties, they have two significant limitations regarding their fixed graph structure. First, the types of relationships between the nodes (i.e., the potential on the edges) may not be known in advance. Second, in certain cases some nodes may always be latent (hidden) during the training as well as the test time. In this proposal, we propose to develop Infinite Latent Conditional Random Fields (ILCRF), where we do not need to specify the graph structure in advance. The (latent) nodes as well as the potentials on the edges will be generated in an unsupervised manner using a generative model. This proposed model will make the algorithms such as CRFs very powerful because now the modeling will no longer need to be limited to a particular graph structure, but instead the graph structure will be discovered in an unsupervised manner from the data. We then propose to apply our ILCRF model to model the intuitive physics and human context in the environment as hidden factors. As an example, consider an office environment where a keyboard is found below a monitor not because there is some special relation between the keyboard and monitor, but because of their affordances the keyboard needs to be touched by hand, and a monitor needs to be seen. We will then use our model and test it on several robotics and vision applications, including scene parsing, human activity detection, and a robot retrieving objects, arranging a disorganized house, manipulating objects, and responding to human activities.

Document Details

Document Type
DoD Grant Award
Publication Date
Jun 03, 2016
Source ID
N000141612121

Entities

People

  • Ashutosh Sexena

Organizations

  • Cornell University
  • Office of Naval Research
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Autonomous Systems
  • AI & ML - Machine Learning Algorithms
  • Autonomy