Feedback-Enabled Joint Reasoning over Uncertain Sub-Components of Perception

Abstract

Motivation. We are witnessing an explosion of visual content. Photo-sharing websites like Flickr and Facebook now host 6 and 90 billion photos! Last year an estimated 1 billion (camera-equipped) mobile phones were sold worldwide. Every day users share 200 million more images on Facebook. Every minute users upload 3 days worth of video to Youtube. A recent World Economic Form report and a New York Times article declared data to be a new class of economic asset, like currency or gold. This data revolution presents both an opportunity and a challenge. Extracting value from this asset will require converting meaningless data into perceptual understanding. This is challenging buthas the potential to fundamentally change the way we live ~ from self-driving cars bringing mobility to the visually impaired, to in-home robots caring for the elderly and physically impaired, to augmented reality with Google-Glass-like wearable computing units.The past two decades have witnessed significant progress in machine perception ~ today, there are commercial systems for face detection (Face.com, Apple iPhoto), speech recognition (Siri), handwriting recognition (Microsoft OneNote), and pedestrian detection (Mobileye). Despite progress in these narrow-domain tasks, we are still far from the grand goal of a holistic intelligent system that understands the scene behind the sensors, i.e., is able to invert animage, video, depth or any other sensor to infer all scene properties (Who, what, where, doing what?) such as: 3D Scene Layout: Where is ground and what are the vertical surfaces? Object Layout: Which objects are present and what is their extent in 3D? Object Attributes: Is the person smiling? Is the road sign big? Activity Layout: What is the pose of each object and and what is each actor doing? Intentions, Threats, Future Predictions: Is the person paying attention? About to get hurt?

Document Details

Document Type
DoD Grant Award
Publication Date
Jan 04, 2017
Source ID
N000141712173

Entities

People

  • Dhruv Batra

Organizations

  • Georgia Tech Research Corporation
  • Office of Naval Research
  • United States Navy

Tags

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Computer Vision.
  • Distributed Systems and Data Platform Development

Technology Areas

  • AI & ML
  • AI & ML - Autonomous Systems
  • Autonomy