Inferring the Why in Images

Abstract

Humans have the remarkable capability to infer the motivations of other people's actions, likely due to cognitive skills known in psychophysics as the theory of mind. In this paper, we strive to build a computational model that predicts the motivation behind the actions of people from images. To our knowledge, this challenging problem has not yet been extensively explored in computer vision. We present a novel learning based framework that uses high-level visual recognition to infer why people are performing an actions in images. However, the information in an image alone may not be sufficient to automatically solve this task. Since humans can rely on their own experiences to infer motivation, we propose to give computer vision systems access to some of these experiences by using recently developed natural language models to mine knowledge stored in massive amounts of text. While we are still far away from automatically inferring motivation, our results suggest that transferring knowledge from language into vision can help machines understand why a person might be performing an action in an image.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2014
Accession Number
ADA612444

Entities

People

  • Antonio Torralba
  • Carl Vondrick
  • Hamed Pirsiavash

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Computing
  • Artificial Intelligence Software
  • Computer Languages
  • Computer Vision
  • Computers
  • Convolutional Neural Networks
  • Detectors
  • Information Science
  • Language
  • Learning
  • Machine Learning
  • Motivation
  • Natural Languages
  • Neural Networks
  • Psychological Theory
  • Recognition

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Systems Analysis and Design
  • Vision Science/Vision Psychology/Cognitive Neuroscience.

Technology Areas

  • AI & ML