3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image (Open Access)

Abstract

We present a unified framework for detecting and classifying people interactions in unconstrained user generated images. Unlike previous approaches that directly map people/face locations in 2D image space into features for classification, we first estimate camera viewpoint and people positions in 3D space and then extract spatial configuration features from explicit 3D people positions. This approach has several advantages. First, it can accurately estimate relative distances and orientations between people in 3D. Second, it encodes spatial arrangements of people into a richer set of shape descriptors than afforded in 2D. Our 3D shape descriptors are invariant to camera pose variations often seen in web images and videos. The proposed approach also estimates camera pose and uses it to capture the intent of the photo. To achieve accurate 3D people layout estimation, we develop an algorithm that robustly fuses semantic constraints about human interpositions into a linear camera model. This enables our model to handle large variations in people size, heights (e.g. age) and poses. An accurate 3D layout also allows us to construct features informed by Proxemics that improves our semantic classification. To characterize the human interaction space, we introduce visual proxemes; a set of prototypical patterns that represent commonly occurring social interactions in events. We train a discriminative classifier that classifies 3D arrangements of people into visual proxemes and quantitatively evaluate the performance on a large, challenging dataset.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 03, 2013
Accession Number
AD1037203

Entities

People

  • Hui Cheng
  • Ishani Chakraborty
  • Omar Javed

Organizations

  • SRI International

Tags

Communities of Interest

  • Air Platforms

DTIC Thesaurus Topics

  • Algorithms
  • Anomaly Detection
  • Cameras
  • Change Detection
  • Classification
  • Computer Vision
  • Coordinate Systems
  • Detection
  • Detectors
  • Estimators
  • Event Detection
  • Ground Level
  • High Angles
  • Low Angles
  • Pattern Recognition
  • Recognition
  • Shape

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Geodesy
  • Neural Network Machine Learning.

Technology Areas

  • Space