Actor-Centric Tubelets for Real-Time Activity Detection in Extended Videos

Abstract

We address the problem of detecting human and vehicle activities in long, untrimmed surveillance videos that capture a large field of view. Most existing activity detection approaches are designed for recognizing atomic human actions performed in the foreground. Therefore, they are not suitable for detecting activities in extended videos, which contain multiple actors performing co-occurring, complex activities with extreme spatio-temporal scale variations. In this paper, we propose a modular, actor-centric framework for real-time activity detection in extended videos. In particular, we decompose an extended video into a collection of smaller actor-centric tubelets of interest. Each tubelet is a video sub-volume associated with an actor and includes adaptive visual context for recognizing the actors activities. Once these tubelets are extracted via an object-detection- based approach, we are able to detect activities in each tubelet by focusing on the actor situated in its foreground. To accurately detect the activities of a tubelets actor we take into account the interactions with other detected actors and objects within the tubelet. We encode such interactions with a dynamic visual spatio-temporal graph and process it with a Graph Neural Network that yields context-aware actor representations. We validate our activity detection framework on the MEVA (Multiview Extended Video with Activities) dataset and the ActEV 2021 Sequestered Data Leaderboard and demonstrate its effectiveness in terms of speed and performance.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 04, 2022
Accession Number
AD1185764

Entities

People

  • Effrosyni Mavroudi
  • Prashast Bindal
  • Rene Vidal

Organizations

  • Johns Hopkins University

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Computer Vision
  • Computers
  • Data Science
  • Deep Learning
  • Detection
  • Detectors
  • False Alarms
  • Feature Extraction
  • Image Processing
  • Machine Learning
  • Neural Networks
  • Pattern Recognition
  • Recognition
  • Vehicle Tracks
  • Vehicles

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Computer Vision.

Technology Areas

  • AI & ML