Dance with Flow: Two-in-One Stream Action Detection

Abstract

The goal of this paper is to detect the spatio-temporal extent of an action. The two-stream detection network based on RGB and flow provides state-of-the-art accuracy at the expense of a large model-size and heavy computation. We propose to embed RGB and optical-flow into a single two-in-one stream network with new layers. A motion condition layer extracts motion information from flow images, which is leveraged by the motion modulation layer to generate transformation parameters for modulating the low-level RGB features. The method is easily embedded in existing appearance-or two-stream action detection networks, and trained end-to-end. Experiments demonstrate that lever- aging the motion condition to modulate RGB features improves detection accuracy. With only half the computation and parameters of the state-of-the-art two-stream methods, our two-in-one stream still achieves impressive results on UCF101-24, UCFSports and J-HMDB.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 16, 2019
Accession Number
AD1152143

Entities

People

  • Cees G. M. Snoek
  • Jiaojiao Zhao

Organizations

  • University of Amsterdam

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Accuracy
  • Artificial Intelligence Software
  • Classification
  • Commerce
  • Computations
  • Computer Languages
  • Computer Vision
  • Computing System Architectures
  • Convolution
  • Deep Learning
  • Detection
  • Detectors
  • Efficiency
  • Images
  • Machine Learning
  • Neural Networks
  • Recognition

Fields of Study

  • Computer science

Readers

  • Combustion and Flow Dynamics.
  • Computer Vision.
  • Neural Network Machine Learning.