Excitation Backprop for RNNs

Abstract

Deep models are state-of-the-art for many vision tasks including video action recognition and video captioning. Models are trained to caption or classify activity in videos, but little is known about the evidence used to make such decisions. Grounding decisions made by deep networks has been studied in spatial visual content, giving more insight into model predictions for images. However, such studies are relatively lacking for models of spatiotemporal visual content videos. In this work, we devise a formulation that simultaneously grounds evidence in space and time, in a single pass, using top-down saliency. We visualize the spatiotemporal cues that contribute to a deep models classification/captioning output using the models internal representation. Based on these spatiotemporal cues, we are able to localize segments within a video that correspond with a specific action, or phrase from a caption, without explicitly optimizing/training for these tasks.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 18, 2018
Accession Number
AD1136771

Entities

People

  • Andrea Zunino
  • Donghyun Kim
  • Jianming Zhang
  • Sarah A. Bargal
  • Stan Sclaroff
  • Vittorio Murino

Organizations

  • Adobe
  • Boston University
  • Istituto Italiano di Tecnologia

Tags

Communities of Interest

  • Autonomy
  • Human Systems

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Computer Languages
  • Computer Science
  • Computer Vision
  • Computers
  • Convolutional Neural Networks
  • Data Mining
  • Detection
  • Image Recognition
  • Information Science
  • Machine Learning
  • Natural Language Processing
  • Network Science
  • Neural Networks
  • Pattern Recognition
  • Probability
  • Probability Distributions
  • Recurrent Neural Networks
  • Video Frames

Fields of Study

  • Computer science

Readers

  • Computer Vision.
  • Neural Network Machine Learning.

Technology Areas

  • Space