Multimodal Sparse Coding for Event Detection

Abstract

Unsupervised feature learning methods have proven effective for classification tasks based on single modality. We present multimodal sparse coding for learning feature representations shared across multiple modalities. The shared representations are applied to multimedia event detection (MED) and evaluated in comparison to unimodal counterparts, as well as other feature learning methods such as sparse auto encoder and RBM. We report the cross-validated classification accuracy and mean average precision of the MED system trained on features learned from our unimodal and multimodal settings for the TRECVID MED 2014 dataset.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 13, 2015
Accession Number
AD1034547

Entities

People

  • Douglas E. Sturim
  • H. T. Kung
  • Kevin Brady
  • Miriam Cha
  • William M. Campbell
  • Youngjune L. Gwon

Organizations

  • MIT Lincoln Laboratory

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Accuracy
  • Artificial Intelligence Software
  • Automatic Gain Control
  • Coding
  • Computer Programming
  • Deep Learning
  • Dimensionality Reduction
  • Event Detection
  • Information Science
  • Machine Learning
  • Neural Networks
  • Preprocessing
  • United States Government
  • Unsupervised Machine Learning
  • Video Frames

Fields of Study

  • Computer science

Readers

  • Artificial Intelligence
  • Computer Vision.