Adaptive Feature Aggregation for Video Object Detection

Abstract

Object detection, as a fundamental research topic of computer vision, is facing the challenges of video-related tasks. Objects in videos tend to be blurred, occluded, or out of focus more frequently. Existing works adopt feature aggregation and enhancement to design video-based object detectors. However, most of them do not consider the diversity of object movements and the quality of aggregated context features. Thus, they can not generate comparable results given blurred or crowded videos. In this paper, we propose an adaptive feature aggregation method for video object detection to deal with these problems. We introduce an adaptive quality-similarity weight, with a sparse and dense temporal aggregation policy, into our model. Compared with both image-based and video-based baselines on ImageNet and VIRAT datasets, our work consistently demonstrates better performance.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2020
Accession Number
AD1154807

Entities

People

  • Alexander G. Hauptmann
  • Guoliang Kang
  • Lijun Yu
  • Wenhe Liu
  • Yijun Qian

Organizations

  • Carnegie Mellon University

Tags

DTIC Thesaurus Topics

  • Big Data
  • Classification
  • Computer Vision
  • Computers
  • Detection
  • Event Detection
  • Feature Extraction
  • Image Processing
  • Image Recognition
  • Information Processing
  • Information Systems
  • Pattern Recognition
  • Precision
  • Recognition
  • Surveillance
  • Vehicles
  • Video
  • Video Clips

Fields of Study

  • Computer science

Readers

  • Image Processing and Computer Vision.
  • Neural Network Machine Learning.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms