Improved Robustness and Efficiency for Automatic Visual Site Monitoring

Abstract

Knowing who people are, where they are, what they are doing, and how they interact with other people and things is valuable from commercial, security, and space utilization perspectives. Video sensors backed by computer vision algorithms are a natural way to gather this data. Unfortunately, key technical issues persist in extracting features and models that are simultaneously efficient to compute and robust to issues such as adverse lighting conditions, distracting background motions, appearance changes over time, and occlusions. In this thesis, we present a set of techniques and model enhancements to better handle these problems, focusing on contributions in four areas. First, we improve background subtraction so it can better handle temporally irregular dynamic textures. This allows us to achieve a 5.5% drop in false positive rate on the Wall ower waving trees video. Secondly, we adapt the Dalal and Triggs Histogram of Oriented Gradients pedestrian detector to work on large-scale scenes with dense crowds and harsh lighting conditions: challenges which prevent us from easily using a background subtraction solution. These scenes contain hundreds of simultaneously visible people. To make using the algorithm computationally feasible, we have produced a novel implementation that runs on commodity graphics hardware and is up to 76 faster than our CPU-only implementation. We demonstrate the utility of this detector by modeling scene-level activities with a Hierarchical Dirichlet Process. Third, we show how one can improve the quality of pedestrian silhouettes for recognizing individual people. We combine general appearance information from a large population of pedestrians with semi-periodic shape information from individual silhouette sequences. Finally, we show how one can combine a variety of detection and tracking techniques to robustly handle a variety of event detection scenarios such as theft and left-luggage detection.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2009
Accession Number
ADA607566

Entities

People

  • Gerald E. Dalley

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • Biomedical
  • Energy and Power Technologies
  • Ground and Sea Platforms
  • Human Systems
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Computational Science
  • Computer Languages
  • Computer Programs
  • Computer Vision
  • Data Mining
  • Detectors
  • Information Processing
  • Information Science
  • Kalman Filters
  • Machine Learning
  • Mathematical Filters
  • Monte Carlo Method
  • Natural Language Processing
  • Network Science
  • Ontologies

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Computer Vision.

Technology Areas

  • AI & ML
  • Space
  • Space - Space Objects