Streaming Inference for Dependent Data: Making Inferences On-the-Fly from Large Complex Data Sources

Abstract

Project Abstract Advances in surveillance technology, such as wearable cameras and a range of other new ac- tive and passive sensors, have lead to an onslaught of increasingly complex data streams. The challenge of deriving intelligence from such massive, distributed, and diverse data sources| often providing observations without bound|is one of the largest issues faced by DoD today. To garner the full potential of the data, it is crucial to capture the intricate dependencies within and between the data streams. In such cases, the Bayesian framework is appeal- ing by enabling exible generative modeling of various dependencies, in addition to being able to (1) cope with noisy and incomplete data sources, (2) integrate information from mul- tiple sensing modalities, and (3) coherently propagate and output measures of uncertainty. Considering Bayesian nonparametric models additionally allows automatic adaptation of our world model with new data, a crucial feature in a complex streaming data scenario. An increasingly rich, structured set of Bayesian and Bayesian nonparametric models have been proposed to capture relational and dynamic dependencies. Unfortunately, the associated inference tools have not advanced at the same pace. Algorithms notoriously scale poorly to large datasets, especially in the presence of complex dependencies, and typically one is restricted to analyzing a xed, small batch of data. Intelligence analysts, on the other hand need to cope with large, streaming data, and as a result have been forced to throw away important structure and consider simpler, but more scalable methods. We in- stead propose algorithms for making inferences on-the- y|including providing associated measures of uncertainty|based on complex, heterogeneous sensing technologies providing unbounded streams of observations. We consider the running example of activity recogni- tion and anomaly detection, and provide a road map for devising e cient algorithms for Bayesian inference that preserve the important temporal and relational modeling structure critical to the data being analyzed. This road map includes scienti c develop- ments along three major research threads: 1. Streaming Bayesian Nonparametric Inference: We develop sequential algorithms for Bayesian approximate inference in models whose complexity can adapt on-the- y, and provide bounds on the approximation errors. 2. Large-Scale Activity Recognition from Dependent Data Sources: To jointly learn model parameters and make inferences from sequences of observations at scale, we harness information decay to consider only local temporal and relational structure. 3. Which Model to Use? To automate the daunting task of model selection, we propose a context free grammar on the product space of relational and temporal structures. We consider this task within the context of available computing resources and constraints. Our proposed systems provide parallel and distributed implementations and enable inferences to be made on-the- y and in-the-cloud from richer, more expressive models for the observed data streams. As such, this research is an important step towards increasing the impact of data collected in surveillance tasks: We are arming analysts with tools to e ectively extract information from networks of advanced sensors developed and deployed over recent years. Through the proposed technology, we believe we can positively impact the goal of Information Dominance in creating an integrated information and decision-making space. 1

Document Details

Document Type
DoD Grant Award
Publication Date
Aug 12, 2016
Source ID
N000141512380

Entities

People

  • Emily B. Fox

Organizations

  • Office of Naval Research
  • United States Navy
  • University of Washington

Tags

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Statistical inference.

Technology Areas

  • AI & ML
  • Space