Adaptive, Hands-Off Stream Mining

Abstract

Sensor devices and embedded processors are becoming ubiquitous, especially in measurement and monitoring applications. Automatic discovery of patterns and trends in the large volumes of such data is of paramount importance. The combination of relatively limited resources (CPU, memory and/or communication bandwidth and power) poses some interesting challenges. We need both powerful and concise languages to represent the important features of the data, which can (a) adapt and handle arbitrary periodic components, including bursts, and (b) require little memory and a single pass over the data. This allows sensors to automatically (a) discover interesting patterns and trends in the data, and (b) perform outlier detection to alert users. We need a way so that a sensor can discover something like the hourly phone call volume so far follows a daily and a weekly periodicity, with bursts roughly every year, which a human might recognize as, e.g., the Mother's Day surge. When possible and if desired, the user can then issue explicit queries to further investigate the reported patterns. In this work we propose AWSOM (Arbitrary Window Stream mOdeling Method), which allows sensors operating in remote or hostile environments to discover patterns efficiently and effectively, with practically no user intervention. Our algorithms require limited resources and can thus be incorporated in individual sensors, possibly alongside a distributed query processing engine [CCC+02, BGS01, MSHR02]. Updates are performed in constant time, using sub-linear (in fact, logarithmic) space. Existing, state of the art forecasting methods (AR, SARIMA, GARCH, etc.) fall short on one or more of these requirements. To the best of our knowledge, AWSOM is the first method that has all the above characteristics.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 2002
Accession Number
ADA461108

Entities

People

  • Anthony Brockwell
  • Christos Faloutsos
  • Spiros Papadimitriou

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Anomaly Detection
  • Change Detection
  • Computers
  • Data Mining
  • Data Sets
  • Detection
  • Detectors
  • Dimensionality Reduction
  • Equations
  • Frequency
  • Information Science
  • Maximum Likelihood Estimation
  • Measurement
  • Network Science
  • Signal Processing
  • Statistics

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Parallel and Distributed Computing.
  • Systems Analysis and Design

Technology Areas

  • Space