T-Cube: A Data Structure for Fast Extraction of Time Series from Large Datasets

Abstract

This report introduces a data structure called T-Cube designed to dramatically improve response time to ad-hoc time series queries against large datasets. We have tested T-Cube on both synthetic and real world data "emergency room patient visits, pharmacy sales" containing millions of records. The results indicate that T-Cube responds to complex queries 1,000 times faster when compared to the state-of-the-art commercial time series extraction tools. This speedup has two main benefits: 1. It enables massive scale statistical mining of large collections of time series data, and 2. It allows its users to perform many complex ad-hoc queries without inconvenient delays. These benefits have been already found useful in applications related to practice of monitoring safety of food and agriculture, in detection of emerging patterns of failures in maintenance and supply management systems, as well as in the original application domain: bio-surveillance.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 2007
Accession Number
ADA471457

Entities

People

  • Andrew W. Moore
  • Artur Dubrawski
  • Maheshkumar Sabhnani

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Age Groups
  • Air Force
  • Computer Science
  • Computers
  • Data Mining
  • Data Sets
  • Databases
  • Emergencies
  • Extraction
  • Health
  • Health Services
  • Information Science
  • Machine Learning
  • Operating Systems
  • Public Health
  • Statistical Analysis
  • Surveillance

Fields of Study

  • Computer science

Readers

  • Educational Psychology
  • Industrial Economics
  • Neural Network Machine Learning.