An Unsupervised Algorithm for Segmenting Categorical Timeseries into Episodes

Abstract

This paper describes an unsupervised algorithm for segmenting categorical time series into episodes. The VOTING-EXPERTS algorithm first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over the series and has two "expert methods" decide where in the window boundaries should be drawn. The algorithm successfully segments text into words in four languages. The algorithm also segments time series of robot sensor data into subsequences that represent episodes in the life of the robot. We claim that VOTING-EXPERTS finds meaningful episodes in categorical time series because it exploits two statistical characteristics of meaningful episodes.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2002
Accession Number
ADA461169

Entities

People

  • Brent Heeringa
  • Niall Adams
  • Paul Cohen

Organizations

  • University of Massachusetts Amherst

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Boundaries
  • Computer Science
  • Computer Vision
  • Frequency
  • Information Retrieval
  • Information Science
  • Language
  • Mathematics
  • Natural Languages
  • Numbers
  • Personality
  • Precision
  • Random Variables
  • Sequences
  • Standards
  • Statistical Samples

Fields of Study

  • Computer science

Readers

  • Computer Vision.
  • Mathematical Modeling and Probability Theory.
  • Personnel Management and Statistics in the Military and Department of Defense

Technology Areas

  • AI & ML
  • Autonomy