Sequential Learning with Very Non-uniformly Sampled Time Series

Abstract

Time series classification problems implement supervised machine learning techniques to analyze temporally ordered data and classify new sequential data. Time series classification has grown in popularity as access to time series data has increased in recent years, and the problems have appeared across a wide spectrum of applications such as audio recordings, medical signals, and weather prediction. Generally, an assumption is made that the temporal ordering is uniformly or close to uniformly sampled. However, there are important applications where this is not the case. This project looked at a dataset that was a very non-uniformly sampled time series with the task of classification of three labels. The dataset was also quite large and required very high dimensional features. These considerations encouraged the use of sequential learning techniques. Sequential learning refers to machine learning models that have sequences of data as the input or output. The goal of this project was to identify pre-processing techniques and approaches for generating sequences that would be helpful for this classification task. If successful, the results could help give insights to similar sequential learning problems. The data were first standardized over the entire dataset. The data as given had large gaps of time where no samples resided, called dead zones, that were artificially filled in by a process of interpolating and zero-mean padding. A relative time encoding feature was also created to help the predictor interpret the amount of time between bursts of data. Decimation was performed to maintain the sequence length for a window while simultaneously increasing the duration of time that it represented. A jointly optimal predictor was determined as (D, N, P, S) = (8, 644616, 250, S/8) where D represents the decimation factor, N represents the number of sequences used in training, P represents the window length, and S represents the stride.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 12, 2022
Accession Number
AD1167434

Entities

People

  • Michael Gregg

Organizations

  • Ohio State University

Tags

Communities of Interest

  • Autonomy
  • Biomedical

DTIC Thesaurus Topics

  • Abstracts
  • Accuracy
  • Air Force
  • Air Force Facilities
  • Air Force Research Laboratories
  • Classification
  • Coding
  • Data Processing
  • Deep Learning
  • Dimensionality Reduction
  • Governments
  • Image Processing
  • Learning
  • Machine Learning
  • Notation
  • Sequences
  • Standardization
  • Supervised Machine Learning
  • Test Sets
  • Training
  • United States
  • Universities
  • Weather Forecasting

Fields of Study

  • Computer science

Readers

  • Computer Programming and Software Development.
  • Regression Analysis.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Neural Networks