Learning in the context of distribution drift

Abstract

The increasing ubiquity of data and its ever-increasing use to deliver tangible value raises the need for ever more effective technologies for data analysis. Many online data sources are subject to distribution drift: the frequency of different factors and the relationships between them changeover time. This is problematic for machine learning because almost all algorithms assume that distributions are constant. This project investigates new technologies for learning in the context of distribution drift, guided by the insight that different subgroups will change in different ways, at different speeds and at different times. The results are leading towards robust and reliable data analytics, able to make more effective use of big data under real-world conditions of change. The key developments in this project have been the creation of:- a sound and applicable theoretical framework for analyzing concept drift,- efficient and effective techniques for analyzing, understanding and describing concept drift observed in real world data,- efficient and effective algorithms for learning from time varying data sequences,- efficient and effective algorithms for classifying high-dimensional data,- efficient and effective algorithms for handling ordinal data, and- efficient and effective algorithms for learning in the context of concept drift. These new algorithms and techniques greatly improve the community's capacity to learn under the demanding circumstances of concept drift.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 09, 2017
Accession Number
AD1037488

Entities

People

  • Geoff Webb

Organizations

  • Monash University

Tags

Communities of Interest

  • Biomedical
  • Space

DTIC Thesaurus Topics

  • Accuracy
  • Air Force
  • Air Force Research Laboratories
  • Bayesian Networks
  • Big Data
  • Climate Change
  • Computations
  • Data Mining
  • Distance Learning
  • Information Science
  • Machine Learning
  • Models
  • Probability
  • Probability Distributions
  • Public Health
  • Standards
  • Training

Fields of Study

  • Computer science

Readers

  • Artificial Intelligence
  • Maritime Combat Support and Expeditionary Logistics.
  • Regression Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms