Robust Order Statistics Based Ensembles for Distributed Data Mining

Abstract

This chapter is rooted in the ensemble framework and shows how order statistics can be used in the design of a "meta-learner" that examines the outputs of multiple distributed classifers and provides a final decision. Order statistics is one of the key tools of robust statistics, tailored to handling data with outliers. in a distributed data mining scenario in which there is wide variability among the individual classifers because of the underlying quality of the local data that they examine, a meta-learner should be able to tolerate a few outlier classifer results. The robust properties of order statistics based approaches such as median filtering and m-estimators (Arnold, Balakrishnan, and Nagaraja 1992), have been observed in many disciplines. Thus they are an obvious candidate for meta-learning in such environments.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2001
Accession Number
ADA395854

Entities

People

  • Joydeep Ghosh
  • Kagan Tumer

Organizations

  • University of Texas at Austin

Tags

Communities of Interest

  • Human Systems
  • Materials and Manufacturing Processes
  • Sensors
  • Space

DTIC Thesaurus Topics

  • Abstracts
  • Data Mining
  • Data Sets
  • Databases
  • Distribution Functions
  • Gaussian Distributions
  • Health Care
  • Information Processing
  • Information Science
  • Machine Learning
  • Nervous System
  • Order Statistics
  • Pattern Recognition
  • Probability
  • Probability Density Functions
  • Recognition
  • Statistics

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Neural Network Machine Learning.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks