Robust Order Statistics Based Ensembles for Distributed Data Mining

Abstract

This chapter is rooted in the ensemble framework and shows how order statistics can be used in the design of a "meta-learner" that examines the outputs of multiple distributed classifers and provides a final decision. Order statistics is one of the key tools of robust statistics, tailored to handling data with outliers. in a distributed data mining scenario in which there is wide variability among the individual classifers because of the underlying quality of the local data that they examine, a meta-learner should be able to tolerate a few outlier classifer results. The robust properties of order statistics based approaches such as median filtering and m-estimators (Arnold, Balakrishnan, and Nagaraja 1992), have been observed in many disciplines. Thus they are an obvious candidate for meta-learning in such environments.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2001
Accession Number: ADA395854

Entities

People

Joydeep Ghosh
Kagan Tumer

Organizations

University of Texas at Austin

Robust Order Statistics Based Ensembles for Distributed Data Mining

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers

Technology Areas