Robust Order Statistics Based Ensembles for Distributed Data Mining
Abstract
This chapter is rooted in the ensemble framework and shows how order statistics can be used in the design of a "meta-learner" that examines the outputs of multiple distributed classifers and provides a final decision. Order statistics is one of the key tools of robust statistics, tailored to handling data with outliers. in a distributed data mining scenario in which there is wide variability among the individual classifers because of the underlying quality of the local data that they examine, a meta-learner should be able to tolerate a few outlier classifer results. The robust properties of order statistics based approaches such as median filtering and m-estimators (Arnold, Balakrishnan, and Nagaraja 1992), have been observed in many disciplines. Thus they are an obvious candidate for meta-learning in such environments.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2001
- Accession Number
- ADA395854
Entities
People
- Joydeep Ghosh
- Kagan Tumer
Organizations
- University of Texas at Austin