Ensembles and Their Applications
Abstract
Classifiers are powerful tools that, given some predictor variables, make a prediction of which category a variable belongs. Linear Discriminant Analysis, k-nearest neighbor, and classification trees are a few of these classifiers. An ensemble attempts to iteratively apply a classifier to a data set to decrease the error rate over a single classifier. Ensembles were the focus of my research. I discussed three accepted, yet still developing, ensembles: Bagging, Arc-Boosting, and Ada-Boosting. Then I demonstrated that they usually decrease the error rate over a single classification tree. I also demonstrated that the k-nearest neighbor classifier rarely benefits from an ensemble. Thorough study of the three ensemble methods led to the exploration of a new ensemble-method that proved to reduce error rate on all the sampled data sets, but did not produce competitive results with the other ensemble methods. Most importantly, predictions on the Landsat Imagery Satellite data improve dramatically from all three of the ensembles, when a classification tree is the classifier. Some error rates were bettered by up to 25%, while others saw less significant reductions. However, there was at least a small reduction in error rate for nearly all of the data sets, so long as the single classifier performed slightly better than guessing.
Document Details
- Document Type
- Technical Report
- Publication Date
- Dec 01, 2000
- Accession Number
- ADA387205
Entities
People
- Matt C. Dixon
Organizations
- Air Force Institute of Technology