Randomized Ensemble Methods for Classification Trees

Abstract

We propose two methods of constructing ensembles of classifiers, One method directly injects randomness into classification tree algorithms by choosing a split randomly at each node with probabilities proportional to the measure of goodness for a split We combine this method with a stopping rule which uses permutation of the outputs The other method perturbs the output and constructs a classifier using the perturbed data, In both methods, the final classifier is given by an unweighted vote of the individual classifiers, These methods are compared with bagging, Adaboost, and random forests on thirteen commonly used data sets, The results show that our methods perform better than bagging, and comparably to Adaboost and random forests on average, Additional computation shows that our perturbation method could improve its performance by perturbing both the inputs and with the outputs, and combining a sufficiently large number of trees, Plots of strength and correlation show an interesting relationship, We also explore combining sampling subsets of the training set with our proposed methods, The results of a few trials show that the performance of our proposed methods could be improved by combining sampling subsets of the training set,

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2002
Accession Number
ADA407091

Entities

People

  • Izumi Kobayashi

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Energy and Power Technologies
  • Human Systems
  • Space

DTIC Thesaurus Topics

  • Air Force
  • Algorithms
  • Computations
  • Computer Science
  • Data Mining
  • Data Science
  • Data Sets
  • Databases
  • Information Science
  • Machine Learning
  • Network Science
  • Neural Networks
  • Operations Research
  • Pattern Recognition
  • Probability
  • Supervised Machine Learning
  • Training

Fields of Study

  • Computer science

Readers

  • Computer Vision.
  • Regression Analysis.