Sensitivity Analysis of the Topology of Classification Trees
Abstract
The use of classification trees is one of the most widely used techniques in classification. It is well known that classification trees are not stable in their topology, in contrast to their robustness with respect to misclassification rate. This thesis defines a measure that compares the topology of two trees and studies how a tree's topology changes when the dependent (y) variable or the independent (x) variables are perturbed. This allows us to examine the "robustness" of tree topology under perturbation and to compare it to the robustness with respect to the misclassification rate under the same perturbations. We show that the tree topology can change significantly even for small perturbations in many sets of data. This suggests that even small measurement errors in the variables can affect the tree topology greatly. Because data are often measured with error, it follows that splitting rules in trees may not be suitable for use in making policy decisions. We propose a measure for tree topology, and show that tree topology changes faster than the misclassification rate does under mild perturbations. This finding formalizes the concept that tree models are more stable in terms of misclassification rate than in terms of topology.
Document Details
- Document Type
- Technical Report
- Publication Date
- Dec 01, 1999
- Accession Number
- ADA372965
Entities
People
- Izumi Kobayashi
Organizations
- Naval Postgraduate School