Sensitivity Analysis of the Topology of Classification Trees

Abstract

The use of classification trees is one of the most widely used techniques in classification. It is well known that classification trees are not stable in their topology, in contrast to their robustness with respect to misclassification rate. This thesis defines a measure that compares the topology of two trees and studies how a tree's topology changes when the dependent (y) variable or the independent (x) variables are perturbed. This allows us to examine the "robustness" of tree topology under perturbation and to compare it to the robustness with respect to the misclassification rate under the same perturbations. We show that the tree topology can change significantly even for small perturbations in many sets of data. This suggests that even small measurement errors in the variables can affect the tree topology greatly. Because data are often measured with error, it follows that splitting rules in trees may not be suitable for use in making policy decisions. We propose a measure for tree topology, and show that tree topology changes faster than the misclassification rate does under mild perturbations. This finding formalizes the concept that tree models are more stable in terms of misclassification rate than in terms of topology.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1999
Accession Number
ADA372965

Entities

People

  • Izumi Kobayashi

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Classification
  • Computer Programs
  • Computer Science
  • Computers
  • Contrast
  • Data Analysis
  • Data Sets
  • Information Science
  • Measurement
  • New York
  • Perturbations
  • Probability
  • Programming Languages
  • Sensitivity
  • Simulations
  • Splitting

Fields of Study

  • Computer science

Readers

  • Control Systems Engineering.
  • Neural Network Machine Learning.