Bootstrap Calibration, Model Selection and Tree-Structured Methods

Abstract

Several problems in variable selection and decision trees were solved. In the case of linear regression models with increasing number of covariates, a method based on ordering the covariates in terms of their t-statistics is shown to be asymptotically consistent as the sample size increases. This result holds for the fixed design situation as well as that of random covariates. A new unbiased method of split selection for classification trees was developed and implemented into computer software. The method is unbiased in the sense that when all the covariates are unrelated to the response variable, each covariate has an equal chance of being selected to split a node. No previous algorithm has this property. Bootstrap calibration plays a critical role in the algorithm. Empirical evaluations of the algorithm show that it is as accurate as the best classifiers from the statistical and computer science literature. It has the additional benefit of being one of the fastest algorithms.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 06, 1998
Accession Number
ADA344443

Entities

People

  • Loh Wei-yin

Organizations

  • University of Wisconsin–Madison

Tags

Communities of Interest

  • Energy and Power Technologies
  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • Application Software
  • Artificial Intelligence Software
  • Calibration
  • Classification
  • Computer Programs
  • Computer Science
  • Computers
  • Data Analysis
  • Data Science
  • Information Science
  • Machine Learning
  • Network Science
  • Neural Networks
  • Simulations
  • Statistics

Fields of Study

  • Mathematics

Readers

  • Finite Element Method (FEM) for solving Partial Differential Equations (PDEs)
  • Regression Analysis.