Wrappers for Performance Enhancement and Oblivious Decision Graphs

Abstract

In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are studied under the wrapper approach. The hypothesis spaces we investigate are: decision tables with a default majority rule (DTMs) and oblivious read-once decision graphs (OODGs). For accuracy estimation, we investigate cross-validation and the .632 bootstrap. We show examples where they fail and conduct a large scale study comparing them. We conclude that repeated runs of five-fold cross-validation give a good tradeo between bias and variance for the problem of model selection used in later chapters. We define the wrapper approach and use it for feature subset selection and parameter tuning. We relate definitions of feature relevancy to the set of optimal features, which is defined with respect to both a concept and an induction algorithm. The wrapper approach requires a search space, operators, a search engine, and an evaluation function. We investigate all of them in detail and introduce compound operators for feature subset selection. Finally, we abstract the search problem into search with probabilistic estimates. We introduce decision tables with a default majority rule (DTMs) to test the conjecture that feature subset selection is a very powerful bias. The accuracy of induced DTMs is surprisingly powerful, and we concluded that this bias is extremely important for many real-world datasets. We show that the resulting decision tables are very small and can be succinctly displayed. We study properties of oblivious read-once decision graphs (OODGs) and show that they do not suffer from some inherent limitations of decision trees.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Sep 01, 1995
Accession Number: ADA601756

Entities

People

Ron Kohavi

Organizations

Carnegie Mellon University

Wrappers for Performance Enhancement and Oblivious Decision Graphs

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas