Issues in Scaling Up Machine Learning

Abstract

This grant investigates issues in improving the accuracy of machine learning systems. The classic machine learning paradigm for prediction has been to learn a set of decision structures or models from a training set and select one for prediction on unseen test data. Rather than select a single node from the set, the focus of this project's research has been to combine the prediction of the learned models to form an improved estimate. The two fronts of this research are regression and classification. In the realm of regression, the task is to predict a single continuous value for an example. The majority of research in this area has focused on simple linear combination of the learned models. The nature of these weights may span from being highly regularized completely unconstrained. A set of weights is considered highly regularized if they are all positive, they sum to one, or they are uniform. Completely unconstrained weights have no restrictions and may be derived by methods like ordinary least squares regression. The degree of regularization required depends on the particular regression problem. The project has developed a technique called PCRY, which automatically estimates the appropriate degrease regularization for a given data set. The basic idea is to use the eigen structure of the model predictions on the training data to derive a continuum of possible weight sets ranging front highly regularized to completely unconstrained. Cross validation is used to estimate which weight set is most appropriate.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Mar 13, 1997
Accession Number: ADA337740

Entities

People

Michael Pazzani

Organizations

University of California, Irvine

Issues in Scaling Up Machine Learning

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas