A Subset Selection Procedure for Regression Variables,

Abstract

Given a regression model with p independent variables, several methods are available for selecting a subset of size t < p which gives an adequate description of the dependent variable. By using the capabilities of the computer, one can now determine the subset corresponding to the largest sample multiple correlation coefficient or equivalently the smallest residual mean square. Due to sampling variation, however, there is no guarantee that this corresponds to the smallest value of the expected residual mean square. A procedure is presented to determine a collection of subsets, each of given size t, having the property that the probability of including the subset corresponding to the smallest value of the expected residual mean square is bounded below by some prespecified constant, 1 - alpha. An example using real data is examined to illustrate the technique. (Author)

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 1973
Accession Number
AD0757430

Entities

People

  • George P. Mccabe Jr.
  • James N. Arvesen

Organizations

  • Purdue University

Tags

DTIC Thesaurus Topics

  • Coefficients
  • Computers
  • Guarantees
  • Probability
  • Residuals
  • Sampling

Fields of Study

  • Mathematics

Readers

  • Computational Modeling and Simulation
  • Linear Algebra