A Comparison of Variable Selection Criteria for Multiple Linear Regression: A Second Simulation Study

Abstract

This thesis implements a variable selection method proposed by Alan J. Miller, and makes an extension of Ross J. Hansen's 1988 thesis research by comparing the methods he examined: Minimum MSE, Minimum Sp, and Minimum Cp with Miller's method. Response Surface methodology is employed with two performance measures: the percentage of correct variables in a model and the Theoretical Mean Squared Error of Prediction (TMSEP). Each technique is applied on generated data with known multicollinearities, variances, random predictors, and sample sizes. Both performance measures are computed for models selected under each technique. A full factorial design using each performance measure is set up to study the effectiveness of each variable selection technique with respect to the known data characteristics. Equations are generated which relate these data characteristics to each combination of performance measure and selection method. A graphical analysis of variance is performed to summarize each technique's performance. Miller's method is shown to be the best overall technique for selecting models with the highest percentage of correct variables. Minimum MSE, followed closely by Minimum Sp, selected models with the least TMSEP.... Statistics, Regression analysis, Least squares method, Subset selection.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 1993
Accession Number
ADA262512

Entities

People

  • David P. Woollard

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Air Force
  • Analysis Of Variance
  • Computational Science
  • Computer Programming
  • Computer Programs
  • Computers
  • Data Science
  • Data Sets
  • Equations
  • Factorial Design
  • Information Science
  • Least Squares Method
  • Mathematical Models
  • Regression Analysis
  • Simulations
  • Statistical Analysis
  • Statistics

Readers

  • Regression Analysis.