Biased and Unbiased Cross-Validation in Density Estimation
Abstract
Nonparametric density estimation requires the specification of smoothing parameters. The demands of statistical objectivity make it highly desirable to base the choice on properties of the data set. This paper introduces some biased cross-validation criteria for selection of smoothing parameters for kernel and histogram density estimators. These criteria are obtained by estimating L sub 2-norms of derivatives of the unknown density and provide slightly biased estimates of the average squared-L sub 2 error or mean integrated squared error. These criteria are roughly the analog of the generalized cross-validation procedure for orthogonal series density estimators. The authors present the relationship of the biased cross-validation procedure to the least squares cross-validation procedure, which provides unbiased estimates of the mean integrated squared error. Both methods are shown to be based on U- statistics. The two methods are compared through theoretical calculation of the noise in the cross-validation functions and corresponding cross validated smoothing parameters, by Monte Carlo simulation, and by example. Surprisingly large gains in asymptotic efficiency are observed between biased and unbiased cross-validation when the underlying density is sufficiently smooth. Reliability of cross-validation for finite samples is discussed.
Document Details
- Document Type
- Technical Report
- Publication Date
- Apr 01, 1986
- Accession Number
- ADA166905
Entities
People
- David W Scott
- George R. Terrell
Organizations
- Stanford University