Date of Award
Doctor of Philosophy
Dr. Joshua Naranjo
Dr. Joseph McKean,
Dr. Hyun Bin Kang
Dr. Bradford Dykes
PRESS statistic, influential observations, diagnostic analysis, mean square error
The most popularly used statistic R2 has a fundamental weakness in model building: it favors adding more predictors to the model because R2 can only increase. In effect, the additional predictors start fitting the noise in data. Other criterion in selecting a regression model such as R2 adj , AIC, SBC, and Mallow’s Cp does not guarantee the model selected will also make better prediction of future values. To avoid this, data scientists withhold a percentage of the data for validation purposes. The PRESS statistic does something similar by withholding each observation in calculating its own predicted value. In this paper, we investigated the properties of PRESS statistic and explored how it performs compared to other criterion in model selection. We also derived estimators of the parameters of interest in linear regression that is based on PRESS, while maintaining desirable statistical properties of estimators such as unbiasedness. A diagnostic statistic that looks at the impact of deleting one observation from the estimation of MSE is also presented.
Alcantara, Ida Marie, "Statistical Properties and Applications of Press Statistic" (2020). Dissertations. 3589.