In Chapter 6, Linear Regression Analysis, and Chapter 7, Logistic Regression Model, we focused on the linear and logistic regression models. In the model selection issues with the linear regression model, we found that a covariate is either selected or not, depending on the associated p-value. However, the rejected covariates are not given any kind of consideration once the p-value is less than the threshold. This may lead to discarding the covariates, even if they have some influence on the regressand. In particular, the final model may thus lead to overfitting of the data, and this problem needs to be addressed.
We will first consider fitting a polynomial regression model, without the technical details, and see how higher order polynomials give a very good fit, which comes with a higher price. A more general framework of B-splines is considered next. This approach leads us to the smooth spline models, which are actually ridge regression models...