The regularization techniques applied above will also work for classification problems, both binomial and multinomial. Therefore, let's not conclude this chapter until we apply some sample code on a logistic regression problem, specifically the breast cancer data from the prior chapter. As in regression with a quantitative response, this can be an important technique to utilize data sets with high dimensionality.
Recall that, in the breast cancer data we analyzed, the probability of a tumor being malignant can be denoted as follows in a logistic function:
P(malignant) = 1 / 1 + e-(B0 + B1X1 + BnXn)
Since we have a linear component in the function, L1 and L2 regularization can be applied. To demonstrate this, let's load and prepare the breast cancer data like we did in the previous chapter:
> library(MASS) > biopsy$ID = NULL > names(biopsy) = c("thick", "u.size", "u.shape", "adhsn", "s.size", "nucl", "chrom...