So far, we have seen only the simple case of both the response and the predictor variables being continuous. Now, let's generalize the model a bit, and enter a discrete predictor into the model. Take the usair
data and add x5
(precipitation: average number of wet days per year) as a predictor with three categories (low, middle, and high levels of precipitation), using 30 and 45 as the cut-points. The research question is how these precipitation groups are associated with the SO2 concentration. The association is not necessary linear, as the following plot shows:
> plot(y ~ x5, data = usair, cex.lab = 1.5) > abline(lm(y ~ x5, data = usair), col = 'red', lwd = 2.5, lty = 1) > abline(lm(y ~ x5, data = usair[usair$x5<=45,]), + col = 'red', lwd = 2.5, lty = 3) > abline(lm(y ~ x5, data = usair[usair$x5 >=30, ]), + col = 'red', lwd = 2.5, lty = 2) > abline(v = c(30, 45), col = 'blue', lwd = 2.5) > legend('topleft', lty = c(1, 3, 2, 1), lwd = rep(2...