To summarize the contents of the previous section, we explore more complex data with linear regression. In this recipe, we demonstrate how to apply linear regression to analyze the Survey of Labor and Income Dynamics (SLID) dataset.
Check whether the car
library is installed and loaded, as it is required to access thedataset SLID.
Follow these steps to perform linear regression on SLID data:
You can use the
str
function to get an overview of the data:> str(SLID) 'data.frame': 7425 obs. of 5 variables: $ wages : num 10.6 11 NA 17.8 NA ... $ education: num 15 13.2 16 14 8 16 12 14.5 15 10 ... $ age : int 40 19 49 46 71 50 70 42 31 56 ... $ sex : Factor w/ 2 levels "Female","Male": 2 2 2 2 2 1 1 1 2 1 ... $ language : Factor w/ 3 levels "English","French",..: 1 1 3 3 1 1 1 1 1 1 ..
First, we visualize the variable wages against language, age, education, and sex:
> par(mfrow=c(2,2)) >...