-
Book Overview & Buying
-
Table Of Contents
Hands-On Ensemble Learning with R
By :
In the previous section, the housing data underwent a lot of analytical pre-processing, and we are now ready to further analyze this. First, we begin with visualization. Since we have a lot of variables, the visualization on the R visual device is slightly difficult. As seen in earlier chapters, to visualize the random forests and other large, complex structures, we will initiate a PDF device and store the graphs in it. In the housing dataset, the main variable is the housing price and so we will first name the output variable SalePrice. We need to visualize the data in a way that facilitates the relationship between the numerous variables and the SalePrice. The independent variables can be either numeric or categorical. If the variables are numeric, a scatterplot will indicate the kind of relationship between the variable and the SalePrice regressand. If the independent variable is categorical/factor, we will visualize the boxplot at each level of the...