-
Book Overview & Buying
-
Table Of Contents
Mastering Machine Learning with R - Second Edition
By :
We are once again going to visit our wine data set that we used in Chapter 8, Cluster Analysis. If you recall, it consists of 13 numeric features and a response of three possible classes of wine. Our task is to predict those classes. I will include one interesting twist and that is to artificially increase the number of observations. The reasons are twofold. First, I want to fully demonstrate the resampling capabilities of the mlr package, and second, I wish to cover a synthetic sampling technique. We utilized upsampling in the prior section, so synthetic is in order.
Our first task is to load the package libraries and bring the data:
> library(mlr) > library(ggplot2) > library(HDclassif) > library(DMwR) > library(reshape2) > library(corrplot) > data(wine) > table(wine$class) 1 2 3 59 71 48
We have 178 observations, plus the response labels are numeric (1, 2 and 3). Let's more than double...