There are a number of approaches to learning in multiclass problems. Techniques such as random forest and discriminant analysis will deal with multiclass while some techniques and/or packages will not, for example, generalized linear models, glm()
, in base R. As of this writing, the caretEnsemble
package, unfortunately, will not work with multiclasses. However, the Machine Learning in R (mlr
) package does support multiple classes and also ensemble methods. If you are familiar with sci-kit Learn for Python, one could say that mlr
endeavors to provide the same functionality for R. The mlr and the caret-based packages are quickly turning into my favorites for almost any business problem. I intend to demonstrate how powerful the package is on a multiclass problem, then conclude by showing how to do an ensemble on the Pima
data.
For the multiclass problem, we will look at how to tune a random forest and then examine how to take a GLM and turn it into a multiclass learner...