The focus of this chapter is on combining the results from different models in order to produce a single model that will outperform individual models on their own. Bagging is essentially an intuitive procedure for combining multiple models trained on the same dataset, by using majority voting for classification models and average value for regression models. We'll present this procedure for the classification case, and later show how this is easily extended to handle regression models.
Note
Bagging procedure for binary classification
Inputs:
data: The input data frame containing the input features and a column with the binary output label.
M: An integer, representing the number of models that we want to train.
Output:
models: A set of Μ trained binary classifier models.
Method:
1. Create a random sample of size n, where n is the number of observations in the original dataset, with replacement. This means that some of the observations from the original training set will be repeated...