When it comes to bagging, it can be applied to both classification and regression. However, there is another technique that is also part of the ensemble family: boosting. However, the underlying principle of these two are quite different. In bagging, each of the models runs independently and then the results are aggregated at the end. This is a parallel operation. Boosting acts in a different way, since it flows sequentially. Each model here runs and passes on the significant features to another model:
To explain gradient boosting, we will take the route of Ben Gorman, a great data scientist. He has been able to explain it in a mathematical yet simple way. Let's say that we have got nine training examples wherein we are required to predict the age of a person based on three features, such as whether they like gardening, playing video games, or surfing the internet. The data for this is as follows:
To build this model, the objective is to minimize the mean squared...