Building non-correlated ensembles
– Song, Kaggle Winner
The winning models of Kaggle competitions are rarely individual models; they are almost always ensembles. By ensembles, I do not mean boosting or bagging models, such as random forests or XGBoost, but pure ensembles that include any distinct models, including XGBoost, random forests, and others.
In this section, we will combine machine learning models into non-correlated ensembles to gain accuracy and reduce overfitting.
Range of models
The Wisconsin Breast Cancer dataset, used to predict whether a patient has breast cancer, has 569 rows and 30 columns, and can be viewed at https://scikit...