Preparing data and base models
Before introducing and applying XGBoost hyperparameters, let's prepare by doing the following:
Getting the heart disease dataset
Scoring a baseline XGBoost model
RandomizedSearchCVto form one powerful function
Good preparation is essential for gaining accuracy, consistency, and speed when fine-tuning hyperparameters.
The heart disease dataset
The dataset used throughout this chapter is the heart disease dataset originally presented in Chapter 2, Decision Trees in Depth. We have chosen the same dataset to maximize the time spent doing hyperparameter fine-tuning, and to minimize the time spent on data analysis. Let's begin the process:
Go to https://github.com/PacktPublishing/Hands-On-Gradient-Boosting-with-XGBoost-and-Scikit-learn/tree/master/Chapter06 to load
heart_disease.csvinto a DataFrame and display the...