Chapter 1: Machine Learning Landscape
Welcome to Hands-On Gradient Boosting with XGBoost and Scikit-Learn, a book that will teach you the foundations, tips, and tricks of XGBoost, the best machine learning algorithm for making predictions from tabular data.
The focus of this book is XGBoost, also known as Extreme Gradient Boosting. The structure, function, and raw power of XGBoost will be fleshed out in increasing detail in each chapter. The chapters unfold to tell an incredible story: the story of XGBoost. By the end of this book, you will be an expert in leveraging XGBoost to make predictions from real data.
In the first chapter, XGBoost is presented in a sneak preview. It makes a guest appearance in the larger context of machine learning regression and classification to set the stage for what's to come.
This chapter focuses on preparing data for machine learning, a process also known as data wrangling. In addition to building machine learning models, you will learn about using efficient Python code to load data, describe data, handle null values, transform data into numerical columns, split data into training and test sets, build machine learning models, and implement cross-validation, as well as comparing linear regression and logistic regression models with XGBoost.
The concepts and libraries presented in this chapter are used throughout the book.
This chapter consists of the following topics:
Previewing XGBoost
Wrangling data
Predicting regression
Predicting classification