By now we should have become quite familiar with machine learning and data science in Java: we have covered both supervised and unsupervised learning and also considered an application of machine learning to textual data.
In this chapter, we continue with supervised machine learning and will discuss a library which gives state-of-the-art performance in many supervised tasks: XGBoost and Extreme Gradient Boosting. We will look at familiar problems such as predicting whether a URL ranks for the first page or not, performance prediction, and ranking for the search engine, but this time we will use XGBoost to solve the problem.
The outline of this chapter is as follows:
- Gradient Boosting Machines and XGBoost
- Installing XGBoost
- XGBoost for classification
- XGBoost for regression
- XGBoost for learning to rank
By the end of this chapter, you will learn how to build XGBoost from the sources and use it for solving data science problems.