Book Image

Hands-On Ensemble Learning with Python

By : George Kyriakides, Konstantinos G. Margaritis
Book Image

Hands-On Ensemble Learning with Python

By: George Kyriakides, Konstantinos G. Margaritis

Overview of this book

Ensembling is a technique of combining two or more similar or dissimilar machine learning algorithms to create a model that delivers superior predictive power. This book will demonstrate how you can use a variety of weak algorithms to make a strong predictive model. With its hands-on approach, you'll not only get up to speed with the basic theory but also the application of different ensemble learning techniques. Using examples and real-world datasets, you'll be able to produce better machine learning models to solve supervised learning problems such as classification and regression. In addition to this, you'll go on to leverage ensemble learning techniques such as clustering to produce unsupervised machine learning models. As you progress, the chapters will cover different machine learning algorithms that are widely used in the practical world to make predictions and classifications. You'll even get to grips with the use of Python libraries such as scikit-learn and Keras for implementing different ensemble models. By the end of this book, you will be well-versed in ensemble learning, and have the skills you need to understand which ensemble method is required for which problem, and successfully implement them in real-world scenarios.
Table of Contents (20 chapters)
Free Chapter
1
Section 1: Introduction and Required Software Tools
4
Section 2: Non-Generative Methods
7
Section 3: Generative Methods
11
Section 4: Clustering
13
Section 5: Real World Applications

What this book covers

Chapter 1, A Machine Learning Refresher, presents an overview of machine learning, including basic concepts such as training/test sets, performance measures, supervised and unsupervised learning, machine learning algorithms, and benchmark datasets.

Chapter 2, Getting Started with Ensemble Learning, introduces the concept of ensemble learning, highlighting the problems that it solves as well as the problems that it poses.

Chapter 3, Voting, introduces the most simple ensemble learning technique, voting, while explaining the difference between hard and soft voting. You will learn how to implement a custom classifier, as well as use scikit-learn's implementation of hard/soft voting.

Chapter 4, Stacking, covers meta learning (stacking) a more advanced ensemble learning method. After reading this chapter, you will be able to implement a stacking classifier in Python to use with scikit-learn classifiers.

Chapter 5, Bagging, introduces bootstrap resampling and the first generative ensemble learning technique, bagging. Furthermore, this chapter guides you through the process of implementing the technique in Python, as well as how to use the scikit-learn implementation.

Chapter 6, Boosting, touches on more advanced subjects in ensemble learning. This chapter explains how popular boosting algorithms work and are implemented. Furthermore, it presents XGBoost, a highly successful distributed boosting library.

Chapter 7, Random Forests, goes through the process of creating random decision trees by subsampling the instances and features of a dataset. Moreover, this chapter explains how to utilize an ensemble of random trees to create a random forest. Finally, this chapter presents scikit-learn's implementations and how to use them.

Chapter 8, Clustering, introduces to the possibility of using ensembles for unsupervised learning tasks, such as clustering. Furthermore, the OpenEnsembles Python library is introduced, along with guidance on using it.

Chapter 9, Classifying Fraudulent Transactions, presents an application for the classification of a real-world dataset, using ensemble learning techniques presented in earlier chapters. The dataset concerns fraudulent credit card transactions.

Chapter 10, Predicting Bitcoin Prices, presents an application for the regression of a real-world dataset, using ensemble learning techniques presented in earlier chapters. The dataset concerns the price of the popular cryptocurrency Bitcoin.

Chapter 11, Evaluating Sentiment on Twitter, presents an application for evaluating the sentiment of various tweets using a real-world dataset.

Chapter 12, Recommending Movies with Keras, presents the process of creating a recommender system using ensembles of neural networks.

Chapter 13, Clustering World Happiness, presents the process of using an ensemble learning approach to cluster data from the World Happiness Report 2018.