Book Image

Hands-On Ensemble Learning with Python

By : George Kyriakides, Konstantinos G. Margaritis
Book Image

Hands-On Ensemble Learning with Python

By: George Kyriakides, Konstantinos G. Margaritis

Overview of this book

Ensembling is a technique of combining two or more similar or dissimilar machine learning algorithms to create a model that delivers superior predictive power. This book will demonstrate how you can use a variety of weak algorithms to make a strong predictive model. With its hands-on approach, you'll not only get up to speed with the basic theory but also the application of different ensemble learning techniques. Using examples and real-world datasets, you'll be able to produce better machine learning models to solve supervised learning problems such as classification and regression. In addition to this, you'll go on to leverage ensemble learning techniques such as clustering to produce unsupervised machine learning models. As you progress, the chapters will cover different machine learning algorithms that are widely used in the practical world to make predictions and classifications. You'll even get to grips with the use of Python libraries such as scikit-learn and Keras for implementing different ensemble models. By the end of this book, you will be well-versed in ensemble learning, and have the skills you need to understand which ensemble method is required for which problem, and successfully implement them in real-world scenarios.
Table of Contents (20 chapters)
Free Chapter
1
Section 1: Introduction and Required Software Tools
4
Section 2: Non-Generative Methods
7
Section 3: Generative Methods
11
Section 4: Clustering
13
Section 5: Real World Applications

Using OpenEnsembles

OpenEnsembles is a Python library that is dedicated to ensemble methods for clustering. In this section, we will present its usage and utilize it in order to cluster some of our example datasets. In order to install the library, the pip install openensembles command must be executed in the Terminal. Although it leverages scikit-learn, its interface is different. One major difference is that data must be passed as a data class, implemented by OpenEnsembles. The constructor has two input parameters: a pandas DataFrame which contains the data, and a list which contains the feature names:

# --- SECTION 1 ---
# Libraries and data loading
import openensembles as oe
import pandas as pd
import sklearn.metrics

from sklearn.datasets import load_breast_cancer

bc = load_breast_cancer()

# --- SECTION 2 ---
# Create the data object
cluster_data = oe.data(pd.DataFrame(bc.data), bc...