Book Image

Artificial Intelligence with Python Cookbook

By : Ben Auffarth
Book Image

Artificial Intelligence with Python Cookbook

By: Ben Auffarth

Overview of this book

Artificial intelligence (AI) plays an integral role in automating problem-solving. This involves predicting and classifying data and training agents to execute tasks successfully. This book will teach you how to solve complex problems with the help of independent and insightful recipes ranging from the essentials to advanced methods that have just come out of research. Artificial Intelligence with Python Cookbook starts by showing you how to set up your Python environment and taking you through the fundamentals of data exploration. Moving ahead, you’ll be able to implement heuristic search techniques and genetic algorithms. In addition to this, you'll apply probabilistic models, constraint optimization, and reinforcement learning. As you advance through the book, you'll build deep learning models for text, images, video, and audio, and then delve into algorithmic bias, style transfer, music generation, and AI use cases in the healthcare and insurance industries. Throughout the book, you’ll learn about a variety of tools for problem-solving and gain the knowledge needed to effectively approach complex problems. By the end of this book on AI, you will have the skills you need to write AI and machine learning algorithms, test them, and deploy them for production.
Table of Contents (13 chapters)

Live decisioning customer values

Let's assume we have the following scenario: we have a list of customers to call in order to sell them our product. Each phone call costs money in call center personal salaries, so we want to reduce these costs as much as possible. We have certain information about each customer that could help us determine whether they are likely to buy. After every call, we can update our model. The main goal is to call only the most promising customers and to improve our insights into which customers are more likely to pay for our product.

In this recipe, we will approach this with active learning, a strategy where we actively decide what to explore (and learn) next. Our model will help decide whom to call. Because we will update our model after each query (phone call), we will use online learning models.

Getting ready

We'll prepare for our recipe by downloading our dataset and installing a few libraries.

Again, we will get the data from OpenML:

!pip install -q openml

import openml
dataset = openml.datasets.get_dataset(1461)
X, y, categorical_indicator, _ = dataset.get_data(
dataset_format='DataFrame',
target=dataset.default_target_attribute
)
categorical_features = X.columns[categorical_indicator]
numeric_features = X.columns[
[not(i) for i in categorical_indicator]
]

This dataset is called bank-marketing, and you can see a description on OpenML at https://www.openml.org/d/1461.

For each row, describing a single person, we have different features, numerical and categorical, that tell us about demographics and customer history.

To model the likelihood of customers signing up for our product, we will use the scikit-multiflow package that specializes in online models. We will also use the category_encoders package again:

!pip install scikit-multiflow category_encoders

With these two libraries in place, we can start the recipe.

How to do it...

We need to implement an exploration strategy and a model that is being continuously updated. We are using the online version of the random forest, the Hoeffding Tree, as our model. We are estimating the uncertainties at every step, and based on that we will return a candidate to call next.

As always, we will need to define a few preprocessing steps:

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import FunctionTransformer
import category_encoders as ce

ordinal_encoder = ce.OrdinalEncoder(
cols=None, # all features that it encounters
handle_missing='return_nan',
handle_unknown='ignore'
).fit(X)

preprocessor = ColumnTransformer(
transformers=[
('cat', ordinal_encoder, categorical_features),
('num', FunctionTransformer(validate=False), numeric_features)
])
preprocessor = preprocessor.fit(X)

Then we come to our active learning approach itself. This is inspired by modAL.models.ActiveLearner:

import numpy as np
from skmultiflow.trees.hoeffding_tree import HoeffdingTreeClassifier
from sklearn.metrics import roc_auc_score
import random

class ActivePipeline:
def __init__(self, model, preprocessor, class_weights):
self.model = model
self.preprocessor = preprocessor
self.class_weights = class_weights

@staticmethod
def values(X):
if isinstance(X, (np.ndarray, np.int64)):
return X
else:
return X.values

def preprocess(self, X):
X_ = pd.DataFrame(
data=self.values(X),
columns=[
'V1', 'V2', 'V3', 'V4',
'V5', 'V6', 'V7', 'V8',
'V9', 'V10', 'V11', 'V12',
'V13', 'V14', 'V15', 'V16'
])
return self.preprocessor.transform(X_)

def fit(self, X, ys):
weights = [self.class_weights[y] for y in ys]
self.model.fit(self.preprocess(X), self.values(ys))

def update(self, X, ys):
if isinstance(ys, (int, float)):
weight = self.class_weights[y]
else:
weight = [self.class_weights[y] for y in ys]

self.model.partial_fit(
self.preprocess(X),
self.values(ys),
weight
)

def predict(self, X):
return self.model.predict(
self.preprocess(X)
)

def predict_proba(self, X):
return self.model.predict_proba(
self.preprocess(X)
)

@staticmethod
def entropy(preds):
return -np.sum(
np.log((preds + 1e-15) * preds)
/ np.log(np.prod(preds.size))
)

def max_margin_uncertainty(self, X, method: str='entropy',
exploitation: float=0.9, favor_class: int=1, k: int=1
):
'''similar to modAL.uncertainty.margin_uncertainty
'''
probs = self.predict_proba(X)
if method=='margin':
uncertainties = np.abs(probs[:,2] - probs[:, 1]) / 2.0
elif method=='entropy':
uncertainties = np.apply_along_axis(self.entropy, 1, probs[:, (1,2)])
else: raise(ValueError('method not implemented!'))

if favor_class is None:
weights = uncertainties
else: weights = (1.0 - exploitation) * uncertainties + exploitation * probs[:, favor_class]

if self.sampling:
ind = random.choices(
range(len(uncertainties)), weights, k=k
)
else:
ind = np.argsort(weights, axis=0)[::-1][:k]
return ind, np.mean(uncertainties[ind])

def score(self, X, y, scale=True):
probs = self.predict_proba(X, probability=2)
if scale:
probs = np.clip(probs - np.mean(probs) + 0.5, 0, 1)
return roc_auc_score(y, probs)

Again, we create a scikit-learn-compatible class. It basically holds a machine learning model and a data preprocessor. We implement fit() and predict(), but also score() to get a model performance. We also implement an update() method that calls partial_fit() of the machine learning model. Calling partial_fit() instead of fit() considerably speeds up the computations, because we don't have to start from scratch every time we get new data.

Here's how to create the active learning pipeline:

active_pipeline = ActivePipeline(
HoeffdingTreeClassifier(),
preprocessor,
class_weights.to_dict()
)
active_pipeline.model.classes = [0, 1, 2]

We can run different simulations on our dataset with this setup. For example, we can compare a lot of experimentation (0.5 exploitation) against only exploitation (1.0), or no learning at all after the first batch. We basically go through a loop:

  • Via active_pipeline.max_margin_uncertainty(), we present data to the active pipeline, and get a number of data points that integrate uncertainty and target predictions according to our preferences of the integration method.
  • Once we get the actual results for these data points, we can update our model: active_pipeline.update().

You can see an example of this in the notebook on GitHub.

We can see that curious wins out after the first few examples. Exploitation is actually the least successful scheme. By not updating the model, performance deteriorates over time:

This is an ideal scenario for active learning or reinforcement learning, because, not unlike in reinforcement learning, uncertainty can be an additional criterion, apart from positive expectation, from a customer. Over time, this entropy reduction-seeking behavior reduces as the model's understanding of customers improves.

How it works...

It's worth delving a bit more into a few of the concepts and strategies employed in this recipe.

Active learning

Active learning means that we can actively query for more information; in other words, exploration is part of our strategy. This can be useful in scenarios where we have to actively decide what to learn, and where what we learn influences not only how much our model learns and how well, but also how much return on an investment we can get.

Hoeffding Tree

The Hoeffding Tree (also known as the Very Fast Decision Tree, VFDT for short) was introduced in 2001 by Geoff Hulten and others (Mining time-changing data streams). It is an incrementally growing decision tree for streamed data. Tree nodes are expanded based on the Hoeffding bound (or additive Chernoff bound). It was theoretically shown that, given sufficient training data, a model learned by the Hoeffding tree converges very closely to the one built by a non-incremental learner.

The Hoeffding bound is defined as follows:

It's important to note that the Hoeffding Tree doesn't deal with data distributions that change over time.

Class weighting

Since we are dealing with an imbalanced dataset, let's use class weights. This basically means that we are upsampling the minority (signing up) class and downsampling the majority class (not signing up).

The formula for the class weights is as follows:

Similarly, in Python, we can write the following:

class_weights = len(X) / (y.astype(int).value_counts() * 2)

We can then use these class weights for sampling.

We'll close the recipe with a few more pointers.

See also

Only a few models in scikit-learn allow incremental or online learning. Refer to the list at https://scikit-learn.org/stable/modules/computing.html.

A few linear models include the partial_fit() method. The scikit-multiflow library specializes in incremental and online/streaming models: https://scikit-multiflow.github.io/

You can find more resources and ideas regarding active learning from a recent review that concentrates on biomedical image processing (Samuel Budd and others, A Survey on Active Learning and Human-in-the-Loop Deep Learning for Medical Image Analysis, 2019; https://arxiv.org/abs/1910.02923).

Our approach is inspired by the modalAI Python active learning package, which you can find at https://modal-python.readthedocs.io/. We recommend you check it out if you are interested in active learning approaches. A few more Python packages are available, as follows:

One of the main decisions in active learning is the trade-off between exploration and exploitation. You can find out more about this in a paper called Exploration versus exploitation in active learning: a Bayesian approach: http://www.vincentlemaire-labs.fr/publis/ijcnn_2_2010_camera_ready.pdf