Book Image

Interpretable Machine Learning with Python

By : Serg Masís
Book Image

Interpretable Machine Learning with Python

By: Serg Masís

Overview of this book

Do you want to gain a deeper understanding of your models and better mitigate poor prediction risks associated with machine learning interpretation? If so, then Interpretable Machine Learning with Python deserves a place on your bookshelf. We’ll be starting off with the fundamentals of interpretability, its relevance in business, and exploring its key aspects and challenges. As you progress through the chapters, you'll then focus on how white-box models work, compare them to black-box and glass-box models, and examine their trade-off. You’ll also get you up to speed with a vast array of interpretation methods, also known as Explainable AI (XAI) methods, and how to apply them to different use cases, be it for classification or regression, for tabular, time-series, image or text. In addition to the step-by-step code, this book will also help you interpret model outcomes using examples. You’ll get hands-on with tuning models and training data for interpretability by reducing complexity, mitigating bias, placing guardrails, and enhancing reliability. The methods you’ll explore here range from state-of-the-art feature selection and dataset debiasing methods to monotonic constraints and adversarial retraining. By the end of this book, you'll be able to understand ML models better and enhance them through interpretability tuning.
Table of Contents (19 chapters)
Section 1: Introduction to Machine Learning Interpretation
Section 2: Mastering Interpretation Methods
Section 3:Tuning for Interpretability

What is machine learning interpretation?

To interpret something is to explain the meaning of it. In the context of machine learning, that something is an algorithm. More specifically, that algorithm is a mathematical one that takes input data and produces an output, much like with any formula.

Let's examine the most basic of models, simple linear regression, illustrated in the following formula:

Once fitted to the data, the meaning of this model is that predictions are a weighted sum of the features with the coefficients. In this case, there's only one feature or predictor variable, and the variable is typically called the response or target variable. A simple linear regression formula single-handedly explains the transformation, which is performed on the input data to produce the output . The following example can illustrate this concept in further detail.

Understanding a simple weight prediction model

If you go to this web page maintained by the University of California,, you can find a link to download a dataset of synthetic records of weights and heights of -year-olds. We won't use the entire dataset but only the sample table on the web page itself with records. We scrape the table from the web page and fit a linear regression model to the data. The model uses the height to predict the weight.

In other words, and , so the formula for the linear regression model would be as follows:

You can find the code for this example here:

To run this example, you need to install the following libraries:

  • requests to fetch the web page
  • bs4 (Beautiful Soup) to scrape the table from the web page
  • pandas to load the table in to a dataframe
  • sklearn (scikit-learn) to fit the linear regression model and calculate its error
  • matplotlib to visualize the model
  • scipy to test the correlation

You should load all of them first, as follows:

Import math
import requests
from bs4 import BeautifulSoup
import pandas as pd
from sklearn import linear_model
from sklearn.metrics import mean_absolute_error
import matplotlib.pyplot as plt
from scipy.stats import pearsonr

Once the libraries are all loaded, you use requests to fetch the contents of the web page, like this:

url = \
page = requests.get(url)

Then, take these contents and scrape out just the contents of the table with BeautifulSoup, as follows:

soup = BeautifulSoup(page.content, 'html.parser')
tbl = soup.find("table",{"class":"wikitable"})

pandas can turn the raw HyperText Markup Language (HTML) contents of the table into a dataframe, as illustrated here:

height_weight_df = pd.read_html(str(tbl))[0]\

And voilà! We now have a dataframe with Heights(Inches) in one column and Weights(Pounds) in another. As a sanity check, we can then count the number of records. This should be . The code is shown here:

num_records = height_weight_df.shape[0]

Now that we have confirmed that we have the data, we must transform it so that it conforms to the model's specifications. sklearn needs it as NumPy arrays with dimensions, so we must first extract the Height(Inches) and Weight(Pounds) pandas Series. Then, we turn them into NumPy arrays, and, finally, reshape them into dimensions. The following commands perform all the necessary transformation operations:

x = height_weight_df['Height(Inches)'].values.\
                                       reshape(num_records, 1)
y = height_weight_df['Weight(Pounds)'].values.\
                                       reshape(num_records, 1)

Then, you initialize the scikit-learn LinearRegression model and fit it with the training data, as follows:

model = linear_model.LinearRegression()
_ =,y)

To output the fitted linear regression model formula in scikit-learn, you must extract the intercept and coefficients. This is the formula that explains how it makes predictions:

print("ŷ =" + str(model.intercept_[0]) + " + " +\
                          str(model.coef_.T[0][0]) + " x")

The following is the output:

ŷ = -106.02770644878132 + 3.432676129271629 x1

This tells us that, on average, for every additional pound, there are 3.4 inches of height.

However, explaining how the model works is only one way to explain this linear regression model, and this is only one side of the story. The model isn't perfect because the actual outcomes and the predicted outcomes are not the same for the training data. The difference between both is the error or residuals.

There are many ways of understanding an error in a model. You can use an error function such as mean_absolute_error to measure the deviation between the predicted values and the actual values, as illustrated in the following code snippet:

y_pred = model.predict(x)
mae = mean_absolute_error(y, y_pred)

The following is the output:


A mean absolute error means that, on average, the prediction is pounds from the actual amount, but this might not be intuitive or informative. Visualizing the linear regression model can shed some light on how accurate these predictions truly are.

This can be done by using a matplotlib scatterplot and overlaying the linear model (in blue) and the mean absolute error (as two parallel bands in gray), as shown in the following code snippet:

plt.scatter(x, y, color='black')
plt.plot(x, y_pred, color='blue', linewidth=3)
plt.plot(x, y_pred + mae, color='lightgray')
plt.plot(x, y_pred - mae, color='lightgray')

If you run the preceding snippet, the plot shown here in Figure 1.1 is what you get as the output:

Figure 1.1 – Linear regression model to predict weight based on height

Figure 1.1 – Linear regression model to predict weight based on height

As you can appreciate from the plot in Figure 1.1, there are many times in which the actuals are pounds away from the prediction. Yet the mean absolute error can fool you into thinking that the error is always closer to . This is why it is essential to visualize the error of the model to understand its distribution. Judging from this graph, we can tell that there are no red flags that stand out about this distribution, such as residuals being more spread out for one range of heights than for others. Since it is more or less equally spread out, we say it's homoscedastic. In the case of linear regression, this is one of many model assumptions you should test for, along with linearity, normality, independence, and lack of multicollinearity (if there's more than one feature). These assumptions ensure that you are using the right model for the job. In other words, the height and weight can be explained with a linear relationship, and it is a good idea to do so, statistically speaking.

With this model, we are trying to establish a linear relationship between height and weight. This association is called a linear correlation. One way to measure this relationship's strength is with Pearson's correlation coefficient. This statistical method measures the association between two variables using their covariance divided by their standard deviations. It is a number between and whereby the closer the number it is to zero, the weaker the association is. If the number is positive, there is a positive association, and if it's negative, there is a negative one. In Python, you can compute Pearson's correlation coefficient with the pearsonr function from scipy, as illustrated here:

corr, pval = pearsonr(x[:,0], y[:,0])

The following is the output:


The number is positive, which is no surprise because as height increases, weight also tends to increase, but it is also closer to than to , denoting that it is strongly correlated. The second number produced by the pearsonr function is the -value for testing non-correlation. If we test that it's less than an error level of 5%, we can say there's sufficient evidence of this correlation, as illustrated here:

print(pval < 0.05)

The following is the output:


Understanding how a model performs and in which circumstances can help us explain why it makes certain predictions, and when it cannot. Let's imagine we are asked to explain why someone who is 71 inches tall was predicted to have a weight of 134 pounds but instead weighed 18 pounds more. Judging from what we know about the model, this margin of error is not unusual even though it's not ideal. However, there are many circumstances in which we cannot expect this model to be reliable. What if we were asked to predict the weight of a person who is 56 inches tall with the help of this model? Could we assure the same level of accuracy? Definitely not, because we fit the model on the data of subjects no shorter than 63 inches. Ditto if we were asked to predict the weight of a 9-year-old, because the training data was for 18-year-olds.

Despite the acceptable results, this weight prediction model was not a realistic example. If you wanted to be more accurate but—more importantly—faithful to what can really impact the weight of an individual, you would need to add more variables. You can add—say—gender, age, diet, and activity level. This is where it gets interesting because you have to make sure it is fair to include them, or not to include them. For instance, if gender were included yet most of our dataset was composed of males, how could you ensure accuracy for females? This is what is called selection bias. And what if weight had more to do with lifestyle choices and circumstances such as poverty and pregnancy than gender? If these variables aren't included, this is called omitted variable bias. And then, does it make sense to include the sensitive gender variable at the risk of adding bias to the model?

Once you have multiple features that you have vetted for fairness, you can find out and explain which features impact model performance. We call this feature importance. However, as we add more variables, we increase the complexity of the model. Paradoxically, this is a problem for interpretation, and we will explore this in further detail in the following chapters. For now, the key takeaway should be that model interpretation has a lot to do with explaining the following:

  1. Can we explain that predictions were made fairly?
  2. Can we trace the predictions reliably back to something or someone?
  3. Can we explain how predictions were made? Can we explain how the model works?

And ultimately, the question we are trying to answer is this:

Can we trust the model?

The three main concepts of interpretable machine learning directly relate to the three preceding questions and have the acronym of FAT, which stands for fairness, accountability, and transparency. If you can explain that predictions were made without discernible bias, then there is fairness. If you can explain why it makes certain predictions, then there's accountability. And if you can explain how predictions were made and how the model works, then there's transparency. There are many ethical concerns associated to these concepts, as shown here in Figure 1.2:

Figure 1.2 – Three main concept of Interpretable Machine Learning

Figure 1.2 – Three main concept of Interpretable Machine Learning

Some researchers and companies have expanded FAT under a larger umbrella of ethical artificial intelligence (AI), thus turning FAT into FATE. Ethical AI is part of an even larger discussion of algorithmic and data governance. However, both concepts very much overlap since interpretable machine learning is how FAT principles and ethical concerns get implemented in machine learning. In this book, we will discuss ethics in this context. For instance, Chapter 13, Adversarial Robustness relates to reliability, safety, and security. Chapter 11, Mitigating Bias and Causal Inference Methods relates to fairness. That being said, interpretable machine learning can be leveraged with no ethical aim in mind, and also for unethical reasons.