Interpretable Machine Learning with Python - Second Edition

By : Serg Masís

4 (4)

Buy this Book

Interpretable Machine Learning with Python - Second Edition

4 (4)

By: Serg Masís

Buy this Book

Overview of this book

Interpretable Machine Learning with Python, Second Edition, brings to light the key concepts of interpreting machine learning models by analyzing real-world data, providing you with a wide range of skills and tools to decipher the results of even the most complex models. Build your interpretability toolkit with several use cases, from flight delay prediction to waste classification to COMPAS risk assessment scores. This book is full of useful techniques, introducing them to the right use case. Learn traditional methods, such as feature importance and partial dependence plots to integrated gradients for NLP interpretations and gradient-based attribution methods, such as saliency maps. In addition to the step-by-step code, you’ll get hands-on with tuning models and training data for interpretability by reducing complexity, mitigating bias, placing guardrails, and enhancing reliability. By the end of the book, you’ll be confident in tackling interpretability challenges with black-box models using tabular, language, image, and time series data.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Interpretation, Interpretability, and Explainability; and Why Does It All Matter?

Technical requirements

What is machine learning interpretation?

Understanding the difference between interpretability and explainability

A business case for interpretability

Summary

Image sources

Dataset sources

Further reading

Free Chapter

Key Concepts of Interpretability

Technical requirements

The mission

The approach

Preparations

Interpretation method types and scopes

Appreciating what hinders machine learning interpretability

Mission accomplished

Summary

Further reading

Interpretation Challenges

Technical requirements

The mission

The approach

The preparations

Loading the libraries

Reviewing traditional model interpretation methods

Understanding limitations of traditional model interpretation methods

Studying intrinsically interpretable (white-box) models

Recognizing the trade-off between performance and interpretability

Discovering newer interpretable (glass-box) models

Mission accomplished

Summary

Dataset sources

Further reading

Global Model-Agnostic Interpretation Methods

Technical requirements

The mission

The approach

The preparations

Model training and evaluation

What is feature importance?

Assessing feature importance with model-agnostic methods

Visualize global explanations

Feature summary explanations

Feature interactions

Summary

Further reading

Local Model-Agnostic Interpretation Methods

Technical requirements

The mission

The approach

The preparations

Leveraging SHAP’s KernelExplainer for local interpretations with SHAP values

Employing LIME

Using LIME for NLP

Trying SHAP for NLP

Comparing SHAP with LIME

Mission accomplished

Summary

Dataset sources

Further reading

Anchors and Counterfactual Explanations

Technical requirements

The mission

The approach

The preparations

Understanding anchor explanations

Exploring counterfactual explanations

Mission accomplished

Summary

Dataset sources

Further reading

Visualizing Convolutional Neural Networks

Technical requirements

The mission

The approach

Preparations

Visualizing the learning process with activation-based methods

Evaluating misclassifications with gradient-based attribution methods

Understanding classifications with perturbation-based attribution methods

Mission accomplished

Summary

Further reading

Interpreting NLP Transformers

Technical requirements

The mission

The approach

The preparations

Visualizing attention with BertViz

Interpreting token attributions with integrated gradients

LIME, counterfactuals, and other possibilities with the LIT

Mission accomplished

Summary

Further reading

Interpretation Methods for Multivariate Forecasting and Sensitivity Analysis

Technical requirements

The mission

The approach

The preparation

Assessing time series models with traditional interpretation methods

Generating LSTM attributions with integrated gradients

Computing global and local attributions with SHAP’s KernelExplainer

Identifying influential features with factor prioritization

Quantifying uncertainty and cost sensitivity with factor fixing

Mission accomplished

Summary

Dataset and image sources

Further reading

Feature Selection and Engineering for Interpretability

Technical requirements

The mission

The approach

The preparations

Understanding the effect of irrelevant features

Reviewing filter-based feature selection methods

Exploring embedded feature selection methods

Discovering wrapper, hybrid, and advanced feature selection methods

Considering feature engineering

Mission accomplished

Summary

Dataset sources

Further reading

Bias Mitigation and Causal Inference Methods

Technical requirements

Creating a causal model

Understanding heterogeneous treatment effects

Testing estimate robustness

Mission accomplished

Summary

Dataset sources

Further reading

Monotonic Constraints and Model Tuning for Interpretability

Technical requirements

The mission

The approach

The preparations

Placing guardrails with feature engineering

Tuning models for interpretability

Implementing model constraints

Mission accomplished

Summary

Dataset sources

Further reading

Adversarial Robustness

Technical requirements

The mission

The approach

The preparations

Learning about evasion attacks

Defending against targeted attacks with preprocessing

Shielding against any evasion attack by adversarial training of a robust classifier

Evaluating adversarial robustness

Mission accomplished

Summary

Dataset sources

Further reading

What’s Next for Machine Learning Interpretability?

Understanding the current landscape of ML interpretability

Speculating on the future of ML interpretability

Summary

Further reading

Other Books You May Enjoy

Index

Customer Reviews

4 (4)

5 star

50%

4 star

25%

3 star

2 star

25%

1 star

Mission accomplished

The first part of the mission was to understand risk factors for cardiovascular disease, and you’ve determined that the top four risk factors are systolic blood pressure (ap_hi), age, cholesterol, and weight according to the logistic regression model, of which only age is non-modifiable. However, you also realized that systolic blood pressure (ap_hi) is not as meaningful on its own since it relies on diastolic blood pressure (ap_lo) for interpretation. The same goes for weight and height. We learned that the interaction of features plays a crucial role in interpretation, and so does their relationship with each other and the target variable, whether linear or monotonic. Furthermore, the data is only a representation of the truth, which can be wrong. After all, we found anomalies that, left unchecked, can bias our model.

Another source of bias is how the data was collected. After all, you can wonder why the model’s top features were all objective and examination features. Why isn’t smoking or drinking a larger factor? To verify whether there was sample bias involved, you would have to compare with other more trustworthy datasets to check whether your dataset is underrepresenting drinkers and smokers. Or maybe the bias was introduced by the question that asked whether they smoked now, and not whether they had ever smoked for an extended period.

Another type of bias that we could address is exclusion bias—our data might be missing information that explains the truth that the model is trying to depict. For instance, we know through medical research that blood pressure issues such as isolated systolic hypertension, which increases CVD risk, are caused by underlying conditions such as diabetes, hyperthyroidism, arterial stiffness, and obesity, to name a few. The only one of these conditions that we can derive from the data is obesity and not the other ones. If we want to be able to interpret a model’s predictions well, we need to have all relevant features. Otherwise, there will be gaps we cannot explain. Maybe once we add them, they won’t make much of a difference, but that’s what the methods we will learn in Chapter 10, Feature Selection and Engineering for Interpretability, are for.

The second part of the mission was to be able to interpret individual model predictions. We can do this well enough by plotting decision regions. It’s a simple method, but it has many limitations, especially in situations where there are more than a handful of features, and they tend to interact a lot with each other. Chapter 5, Local Model-Agnostic Interpretation Methods, and Chapter 6, Anchors and Counterfactual Explanations, will cover local interpretation methods in more detail. However, the decision region plot method helps illustrate many of the concepts surrounding decision boundaries, which we will discuss in those chapters.

Interpretable Machine Learning with Python - Second Edition

By : Serg Masís

Interpretable Machine Learning with Python - Second Edition

By: Serg Masís

Overview of this book

Related Content you might be interested in

Current Title:

Interpretable Machine Learning with Python - Second Edition

Applied Machine Learning Explainability Techniques

Deep Learning and XAI Techniques for Anomaly Detection

Responsible AI in the Enterprise

Mission accomplished