Book Image

Interpretable Machine Learning with Python - Second Edition

By : Serg Masís
4 (4)
Book Image

Interpretable Machine Learning with Python - Second Edition

4 (4)
By: Serg Masís

Overview of this book

Interpretable Machine Learning with Python, Second Edition, brings to light the key concepts of interpreting machine learning models by analyzing real-world data, providing you with a wide range of skills and tools to decipher the results of even the most complex models. Build your interpretability toolkit with several use cases, from flight delay prediction to waste classification to COMPAS risk assessment scores. This book is full of useful techniques, introducing them to the right use case. Learn traditional methods, such as feature importance and partial dependence plots to integrated gradients for NLP interpretations and gradient-based attribution methods, such as saliency maps. In addition to the step-by-step code, you’ll get hands-on with tuning models and training data for interpretability by reducing complexity, mitigating bias, placing guardrails, and enhancing reliability. By the end of the book, you’ll be confident in tackling interpretability challenges with black-box models using tabular, language, image, and time series data.
Table of Contents (17 chapters)
15
Other Books You May Enjoy
16
Index

Learning about evasion attacks

There are six broad categories of adversarial attacks:

  • Evasion: designing an input that can cause a model to make an incorrect prediction, especially when it wouldn’t fool a human observer. It can either be targeted or untargeted, depending on the attacker’s intention to fool the model into misclassifying a specific class (targeted) or, rather, misclassifying any class (untargeted). The attack methods can be white-box if the attacker has full access to the model and its training dataset, or black-box with only inference access. Gray-box sits in the middle. Black-box is always model-agnostic, whereas white and gray-box methods might be.
  • Poisoning: injecting faulty training data or parameters into a model can come in many forms, depending on the attacker’s capabilities and access. For instance, for systems with user-generated data, the attacker may be capable of adding faulty data or labels. If they have more access...