Book Image

Interpretable Machine Learning with Python - Second Edition

By : Serg Masís
4 (4)
Book Image

Interpretable Machine Learning with Python - Second Edition

4 (4)
By: Serg Masís

Overview of this book

Interpretable Machine Learning with Python, Second Edition, brings to light the key concepts of interpreting machine learning models by analyzing real-world data, providing you with a wide range of skills and tools to decipher the results of even the most complex models. Build your interpretability toolkit with several use cases, from flight delay prediction to waste classification to COMPAS risk assessment scores. This book is full of useful techniques, introducing them to the right use case. Learn traditional methods, such as feature importance and partial dependence plots to integrated gradients for NLP interpretations and gradient-based attribution methods, such as saliency maps. In addition to the step-by-step code, you’ll get hands-on with tuning models and training data for interpretability by reducing complexity, mitigating bias, placing guardrails, and enhancing reliability. By the end of the book, you’ll be confident in tackling interpretability challenges with black-box models using tabular, language, image, and time series data.
Table of Contents (17 chapters)
15
Other Books You May Enjoy
16
Index

Considering feature engineering

Let’s assume that the non-profit has chosen to use the model whose features were selected with LASSO LARS with AIC (e-llarsic) but would like to evaluate whether you can improve it further. Now that you have removed over 300 features that might have only marginally improved predictive performance but mostly added noise, you are left with more relevant features. However, you also know that 8 features selected by e-llars produced the same amount of RMSE as the 111 features. This means that while there’s something in those extra features that improves profitability, it does not improve the RMSE.

From a feature selection standpoint, many things can be done to approach this problem. For instance, examine the overlap and difference of features between e-llarsic and e-llars, and do feature selection variations strictly on those features to see whether the RMSE dips on any combination while keeping or improving on current profitability. However...