Practical Deep Learning at Scale with MLflow

By : Yong Liu

5 (1)

Buy this Book

Practical Deep Learning at Scale with MLflow

5 (1)

By: Yong Liu

Buy this Book

Overview of this book

The book starts with an overview of the deep learning (DL) life cycle and the emerging Machine Learning Ops (MLOps) field, providing a clear picture of the four pillars of deep learning: data, model, code, and explainability and the role of MLflow in these areas. From there onward, it guides you step by step in understanding the concept of MLflow experiments and usage patterns, using MLflow as a unified framework to track DL data, code and pipelines, models, parameters, and metrics at scale. You’ll also tackle running DL pipelines in a distributed execution environment with reproducibility and provenance tracking, and tuning DL models through hyperparameter optimization (HPO) with Ray Tune, Optuna, and HyperBand. As you progress, you’ll learn how to build a multi-step DL inference pipeline with preprocessing and postprocessing steps, deploy a DL inference pipeline for production using Ray Serve and AWS SageMaker, and finally create a DL explanation as a service (EaaS) using the popular Shapley Additive Explanations (SHAP) toolbox. By the end of this book, you’ll have built the foundation and gained the hands-on experience you need to develop a DL pipeline solution from initial offline experimentation to final deployment and production, all within a reproducible and open source framework.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Share Your Thoughts

Section 1 - Deep Learning Challenges and MLflow Prime

Free Chapter

Chapter 1: Deep Learning Life Cycle and MLOps Challenges

Technical requirements

Understanding the DL life cycle and MLOps challenges

Understanding DL data challenges

Understanding DL model challenges

Understanding DL code challenges

Understanding DL explainability challenges

Summary

Further reading

Chapter 2: Getting Started with MLflow for Deep Learning

Technical requirements

Setting up MLflow

Implementing our first DL experiment with MLflow autologging

Exploring MLflow's components and usage patterns

Summary

Further reading

Section 2 – Tracking a Deep Learning Pipeline at Scale

Chapter 3: Tracking Models, Parameters, and Metrics

Technical requirements

Setting up a full-fledged local MLflow tracking server

Tracking model provenance

Tracking model metrics

Tracking model parameters

Summary

Further reading

Chapter 4: Tracking Code and Data Versioning

Technical requirements

Tracking notebook and pipeline versioning

Tracking locally, privately built Python libraries

Tracking data versioning in Delta Lake

Summary

Further reading

Section 3 – Running Deep Learning Pipelines at Scale

Chapter 5: Running DL Pipelines in Different Environments

Technical requirements

An overview of different execution scenarios and environments

Running locally with local code

Running remote code in GitHub locally

Running local code remotely in the cloud

Running remotely in the cloud with remote code in GitHub

Summary

Further reading

Chapter 6: Running Hyperparameter Tuning at Scale

Technical requirements

Understanding automatic HPO for DL pipelines

Creating HPO-ready DL models with Ray Tune and MLflow

Running the first Ray Tune HPO experiment with MLflow

Running HPO with Ray Tune using Optuna and HyperBand

Summary

Further reading

Section 4 – Deploying a Deep Learning Pipeline at Scale

Chapter 7: Multi-Step Deep Learning Inference Pipeline

Technical requirements

Understanding patterns of DL inference pipelines

Implementing a custom MLflow Python model

Implementing preprocessing and postprocessing steps in a DL inference pipeline

Implementing an inference pipeline as a new entry point in the main MLproject

Summary

Further reading

Chapter 8: Deploying a DL Inference Pipeline at Scale

Technical requirements

Understanding different deployment tools and host environments

Deploying locally for batch and web service inference

Deploying using Ray Serve and MLflow deployment plugins

Deploying to AWS SageMaker – a complete end-to-end guide

Summary

Further reading

Section 5 – Deep Learning Model Explainability at Scale

Chapter 9: Fundamentals of Deep Learning Explainability

Technical requirements

Understanding the categories and audience of explainability

Exploring the SHAP Explainability toolbox

Exploring the Transformers Interpret toolbox

Summary

Further reading

Chapter 10: Implementing DL Explainability with MLflow

Technical requirements

Understanding current MLflow explainability integration

Implementing a SHAP explanation using the MLflow artifact logging API

Implementing a SHAP explainer using the MLflow pyfunc API

Summary

Further reading

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Customer Reviews

5 (1)

5 star

100%

4 star

3 star

2 star

1 star

What this book covers

Chapter 1, Deep Learning Life Cycle and MLOps Challenges, covers the five stages of the full life cycle of DL and the first DL model in this book using the transfer learning approach for text sentiment classification. It also defines the concept of MLOps along with the three foundation layers and four pillars, and the roles of MLflow in these areas. An overview of the challenges in DL data, model, code, and explainability are also presented. This chapter is designed to bring everyone to the same foundational level and provides clarity and guidelines on the scope of the rest of the book.

Chapter 2, Getting Started with MLflow for Deep Learning, serves as an MLflow primer and a first hands-on learning module to quickly set up a local filesystem-based MLflow tracking server or interact with a remote managed MLflow tracking server in Databricks, and perform a first DL experiment using MLflow auto logging. It also explains some foundational MLflow concepts through concrete examples such as experiments, runs, metadata about and the relationship between experiments and runs, code tracking, model logging, and model flavor. Specifically, we underline that experiments should be first-class entities that can be used to bridge the gap between the offline and online production life cycle of DL models. This chapter builds the foundational knowledge of MLflow.

Chapter 3, Tracking Models, Parameters, and Metrics, covers the first in-depth learning module on tracking using a fully-fledged local MLflow tracking server. It starts with setting up a local fully-fledged MLflow tracking server that runs in Docker Desktop, with a MySQL backend store and a MinIO artifact store. Before implementing tracking, this chapter provides an open provenance tracking framework based on the open provenance model vocabulary specification, and presents six types of provenance questions that could be implemented by using MLflow. It then provides hands-on implementation examples on how to use MLflow model-logging APIs and registry APIs to track model provenance, model metrics, and parameters, with or without auto logging. Unlike other typical MLflow API tutorials, which only provide guidance on using the APIs, this chapter instead focuses on how successfully we can use MLflow to answer the provenance questions. By the end of this chapter, we could answer four out of six provenance questions, and the remaining two questions can only be answered when we have a multi-step pipeline or deployment to production, which are covered in the later chapters.

Chapter 4, Tracking Code and Data Versioning, covers the second in-depth learning module on MLflow tracking. It analyzes the current practices on the usage of notebooks and pipelines in the ML/DL projects. It recommends using VS Code notebooks and shows a concrete DL notebook example that can be run either interactively or non-interactively with MLflow tracking enabled. It also recommends using MLflow's MLproject to implement a multi-step DL pipeline using MLflow's entry points and pipeline chaining. A three-step DL pipeline is created for DL model training and registration. In addition, it also shows the pipeline level tracking and individual step tracking through the parent-child nested run in MLflow. Finally, it shows how to track public and privately built Python libraries and data versioning in Delta Lake using MLflow.

Chapter 5, Running DL Pipelines in Different Environments, covers how to run a DL pipeline in different environments. It starts with the scenarios and requirements for executing DL pipelines in different environments. It then shows how to use MLflow's command-line interface (CLI) to submit runs in four scenarios: running locally with local code, running locally with remote code in GitHub, running remotely in the cloud with local code, and running remotely in the cloud with remote code in GitHub. The flexibility and reproducibility supported by MLflow to execute a DL pipeline also provide building blocks for continuous integration/continuous deployment (CI/CD) automation when needed.

Chapter 6, Running Hyperparameter Tuning at Scale, covers using MLflow to support HPO at scale using state-of-the-art HPO frameworks such as Ray Tune. It starts with a review of the types and challenges of DL pipeline hyperparameters. Then, it compares three HPO frameworks Ray Tune, Optuna, and HyperOpt, and provides a detailed analysis of the pros and cons and their integration maturity with MLflow. It then recommends and shows how to use Ray Tune with MLflow to do HPO tuning for the DL model we have been working on in this book so far. Furthermore, it covers how to switch to other HPO search and scheduler algorithms such as Optuna and HyperBand. This enables us to produce high-performance DL models that meet the business requirements in a cost-effective and scalable way.

Chapter 7, Multi-Step Deep Learning Inference Pipeline, covers creating a multi-step inference pipeline using MLflow's custom Python model approach. It starts with an overview of four patterns of inference workflows in production where a single trained model is usually not enough to meet the business application requirements. Additional preprocessing and postprocessing steps are needed. It then presents a step-by-step guide to implementing a multi-step inference pipeline that wraps the previously fine-tuned DL sentiment model with language detection, caching, and additional model metadata. This inference pipeline is then logged as a generic MLflow PyFunc model that can be loaded using the common MLflow PyFunc load API. Having an inference pipeline wrapped as an MLflow model opens doors for automation and consistent management of the model pipeline within the same MLflow framework.

Chapter 8, Deploying a DL Inference Pipeline at Scale, covers deploying a DL inference pipeline into different host environments for production usage. It starts with an overview of the landscape of deployment and hosting environments including batch inference and streaming inference at scale. It then describes the different deployment mechanisms such as MLflow built-in model serving tools, custom deployment plugins, and generic model serving frameworks such as Ray Serve. It shows examples of how to deploy a batch inference pipeline using MLflow's Spark user-defined function (UDF), and how to serve a DL inference pipeline as a local web service using either MLflow's built-in model serving tool or Ray Serve's MLflow deployment plugin, mlflow-ray-serve. It then describes a complete step-by-step guide to deploying a DL inference pipeline to a managed AWS SageMaker instance for production usage.

Chapter 9, Fundamentals of Deep Learning Explainability, covers the foundational concepts of explainability and exploration of using two popular explainability tools. It starts with an overview of the eight dimensions of explainability and explainable AI (XAI), then provides concrete learning examples to explore the usage of SHAP and Transformers-interpret toolboxes for an NLP sentiment pipeline. It emphasizes that explainability should be lifted to be the first-class artifact when developing a DL application since there are increasing demands and expectations for model and data explanation in various business applications and domains.

Chapter 10, Implementing DL Explainability with MLflow, covers how to implement DL explainability using MLflow to provide Explanation-as-a-Service (EaaS). It starts with an overview of MLflow's current capability to support explainers and explanations. Specifically, the existing integration with SHAP in MLflow APIs does not support DL explainability at scale. Therefore, it provides two generic ways of using MLflow's artifact logging APIs and PyFunc APIs for the implementation. Examples are provided for implementing SHAP explanation, which logs the SHAP value in a bar chart in an MLflow tracking server's artifact store. A SHAP explainer can be logged as an MLflow Python model, and then loaded as either a Spark UDF for batch explanation or as a web service for online EaaS. This provides maximal flexibility within a unified MLflow framework for implementing explainability.

Practical Deep Learning at Scale with MLflow

By : Yong Liu

Practical Deep Learning at Scale with MLflow

By: Yong Liu

Overview of this book

Related Content you might be interested in

Current Title:

Practical Deep Learning at Scale with MLflow

Machine Learning Engineering with MLflow.

Practical Machine Learning on Databricks

Distributed Data Systems with Azure Databricks

What this book covers