Book Image

Practical Deep Learning at Scale with MLflow

By : Yong Liu
5 (1)
Book Image

Practical Deep Learning at Scale with MLflow

5 (1)
By: Yong Liu

Overview of this book

The book starts with an overview of the deep learning (DL) life cycle and the emerging Machine Learning Ops (MLOps) field, providing a clear picture of the four pillars of deep learning: data, model, code, and explainability and the role of MLflow in these areas. From there onward, it guides you step by step in understanding the concept of MLflow experiments and usage patterns, using MLflow as a unified framework to track DL data, code and pipelines, models, parameters, and metrics at scale. You’ll also tackle running DL pipelines in a distributed execution environment with reproducibility and provenance tracking, and tuning DL models through hyperparameter optimization (HPO) with Ray Tune, Optuna, and HyperBand. As you progress, you’ll learn how to build a multi-step DL inference pipeline with preprocessing and postprocessing steps, deploy a DL inference pipeline for production using Ray Serve and AWS SageMaker, and finally create a DL explanation as a service (EaaS) using the popular Shapley Additive Explanations (SHAP) toolbox. By the end of this book, you’ll have built the foundation and gained the hands-on experience you need to develop a DL pipeline solution from initial offline experimentation to final deployment and production, all within a reproducible and open source framework.
Table of Contents (17 chapters)
1
Section 1 - Deep Learning Challenges and MLflow Prime
4
Section 2 –
Tracking a Deep Learning Pipeline at Scale
7
Section 3 –
Running Deep Learning Pipelines at Scale
10
Section 4 –
Deploying a Deep Learning Pipeline at Scale
13
Section 5 – Deep Learning Model Explainability at Scale

Preface

Starting from AlexNet in 2012, which won the large-scale ImageNet competition, to the BERT pre-trained language model in 2018, which topped many natural language processing (NLP) leaderboards, the revolution of modern deep learning (DL) in the broader artificial intelligence (AI) and machine learning (ML) community continues. Yet, the challenges of moving these DL models from offline experimentation to a production environment remain. This is largely due to the complexity and lack of a unified open source framework for supporting the full life cycle development of DL. This book will help you understand the big picture of DL full life cycle development, and implement DL pipelines that can scale from a local offline experiment to a distributed environment and online production clouds, with an emphasis on hands-on project-based learning to support the end-to-end DL process using the popular open source MLflow framework.

The book starts with an overview of the DL full life cycle and the emerging machine learning operations (MLOps) field, providing a clear picture of the four pillars of DL (data, model, code, and explainability) and the role of MLflow in these areas. A basic transfer learning-based NLP sentiment model using PyTorch Lightning Flash is built in the first chapter, which is further developed, tuned, and deployed to production throughout the rest of the book. From there onward, it guides you step-by-step to understand the concept of MLflow experiments and usage patterns, using MLflow as a unified framework to track DL data, code and pipeline, model, parameters, and metrics at scale. We'll run DL pipelines in a distributed execution environment with reproducibility and provenance tracking, and tune DL models through hyperparameter optimization (HPO) with Ray Tune, Optuna and HyperBand. We'll also build a multi-step DL inference pipeline with preprocessing and postprocessing steps, deploy a DL inference pipeline for production using Ray Serve and AWS SageMaker, and finally, provide a DL Explanation-as-a-Service using SHapley Additive exPlanations (SHAP) and MLflow integration.

By the end of this book, you'll have the foundation and hands-on experience to build a DL pipeline from initial offline experimentation to final deployment and production, all within a reproducible and open source framework. Along the way, you will also learn the unique challenges with DL pipelines and how we overcome them with practical and scalable solutions such as using multi-core CPUs, graphical processing units (GPUs), distributed and parallel computing frameworks, and the cloud.