Book Image

Machine Learning Engineering with MLflow

By : Natu Lauchande

2 (1)

Book Image

Machine Learning Engineering with MLflow

2 (1)

By: Natu Lauchande

Overview of this book

MLflow is a platform for the machine learning life cycle that enables structured development and iteration of machine learning models and a seamless transition into scalable production environments. This book will take you through the different features of MLflow and how you can implement them in your ML project. You will begin by framing an ML problem and then transform your solution with MLflow, adding a workbench environment, training infrastructure, data management, model management, experimentation, and state-of-the-art ML deployment techniques on the cloud and premises. The book also explores techniques to scale up your workflow as well as performance monitoring techniques. As you progress, you’ll discover how to create an operational dashboard to manage machine learning systems. Later, you will learn how you can use MLflow in the AutoML, anomaly detection, and deep learning context with the help of use cases. In addition to this, you will understand how to use machine learning platforms for local development as well as for cloud and managed environments. This book will also show you how to use MLflow in non-Python-based languages such as R and Java, along with covering approaches to extend MLflow with Plugins. By the end of this machine learning book, you will be able to produce and deploy reliable machine learning algorithms using MLflow in multiple environments.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Share Your Thoughts

Section 1: Problem Framing and Introductions

Section 1: Problem Framing and Introductions

Free Chapter

Chapter 1: Introducing MLflow

Chapter 1: Introducing MLflow

Technical requirements

What is MLflow?

Getting started with MLflow

Exploring MLflow modules

Further reading

Chapter 2: Your Machine Learning Project

Chapter 2: Your Machine Learning Project

Technical requirements

Exploring the machine learning process

Framing the machine learning problem

Introducing the stock market prediction problem

Sentiment analysis of market influencers

Developing your machine learning baseline pipeline

Further reading

Section 2: Model Development and Experimentation

Section 2: Model Development and Experimentation

Chapter 3: Your Data Science Workbench

Chapter 3: Your Data Science Workbench

Technical requirements

Understanding the value of a data science workbench

Creating your own data science workbench

Using the workbench for stock prediction

Further reading

Chapter 4: Experiment Management in MLflow

Chapter 4: Experiment Management in MLflow

Technical requirements

Getting started with the experiments module

Defining the experiment

Adding experiments

Comparing different models

Tuning your model with hyperparameter optimization

Further reading

Chapter 5: Managing Models with MLflow

Chapter 5: Managing Models with MLflow

Technical requirements

Understanding models in MLflow

Exploring model flavors in MLflow

Managing model signatures and schemas

Introducing Model Registry

Managing the model development life cycle

Further reading

Section 3: Machine Learning in Production

Section 3: Machine Learning in Production

Chapter 6: Introducing ML Systems Architecture

Chapter 6: Introducing ML Systems Architecture

Technical requirements

Understanding challenges with ML systems and projects

Surveying state-of-the-art ML platforms

Architecting the PsyStock ML platform

Further reading

Chapter 7: Data and Feature Management

Chapter 7: Data and Feature Management

Technical requirements

Structuring your data pipeline project

Acquiring stock data

Checking data quality

Generating a feature set and training data

Using a feature store

Further reading

Chapter 8: Training Models with MLflow

Chapter 8: Training Models with MLflow

Technical requirements

Creating your training project with MLflow

Implementing the training job

Evaluating the model

Deploying the model in the Model Registry

Creating a Docker image for your training job

Further reading

Chapter 9: Deployment and Inference with MLflow

Chapter 9: Deployment and Inference with MLflow

Technical requirements

Starting up a local model registry

Setting up a batch inference job

Creating an API process for inference

Deploying your models for batch scoring in Kubernetes

Making a cloud deployment with AWS SageMaker

Further reading

Section 4: Advanced Topics

Section 4: Advanced Topics

Chapter 10: Scaling Up Your Machine Learning Workflow

Chapter 10: Scaling Up Your Machine Learning Workflow

Technical requirements

Developing models with a Databricks Community Edition environment

Integrating MLflow with Apache Spark

Integrating MLflow with NVIDIA RAPIDS (GPU)

Integrating MLflow with the Ray platform

Further reading

Chapter 11: Performance Monitoring

Chapter 11: Performance Monitoring

Technical requirements

Overview of performance monitoring for machine learning models

Monitoring data drift and model performance

Infrastructure monitoring and alerting

Further reading

Chapter 12: Advanced Topics with MLflow

Chapter 12: Advanced Topics with MLflow

Technical requirements

Exploring MLflow use cases with AutoML

Integrating MLflow with other languages

Understanding MLflow plugins

Further reading

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Customer Reviews

2 (1)

5 star

0

4 star

0

3 star

0

2 star

100%

1 star

0

Implementing the training job

We will use the training data produced in the previous chapter. The assumption here is that an independent job populates the data pipeline in a specific folder. In the book's GitHub repository, you can look at the data in https://github.com/PacktPublishing/Machine-Learning-Engineering-with-MLflow/blob/master/Chapter08/psystock-training/data/training/data.csv.

We will now create a train_model.py file that will be responsible for loading the training data to fit and produce a model. Test predictions will be produced and persisted in the environment so that other steps of the workflow can use the data to evaluate the model.

The file produced in this section is available at the following link:

https://github.com/PacktPublishing/Machine-Learning-Engineering-with-MLflow/blob/master/Chapter08/psystock-training/train_model.py:

We will start by importing the relevant packages. In this case, we will need pandas to handle the data, xgboost...