Book Image

Engineering MLOps

By : Emmanuel Raj
Book Image

Engineering MLOps

By: Emmanuel Raj

Overview of this book

Engineering MLps presents comprehensive insights into MLOps coupled with real-world examples in Azure to help you to write programs, train robust and scalable ML models, and build ML pipelines to train and deploy models securely in production. The book begins by familiarizing you with the MLOps workflow so you can start writing programs to train ML models. Then you’ll then move on to explore options for serializing and packaging ML models post-training to deploy them to facilitate machine learning inference, model interoperability, and end-to-end model traceability. You’ll learn how to build ML pipelines, continuous integration and continuous delivery (CI/CD) pipelines, and monitor pipelines to systematically build, deploy, monitor, and govern ML solutions for businesses and industries. Finally, you’ll apply the knowledge you’ve gained to build real-world projects. By the end of this ML book, you'll have a 360-degree view of MLOps and be ready to implement MLOps in your organization.
Table of Contents (18 chapters)
1
Section 1: Framework for Building Machine Learning Models
7
Section 2: Deploying Machine Learning Models at Scale
13
Section 3: Monitoring Machine Learning Models in Production

Testing your ML solution by design

On top of performing regular software development tests, such as unit tests, integration tests, system testing, and acceptance testing, ML solutions need additional tests because data and ML models are involved. Both the data and models change dynamically over time. Here are some concepts for testing by design; applying them to your use cases can ensure robust ML solutions are produced as a result.

Data testing

The goal of testing data is to ensure that the data is of a high enough quality for ML model training. The better the quality of the data, the better the models trained for the given tasks. So how do we assess the quality of data? It can be done by inspecting the following five factors of the data:

  • Accuracy
  • Completeness (no missing values)
  • Consistency (in terms of expected data format and volume)
  • Relevance (data should meet the intended need and requirements)
  • Timeliness (the latest or up-to-date data)

Based...