The Machine Learning Development Process

Book Overview & Buying
Table Of Contents

Machine Learning Engineering with Python - Second Edition

By : Andrew P. McMahon

4.6 (37)

Buy this Book

Machine Learning Engineering with Python

4.6 (37)

By: Andrew P. McMahon

Buy this Book

Overview of this book

The Second Edition of Machine Learning Engineering with Python is the practical guide that MLOps and ML engineers need to build solutions to real-world problems. It will provide you with the skills you need to stay ahead in this rapidly evolving field. The book takes an examples-based approach to help you develop your skills and covers the technical concepts, implementation patterns, and development methodologies you need. You'll explore the key steps of the ML development lifecycle and create your own standardized "model factory" for training and retraining of models. You'll learn to employ concepts like CI/CD and how to detect different types of drift. Get hands-on with the latest in deployment architectures and discover methods for scaling up your solutions. This edition goes deeper in all aspects of ML engineering and MLOps, with emphasis on the latest open-source and cloud-based technologies. This includes a completely revamped approach to advanced pipelining and orchestration techniques. With a new chapter on deep learning, generative AI, and LLMOps, you will learn to use tools like LangChain, PyTorch, and Hugging Face to leverage LLMs for supercharged analysis. You will explore AI assistants like GitHub Copilot to become more productive, then dive deep into the engineering considerations of working with deep learning.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Introduction to ML Engineering

Technical requirements

Defining a taxonomy of data disciplines

Working as an effective team

ML engineering in the real world

What does an ML solution look like?

High-level ML system design

Summary

Free Chapter

The Machine Learning Development Process

Technical requirements

Setting up our tools

Concept to solution in four steps

Summary

From Model to Model Factory

Technical requirements

Defining the model factory

Learning about learning

Engineering features for machine learning

Designing your training system

Retraining required

Persisting your models

Building the model factory with pipelines

Summary

Packaging Up

Technical requirements

Writing good Python

Choosing a style

Packaging your code

Building your package

Testing, logging, securing, and error handling

Not reinventing the wheel

Summary

Deployment Patterns and Tools

Technical requirements

Architecting systems

Exploring some standard ML patterns

Containerizing

Hosting your own microservice on AWS

Building general pipelines with Airflow

Building advanced ML pipelines

Selecting your deployment strategy

Summary

Scaling Up

Technical requirements

Scaling with Spark

Spinning up serverless infrastructure

Containerizing at scale with Kubernetes

Scaling with Ray

Designing systems at scale

Summary

Deep Learning, Generative AI, and LLMOps

Going deep with deep learning

Living it large with LLMs

Building the future with LLMOps

Summary

Building an Example ML Microservice

Technical requirements

Understanding the forecasting problem

Designing our forecasting service

Selecting the tools

Training at scale

Serving the models with FastAPI

Containerizing and deploying to Kubernetes

Summary

Building an Extract, Transform, Machine Learning Use Case

Technical requirements

Understanding the batch processing problem

Designing an ETML solution

Selecting the tools

Executing the build

Summary

Other Books You May Enjoy

Index

The Machine Learning Development Process

In this chapter, we will define how the work for any successful machine learning (ML) software engineering project can be divided up. Basically, we will answer the question of how you actually organize the doing of a successful ML project. We will not only discuss the process and workflow but we will also set up the tools you will need for each stage of the process and highlight some important best practices with real ML code examples.

In this edition, there will be more details on an important data science and ML project management methodology: Cross-Industry Standard Process for Data Mining (CRISP-DM). This will include a discussion of how this methodology compares to traditional Agile and Waterfall methodologies and will provide some tips and tricks for applying it to your ML projects. There are also far more detailed examples to help you get up and running with continuous integration/continuous deployment (CI/CD) using GitHub Actions, including how to run ML-focused processes such as automated model validation. The advice on getting up and running in an Interactive Development Environment (IDE) has also been made more tool-agnostic, to allow for those using any appropriate IDE. As before, the chapter will focus heavily on a “four-step” methodology I propose that encompasses a discover, play, develop, deploy workflow for your ML projects. This project workflow will be compared with the CRISP-DM methodology, which is very popular in data science circles. We will also discuss the appropriate development tooling and its configuration and integration for a successful project. We will also cover version control strategies and their basic implementation, and setting up CI/CD for your ML project. Then, we will introduce some potential execution environments as the target destinations for your ML solutions. By the end of this chapter, you will be set up for success in your Python ML engineering project. This is the foundation on which we will build everything in subsequent chapters.

As usual, we will conclude the chapter by summarizing the main points and highlighting what this means as we work through the rest of the book.

Finally, it is also important to note that although we will frame the discussion here in terms of ML challenges, most of what you will learn in this chapter can also be applied to other Python software engineering projects. My hope is that the investment in building out these foundational concepts in detail will be something you can leverage again and again in all of your work.

We will explore all of this in the following sections and subsections:

Setting up our tools
Concept to solution in four steps:
- Discover
- Play
- Develop
- Deploy

There is plenty of exciting stuff to get through and lots to learn – so let’s get started!

Machine Learning Engineering with Python - Second Edition

By : Andrew P. McMahon

Machine Learning Engineering with Python

By: Andrew P. McMahon

Overview of this book

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access