Book Image

Machine Learning Engineering with Python - Second Edition

By : Andrew P. McMahon
2.5 (2)
Book Image

Machine Learning Engineering with Python - Second Edition

2.5 (2)
By: Andrew P. McMahon

Overview of this book

The Second Edition of Machine Learning Engineering with Python is the practical guide that MLOps and ML engineers need to build solutions to real-world problems. It will provide you with the skills you need to stay ahead in this rapidly evolving field. The book takes an examples-based approach to help you develop your skills and covers the technical concepts, implementation patterns, and development methodologies you need. You'll explore the key steps of the ML development lifecycle and create your own standardized "model factory" for training and retraining of models. You'll learn to employ concepts like CI/CD and how to detect different types of drift. Get hands-on with the latest in deployment architectures and discover methods for scaling up your solutions. This edition goes deeper in all aspects of ML engineering and MLOps, with emphasis on the latest open-source and cloud-based technologies. This includes a completely revamped approach to advanced pipelining and orchestration techniques. With a new chapter on deep learning, generative AI, and LLMOps, you will learn to use tools like LangChain, PyTorch, and Hugging Face to leverage LLMs for supercharged analysis. You will explore AI assistants like GitHub Copilot to become more productive, then dive deep into the engineering considerations of working with deep learning.
Table of Contents (12 chapters)
10
Other Books You May Enjoy
11
Index

Building an Extract, Transform, Machine Learning Use Case

Similar to Chapter 8, Building an Example ML Microservice, the aim of this chapter will be to try to crystallize a lot of the tools and techniques we have learned about throughout this book and apply them to a realistic scenario. This will be based on another use case introduced in Chapter 1, Introduction to ML Engineering, where we imagined the need to cluster taxi ride data on a scheduled basis. So that we can explore some of the other concepts introduced throughout the book, we will assume as well that for each taxi ride, there is also a series of textual data from a range of sources, such as traffic news sites and transcripts of calls between the taxi driver and the base, joined to the core ride information. We will then pass this data to a Large Language Model (LLM) for summarization. The result of this summarization can then be saved in the target data location alongside the basic ride date to provide important context...