Book Image

Machine Learning Engineering with Python

By : Andrew P. McMahon
Book Image

Machine Learning Engineering with Python

By: Andrew P. McMahon

Overview of this book

Machine learning engineering is a thriving discipline at the interface of software development and machine learning. This book will help developers working with machine learning and Python to put their knowledge to work and create high-quality machine learning products and services. Machine Learning Engineering with Python takes a hands-on approach to help you get to grips with essential technical concepts, implementation patterns, and development methodologies to have you up and running in no time. You'll begin by understanding key steps of the machine learning development life cycle before moving on to practical illustrations and getting to grips with building and deploying robust machine learning solutions. As you advance, you'll explore how to create your own toolsets for training and deployment across all your projects in a consistent way. The book will also help you get hands-on with deployment architectures and discover methods for scaling up your solutions while building a solid understanding of how to use cloud-based tools effectively. Finally, you'll work through examples to help you solve typical business problems. By the end of this book, you'll be able to build end-to-end machine learning services using a variety of techniques and design your own processes for consistently performant machine learning engineering.
Table of Contents (13 chapters)
1
Section 1: What Is ML Engineering?
4
Section 2: ML Development and Deployment
9
Section 3: End-to-End Examples

Summary

In this chapter, we looked at how to take the ML solutions we have been building in the past few chapters and thought about how to scale them up to larger data volumes or higher numbers of requests for predictions. To do this, we mainly focused on Apache Spark as this is the most popular general-purpose engine for distributed computing. During our discussion of Apache Spark, we revisited some coding patterns and syntax we used previously in this book. By doing so, we developed a more thorough understanding of how and why to do certain things when developing in PySpark. We discussed the concept of UDFs in detail and how these can be used to create massively scalable ML workflows.

After this, we explored how to work with Spark on the cloud, specifically through the Elastic Map Reduce (EMR) service provided by AWS. Then, we looked at some of the other ways we can scale our solutions; that is, through serverless architectures and horizontal scaling with containers. In the former...