Book Image

Machine Learning on Kubernetes

By : Faisal Masood, Ross Brigoli
Book Image

Machine Learning on Kubernetes

By: Faisal Masood, Ross Brigoli

Overview of this book

MLOps is an emerging field that aims to bring repeatability, automation, and standardization of the software engineering domain to data science and machine learning engineering. By implementing MLOps with Kubernetes, data scientists, IT professionals, and data engineers can collaborate and build machine learning solutions that deliver business value for their organization. You'll begin by understanding the different components of a machine learning project. Then, you'll design and build a practical end-to-end machine learning project using open source software. As you progress, you'll understand the basics of MLOps and the value it can bring to machine learning projects. You will also gain experience in building, configuring, and using an open source, containerized machine learning platform. In later chapters, you will prepare data, build and deploy machine learning models, and automate workflow tasks using the same platform. Finally, the exercises in this book will help you get hands-on experience in Kubernetes and open source tools, such as JupyterHub, MLflow, and Airflow. By the end of this book, you'll have learned how to effectively build, train, and deploy a machine learning model using the machine learning platform you built.
Table of Contents (16 chapters)
1
Part 1: The Challenges of Adopting ML and Understanding MLOps (What and Why)
5
Part 2: The Building Blocks of an MLOps Platform and How to Build One on Kubernetes
10
Part 3: How to Use the MLOps Platform and Build a Full End-to-End Project Using the New Platform

Writing and running a Spark application from Jupyter Notebook

Before you run the following steps, make sure that you grasped the components and their interactions that we have introduced in the previous section of this chapter:

  1. Validate that the Spark operator pod is running by running the following command:
    kubectl get pods -n ml-workshop | grep spark-operator

You should see the following response:

Figure 5.35 – Spark operator pod

  1. Validate that the JupyterHub pod is running by running the following command:
    kubectl get pods -n ml-workshop | grep jupyterhub

You should see the following response:

Figure 5.36 – JupyterHub pod

  1. Before you start the notebook, let's delete the Spark cluster you have created in the previous sections by running the following command. This is to demonstrate that JupyterHub will automatically create a new instance of Spark cluster for you:
    kubectl delete sparkcluster...