-
Book Overview & Buying
-
Table Of Contents
Machine Learning with Amazon SageMaker Cookbook
By :
In this recipe, we will prepare a Dockerfile for the custom Python container image. We will make use of the train and serve scripts that we prepared in the previous recipes. After that, we will run the docker build command to prepare the image before pushing it to an Amazon ECR repository.
Tip
Wait! What's a Dockerfile? It's a text document containing the directives (commands) used to prepare and build a container image. This container image then serves as the blueprint when running containers. Feel free to check out https://docs.docker.com/engine/reference/builder/ for more information on Dockerfiles.
Make sure you have completed the Preparing and testing the serve script in Python recipe.
The initial steps in this recipe focus on preparing a Dockerfile. Let's get started:
Dockerfile file in the file tree to open it in the Editor pane. Make sure that this is the same Dockerfile that's inside the ml-python directory:
Figure 2.55 – Opening the Dockerfile inside the ml-python directory
Here, we can see a Dockerfile inside the ml-python directory. Remember that we created an empty Dockerfile in the Setting up the Python and R experimentation environments recipe. Clicking it in the file tree should open an empty file in the Editor pane:

Figure 2.56 – Empty Dockerfile in the Editor pane
Here, we have an empty Dockerfile. In the next step, we will update this by adding three lines of code.
Dockerfile with the following block of configuration code: FROM arvslat/amazon-sagemaker-cookbook-python-base:1 COPY train /usr/local/bin/train COPY serve /usr/local/bin/serve
Here, we are planning to build on top of an existing image called amazon-sagemaker-cookbook-python-base. This image already has a few prerequisites installed. These include the Flask, pandas, and Scikit-learn libraries so that you won't have to worry about getting the installation steps working properly in this recipe. For more details on this image, check out https://hub.docker.com/r/arvslat/amazon-sagemaker-cookbook-python-base:

Figure 2.57 – Docker Hub page for the base image
Here, we can see the Docker Hub page for the amazon-sagemaker-cookbook-python-base image.
Tip
You can access a working copy of this Dockerfile in the Machine Learning with Amazon SageMaker Cookbook GitHub repository: https://github.com/PacktPublishing/Machine-Learning-with-Amazon-SageMaker-Cookbook/blob/master/Chapter02/ml-python/serve.
With the Dockerfile ready, we will proceed with using the Terminal until the end of this recipe:

Figure 2.58 – New Terminal
Here, we can see how to create a new Terminal. Note that the Terminal pane is under the Editor pane in the AWS Cloud9 IDE.
ml-python directory containing our Dockerfile:cd /home/ubuntu/environment/opt/ml-python
IMAGE_NAME=chap02_python TAG=1
docker build command:docker build --no-cache -t $IMAGE_NAME:$TAG .
The docker build command makes use of what is written inside our Dockerfile. We start with the image specified in the FROM directive and then we proceed by copying the file into the container image.
docker run command to test if the train script works:docker run --name pytrain --rm -v /opt/ml:/opt/ml $IMAGE_NAME:$TAG train
Let's quickly discuss some of the different options that were used in this command. The --rm flag makes Docker clean up the container after the container exits. The -v flag allows us to mount the /opt/ml directory from the host system to the /opt/ml directory of the container:

Figure 2.59 – Result of the docker run command (train)
Here, we can see the results after running the docker run command. It should show logs similar to what we had in the Preparing and testing the train script in Python recipe.
docker run command to test if the serve script works:docker run --name pyserve --rm -v /opt/ml:/opt/ml $IMAGE_NAME:$TAG serve
After running this command, the Flask API server starts successfully. We should see logs similar to what we had in the Preparing and testing the serve script in Python recipe:

Figure 2.60 – Result of the docker run command (serve)
Here, we can see that the API is running on port 8080. In the base image we used, we added EXPOSE 8080 to allow us to access this port in the running container.

Figure 2.61 – New Terminal
As the API is running already in the first Terminal, we have created a new one.
SERVE_IP=$(docker network inspect bridge | jq -r ".[0].Containers[].IPv4Address" | awk -F/ '{print $1}')
echo $SERVE_IPWe should get an IP address that's equal or similar to 172.17.0.2. Of course, we may get a different IP address value.
curl command: curl http://$SERVE_IP:8080/ping
We should get an OK after running this command.
invocations endpoint URL using the curl command:curl -d "1" -X POST http://$SERVE_IP:8080/invocations
We should get a value similar or close to 881.3428400857507 after invoking the invocations endpoint.
At this point, it is safe to say that the custom container image we have prepared in this recipe is ready. Now, let's see how this works!
In this recipe, we built a custom container image using the Dockerfile configuration we specified. When you have a Dockerfile, the standard set of steps would be to use the docker build command to build the Docker image, authenticate with ECR to gain the necessary permissions, use the docker tag command to tag the image appropriately, and use the docker push command to push the Docker image to the ECR repository.
Let's discuss what we have inside our Dockerfile. If this is your first time hearing about Dockerfiles, they are simply text files containing commands to build the image. In our Dockerfile, we did the following:
arvslat/amazon-sagemaker-cookbook-python-base as the base image. Check out https://hub.docker.com/repository/docker/arvslat/amazon-sagemaker-cookbook-python-base for more details about this image.train and serve scripts to the /usr/local/bin directory inside the container image. These scripts are executed when we use docker run.Using the arvslat/amazon-sagemaker-cookbook-python-base image as the base image allowed us to write a shorter Dockerfile that focuses only on copying the train and serve files to the directory inside the container image. Behind the scenes, we have already pre-installed the flask, pandas, scikit-learn, and joblib packages, along with their prerequisites, inside this container image so that we will not run into issues when building the custom container image. Here is a quick look at the Dockerfile file we used as the base image that we are using in this recipe:
FROM ubuntu:18.04 RUN apt-get -y update RUN apt-get install -y python3.6 RUN apt-get install -y --no-install-recommends python3-pip RUN apt-get install -y python3-setuptools RUN ln -s /usr/bin/python3 /usr/bin/python & \ ln -s /usr/bin/pip3 /usr/bin/pip RUN pip install flask RUN pip install pandas RUN pip install scikit-learn RUN pip install joblib WORKDIR /usr/local/bin EXPOSE 8080
In this Dockerfile, we can see that we are using Ubuntu:18.04 as the base image. Note that we can use other base images as well, depending on the libraries and frameworks we want to be installed in the container image.
Once we have the container image built, the next step will be to test if the train and serve scripts will work inside the container once we use docker run. Getting the IP address of the running container may be the trickiest part, as shown in the following block of code:
SERVE_IP=$(docker network inspect bridge | jq -r ".[0].Containers[].IPv4Address" | awk -F/ '{print $1}')
We can divide this into the following parts:
docker network inspect bridge: This provides detailed information about the bridge network in JSON format. It should return an output with a structure similar to the following JSON value:[
{
...
"Containers": {
"1b6cf4a4b8fc5ea5...": {
"Name": "pyserve",
"EndpointID": "ecc78fb63c1ad32f0...",
"MacAddress": "02:42:ac:11:00:02",
"IPv4Address": "172.17.0.2/16",
"IPv6Address": ""
}
},
...
}
]jq -r ".[0].Containers[].IPv4Address": This parses through the JSON response value from docker network inspect bridge. Piping this after the first command would yield an output similar to 172.17.0.2/16.awk -F/ '{print $1}': This splits the result from the jq command using the / separator and returns the value before /. After getting the AA.BB.CC.DD/16 value from the previous command, we get AA.BB.CC.DD after using the awk command.Once we have the IP address of the running container, we can ping the /ping and /invocations endpoints, similar to how we did in the Preparing and testing the serve script in Python recipe.
In the next recipes in this chapter, we will use this custom container image when we do training and deployment with the SageMaker Python SDK.
Change the font size
Change margin width
Change background colour