Book Image

Azure Machine Learning Engineering

By : Sina Fakhraee, Balamurugan Balakreshnan, Megan Masanz
Book Image

Azure Machine Learning Engineering

By: Sina Fakhraee, Balamurugan Balakreshnan, Megan Masanz

Overview of this book

Data scientists working on productionizing machine learning (ML) workloads face a breadth of challenges at every step owing to the countless factors involved in getting ML models deployed and running. This book offers solutions to common issues, detailed explanations of essential concepts, and step-by-step instructions to productionize ML workloads using the Azure Machine Learning service. You’ll see how data scientists and ML engineers working with Microsoft Azure can train and deploy ML models at scale by putting their knowledge to work with this practical guide. Throughout the book, you’ll learn how to train, register, and productionize ML models by making use of the power of the Azure Machine Learning service. You’ll get to grips with scoring models in real time and batch, explaining models to earn business trust, mitigating model bias, and developing solutions using an MLOps framework. By the end of this Azure Machine Learning book, you’ll be ready to build and deploy end-to-end ML solutions into a production system using the Azure Machine Learning service for real-time scenarios.
Table of Contents (17 chapters)
1
Part 1: Training and Tuning Models with the Azure Machine Learning Service
7
Part 2: Deploying and Explaining Models in AMLS
12
Part 3: Productionizing Your Workload with MLOps

Navigating AMLS

AMLS provides access to key resources for a data science team to leverage. In this section, you will learn how to navigate AMLS exploring the key components found within the studio. You will learn briefly about its capabilities, which we will cover in detail in the rest of the chapters.

Open a browser and go to https://portal.azure.com. Log in with your Azure AD credentials. Once logged into the portal, you will see several icons. Select the Resource group icon and click on the Azure Machine Learning resource.

In the Overview page, click on the Launch Studio button as seen in the following screenshot:

Figure 1.6 – Launch studio

Figure 1.6 – Launch studio

Clicking on the icon shown in Figure 1.6 will open AMLS in a new window.

The studio launch will bring you to the main home page of AMLS. The UI includes functionality to match several personas, including no-code, low -code, and code-based ML. The main page has two sections – the left-hand menu pane and the right-hand workspace pane.

The AMLS workspace home screen is shown in Figure 1.7:

Figure 1.7 – AMLS workspace home screen

Figure 1.7 – AMLS workspace home screen

Now, let us understand the preceding screenshot in brief:

  • In section 1 of Figure 1.7, the left-hand menu pane is displayed. Clicking on any of the words in this pane will bring up a new right workspace pane, which includes sections 2 and 3 of the screen. We can select any of these keywords to quickly access key resources within our AMLS workspace. We will drill into these key resources as we begin exploring the AMLS workspace.
  • In section 2 of Figure 1.7, quick links are provided to key resources that we will leverage throughout this book, enabling AMLS users to create new items covering the varying personas supported.
  • As we continue to explore our environment and dig into creating assets within the AMLS workspace, both with code-based and low-code options, recent resources will begin to appear in section 3 of Figure 1.7, providing users with the ability to see recently leveraged resources, whether the compute, the code execution, the models created, or the datasets that are leveraged.

The home page provides quick access to the key resources found within your AMLS workspace. In addition to the quick links, scroll down and you can view the Documentation section. In the Documentation section, we see great documentation to get you started in understanding how to best leverage your AML environment.

The Documentation section, a hub for documentation resources, is displayed on the right pane of the AMLS home screen:

Figure 1.8 – Documentation

Figure 1.8 – Documentation

As shown in Figure 1.8, the AMLS home page provides you with a wealth of documentation resources to get you started. The links include training modules, tutorials, and even blogs regarding how to leverage AMLS.

On the top-right side of the page, there are several options available:

  • Notifications: The bell icon represents notifications, which display the messages that are generated as you leverage your AMLS workspace. These messages will contain information regarding the creation and deletion of resources, as well as information regarding the resources running within your workspace.
Figure 1.9 – Top-right options

Figure 1.9 – Top-right options

  • Settings: The icon next to the bell that appears as a gear showcases settings for your Azure portal. Clicking on the icon provides the ability to set basic settings as shown in Figure 1.10:
Figure 1.10 – Settings for workspace customization

Figure 1.10 – Settings for workspace customization

Within the Settings blade, options are available to change the background of the workspace UI with themes. There are light and dark shades available. Then, there is a section for changing the preferred language and formats. Check the Language dropdown for a list of languages – the list of languages will change as new languages are added to the workspace.

  • Help: The question mark icon provides helpful resources, from tutorials to placing support requests. This is where all the Help content is organized:
Figure 1.11 – Help for AMLS workspace support

Figure 1.11 – Help for AMLS workspace support

Links are provided for tutorials on how to use the workspace and how to develop and deploy data science projects. Click on Launch guided tour to use the step-by-step guided tour.

To troubleshoot any issue with a workspace, click on Run workspace diagnostics and follow the instructions:

  • Support: This is the section where technical and subscription core limits, and other Azure-related issues, are linked to create a ticket.
  • Resources: This is the section that provides links to the AML documentation, as well as a useful cheat sheet that is hosted on GitHub. A link to Microsoft’s Privacy and Terms is also available in this section.

Clicking on the smiley icon will bring up the Send us feedback section:

Figure 1.12 – Feedback page

Figure 1.12 – Feedback page

Leveraging this section, an AMLS workspace user can provide feedback to the AMLS product team.

In the following screenshot, we can see the workspace selection menu:

Figure 1.13 – Workspace selection menu

Figure 1.13 – Workspace selection menu

When working with multiple workspaces on multiple projects, there may be a need to switch the AMLS workspace between multiple Azure AD directories. This option is available via the selection of the subscription and workspace name as shown in Figure 1.13. Also note, under the Resource Group section, a link will open a new tab in your browser and bring you directly to your resource group in the Azure portal. This is a nice feature, allowing you to quickly explore the Azure resources that are outside of the AMLS workspace but may be relevant to your workload in Azure. The workspace config file, which holds the key information enabling authorized users to connect directly to the AMLS workspace through code, can be downloaded to use with the Azure Machine Learning SDK for Python (AML SDK v2) inside the workspace selection menu.

Next, we will discuss the AMLS left-hand navigation menu shown in Figure 1.7 (1). This navigation menu will allow you to interact with your assets within your AML environment and is divided into three sections:

  • The Author section, which includes Notebooks, Automated ML, and Designer
  • The Assets section includes artifacts that will be created as part of your data science workload, which will be explored in detail in upcoming chapters.
  • The Manage section, which includes resources that will be leveraged as part of your data science workload.

Let’s review the sections as follows:

  • Author is the section in which the data scientist selects the tool of choice for development:
    • Notebooks: This is a section within the Author portion of the menu that provides access to your files, as well as an AMLS workspace IDE, which is similar to a Jupyter notebook, but with a few extra features for data scientists to carry out feature engineering and modeling. Inside this IDE, with a notebook that has been created, users can select a version of Python kernel, connecting them to a Conda environment with a specified version of Python.
Figure 1.14 – Author menu items

Figure 1.14 – Author menu items

Notebooks is an option within the Author section providing access to files, samples, file management, terminal access, and, as we will see later in this chapter in the Developing within AMLS section, a built-in IDE:

Figure 1.15 – Notebooks

Figure 1.15 – Notebooks

We will highlight the different features found within the Notebooks selection:

  1. In section 1 of Figure 1.15, clicking on the Files label shows all the user directories within the collaborative AMLS workspace, in addition to files stored within those directories.
  2. In section 2 of Figure 1.15, clicking on the Samples label provides AML tutorials for getting the most out of AMLS.
  3. Additionally, there is the capability to leverage a terminal on your compute resource.

In this section, you can create new files. Clicking on the + icon gives you the ability to create new files. Note that both files and folder directories can be uploaded as well as created. This allows you to easily upload data files in addition to code.

Create new file has options to name the file and select what type of file it is, such as a Jupyter notebook or Python. Typically, data scientists will create new Jupyter notebooks, but in addition to the .ipynb extension, the menu for File type includes Jupyter, Python, R, Shell, text, and other, in which you can provide your own file extension.

In the left-hand navigation menu of Figure 1.7, we saw Notebooks, which we briefly reviewed, as well as Automated ML and Designer. We will next provide a high-level overview of the Automated ML section.

  • Automated ML: This can also be selected from the Author section. Automated ML is a no-code-required tool that provides the ability to leverage data and select an ML model type and compute to accomplish model creation. In future chapters, we will go through this in more detail, but at a high level, this option provides a walk-through to establish a model based on the dataset provided. You will be prompted to pick classification, regression, or time-series forecasting; natural language processing (multi-class or multi-label classification); or compute vision (including multi-class, label, object detection, and instance segmentation) based on your data science workload. It’s a guided step-by-step process. There are settings available to stop the model from overrunning past a set duration to ensure that unexpected costs are limited. Automated ML also provides the ability to exclude algorithms. AML will select a variety of algorithms and run them with a dataset to provide the best model available. In addition to the capability to run multiple algorithms to determine the best model based on a given dataset, Automated ML also includes model explainability, providing insight into which features are more or less important in determining the response variable. The timing required for this process is dependent on the dataset, as well as the compute resources allocated to the task. Automated ML uses an ad-hoc compute, so when the experiment is submitted to run, it starts the compute and then runs the experiment. Building the models is run inside an experiment as a job, which is saved as a snapshot for future analysis. After the best model is built with Automated ML, AMLS provides the ability to leverage the best model with a single-click deployment of a REST API hosted in an Azure Container Instance (ACI) for development and test environments. AMLS can also support production workloads with a REST API deployment to Azure Kubernetes Services (AKS) and leveraging the CLI v2 or the SDK v2 AMLS supports endpoints that streamline the process of model deployment.

Clicking on Automated ML in the left-hand menu tab opens the ability to create a new Automated ML job:

Figure 1.16 – Automated ML screen with options

Figure 1.16 – Automated ML screen with options

Now that we have seen the Notebooks and Automated ML sections, we will look at the Designer section for a low-code experience.

  • Designer: This is the section where low-code environments are provided. Data scientists can drag and drop and develop model training and validation. Designer has two sections – to the left is the menu and to the right is the authoring section for development. Once the model is built, an option to deploy it in various forms is provided.

Here is a sample experiment built with Designer:

Figure 1.17 – Designer sample

Figure 1.17 – Designer sample

Designer provides options to model with several types of ML models, such as classification, regression, clustering, recommendation, computer vision, and text analytics.

Now that we have reviewed the sections for authoring a model – Notebooks, Automated ML, and Designer – we will explore the concept of assets in the AMLS Assets navigation section.

  • Assets is a section where all the experiment jobs and their artifacts are stored:
Figure 1.18 – Assets menu items

Figure 1.18 – Assets menu items

  • Data: This section will display the registered datasets used within the AMLS workspace under the Data assets tab. Datasets manage the versions created every time a new dataset is registered. Datasets can be created through the UI, SDK, or CLI. When a dataset is created, a data store (the resource hosting the data) is also provided:
    • Data assets: This displays a list of the datasets leveraged within the workspace:
Figure 1.19 – The Datasets display

Figure 1.19 – The Datasets display

Click on Data assets and see the list of all data sets used. The UI displaying datasets can be customized by adding and deleting columns to your view. In addition to providing the ability to register datasets through the UI, there is also the ability to archive a dataset by clicking on Archive. The data in a repository may change over time as applications add in data.

  • Datastores: Within the Data section of the left-hand pane menu, can also be selected. Data stores can be thought of as locations for retrieving data. Examples of data stores include Azure Blob storage, an Azure file share, Azure Data Lake Storage, or an Azure database, including SQL, PostgreSQL, and MySQL. All the security for connecting to a data store associated with your AMLS workspace and stored in Azure Key Vault. During the AMLS workspace deployment, an Azure Blob storage account was created. This Azure Blob storage account is your default datastore for your AMLS workspace.
  • A registered dataset can be monitored with functionality that is currently in preview, which can be reviewed by clicking on the Dataset monitors (preview) label shown in Figure 1.19.
  • Jobs: The Jobs screen shows all the experiments, which are groups of jobs, and the execution of code within your AMLS workspace:
Figure 1.20 – The Experiments display

Figure 1.20 – The Experiments display

You can customize and reset the default view in the UI for jobs by adding columns or deleting columns, the properties of a given job.

Each experiment will display as blue text under Experiment as in Figure 1.20. Within the Jobs section, we can select multiple experiments and see charts on their performance.

  • Pipelines: A pipeline is a sequence of steps performed within the job of an experiment:
Figure 1.21 – The Pipelines display

Figure 1.21 – The Pipelines display

Usually, designer experiments will show the pipeline and provide statuses for the job. As with the UIs for Jobs and Datasets, the UI provides customization when viewing pipelines. You can also display Pipeline endpoints. The Pipeline drafts option is also available. You can sort or filter the view by Status, Experiment, or Tags. Options to select all filters and clear filters are also available. The option to select how many rows to display is also available.

  • Environments: Setting up a Python environment can be a difficult task, as with the value of leveraging open source packages comes the complexity of managing the versions of various packages. While this problem is not unique to the AMLS workspace, Azure has created a solution for managing these resources – in AMLS, they are called environments. Environments is a section in AMLS that allows users to view and register which packages, and which Docker images, should be leveraged by the compute resources. Microsoft has already created several of these environments, which are considered curated, and users can also create their own custom environments. We will be leveraging custom environments in Chapter 3, Training Machine Learning Models in AMLS, as we run experiment jobs on compute clusters.

The Environments section provides a list of environments leveraged by the AMLS workspace:

Figure 1.22 – Environments

Figure 1.22 – Environments

In the Curated environments section, there is a wide variety of environments to select from. This is useful for applications that need specific environments with libraries. The list of environments created is available for selection. Click on each Name to see what is included in the environment. For now, most of the following environments are used for inference purposes.

  • Models: The Models section shows all the models registered and their versions. The UI provides customization of columns as shown in the following screenshot:
Figure 1.23 – The Models display

Figure 1.23 – The Models display

Models can be registered manually, through the SDK, or through the CLI. The options to change how many models to display, to show the current version or all versions of the model, and the ability to sort and filter and then clear are all available.

  • Endpoints: Models can be deployed as REST endpoints. These endpoints leverage the model, and with predicted values, provide a response based on the trained model. Leveraging the REST protocol, these models can easily be consumed by other applications. Clicking on Endpoints on the left-hand navigation menu of AMLS will bring these up.

The Endpoints section displays endpoints for both real-time and batch inferencing:

Figure 1.24 – The Endpoint display

Figure 1.24 – The Endpoint display

Real-time endpoints are referred to as online endpoints and typically take a single row of data and produce a score output, and they are performant as a REST API. Batch endpoints are for batch-based execution, where we pass large datasets and are then provided with the predicted output. This is usually a long-running process. While CLI v1 and the SDK v1 allow AMLS users to deploy to ACI and Kubernetes, this book will focus on deployments leveraging CLI v2 and SDK v2, which leverage endpoints to deploy to managed online endpoints, Kubernetes online endpoints, and batch inference endpoints.

  • Manage is the section in which users can manage resources leveraged by the AMLS workspace, including Compute, Data Labeling, and Linked Services:
    • Compute: This is where we manage various compute for developing data science projects. There are four types of compute resources found within the Compute section in AMLS. These four include Compute instances, Compute clusters, Inference clusters, and Attached computes.

The Compute section provides visibility into the compute resources leveraged with an AMLS workspace:

Figure 1.25 – Compute options

Figure 1.25 – Compute options

A compute can be a single node or include several nodes. A node is a Virtual Machine (VM) instance. A single node instance can vertically scale and will be limited to a Central Processing Unit (CPU) and Graphics Processing Unit (GPU). Compute instances are single nodes. These resources are great for development work. Compute clusters, on the other hand, can be scaled horizontally and can be used for workloads with larger datasets, as the workload can be distributed across the nodes. To enable scaling, jobs can be performed in parallel to effectively scale the training and scoring using our AML SDK.

Within the Compute section, as compute resources are created, the available quota for your subscription is displayed, providing visibility into the number of cores that are available for a given subscription. Most Azure VM SKUs are available for compute resources. For the GPU, depending on the region, users can create support requests to extend vCores if they are available in the region. When creating compute clusters, the number of nodes leveraged by the compute cluster can be set to from 0 to N nodes.

Compute resources in an AMLS workspace incur a cost per node on an hourly basis. On a compute cluster, setting the minimum number of nodes to 0 will shut down the compute resources when an experiment completes after the Idle seconds before scale down is reached. For a compute instance, there is the option to schedule when to switch on or off the instance to save money. In addition to compute instances and compute clusters, AMLS has the concept of inference clusters. Inference clusters in the Compute section allows you to view or create an AKS cluster. The last type of compute available within the compute section is under the Attached computes section. This section allows you to attach your own compute resources, including Azure Databricks, Synapse Spark pools, HDInsights, VMs, and others.

  • Data Labeling: Data Labeling is a newer feature option added to AMLS. This feature is for projects that tag images for custom vision-based modeling. Images are labeled within an AMLS Data Labeling project. Multiple users can label images within one project. To further improve productivity, there is ML-assisted data labeling. Within a labeling project, both text and images can be labeled. For image projects, labeling tasks include Image Classification Multi-class, which involves classifying an image from a set of classes, and Image Classification Multi-label, which applies more labels from a set of classes. There is also Object identification, which defines a bounding box to each object found in an image, and finally, Instance Segmentation, which provides a polygon around an image and assigns a class label. Text projects, include Multi-class and Multi-label and Text Named Entity Recognition options. Multi-class will apply a single label to text, while Multi-label allows you to apply one or more labels to a piece of text. Text Named Entity Recognition allows users to provide one or more entities for a piece of text.

The Data Labeling feature requires a GPU-enabled compute, due to its compute-intensive nature. An option to provide project instructions is available. Every user will be assigned a queue and the user’s progress in the project is also shown on a dashboard for each project.

The following screenshot shows how a sample labeling project is displayed:

Figure 1.26 – Data Labeling

Figure 1.26 – Data Labeling

  • Linked Services: This provides you with integration with other Microsoft products, currently including Azure Synapse Analytics so that you can attach Apache Spark pools. Click on the + Add integration button to select from an Azure subscription followed by a Synapse workspace.

Linked Services, as seen in the following screenshot, provides visibility into established connections with other Microsoft products:

Figure 1.27 – Linked Services

Figure 1.27 – Linked Services

Through this linked service, which is currently in public preview, AMLS can leverage an Azure Synapse workspace, bringing the power of an Apache Spark pool into your AMLS environment. A large component of a data science workload includes data preparation, and through Linked Services, data transformation can be delivered leveraging Spark.

With a basic understanding of the AMLS workspace, you can now move on to writing code. Before you do that, however, you need to create a VM that will power your jobs. Compute instances are AMLS VMs specifically for writing code. They come in many shapes and sizes and can be created via the AMLS GUI, the Azure CLI, Python code, or ARM templates. Each user is required to have their own compute instance, as AMLS allows only one user per compute instance.

We will begin by creating a compute instance via the AMLS GUI. Then, we will add a schedule to our compute instance so that it starts up and shuts down automatically; this is an important cost-saving measure. Next, we will create a compute instance by using the Azure CLI. Finally, we will create a compute instance with a schedule enabled with an ARM template. Even though you will create three compute instances, there is no need to delete them, as you only pay for them while they are in use.

Tip

When you’re not using a compute instance, make sure it is shut down. Leaving compute instances up and running incurs an hourly cost.

In this section, we have navigated through AMLS, leveraging the left-hand navigation menu pane. We explored the Author, Assets, and Manage sections and each of the components found within AMLS. Now that we have covered navigating the components of AMLS, let us continue with creating a compute so that you can begin to write code in AMLS.