Book Image

Automated Machine Learning with Microsoft Azure

By : Dennis Michael Sawyers
Book Image

Automated Machine Learning with Microsoft Azure

By: Dennis Michael Sawyers

Overview of this book

Automated Machine Learning with Microsoft Azure will teach you how to build high-performing, accurate machine learning models in record time. It will equip you with the knowledge and skills to easily harness the power of artificial intelligence and increase the productivity and profitability of your business. Guided user interfaces (GUIs) enable both novices and seasoned data scientists to easily train and deploy machine learning solutions to production. Using a careful, step-by-step approach, this book will teach you how to use Azure AutoML with a GUI as well as the AzureML Python software development kit (SDK). First, you'll learn how to prepare data, train models, and register them to your Azure Machine Learning workspace. You'll then discover how to take those models and use them to create both automated batch solutions using machine learning pipelines and real-time scoring solutions using Azure Kubernetes Service (AKS). Finally, you will be able to use AutoML on your own data to not only train regression, classification, and forecasting models but also use them to solve a wide variety of business problems. By the end of this Azure book, you'll be able to show your business partners exactly how your ML models are making predictions through automatically generated charts and graphs, earning their trust and respect.
Table of Contents (17 chapters)
1
Section 1: AutoML Explained – Why, What, and How
5
Section 2: AutoML for Regression, Classification, and Forecasting – A Step-by-Step Guide
10
Section 3: AutoML in Production – Automating Real-Time and Batch Scoring Solutions

Creating a parallel scoring pipeline

Standard ML pipelines work just fine for the majority of ML use cases, but when you need to score a large amount of data at once, you need a more powerful solution. That's where ParallelRunStep comes in. ParallelRunStep is Azure's answer to scoring big data in batch. When you use ParallelRunStep, you leverage all of the cores on your compute cluster simultaneously.

Say you have a compute cluster consisting of eight Standard_DS3_v2 virtual machines. Each Standard_DS3_v2 node has four cores, so you can perform 32 parallel scoring processes at once. This parallelization essentially lets you score data many times faster than if you used a single machine. Furthermore, it can easily scale vertically (increasing the size of each virtual machine in the cluster) and horizontally (increasing the node count).

This section will allow you to become a big data scientist who can score large batches of data. Here, you will again be using simulated...