Book Image

Automated Machine Learning

By : Adnan Masood
Book Image

Automated Machine Learning

By: Adnan Masood

Overview of this book

Every machine learning engineer deals with systems that have hyperparameters, and the most basic task in automated machine learning (AutoML) is to automatically set these hyperparameters to optimize performance. The latest deep neural networks have a wide range of hyperparameters for their architecture, regularization, and optimization, which can be customized effectively to save time and effort. This book reviews the underlying techniques of automated feature engineering, model and hyperparameter tuning, gradient-based approaches, and much more. You'll discover different ways of implementing these techniques in open source tools and then learn to use enterprise tools for implementing AutoML in three major cloud service providers: Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform. As you progress, you’ll explore the features of cloud AutoML platforms by building machine learning models using AutoML. The book will also show you how to develop accurate models by automating time-consuming and repetitive tasks in the machine learning development lifecycle. By the end of this machine learning book, you’ll be able to build and deploy AutoML models that are not only accurate, but also increase productivity, allow interoperability, and minimize feature engineering tasks.
Table of Contents (15 chapters)
1
Section 1: Introduction to Automated Machine Learning
5
Section 2: AutoML with Cloud Platforms
12
Section 3: Applied Automated Machine Learning

The ML development life cycle

Before introducing you to automated ML, we should first define how we operationalize and scale ML experiments into production. To go beyond Hello-World apps and works-on-my-machine-in-my-Jupyter-notebook kinds of projects, enterprises need to adapt a robust, reliable, and repeatable model development and deployment process. Just as in a software development life cycle (SDLC), the ML or data science life cycle is also a multi-stage, iterative process.

The life cycle includes several steps – the process of problem definition and analysis, building the hypothesis (unless you are doing exploratory data analysis), selecting business outcome metrices, exploring and preparing data, building and creating ML models, training those ML models, evaluating and deploying them, and maintaining the feedback loop:

Figure 1.1 – Team data science process

Figure 1.1 – Team data science process

A successful data science team has the discipline to prepare the problem statement and hypothesis, preprocess the data, select the appropriate features from the data based on the input of the Subject-Matter Expert (SME) and the right model family, optimize model hyperparameters, review outcomes and the resulting metrics, and finally fine-tune the models. If this sounds like a lot, remember that it is an iterative process where the data scientist also has to ensure that the data, model versioning, and drift are being addressed. They must also put guardrails in place to guarantee the model's performance is being monitored. Just to make this even more interesting, there are also frequent champion challenger and A/B experimentations happening in production – may the best model win.

In such an intricate and multifaceted environment, data scientists can use all the help they can get. Automated ML extends a helping hand with the promise to take care of the mundane, the repetitive, and the intellectually less efficient tasks so that the data scientists can focus on the important stuff.