Book Image

The Machine Learning Solutions Architect Handbook

By : David Ping

Book Image

The Machine Learning Solutions Architect Handbook

By: David Ping

Overview of this book

When equipped with a highly scalable machine learning (ML) platform, organizations can quickly scale the delivery of ML products for faster business value realization. There is a huge demand for skilled ML solutions architects in different industries, and this handbook will help you master the design patterns, architectural considerations, and the latest technology insights you’ll need to become one. You’ll start by understanding ML fundamentals and how ML can be applied to solve real-world business problems. Once you've explored a few leading problem-solving ML algorithms, this book will help you tackle data management and get the most out of ML libraries such as TensorFlow and PyTorch. Using open source technology such as Kubernetes/Kubeflow to build a data science environment and ML pipelines will be covered next, before moving on to building an enterprise ML architecture using Amazon Web Services (AWS). You’ll also learn about security and governance considerations, advanced ML engineering techniques, and how to apply bias detection, explainability, and privacy in ML model development. By the end of this book, you’ll be able to design and build an ML platform to support common use cases and architecture patterns like a true professional.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Share Your Thoughts

Section 1: Solving Business Challenges with Machine Learning Solution Architecture

Section 1: Solving Business Challenges with Machine Learning Solution Architecture

Free Chapter

Chapter 1: Machine Learning and Machine Learning Solutions Architecture

Chapter 1: Machine Learning and Machine Learning Solutions Architecture

What are AI and ML?

ML versus traditional software

ML solutions architecture

Testing your knowledge

Chapter 2: Business Use Cases for Machine Learning

Chapter 2: Business Use Cases for Machine Learning

ML use cases in financial services

ML use cases in media and entertainment

ML use cases in healthcare and life sciences

ML use cases in manufacturing

ML use cases in retail

ML use case identification exercise

Section 2: The Science, Tools, and Infrastructure Platform for Machine Learning

Section 2: The Science, Tools, and Infrastructure Platform for Machine Learning

Chapter 3: Machine Learning Algorithms

Chapter 3: Machine Learning Algorithms

Technical requirements

How machines learn

Overview of ML algorithms

Hands-on exercise

Chapter 4: Data Management for Machine Learning

Chapter 4: Data Management for Machine Learning

Technical requirements

Data management considerations for ML

Data management architecture for ML

Hands-on exercise – data management for ML

Chapter 5: Open Source Machine Learning Libraries

Chapter 5: Open Source Machine Learning Libraries

Technical requirements

Core features of open source machine learning libraries

Understanding the scikit-learn machine learning library

Understanding the Apache Spark ML machine learning library

Understanding the TensorFlow deep learning library

Hands-on exercise – training a TensorFlow model

Understanding the PyTorch deep learning library

Hands-on exercise – building and training a PyTorch model

Chapter 6: Kubernetes Container Orchestration Infrastructure Management

Chapter 6: Kubernetes Container Orchestration Infrastructure Management

Technical requirements

Introduction to containers

Kubernetes overview and core concepts

Networking on Kubernetes

Security and access management

Hands-on – creating a Kubernetes infrastructure on AWS

Section 3: Technical Architecture Design and Regulatory Considerations for Enterprise ML Platforms

Section 3: Technical Architecture Design and Regulatory Considerations for Enterprise ML Platforms

Chapter 7: Open Source Machine Learning Platforms

Chapter 7: Open Source Machine Learning Platforms

Technical requirements

Core components of an ML platform

Open source technologies for building ML platforms

Hands-on exercise – building a data science architecture using open source technologies

Chapter 8: Building a Data Science Environment Using AWS ML Services

Chapter 8: Building a Data Science Environment Using AWS ML Services

Technical requirements

Data science environment architecture using SageMaker

Hands-on exercise – building a data science environment using AWS services

Chapter 9: Building an Enterprise ML Architecture with AWS ML Services

Chapter 9: Building an Enterprise ML Architecture with AWS ML Services

Technical requirements

Key requirements for an enterprise ML platform

Enterprise ML architecture pattern overview

Model training environment

Model hosting environment deep dive

Adopting MLOps for ML workflows

Hands-on exercise – building an MLOps pipeline on AWS

Chapter 10: Advanced ML Engineering

Chapter 10: Advanced ML Engineering

Technical requirements

Training large-scale models with distributed training

Achieving low latency model inference

Hands-on lab – running distributed model training with PyTorch

Chapter 11: ML Governance, Bias, Explainability, and Privacy

Chapter 11: ML Governance, Bias, Explainability, and Privacy

Technical requirements

What is ML governance and why is it needed?

Understanding the ML governance framework

Understanding ML bias and explainability

Designing an ML platform for governance

Hands-on lab – detecting bias, model explainability, and training privacy-preserving models

Chapter 12: Building ML Solutions with AWS AI Services

Chapter 12: Building ML Solutions with AWS AI Services

Technical requirements

What are AI services?

Overview of AWS AI services

Building intelligent solutions with AI services

Designing an MLOps architecture for AI services

Hands-on lab – running ML tasks using AI services

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

What are AI and ML?

AI can be defined as a machine demonstrating intelligence similar to that of human natural intelligence, such as distinguishing different types of flowers through vision, understanding languages, or driving cars. Having AI capability does not necessarily mean a system has to be powered only by ML. An AI system can also be powered by other techniques, such as rule-based engines. ML is a form of AI that learns how to perform a task using different learning techniques, such as learning from examples using historical data or learning by trial and error. An example of ML would be making credit decisions using an ML algorithm with access to historical credit decision data.

Deep learning (DL) is a subset of ML that uses a large number of artificial neurons (known as an artificial neural network) to learn, which is similar to how a human brain learns. An example of a deep learning-based solution is the Amazon Echo virtual assistant. To better understand how ML works, let's first talk about the different approaches taken by machines to learn. They are as follows:

Supervised ML
Unsupervised machine learning
Reinforcement learning

Let's have a look at each one of them in detail.

Supervised ML

Supervised ML is a type of ML where, when training an ML model, an ML algorithm is provided with the input data features (for example, the size and zip code of houses) and the answers, also known as labels (for example, the prices of the houses). A dataset with labels is called a labeled dataset. You can think of supervised ML as learning by example. To understand what this means, let's use an example of how we humans learn to distinguish different objects. Say you are first provided with a number of pictures of different flowers and their names. You are then told to study the characteristics of the flowers, such as the shape, size, and color for each provided flower name. After you have gone through a number of different pictures for each flower, you are then given flower pictures without the names and asked to distinguish them. Based on what you have learned previously, you should be able to tell the names of flowers if they have the characteristics of the known flowers.

In general, the more training pictures with variations you have looked at during the learning time, the more accurate you will likely be when you try to name flowers in the new pictures. Conceptually, this is how supervised ML works. The following figure (Figure 1.1) shows a labeled dataset being fed into a computer vision algorithm to train an ML model:

Figure 1.1 – Supervised ML

Figure 1.1 – Supervised ML

Supervised ML is mainly used for classification tasks that assign a label from a discrete set of categories to an example (for example, telling the names of different objects) and regression tasks that predict a continuous value (for example, estimating the value of something given supporting information). In the real world, the majority of ML solutions are based on supervised ML techniques. The following are some examples of ML solutions that use supervised ML:

Classifying documents into different document types automatically, as part of a document management workflow. The typical business benefits of ML-based document processing are the reduction of manual effort, which reduces costs, faster processing time, and higher processing quality.
Assessing the sentiment of news articles to help understand the market perception of a brand or product or facilitate investment decisions.
Automating the objects or faces detection in images as part of a media image processing workflow. The business benefits this delivers are cost-saving from the reduction of human labor, faster processing, and higher accuracy.
Predicting the probability that someone will default on a bank loan. The business benefits this delivers are faster decision-making on loan application reviews and approvals, lower processing costs, and a reduced impact on a company's financial statement due to loan defaults.

Unsupervised ML

Unsupervised ML is a type of ML where an ML algorithm is provided with input data features without labels. Let's continue with the flower example, however in this case, you are now only provided with the pictures of the flowers and not their names. In this scenario, you will not be able to figure out the names of the flowers, regardless of how much time you spend looking at the pictures. However, through visual inspection, you should be able to identify the common characteristics (for example, color, size, and shape) of different types of flowers across the pictures, and group flowers with common characteristics in the same group.

This is similar to how unsupervised ML works. Specifically, in this particular case, you have performed the clustering task in unsupervised ML:

Figure 1.2 – Unsupervised ML

Figure 1.2 – Unsupervised ML

In addition to the clustering technique, there are many other techniques in unsupervised ML. Another common and useful unsupervised ML technique is dimensionality reduction, where a smaller number of transformed features represent the original set of features while maintaining the critical information from the original features so that they can be largely reproduced in the number of data dimensions and size. To understand this more intuitively, let's take a look at Figure 1.3:

Figure 1.3 – Reconstruction of an image from reduced features

Figure 1.3 – Reconstruction of an image from reduced features

In this figure, the original picture on the left is transformed to the reduced representation in the middle. While the reduced representation does not look like the original picture at all, it still maintains the critical information about the original picture, so that when the picture on the right is reconstructed using the reduced representation, the reconstructed image looks almost the same as the original picture. The process that transforms the original picture to the reduced representation is called dimensionality reduction.

The main benefits of dimensionality reduction are reduction of the training dataset and that it helps speed up the model training. Dimensionality reduction also helps visualize high dimensional datasets in lower dimensions (for example, reducing the dataset to three dimensions to be plotted and visually inspected).

Unsupervised ML is mainly used for recognizing underlying patterns within a dataset. Since unsupervised learning is not provided with actual labels to learn from, its predictions have greater uncertainties than predictions using the supervised ML approach. The following are some real-life examples of unsupervised ML solutions:

Customer segmentation for target marketing: This is done by using customer attributes such as demographics and historical engagement data. The data-driven customer segmentation approach is usually more accurate than human judgment, which can be biased and subjective.
Computer network intrusion detection: This is done by detecting outlier patterns that are different from normal network traffic patterns. Detecting anomalies in network traffic manually and rule-based processing is extremely challenging due to the high volume and changing dynamics of traffic patterns.
Reducing the dimensions of datasets: This is done to visualize them in a 2D or 3D environment to help understand the data better and more easily.

Reinforcement learning

Reinforcement learning is a type of ML where an ML model learns by trying out different actions and adjusts its future behaviors sequentially based on the received response from the action. For example, suppose you are playing a space invader video game for the first time without knowing the game's rules. In that case, you will initially try out different actions randomly using the controls, such as moving left and right or shooting the canon. As different moves are made, you will see responses to your moves, such as getting killed or killing the invader, and you will also see your score increase or decrease. Through these responses, you will know what a good move is versus a bad move in order to stay alive and increase your score. After much trial and error, you will eventually be a very good player of the game. This is basically how reinforcement learning works.

A very popular example of reinforcement learning is the AlphaGo computer program, which uses mainly reinforcement learning to learn how to play the game of Go. Figure 1.4 shows the flow of reinforcement learning where an agent (for example, the player of a space invader game) takes actions (for example, moving the left/right control) in the environment (for example, the current state of the game) and receives rewards or penalties (score increase/decrease). As a result, the agent will adjust its future moves to maximize the rewards in the future states of the environment. This cycle continues for a very large number of rounds, and the agent will improve and become better over time:

Figure 1.4 – Reinforcement learning

Figure 1.4 – Reinforcement learning

There are many practical use cases for reinforcement learning in the real world. The following are some examples for reinforcement learning:

Robots or self-driving cars learn how to walk or navigate in unknown environments by trying out different moves and responding to the received results.
A recommendation engine optimizes product recommendations through adjustments based on the feedback of the customers to different product recommendations.
A truck delivery company optimizes the delivery route of its fleet to determine the delivery sequence required to achieve the best rewards, such as the lowest cost or shortest time.