Learn Amazon SageMaker - Second Edition

By : Julien Simon

Learn Amazon SageMaker - Second Edition

By: Julien Simon

Overview of this book

Amazon SageMaker enables you to quickly build, train, and deploy machine learning models at scale without managing any infrastructure. It helps you focus on the machine learning problem at hand and deploy high-quality models by eliminating the heavy lifting typically involved in each step of the ML process. This second edition will help data scientists and ML developers to explore new features such as SageMaker Data Wrangler, Pipelines, Clarify, Feature Store, and much more. You'll start by learning how to use various capabilities of SageMaker as a single toolset to solve ML challenges and progress to cover features such as AutoML, built-in algorithms and frameworks, and writing your own code and algorithms to build ML models. The book will then show you how to integrate Amazon SageMaker with popular deep learning libraries, such as TensorFlow and PyTorch, to extend the capabilities of existing models. You'll also see how automating your workflows can help you get to production faster with minimum effort and at a lower cost. Finally, you'll explore SageMaker Debugger and SageMaker Model Monitor to detect quality issues in training and production. By the end of this Amazon book, you'll be able to use Amazon SageMaker on the full spectrum of ML workflows, from experimentation, training, and monitoring to scaling, deployment, and automation.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Share your thoughts

Section 1: Introduction to Amazon SageMaker

Free Chapter

Chapter 1: Introducing Amazon SageMaker

Technical requirements

Exploring the capabilities of Amazon SageMaker

Setting up Amazon SageMaker on your local machine

Setting up Amazon SageMaker Studio

Deploying one-click solutions and models with Amazon SageMaker JumpStart

Summary

Chapter 2: Handling Data Preparation Techniques

Technical requirements

Labeling data with Amazon SageMaker Ground Truth

Transforming data with Amazon SageMaker Data Wrangler

Running batch jobs with Amazon SageMaker Processing

Summary

Section 2: Building and Training Models

Chapter 3: AutoML with Amazon SageMaker Autopilot

Technical requirements

Discovering Amazon SageMaker Autopilot

Using Amazon SageMaker Autopilot in SageMaker Studio

Using the SageMaker Autopilot SDK

Diving deep on SageMaker Autopilot

Summary

Chapter 4: Training Machine Learning Models

Technical requirements

Discovering the built-in algorithms in Amazon SageMaker

Training and deploying models with built-in algorithms

Using the SageMaker SDK with built-in algorithms

Working with more built-in algorithms

Summary

Chapter 5: Training CV Models

Technical requirements

Discovering the CV built-in algorithms in Amazon SageMaker

Preparing image datasets

Using the built-in CV algorithms

Summary

Chapter 6: Training Natural Language Processing Models

Technical requirements

Discovering the NLP built-in algorithms in Amazon SageMaker

Preparing natural language datasets

Using the built-in algorithms for NLP

Summary

Chapter 7: Extending Machine Learning Services Using Built-In Frameworks

Technical requirements

Discovering the built-in frameworks in Amazon SageMaker

Running your framework code on Amazon SageMaker

Using the built-in frameworks

Summary

Chapter 8: Using Your Algorithms and Code

Technical requirements

Understanding how SageMaker invokes your code

Customizing an existing framework container

Using the SageMaker Training Toolkit with scikit-learn

Building a fully custom container for scikit-learn

Building a fully custom container for R

Training and deploying with your own code on MLflow

Building a fully custom container for SageMaker Processing

Summary

Section 3: Diving Deeper into Training

Chapter 9: Scaling Your Training Jobs

Technical requirements

Understanding when and how to scale

Monitoring and profiling training jobs with Amazon SageMaker Debugger

Streaming datasets with pipe mode

Distributing training jobs

Scaling an image classification model on ImageNet

Training with the SageMaker data and model parallel libraries

Using other storage services

Summary

Chapter 10: Advanced Training Techniques

Technical requirements

Optimizing training costs with managed spot training

Optimizing hyperparameters with automatic model tuning

Exploring models with SageMaker Debugger

Managing features and building datasets with SageMaker Feature Store

Detecting bias in datasets and explaining predictions with SageMaker Clarify

Summary

Section 4: Managing Models in Production

Chapter 11: Deploying Machine Learning Models

Technical requirements

Examining model artifacts and exporting models

Deploying models on real-time endpoints

Deploying models on batch transformers

Deploying models on inference pipelines

Monitoring prediction quality with Amazon SageMaker Model Monitor

Deploying models to container services

Summary

Chapter 12: Automating Machine Learning Workflows

Technical requirements

Automating with AWS CloudFormation

Automating with AWS CDK

Building end-to-end workflows with AWS Step Functions

Building end-to-end workflows with Amazon SageMaker Pipelines

Summary

Chapter 13: Optimizing Prediction Cost and Performance

Technical requirements

Autoscaling an endpoint

Deploying a multi-model endpoint

Deploying a model with Amazon Elastic Inference

Compiling models with Amazon SageMaker Neo

Building a cost optimization checklist

Summary

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share your thoughts

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Running your framework code on Amazon SageMaker

We will start from a vanilla scikit-learn program that trains and saves a linear regression model on the Boston Housing dataset, which we used in Chapter 4, Training Machine Learning Models:

import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import joblib
data = pd.read_csv('housing.csv')
labels = data[['medv']]
samples = data.drop(['medv'], axis=1)
X_train, X_test, y_train, y_test = train_test_split(
samples, labels, test_size=0.1, random_state=123)
regr = LinearRegression(normalize=True)
regr.fit(X_train, y_train)
y_pred = regr.predict(X_test)
print('Mean squared error: %.2f' 
       % mean_squared_error(y_test, y_pred))
print('Coefficient of determination: %.2f' 
       % r2_score(y_test, y_pred...

Learn Amazon SageMaker - Second Edition

By : Julien Simon

Learn Amazon SageMaker - Second Edition

By: Julien Simon

Overview of this book

Related Content you might be interested in

Current Title:

Learn Amazon SageMaker - Second Edition

Getting Started with Amazon SageMaker Studio

Accelerate Deep Learning Workloads with Amazon SageMaker

Machine Learning with Amazon SageMaker Cookbook

Running your framework code on Amazon SageMaker