Getting Started with Amazon SageMaker Studio

By : Michael Hsieh

Getting Started with Amazon SageMaker Studio

By: Michael Hsieh

Overview of this book

Amazon SageMaker Studio is the first integrated development environment (IDE) for machine learning (ML) and is designed to integrate ML workflows: data preparation, feature engineering, statistical bias detection, automated machine learning (AutoML), training, hosting, ML explainability, monitoring, and MLOps in one environment. In this book, you'll start by exploring the features available in Amazon SageMaker Studio to analyze data, develop ML models, and productionize models to meet your goals. As you progress, you will learn how these features work together to address common challenges when building ML models in production. After that, you'll understand how to effectively scale and operationalize the ML life cycle using SageMaker Studio. By the end of this book, you'll have learned ML best practices regarding Amazon SageMaker Studio, as well as being able to improve productivity in the ML development life cycle and build and deploy models easily for your ML use cases.

Preface

Who this book is for

What this book covers

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Share Your Thoughts

Part 1 – Introduction to Machine Learning on Amazon SageMaker Studio

Free Chapter

Chapter 1: Machine Learning and Its Life Cycle in the Cloud

Technical requirements

Understanding ML and its life cycle

Building ML in the cloud

Exploring AWS essentials for ML

Setting up an AWS environment

Summary

Chapter 2: Introducing Amazon SageMaker Studio

Technical requirements

Introducing SageMaker Studio and its components

Setting up SageMaker Studio

Walking through the SageMaker Studio UI

Demystifying SageMaker Studio notebooks, instances, and kernels

Using the SageMaker Python SDK

Summary

Part 2 – End-to-End Machine Learning Life Cycle with SageMaker Studio

Chapter 3: Data Preparation with SageMaker Data Wrangler

Technical requirements

Getting started with SageMaker Data Wrangler for customer churn prediction

Importing data from sources

Exploring data with visualization

Applying transformation

Exporting data for ML training

Summary

Chapter 4: Building a Feature Repository with SageMaker Feature Store

Technical requirements

Understanding the concept of a feature store

Getting started with SageMaker Feature Store

Accessing features from SageMaker Feature Store

Summary

Chapter 5: Building and Training ML Models with SageMaker Studio IDE

Technical requirements

Training models with SageMaker's built-in algorithms

Training with code written in popular frameworks

Developing and collaborating using SageMaker Notebook

Summary

Chapter 6: Detecting ML Bias and Explaining Models with SageMaker Clarify

Technical requirements

Understanding bias, fairness in ML, and ML explainability

Detecting bias in ML

Explaining ML models using SHAP values

Summary

Chapter 7: Hosting ML Models in the Cloud: Best Practices

Technical requirements

Deploying models in the cloud after training

Inferencing in batches with batch transform

Hosting real-time endpoints

Optimizing your model deployment

Summary

Chapter 8: Jumpstarting ML with SageMaker JumpStart and Autopilot

Technical requirements

Launching a SageMaker JumpStart solution

SageMaker JumpStart model zoo

Creating a high-quality model with SageMaker Autopilot

Summary

Further reading

Part 3 – The Production and Operation of Machine Learning with SageMaker Studio

Chapter 9: Training ML Models at Scale in SageMaker Studio

Technical requirements

Performing distributed training in SageMaker Studio

Monitoring model training and compute resources with SageMaker Debugger

Managing long-running jobs with checkpointing and spot training

Summary

Chapter 10: Monitoring ML Models in Production with SageMaker Model Monitor

Technical requirements

Understanding drift in ML

Monitoring data and performance drift in SageMaker Studio

Reviewing model monitoring results in SageMaker Studio

Summary

Chapter 11: Operationalize ML Projects with SageMaker Projects, Pipelines, and Model Registry

Technical requirements

Understanding ML operations and CI/CD

Creating a SageMaker project

Orchestrating an ML pipeline with SageMaker Pipelines

Running CI/CD in SageMaker Studio

Summary

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Detecting bias in ML

For this chapter, I'd like to use an ML adult census income dataset from the University of California Irvine (UCI) ML repository (https://archive.ics.uci.edu/ml/datasets/adult). This dataset contains demographic information from census data and income level as a prediction target. The goal of the dataset is to predict whether a person earns over or below United States dollars (USD) $50,000 ($50K) per year based on the census information. This is a great example and is the type of ML use case that includes socially sensitive categories such as gender and race, and is under the most scrutiny and regulation to ensure fairness when producing an ML model.

In this section, we will analyze the dataset to detect data bias in the training data, mitigate if there is any bias, train an ML model, and analyze whether there is any model bias against a particular group.

Detecting pretraining bias

Please open the notebook in Getting-Started-with-Amazon-SageMaker...

Getting Started with Amazon SageMaker Studio

By : Michael Hsieh

Getting Started with Amazon SageMaker Studio

By: Michael Hsieh

Overview of this book

Related Content you might be interested in

Current Title:

Getting Started with Amazon SageMaker Studio

Amazon SageMaker Best Practices

Learn Amazon SageMaker

Accelerate Deep Learning Workloads with Amazon SageMaker

Detecting bias in ML

Detecting pretraining bias