Chapter 3: Data Labeling with Amazon SageMaker Ground Truth

Book Overview & Buying
Table Of Contents

Amazon SageMaker Best Practices

By : Sireesha Muppala, Randy DeFauw, Shelbee Eigenbrode

4.9 (8)

Buy this Book

Amazon SageMaker Best Practices

4.9 (8)

By: Sireesha Muppala, Randy DeFauw, Shelbee Eigenbrode

Buy this Book

Overview of this book

Amazon SageMaker is a fully managed AWS service that provides the ability to build, train, deploy, and monitor machine learning models. The book begins with a high-level overview of Amazon SageMaker capabilities that map to the various phases of the machine learning process to help set the right foundation. You'll learn efficient tactics to address data science challenges such as processing data at scale, data preparation, connecting to big data pipelines, identifying data bias, running A/B tests, and model explainability using Amazon SageMaker. As you advance, you'll understand how you can tackle the challenge of training at scale, including how to use large data sets while saving costs, monitoring training resources to identify bottlenecks, speeding up long training jobs, and tracking multiple models trained for a common goal. Moving ahead, you'll find out how you can integrate Amazon SageMaker with other AWS to build reliable, cost-optimized, and automated machine learning applications. In addition to this, you'll build ML pipelines integrated with MLOps principles and apply best practices to build secure and performant solutions. By the end of the book, you'll confidently be able to apply Amazon SageMaker's wide range of capabilities to the full spectrum of machine learning workflows.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Share your thoughts

Section 1: Processing Data at Scale

Free Chapter

Chapter 1: Amazon SageMaker Overview

Technical requirements

Preparing, building, training and tuning, deploying, and managing ML models

Discussion of data preparation capabilities

Feature tour of model-building capabilities

Feature tour of training and tuning capabilities

Feature tour of model management and deployment capabilities

Summary

Chapter 2: Data Science Environments

Technical requirements

Machine learning use case and dataset

Creating data science environment

Summary

References

Chapter 3: Data Labeling with Amazon SageMaker Ground Truth

Technical requirements

Challenges with labeling data at scale

Addressing unique labeling requirements with custom labeling workflows

Improving labeling quality using multiple workers

Using active learning to reduce labeling time

Security and permissions

Summary

Chapter 4: Data Preparation at Scale Using Amazon SageMaker Data Wrangler and Processing

Technical requirements

Visual data preparation with Data Wrangler

Bias detection and explainability with Data Wrangler and Clarify

Data preparation at scale with SageMaker Processing

Summary

Chapter 5: Centralized Feature Repository with Amazon SageMaker Feature Store

Technical requirements

Amazon SageMaker Feature Store essentials

Creating feature groups

Populating feature groups

Retrieving features from feature groups

Creating reusable features to reduce feature inconsistencies and inference latency

Designing solutions for near real-time ML predictions

Summary

References

Section 2: Model Training Challenges

Chapter 6: Training and Tuning at Scale

Technical requirements

ML training at scale with SageMaker distributed libraries

Automated model tuning with SageMaker hyperparameter tuning

Organizing and tracking training jobs with SageMaker Experiments

Summary

References

Chapter 7: Profile Training Jobs with Amazon SageMaker Debugger

Technical requirements

Amazon SageMaker Debugger essentials

Real-time monitoring of training jobs using built-in and custom rules

Gaining insight into the training infrastructure and training framework

Summary

Further reading

Section 3: Manage and Monitor Models

Chapter 8: Managing Models at Scale Using a Model Registry

Technical requirements

Using a model registry

Choosing a model registry solution

Managing models using the Amazon SageMaker model registry

Summary

Chapter 9: Updating Production Models Using Amazon SageMaker Endpoint Production Variants

Technical requirements

Basic concepts of Amazon SageMaker Endpoint Production Variants

Deployment strategies for updating ML models with SageMaker Endpoint Production Variants

Selecting an appropriate deployment strategy

Summary

Chapter 10: Optimizing Model Hosting and Inference Costs

Technical requirements

Real-time inference versus batch inference

Deploying multiple models behind a single inference endpoint

Scaling inference endpoints to meet inference traffic demands

Using Elastic Inference for deep learning models

Optimizing models with SageMaker Neo

Summary

Chapter 11: Monitoring Production Models with Amazon SageMaker Model Monitor and Clarify

Technical requirements

Basic concepts of Amazon SageMaker Model Monitor and Amazon SageMaker Clarify

End-to-end architectures for monitoring ML models

Best practices for monitoring ML models

Summary

References

Section 4: Automate and Operationalize Machine Learning

Chapter 12: Machine Learning Automated Workflows

Considerations for automating your SageMaker ML workflows

Building ML workflows with Amazon SageMaker Pipelines

Creating CI/CD pipelines using Amazon SageMaker Projects

Summary

Chapter 13:Well-Architected Machine Learning with Amazon SageMaker

Best practices for operationalizing ML workloads

Best practices for securing ML workloads

Best practices for reliable ML workloads

Best practices for building performant ML workloads

Best practices for cost-optimized ML workloads

Summary

Chapter 14: Managing SageMaker Features across Accounts

Examining an overview of the AWS multi-account environment

Understanding the benefits of using multiple AWS accounts with Amazon SageMaker

Examining multi-account considerations with Amazon SageMaker

Summary

References

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share your thoughts

Amazon SageMaker Best Practices

By : Sireesha Muppala, Randy DeFauw, Shelbee Eigenbrode

Amazon SageMaker Best Practices

By: Sireesha Muppala, Randy DeFauw, Shelbee Eigenbrode

Overview of this book

Using active learning to reduce labeling time

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access