Book Image

Getting Started with Amazon SageMaker Studio

By : Michael Hsieh
Book Image

Getting Started with Amazon SageMaker Studio

By: Michael Hsieh

Overview of this book

Amazon SageMaker Studio is the first integrated development environment (IDE) for machine learning (ML) and is designed to integrate ML workflows: data preparation, feature engineering, statistical bias detection, automated machine learning (AutoML), training, hosting, ML explainability, monitoring, and MLOps in one environment. In this book, you'll start by exploring the features available in Amazon SageMaker Studio to analyze data, develop ML models, and productionize models to meet your goals. As you progress, you will learn how these features work together to address common challenges when building ML models in production. After that, you'll understand how to effectively scale and operationalize the ML life cycle using SageMaker Studio. By the end of this book, you'll have learned ML best practices regarding Amazon SageMaker Studio, as well as being able to improve productivity in the ML development life cycle and build and deploy models easily for your ML use cases.
Table of Contents (16 chapters)
1
Part 1 – Introduction to Machine Learning on Amazon SageMaker Studio
4
Part 2 – End-to-End Machine Learning Life Cycle with SageMaker Studio
11
Part 3 – The Production and Operation of Machine Learning with SageMaker Studio

Understanding bias, fairness in ML, and ML explainability

There are two types of bias in ML that we can analyze and mitigate to ensure fairness—data bias and model bias. Data bias is an imbalance in the training data across different groups and categories that can be introduced into an ML solution simply due to a sampling error, or intricately due to inherent reasons that are unfortunately ingrained in society. Data bias, if neglected, can translate into poor accuracy in general and unfair prediction against a certain group in a trained model. It is more critical than ever to be able to discover inherent biases in the data early and take action to address them. Model bias, on the other hand, refers to bias introduced by model prediction, such as the distribution of classification and errors among advantaged and disadvantaged groups. Should the model favor an advantaged group for a particular outcome or disproportionally predict incorrectly for a disadvantaged group, causing...