Book Image

Getting Started with Amazon SageMaker Studio

By : Michael Hsieh
Book Image

Getting Started with Amazon SageMaker Studio

By: Michael Hsieh

Overview of this book

Amazon SageMaker Studio is the first integrated development environment (IDE) for machine learning (ML) and is designed to integrate ML workflows: data preparation, feature engineering, statistical bias detection, automated machine learning (AutoML), training, hosting, ML explainability, monitoring, and MLOps in one environment. In this book, you'll start by exploring the features available in Amazon SageMaker Studio to analyze data, develop ML models, and productionize models to meet your goals. As you progress, you will learn how these features work together to address common challenges when building ML models in production. After that, you'll understand how to effectively scale and operationalize the ML life cycle using SageMaker Studio. By the end of this book, you'll have learned ML best practices regarding Amazon SageMaker Studio, as well as being able to improve productivity in the ML development life cycle and build and deploy models easily for your ML use cases.
Table of Contents (16 chapters)
1
Part 1 – Introduction to Machine Learning on Amazon SageMaker Studio
4
Part 2 – End-to-End Machine Learning Life Cycle with SageMaker Studio
11
Part 3 – The Production and Operation of Machine Learning with SageMaker Studio

Getting started with SageMaker Data Wrangler for customer churn prediction

Customer churn is a serious problem for businesses. Losing a customer is definitely not something you want to see if you are a business owner. You want to your customers to be happy with your product or service and continue to use them for, well, forever. Customer churn is always going to happen but being able to understand how and why a customer leaves the service or why a customer is not buying your product anymore is critical for your business. Being able to predict ahead of time would be even better.

In this chapter, we will perform exploratory data analysis and data transformation with SageMaker Data Wrangler, and at the end of the chapter, we will be training an ML model using the XGBoost algorithm on the wrangled data.

Preparing the use case

We are going to take a synthetic telecommunication (telco) customer churn dataset for this chapter to demonstrate what it takes to prepare a dataset for...