Book Image

Time Series Analysis on AWS

By : Michaël Hoarau
Book Image

Time Series Analysis on AWS

By: Michaël Hoarau

Overview of this book

Being a business analyst and data scientist, you have to use many algorithms and approaches to prepare, process, and build ML-based applications by leveraging time series data, but you face common problems, such as not knowing which algorithm to choose or how to combine and interpret them. Amazon Web Services (AWS) provides numerous services to help you build applications fueled by artificial intelligence (AI) capabilities. This book helps you get to grips with three AWS AI/ML-managed services to enable you to deliver your desired business outcomes. The book begins with Amazon Forecast, where you’ll discover how to use time series forecasting, leveraging sophisticated statistical and machine learning algorithms to deliver business outcomes accurately. You’ll then learn to use Amazon Lookout for Equipment to build multivariate time series anomaly detection models geared toward industrial equipment and understand how it provides valuable insights to reinforce teams focused on predictive maintenance and predictive quality use cases. In the last chapters, you’ll explore Amazon Lookout for Metrics, and automatically detect and diagnose outliers in your business and operational data. By the end of this AWS book, you’ll have understood how to use the three AWS AI services effectively to perform time series analysis.
Table of Contents (20 chapters)
1
Section 1: Analyzing Time Series and Delivering Highly Accurate Forecasts with Amazon Forecast
9
Section 2: Detecting Abnormal Behavior in Multivariate Time Series with Amazon Lookout for Equipment
15
Section 3: Detecting Anomalies in Business Metrics with Amazon Lookout for Metrics

Preparing a dataset for anomaly detection purposes

Before you can train an anomaly detection model, you need to prepare a multivariate time series dataset. In this section, you will learn how to prepare such a dataset and how to allow Amazon Lookout for Equipment to access it.

Preparing the dataset

The dataset we are going to use is a cleaned-up version of the one that can be found on Kaggle here:

https://www.kaggle.com/nphantawee/pump-sensor-data/version/1

This dataset contains known time ranges when a pump is broken and when it is operating under nominal conditions. To adapt this dataset so that it can be fed to Amazon Lookout for Equipment, perform the following steps:

  1. Download the raw time series dataset. This data contains 5 months' worth of data at a 1-minute sampling rate with several events of interest. The original dataset ranges from 2018-04-01 to 2018-08-31.

Figure 9.1 – Industrial pump dataset overview

  1. Amazon...