Preparing a dataset for anomaly detection purposes
Before you can train an anomaly detection model, you need to prepare a multivariate time series dataset. In this section, you will learn how to prepare such a dataset and how to allow Amazon Lookout for Equipment to access it.
Preparing the dataset
The dataset we are going to use is a cleaned-up version of the one that can be found on Kaggle here:
https://www.kaggle.com/nphantawee/pump-sensor-data/version/1
This dataset contains known time ranges when a pump is broken and when it is operating under nominal conditions. To adapt this dataset so that it can be fed to Amazon Lookout for Equipment, perform the following steps:
- Download the raw time series dataset. This data contains 5 months' worth of data at a 1-minute sampling rate with several events of interest. The original dataset ranges from 2018-04-01 to 2018-08-31.
- Amazon...