Preparing a dataset for anomaly detection purposes
Throughout this chapter and the next one, we are going to focus on an e-commerce dataset in which we will detect potential anomalies and identify some root causes to help us investigate the problems and deliver a faster route to remediation.
In the sub-sections that follow, we are going to look at the following steps in detail:
- Download the e-commerce dataset and split your data into a training dataset (that you will use for backtesting purposes) and a testing dataset (that you will use to monitor simulated live data to understand how the continuous mode of Amazon Lookout for Metrics works).
- Upload your prepared CSV files to Amazon Simple Store Service (S3) for storage. Amazon S3 lets you store files and is often used as a file datastore for many AWS services such as Amazon Lookout for Metrics.
- Authorize Amazon Lookout for Metrics to access your data in Amazon S3. This is optional as you can let Amazon Lookout for...