Book Image

Modern Time Series Forecasting with Python

By : Manu Joseph
5 (1)
Book Image

Modern Time Series Forecasting with Python

5 (1)
By: Manu Joseph

Overview of this book

We live in a serendipitous era where the explosion in the quantum of data collected and a renewed interest in data-driven techniques such as machine learning (ML), has changed the landscape of analytics, and with it, time series forecasting. This book, filled with industry-tested tips and tricks, takes you beyond commonly used classical statistical methods such as ARIMA and introduces to you the latest techniques from the world of ML. This is a comprehensive guide to analyzing, visualizing, and creating state-of-the-art forecasting systems, complete with common topics such as ML and deep learning (DL) as well as rarely touched-upon topics such as global forecasting models, cross-validation strategies, and forecast metrics. You’ll begin by exploring the basics of data handling, data visualization, and classical statistical methods before moving on to ML and DL models for time series forecasting. This book takes you on a hands-on journey in which you’ll develop state-of-the-art ML (linear regression to gradient-boosted trees) and DL (feed-forward neural networks, LSTMs, and transformers) models on a real-world dataset along with exploring practical topics such as interpretability. By the end of this book, you’ll be able to build world-class time series forecasting systems and tackle problems in the real world.
Table of Contents (26 chapters)
1
Part 1 – Getting Familiar with Time Series
6
Part 2 – Machine Learning for Time Series
13
Part 3 – Deep Learning for Time Series
20
Part 4 – Mechanics of Forecasting

To get the most out of this book

You should have basic familiarity with Python programming, as the entire code that we use for the practical sections is in Python. Familiarity with major libraries in Python, such as pandas and scikit-learn, are not essential (because the book covers some basics) but will help you get through the book much faster. Familiarity with PyTorch, the framework the book uses for deep learning, is also not essential but would accelerate your learning by many folds. Any of the software requirements shouldn’t stop you because, in today’s internet-enabled world, the only thing that is standing between you and a world of knowledge is the search bar in your favorite search engine.

Another key aspect to get the most out of this book is to run the associated notebooks as you go along the lessons. Also, feel free to experiment with different variations that the book doesn’t go into. That is a surefire way to internalize what’s being talked about in the book. And for that, we need to set up an environment, as you’ll see in the following section.

Setting up an environment

The easiest way to set up an environment is by using Anaconda, a distribution of Python for scientific computing. You can use Miniconda, a minimal installer for Conda, as well if you do not want the pre-installed packages that come with Anaconda:

  1. Install Anaconda/Miniconda: Anaconda can be installed from https://www.anaconda.com/products/distribution. Depending on your operating system, choose the corresponding file and follow the instructions. Alternatively, you can install Miniconda from here: https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links.
  2. Open conda prompt: To open Anaconda Prompt (or Terminal on Linux or macOS), do the following:
    • Windows: Open the Anaconda Prompt (Start | Anaconda Prompt)
    • macOS: Open Launchpad and then open Terminal. Type conda activate.
    • Linux: Open Terminal. Type conda activate.
  3. Navigate to the downloaded code: Use operating system-specific commands to navigate to the folder where you have downloaded the code. For instance, in Windows, use cd.
  4. Install the environment: Using the anaconda_env.yml file that is included, install the environment:
    conda env create -f anaconda_env.yml

This creates a new environment under the name modern_ts and will install all the required libraries in the environment. This can take a while.

  1. Checking the installation: We can check whether all the libraries required for the book are installed properly by executing a script in the downloaded code folder:
    python test_installation.py
  2. Activating the environment and running notebooks: Every time you want to run the notebooks, first activate the environment using the conda activate modern_ts command and then use the Jupyter Notebook (jupyter notebook) or JupyterLab (jupyter lab), according to your preference.

Download the data

You are going to be using a single dataset throughout the book. The book uses the London Smart Meters dataset from Kaggle for this purpose. Therefore, if you don’t have an account with Kaggle, please go ahead and create one: https://www.kaggle.com/account/login?phase=startRegisterTab.

There are two ways you can download the data-automated and manual.

For the automated way, we need to download a key from Kaggle. Let’s do that first (if you are going to choose the manual way, you can skip this):

  1. Click on your profile picture in the top-right corner of Kaggle.
  2. Select Account, and find the section for API.
  3. Click the Create New API Token button. A file with the name kaggle.json will be downloaded.
  4. Copy the file and place it in the api_keys folder in the downloaded code folder.

Now that we have kaggle.json downloaded and placed in the right folder, let’s look at the two methods to download data:

Method one – automated download

  1. Activate the environment using conda activate modern_ts.
  2. Run the provided script from the root directory of the downloaded code:
    python scripts/download_data.py

That’s it. Now, just wait for the script to finish downloading, unzip it, and organize the files in the expected format.

Method two – manual download

  1. Go to https://www.kaggle.com/jeanmidev/smart-meters-in-london and download the dataset.
  2. Unzip the contents to data/london_smart_meters.
  3. Unzip hhblock_dataset to get the raw files we want to work with.
  4. Make sure the unzipped files are in the expected folder structure (see the next section).

Now that you have downloaded the data, we need to make sure it is arranged in the following folder structure. The automated download does it automatically, but with the manual download, this structure needs to be created. To avoid ambiguity, the expected folder structure can be found as follows:

data
├── london_smart_meters
│   ├── hhblock_dataset
│   │   ├── hhblock_dataset
│   │       ├── block_0.csv
│   │       ├── block_1.csv
│   │       ├── ...
│   │       ├── block_109.csv
│── acorn_details.csv
├── informations_households.csv
├── uk_bank_holidays.csv
├── weather_daily_darksky.csv
├── weather_hourly_darksky.csv

There can be additional files as part of the extraction process. You can remove them without impacting anything. There is a helpful script that checks this structure.

python test_data_download.py

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository. Doing so will help you avoid any potential errors related to the copying and pasting of code.

The code that is provided along with the book is in no way a library but more of a guide for you to start experimenting on. The amount of learning you can derive from the book and code is directly proportional to how much you experiment with the code and stray outside your comfort zone. So, go ahead and start experimenting and putting the skills you pick up in the book to good use.