Book Image

Modern Time Series Forecasting with Python

By : Manu Joseph
5 (1)
Book Image

Modern Time Series Forecasting with Python

5 (1)
By: Manu Joseph

Overview of this book

We live in a serendipitous era where the explosion in the quantum of data collected and a renewed interest in data-driven techniques such as machine learning (ML), has changed the landscape of analytics, and with it, time series forecasting. This book, filled with industry-tested tips and tricks, takes you beyond commonly used classical statistical methods such as ARIMA and introduces to you the latest techniques from the world of ML. This is a comprehensive guide to analyzing, visualizing, and creating state-of-the-art forecasting systems, complete with common topics such as ML and deep learning (DL) as well as rarely touched-upon topics such as global forecasting models, cross-validation strategies, and forecast metrics. You’ll begin by exploring the basics of data handling, data visualization, and classical statistical methods before moving on to ML and DL models for time series forecasting. This book takes you on a hands-on journey in which you’ll develop state-of-the-art ML (linear regression to gradient-boosted trees) and DL (feed-forward neural networks, LSTMs, and transformers) models on a real-world dataset along with exploring practical topics such as interpretability. By the end of this book, you’ll be able to build world-class time series forecasting systems and tackle problems in the real world.
Table of Contents (26 chapters)
1
Part 1 – Getting Familiar with Time Series
6
Part 2 – Machine Learning for Time Series
13
Part 3 – Deep Learning for Time Series
20
Part 4 – Mechanics of Forecasting

Forecasting terminology

There are a few terminologies that will help you follow the book as well as other literature on time series. These are described in more detail here:

  • Forecasting

Forecasting is the prediction of future values of a time series using the known past values of the time series and/or some other related variables. This is very similar to prediction in ML where we use a model to predict unseen data.

  • Multivariate forecasting

Multivariate time series consist of more than one time series variable that is not only dependent on its past values but also has some dependency on the other variables. For example, a set of macroeconomic indicators such as gross domestic product (GDP), inflation, and so on of a particular country can be considered as a multivariate time series. The aim of multivariate forecasting is to come up with a model that captures the interrelationship between the different variables along with its relationship with its past and forecast all the time series together in the future.

  • Explanatory forecasting

In addition to the past values of a time series, we might use some other information to predict the future values of a time series. For example, for predicting retail store sales, information regarding promotional offers (both historical and future ones) is usually helpful. This type of forecasting, which uses information other than its own history, is called explanatory forecasting.

  • Backtesting

Setting aside a validation set from your training data to evaluate your models is a practice that is common in the ML world. Backtesting is the time series equivalent of validation, whereby you use the history to evaluate a trained model. We will cover the different ways of doing validation and cross-validation for time series data later.

  • In-sample and out-sample

Again, drawing parallels with ML, in-sample refers to training data and out-sample refers to unseen or testing data. When you hear in-sample metrics, this is referring to metrics calculated on training data, and out-sample metrics is referring to metrics calculated on testing data.

  • Exogenous and endogenous variables

Exogenous variables are parallel time series variables that are not modeled directly for output but used to help us model the time series that we are interested in. Typically, exogenous variables are not affected by other variables in the system. Endogenous variables are variables that are affected by other variables in the system. A purely endogenous variable is a variable that is entirely dependent on the other variables in the system. Relaxing the strict assumptions a bit, we can consider the target variable as the endogenous variable and the explanatory regressors we include in the model as exogenous variables.

  • Forecast combination

Forecast combinations in the time series world are similar to ensembling from the ML world. It is a process by which we combine multiple forecasts by using some function, either learned or heuristic-based, such as a simple average of three forecast models.

There are a lot more terms specific to time series, some of which we will be covering throughout the book. But to start with a basic familiarity in the field, these terms should be enough.