-
Book Overview & Buying
-
Table Of Contents
Machine Learning for Time Series with Python - Second Edition
By :
Our forecasting engines had never been more advanced. Yet that very complexity created a dangerous paradox. In March 2020, as COVID‑19 swept the globe, these state‑of‑the‑art systems at major retailers began producing wildly inaccurate predictions.
At Target, machine‑learning models trained on years of stable history couldn't explain an 845 percent surge in toilet‑paper demand or an 89 percent collapse in apparel sales. At Amazon, inventory algorithms that had expertly balanced millions of SKUs suddenly flooded warehouses with home‑gym and office‑gear nobody needed—while critical cleaning and health products vanished from shelves. These failures weren't the result of simple spreadsheet errors but of forecasting pipelines—some pure ML, others decomposition‑based—that either omitted an explicit seasonality‑and‑anomaly step or relied on static, pre‑pandemic settings. Instead of flagging the March–April 2020 spikes as one‑off shocks, they folded them into their trend and seasonal baselines, yielding persistent over‑ and under‑forecasts until the models were hurriedly re‑tuned.
It's important to remember that these weren't simple statistical models running on outdated spreadsheets, but ML systems built by world-class data science teams using state-of-the-art methods. Yet, within weeks, billions of dollars in forecasting infrastructure were rendered nearly useless. Walmart's demand‑planning engines—which normally allocate stock across 4,700 stores with surgical precision—began issuing orders that bore no relation to real buying patterns. CVS Health's prescription models, built on years of consistent patient behavior, missed the surge in anxiety‑drug scripts and failed to anticipate the drop in routine refills as people skipped doctor visits. Across retail and healthcare, systems once lauded for their high validation scores and rigorous A/B tests became sources of systematic error instead of competitive advantage.
In the next section, we'll examine the technical sophistication that made these failures so surprising and instructive.
These failures reveal why time series requires different thinking. The sophisticated feature engineering that works well for cross-sectional data can become actively harmful when underlying patterns change faster than models can adapt. These organizations were deploying methods that represented the pinnacle of data science best practices circa 2020:
Yet when consumer behavior shifted faster than their training pipelines could adapt, even the most advanced machine learning became a liability. The sophisticated feature engineering that captured nuanced seasonal patterns became actively misleading when seasonality disappeared overnight. The ensemble methods that provided robust predictions during normal times amplified errors when all component models failed simultaneously.
Understanding temporal dependencies is crucial because they create both opportunities and challenges. While past values can provide powerful predictive information, they also mean that patterns can change over time: a phenomenon called concept drift, where the statistical relationship between inputs and the target shifts, so a model trained on yesterday's distribution gradually becomes wrong on tomorrow's. The same dependencies that make forecasting possible also make validation harder: the data-generating process itself moves with the calendar, so a holdout score collected last quarter cannot certify how the model will behave this quarter. A validation framework for time series therefore has to score both how well a model fits the past and how quickly that fit decays as the world drifts.
Before we can build systems that work reliably when patterns change, we need to establish what makes time series fundamentally different from the static datasets most machine learning methods assume.
The key insight is that time series data violates a fundamental assumption of most machine learning methods: that observations are independent and identically distributed. When we treat temporal observations as independent data points that just happen to be ordered chronologically, we miss the crucial dependencies between past and future values. This is why techniques that work well for cross-sectional data—like random train/test splits or standard feature engineering—can fail with time series.
Change the font size
Change margin width
Change background colour