Book Image

Machine Learning for Time-Series with Python - Second Edition

By : Ben Auffarth
4 (4)
Book Image

Machine Learning for Time-Series with Python - Second Edition

4 (4)
By: Ben Auffarth

Overview of this book

The Python time-series ecosystem is a huge and challenging topic to tackle, especially for time series since there are so many new libraries and models. Machine Learning for Time Series, Second Edition, aims to deepen your understanding of time series by providing a comprehensive overview of popular Python time-series packages and helping you build better predictive systems. This fully updated second edition starts by re-introducing the basics of time series and then helps you get to grips with traditional autoregressive models as well as modern non-parametric models. By observing practical examples and the theory behind them, you will gain a deeper understanding of loading time-series datasets from any source and a variety of models, such as deep learning recurrent neural networks, causal convolutional network models, and gradient boosting with feature engineering. This book will also help you choose the right model for the right problem by explaining the theory behind several useful models. New updates include a chapter on forecasting and extracting signals on financial markets and case studies with relevant examples from operations management, digital marketing, and healthcare. By the end of this book, you should feel at home with effectively analyzing and applying machine learning methods to time series.
Table of Contents (3 chapters)

How can we approach time series problems?

Given a time series problem, there is no one-size-fits-all approach - as it depends on the specific problem that you are trying to solve. In fact, there are many different approaches and ways to solve problems. The choice of method will depend on the type of data, the nature of the problem, and the desired outcome. However, there is a set of problem solving techniques that can help guide you approach any time series problem.Time series analysis (TSA) comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Importing the data can be considered prior to time series analysis, and data cleaning, feature engineering, and training a machine learning model are not strictly part of time series analysis. We'll discuss TSA in chapter 2, Time-Series Analysis.The steps belonging to TSA and leading to preprocessing (feature engineering) and machine learning are highly iterative, and can be visually appreciated in the following time series machine learning flywheel:

Figure 2.1: The time-series machine learning flywheel

This flywheel emphasizes the iterative nature of the work. For example, data cleaning comes often after loading the data, but will come up again after we've made another discovery about our variables. I've highlighted TSA in dark, while steps that are not strictly part of TSA are grayed out.In summary, we have these steps:

  1. Define the problem: Is it a classification or regression problem? What are the inputs and outputs? What are the steps involved in solving the problem?
  2. Data Exploration: This step involves visualizing the data to understand the trends and patterns. This helps in developing the intuition for building the machine learning model.
  3. Preprocess the data: This step involves cleaning the data, dealing with missing values, and transforming the data so that it can be used by the machine learning algorithm. This step is crucial and includes feature engineering, scaling, normalization.
  4. Decompose the time series: This step is useful in understanding the trend, seasonality, and noise in the data. This step is necessary for certain kinds of methods, but not with others.
  5. Build models: Once the data is preprocessed, you can build models using traditional machine learning algorithms or time series specific algorithms.
  6. Evaluate the models: This step includes comparing the performance of different models and choosing the best one. It involves assessing the performance of the machine learning model on unseen data. This helps in fine-tuning the model and making it ready for deployment.
  7. Make predictions: This step involves using the final model to make predictions on new data.

There are many ways to deal with time series. Let's briefly go through the main categories of approaches, traditional time series methods, machine learning, and deep learning. All of them can be used for problems such as forecasting, classification and regression.Traditional time series methods are the most common approach to dealing with time series data. They involve using statistical methods to model the data and make predictions. They are simple and easy to interpret. A downside is that many of them assume that the data is stationary.Machine Learning methods can be applied to data that is not stationary. The downside is that applications can be more complex and results more difficult to understand.Deep learning is a branch of machine learning that uses neural networks to learn from data. These neural networks often have many parameters giving the model the potential to capture a wide range of phenomena, but require a lot of data to tune, and often predictions can be highly variable.To summarize: the best approach depends on the data, the problem, and the resources available.