Book Image

Machine Learning for Time-Series with Python

By : Ben Auffarth
Book Image

Machine Learning for Time-Series with Python

By: Ben Auffarth

Overview of this book

The Python time-series ecosystem is huge and often quite hard to get a good grasp on, especially for time-series since there are so many new libraries and new models. This book aims to deepen your understanding of time series by providing a comprehensive overview of popular Python time-series packages and help you build better predictive systems. Machine Learning for Time-Series with Python starts by re-introducing the basics of time series and then builds your understanding of traditional autoregressive models as well as modern non-parametric models. By observing practical examples and the theory behind them, you will become confident with loading time-series datasets from any source, deep learning models like recurrent neural networks and causal convolutional network models, and gradient boosting with feature engineering. This book will also guide you in matching the right model to the right problem by explaining the theory behind several useful models. You’ll also have a look at real-world case studies covering weather, traffic, biking, and stock market data. By the end of this book, you should feel at home with effectively analyzing and applying machine learning methods to time-series.
Table of Contents (15 chapters)
Other Books You May Enjoy

Bandit algorithms

A Multi-Armed Bandit (MAB) is a classic reinforcement learning problem, in which a player is faced with a slot machine (bandit) that has k levers (arms), each with a different reward distribution. The agent's goal is to maximize its cumulative reward on a trial-by-trial basis. Since MABs are a simple but powerful framework for algorithms that make decisions over time under uncertainty, a large number of research articles have been dedicated to them.

Bandit learning refers to algorithms that aim to optimize a single unknown stationary objective function. An agent chooses an action from a set of actions . The environment reveals reward of the chosen action at time t. As information is accumulated over multiple rounds, the agent can build a good representation of the value (or reward) distribution for each arm, .

Therefore, a good policy might converge so that the choice of arm becomes optimal. According to one policy, UCB1 (published by Peter Auer, Nicol...