#### Overview of this book

Machine Learning for Finance explores new advances in machine learning and shows how they can be applied across the financial sector, including insurance, transactions, and lending. This book explains the concepts and algorithms behind the main machine learning techniques and provides example Python code for implementing the models yourself. The book is based on Jannes Klaas’ experience of running machine learning training courses for financial professionals. Rather than providing ready-made financial algorithms, the book focuses on advanced machine learning concepts and ideas that can be applied in a wide variety of ways. The book systematically explains how machine learning works on structured data, text, images, and time series. You'll cover generative adversarial learning, reinforcement learning, debugging, and launching machine learning products. Later chapters will discuss how to fight bias in machine learning. The book ends with an exploration of Bayesian inference and probabilistic programming.
Machine Learning for Finance
Contributors
Preface
Other Books You May Enjoy
Free Chapter
Applying Machine Learning to Structured Data
Utilizing Computer Vision
Understanding Time Series
Parsing Textual Data with Natural Language Processing
Using Generative Models
Reinforcement Learning for Financial Markets
Privacy, Debugging, and Launching Your Products
Fighting Bias
Bayesian Inference and Probabilistic Programming
Index

## Median forecasting

A good sanity check and an often underrated forecasting tool is medians. A median is a value separating the higher half of a distribution from the lower half; it sits exactly in the middle of the distribution. Medians have the advantage of removing noise, coupled with the fact that they are less susceptible to outliers than means, and the way they capture the midpoint of distribution means that they are also easy to compute.

To make a forecast, we compute the median over a look-back window in our training data. In this case, we use a window size of 50, but you could experiment with other values. The next step is to select the last 50 values from our X values and compute the median.

Take a minute to note that in the NumPy median function, we have to set `keepdims=True`. This ensures that we keep a two-dimensional matrix rather than a flat array, which is important when computing the error. So, to make a forecast, we need to run the following code:

```lookback = 50

lb_data = X_train...```