Book Image

Machine Learning for Finance

By : Jannes Klaas
Book Image

Machine Learning for Finance

By: Jannes Klaas

Overview of this book

Machine Learning for Finance explores new advances in machine learning and shows how they can be applied across the financial sector, including insurance, transactions, and lending. This book explains the concepts and algorithms behind the main machine learning techniques and provides example Python code for implementing the models yourself. The book is based on Jannes Klaas’ experience of running machine learning training courses for financial professionals. Rather than providing ready-made financial algorithms, the book focuses on advanced machine learning concepts and ideas that can be applied in a wide variety of ways. The book systematically explains how machine learning works on structured data, text, images, and time series. You'll cover generative adversarial learning, reinforcement learning, debugging, and launching machine learning products. Later chapters will discuss how to fight bias in machine learning. The book ends with an exploration of Bayesian inference and probabilistic programming.
Table of Contents (15 chapters)
Machine Learning for Finance
Contributors
Preface
Other Books You May Enjoy
Index

Sources of unfairness in machine learning


As we have discussed many times throughout this book, models are a function of the data that they are trained on. Generally speaking, more data will lead to smaller errors. So, by definition, there is less data on minority groups, simply because there are fewer people in those groups.

This disparate sample size can lead to worse model performance for the minority group. As a result, this increased error is often known as a systematic error. The model might have to overfit the majority group data so that the relationships it found do not apply to the minority group data. Since there is little minority group data, this is not punished as much.

Imagine you are training a credit scoring model, and the clear majority of your data comes from people living in lower Manhattan, and a small minority of it comes from people living in rural areas. Manhattan housing is much more expensive, so the model might learn that you need a very high income to buy an apartment...