Book Image

Machine Learning for Finance

By : Jannes Klaas
Book Image

Machine Learning for Finance

By: Jannes Klaas

Overview of this book

Machine Learning for Finance explores new advances in machine learning and shows how they can be applied across the financial sector, including insurance, transactions, and lending. This book explains the concepts and algorithms behind the main machine learning techniques and provides example Python code for implementing the models yourself. The book is based on Jannes Klaas’ experience of running machine learning training courses for financial professionals. Rather than providing ready-made financial algorithms, the book focuses on advanced machine learning concepts and ideas that can be applied in a wide variety of ways. The book systematically explains how machine learning works on structured data, text, images, and time series. You'll cover generative adversarial learning, reinforcement learning, debugging, and launching machine learning products. Later chapters will discuss how to fight bias in machine learning. The book ends with an exploration of Bayesian inference and probabilistic programming.
Table of Contents (15 chapters)
Machine Learning for Finance
Contributors
Preface
Other Books You May Enjoy
Index

Word embeddings


The order of words in a text matters. Therefore, we can expect higher performance if we do not just look at texts in aggregate but see them as a sequence. This section makes use of a lot of the techniques discussed in the previous chapter; however, here we're going to add a critical ingredient, word vectors.

Words and word tokens are categorical features. As such, we cannot directly feed them into a neural network. Previously, we have dealt with categorical data by turning it into one-hot encoded vectors. Yet for words, this is impractical. Since our vocabulary is 10,000 words, each vector would contain 10,000 numbers that are all zeros except for one. This is highly inefficient, so instead, we will use an embedding.

In practice, embeddings work like a lookup table. For each token, they store a vector. When the token is given to the embedding layer, it returns the vector for that token and passes it through the neural network. As the network trains, the embeddings get optimized...