Book Image

Machine Learning for Finance

By : Jannes Klaas
Book Image

Machine Learning for Finance

By: Jannes Klaas

Overview of this book

Machine Learning for Finance explores new advances in machine learning and shows how they can be applied across the financial sector, including insurance, transactions, and lending. This book explains the concepts and algorithms behind the main machine learning techniques and provides example Python code for implementing the models yourself. The book is based on Jannes Klaas’ experience of running machine learning training courses for financial professionals. Rather than providing ready-made financial algorithms, the book focuses on advanced machine learning concepts and ideas that can be applied in a wide variety of ways. The book systematically explains how machine learning works on structured data, text, images, and time series. You'll cover generative adversarial learning, reinforcement learning, debugging, and launching machine learning products. Later chapters will discuss how to fight bias in machine learning. The book ends with an exploration of Bayesian inference and probabilistic programming.
Table of Contents (15 chapters)
Machine Learning for Finance
Contributors
Preface
Other Books You May Enjoy
Index

Index

A

  • activation function / A logistic regressor
  • active learning
    • about / Using less data – active learning
    • labeling budgets, using / Using labeling budgets efficiently
    • leverage machines, for human labeling / Leveraging machines for human labeling
    • pseudo labeling, for unlabelled data / Pseudo labeling for unlabeled data
    • generative models, using / Using generative models
  • activity_regularizer / Regularization in Keras
  • adam (adaptive momentum estimation) / The Adam optimizer
  • adam optimizer / The Adam optimizer
  • advantage actor-critic (A2C) model
    • about / Advantage actor-critic models
    • pendulum, balancing / Learning to balance
    • trading / Learning to trade
  • aggregate global feature statistics / Aggregate global feature statistics
  • Amazon Mechanical Turk (MTurk) / Fine-tuning the NER
  • Anaconda
    • reference / Running notebooks locally
  • anti-discrimination law
    • disparate impact / Legal perspectives
  • approaches, beyond image classification
    • about / Computer vision beyond classification
    • facial recognition / Facial recognition
    • bounding box prediction / Bounding box prediction
  • asynchronous advantage actor-critic (A3C) / Advantage actor-critic models
  • attention mechanism
    • about / Attention
  • auto-sklearn
    • reference / Learning how to learn
  • autocorrelation / Autocorrelation
  • autoencoders
    • about / Understanding autoencoders
    • for MNIST / Autoencoder for MNIST
    • for credit cards / Autoencoder for credit cards
  • Automatic Differentiation Variational Inference (AVI) / From probabilistic programming to deep probabilistic programming
  • autoregression / ARIMA
  • Autoregressive Integrated Moving Average (AIRMA)
    • about / ARIMA
  • AutoWEKA
    • reference / Learning how to learn
  • AWS deep learning AMI
    • reference / Using the AWS deep learning AMI
    • using / Using the AWS deep learning AMI

B

  • backtesting
    • about / A note on backtesting
    • biases / A note on backtesting
  • bag of words classification / Bag-of-words
  • batchnorm
    • about / Batchnorm
  • Bayesian deep learning / Bayesian deep learning
  • Bayesian inference
    • about / An intuitive guide to Bayesian inference
    • flat prior / Flat prior
    • < 50% prior / <50% prior
    • prior / Prior and posterior
    • posterior / Prior and posterior
    • Markov Chain Monte Carlo / Markov Chain Monte Carlo
    • stochastic volatility example / Metropolis-Hastings MCMC
    • probabilistic programming, migrating to deep probabilistic programming / From probabilistic programming to deep probabilistic programming
  • behavioral economics / Understanding the brain through RL
  • Bellman equation
    • about / Markov processes and the bellman equation – A more formal introduction to RL, The Bellman equation in economics
  • biases, backtesting
    • look-Ahead bias / A note on backtesting
    • survivorship biasTopicn / A note on backtesting
    • psychological tolerance bias / A note on backtesting
    • overfitting / A note on backtesting
  • bias_regularizer / Regularization in Keras
  • bounding box prediction
    • about / Bounding box prediction
    • YOLO approach / Bounding box prediction
  • building blocks, ConvNets in Keras
    • Conv2D / Conv2D
    • padding / Padding
    • input shape / Input shape
    • MaxPooling2D / MaxPooling2D
    • flatten operation / Flatten
    • dense alyers / Dense

C

  • catastrophes / Catastrophes are caused by multiple failures
  • Catch
    • about / Catch – a quick guide to reinforcement learning
    • playing / Training to play Catch
  • categorical data / Preparing the data for the Keras library
  • causal learning
    • about / Causal learning
    • causal models, obtaining / Obtaining causal models
    • instrument variables / Instrument variables
    • nonlinear causal models / Non-linear causal models
  • complex system failure
    • unfairness approach / Unfairness as complex system failure
  • complex systems
    • disadvantages / Complex systems are intrinsically hazardous systems
    • executing, in degraded mode / Complex systems run in degraded mode
  • computers / Our journey in this book
  • confusion matrix
    • used, for evaluating heuristic model / Evaluating with a confusion matrix
  • Conv1D / Conv1D
  • Conv2D
    • about / Conv2D
    • kernel size / Kernel size
    • stride size / Stride size
    • padding / Padding
    • input shape / Input shape
    • ReLu activation / ReLU activation
  • ConvNet
    • building blocks, in Keras / The building blocks of ConvNets in Keras
    • training, on MNIST / Training MNIST
    • MNIST model / The model
    • MNIST dataset, loading / Loading the data
    • MNIST dataset, compiling / Compiling and training
    • MNIST dataset, training / Compiling and training
  • ConvNets
    • about / Convolutional Neural Networks
    • filters, on MNIST / Filters on MNIST
    • second filter, adding / Adding a second filter
  • convolve operation
    • using / Examining the sample time series
  • count vector / Bag-of-words
  • covariance stationarity
    • about / Different kinds of stationarity
  • CUDA
    • reference / Installing TensorFlow
  • Cython documentation
    • reference / Speeding up your code with Cython

D

  • data
    • preparing / Preparing the data
  • data, Seq2Seq models
    • about / The data
    • characters, encoding / Encoding characters
  • data debugging
    • about / Debugging data
    • task eligibility, checking / How to find out whether your data is up to the task
    • rules / How to find out whether your data is up to the task
    • enough data situations / What to do if you don't have enough data
    • unit testing / Unit testing data
    • privacy, maintaining / Keeping data private and complying with regulations
    • best practices / Keeping data private and complying with regulations
    • preparation, for training / Preparing the data for training
    • inputs, comparing to predictions / Understanding which inputs led to which predictions
  • data preparation
    • characters, sanitizing / Sanitizing characters
    • lemmatization / Lemmatization
    • target, preparing / Preparing the target
    • train, preparing / Preparing the training and test sets
    • test set, preparing / Preparing the training and test sets
  • dataset / The data
  • Dataset API
    • reference / Optimizing your pipeline
  • data trap / The feature engineering approach
  • deeper network
    • creating / A deeper network
  • deep learning
    • shortcoming / Learning how to learn
  • deep neural networks / All models are wrong
  • deployment
    • about / Deployment
    • product launch / Launching fast
    • metrics, monitoring / Understanding and monitoring metrics
    • data origin / Understanding where your data comes from
  • dilated and causal convolution / Dilated and causal convolution
  • discrete Fourier transform (DFT) / Fast Fourier transformations
  • disparate sample size / Sources of unfairness in machine learning
  • dropout
    • about / Dropout
  • dummy variable / One-hot encoding

E

  • end-to-end (E2E) modeling
    • about / E2E modeling
  • end-to-end models / Heuristic, feature-based, and E2E models
  • entity embeddings
    • about / Entity embeddings
    • categories, tokenizing / Tokenizing categories
    • input models, creating / Creating input models
    • model, training / Training the model
  • evolutionary strategies (ES) / Evolutionary strategies and genetic algorithms

F

  • 2010 Flash Crash use case / VAEs for time series
  • fair models
    • developing, checklist / A checklist for developing fair models, Is the data biased?
  • false negatives (FN) / Observational fairness
  • false positives (FP) / Observational fairness
  • Fast Fourier transformations / Fast Fourier transformations
  • feature-based models / Heuristic, feature-based, and E2E models
  • feature engineering approach
    • about / The feature engineering approach
    • fraudsters / A feature from intuition – fraudsters don't sleep
    • fraudulent transfer destination / Expert insight – transfer, then cash out
    • fraudulent cash outs / Expert insight – transfer, then cash out
    • balance errors / Statistical quirks – errors in balances
  • feature scaling, ways
    • standardization / Preparing the data for training
    • Min-Max rescaling / Preparing the data for training
    • mean normalization / Preparing the data for training
    • unit length scaling, applying / Preparing the data for training
  • filters
    • applying, on color images / Filters on color images
  • forecasting, with neural nets
    • about / Forecasting with neural networks
    • data preparation / Data preparation
    • data preparation, weekdays / Weekdays
  • forward pass / A forward pass
  • four-fifths rule / Legal perspectives
  • fraud detection
    • SGAN, using / SGANs for fraud detection
  • frontiers, RL
    • about / Frontiers of RL, Understanding the brain through RL
    • multi agents / Multi-agent RL
    • many agents / Multi-agent RL
    • many / Multi-agent RL
  • function approximators / Approximating functions
  • functions
    • approximating / Approximating functions

G

  • GANs
    • about / GANs
    • training process / GANs
    • MNIST GAN / A MNIST GAN
    • latent vectors / Understanding GAN latent vectors
    • training tricks / GAN training tricks
  • General Data Protection Regulation (GDPR) / Keeping data private and complying with regulations
  • generative models
    • using / Using generative models
  • genetic algorithms / Evolutionary strategies and genetic algorithms
  • global features
    • about / Visualization and preparation in pandas
    • issues / Aggregate global feature statistics
  • Global Vectors (GloVe) / Loading pretrained word vectors
  • Google cloud AutoML
    • reference / Learning how to learn
  • graphics processing units (GPUs) / Using the right hardware for your problem
  • Graphics Processing Units (GPUs) / Setting up your workspace
  • Gym
    • reference / Learning to balance

H

  • H2O AutoML
    • reference / Learning how to learn
  • heuristic model
    • about / Heuristic, feature-based, and E2E models, The heuristic approach
    • used, for making predictions / Making predictions using the heuristic model
    • F1 score / The F1 score
    • evaluating, with confusion matrix / Evaluating with a confusion matrix
  • hyper-parameters / Gradient descent
  • hyperas
    • used, for searching hyperparameter / Hyperparameter search with Hyperas
    • reference / Hyperparameter search with Hyperas
    • installation adjustments / Hyperparameter search with Hyperas
  • Hyperopt
    • reference / Learning how to learn
  • hyperparameter
    • searching, with hyperas / Hyperparameter search with Hyperas

I

  • image datasets
    • working with / Working with big image datasets
  • instrumental variables two-stage least squares (IV2SLS) / Instrument variables
  • integrated / ARIMA

J

  • JobLib
    • reference / Flat prior

K

  • Kaggle
    • reference / Using Kaggle kernels, The data, An introductory guide to spaCy
  • Kaggle Kernel demoing marbles
    • reference / Unit testing data
  • Kalman filters / Kalman filters
  • Keras
    • about / A brief introduction to Keras
    • importing / Importing Keras
    • two-layer model / A two-layer model in Keras
    • and TensorFlow / Keras and TensorFlow
    • used, for creating predictive models / Creating predictive models with Keras
    • building blocks, ConvNets / The building blocks of ConvNets in Keras
    • documentation, reference / Augmentation with ImageDataGenerator
  • Keras functional API
    • about / A quick tour of the Keras functional API
  • Keras library
    • data, preparing / Preparing the data for the Keras library
    • nominal data / Preparing the data for the Keras library
    • ordinal data / Preparing the data for the Keras library
    • numerical data / Preparing the data for the Keras library
    • one-hot encoding / One-hot encoding
    • entity embeddings / Entity embeddings
  • Keras library
    • one-hot encoding / One-hot encoding
  • kernel_regularizer / Regularization in Keras
  • Kullback-Leibler (KL) divergence / Visualizing latent spaces with t-SNE

L

  • Latent Dirichlet Allocation (LDA) / Topic modeling
  • latent spaces
    • visualizing, with t-SNE / Visualizing latent spaces with t-SNE
  • learning rate / Parameter updates
  • linear step / A logistic regressor
  • Local Interpretable Model-Agnostic Explanations (LIME) / Understanding which inputs led to which predictions
  • logistic regressor
    • about / A logistic regressor
    • Python version / Python version of our logistic regressor
  • LSTM
    • about / LSTM
    • carry / The carry

M

  • machine learning / What is machine learning?
  • marbles
    • reference / Unit testing data
  • Markov Chains
    • reference / Markov processes and the bellman equation – A more formal introduction to RL
  • Markov processes / Markov processes and the bellman equation – A more formal introduction to RL
  • matrix multiplication (matmul) / Tensors and the computational graph
  • mean absolute percentage (MAPE) / Establishing a training and testing regime
  • mean stationarity
    • about / Different kinds of stationarity
  • median forecasting / Median forecasting
  • ML
    • unfairnes, sources / Sources of unfairness in machine learning
  • ML software stack
    • Keras / The machine learning software stack
    • NumPy / The machine learning software stack
    • Pandas / The machine learning software stack
    • Scikit-learn / The machine learning software stack
    • Matplotlib / The machine learning software stack
    • Jupyter / The machine learning software stack
    • about / The machine learning software stack
  • MNIST
    • filters / Filters on MNIST
  • MNIST Autoencoder VAE
    • reference / Autoencoder for MNIST
  • model debugging
    • about / Debugging your model
    • hyperas, used for searching hyperparameter / Hyperparameter search with Hyperas
    • learning rate, searching / Efficient learning rate search
    • learning rate, scheduling / Learning rate scheduling
    • TensorBoard, used for training monitoring / Monitoring training with TensorBoard
    • vanishing gradient problem / Exploding and vanishing gradients
    • exploding gradient problem / Exploding and vanishing gradients
  • model loss
    • measuring / Measuring model loss
    • gradient descent / Gradient descent
    • backpropagation / Backpropagation
    • parameter updates / Parameter updates
    • 1-layer neural network, training / Putting it all together
  • model parameters
    • optimizing / Optimizing model parameters
  • models
    • training, for maintaining fairness measures / Training to be fair
    • interpreting, for ensuring fairness / Interpreting models to ensure fairness
    • inspecting, for unfairness / Unfairness as complex system failure, Complex systems run in degraded mode, Accident-free operation requires experience with failure
  • modularity trade-off / The modularity tradeoff
  • momentum / Momentum
  • Moving Average / ARIMA

N

  • named entity recognition (NER)
    • about / Named entity recognition
    • fine tuning / Fine-tuning the NER
  • neural nets
    • used, for forecasting / Forecasting with neural networks
    • uncertainty / Bayesian deep learning
  • neural networks (NNs) / Approximating functions
  • No-U-Turn Sampler (NUTS) / Metropolis-Hastings MCMC
  • nonlinear causal models / Non-linear causal models
  • nonlinear step / A logistic regressor
  • notebook, executing
    • TensorFlow, installing / Installing TensorFlow
    • Keras, installing / Installing Keras

O

  • observational fairness
    • about / Observational fairness
  • off the shelf AutoML solutions
    • tpot / Learning how to learn
    • auto-sklearn / Learning how to learn
    • AutoWEKA / Learning how to learn
    • H2O AutoML / Learning how to learn
    • Google cloud AutoML / Learning how to learn
  • one-hot encoding / One-hot encoding
  • overfitting / Creating a test set

P

  • pandas
    • visualization / Visualization and preparation in pandas
    • preparation / Visualization and preparation in pandas
    • reference / Named entity recognition
  • part of speech (POS) tagging
    • about / Part-of-speech (POS) tagging
  • performance tips, machine learning applications
    • about / Performance tips
    • right hardware , using / Using the right hardware for your problem
    • distributed training, using with TF estimators / Making use of distributed training with TF estimators
    • CuDNNLSTM, using / Using optimized layers such as CuDNNLSTM
    • pipeline, optimizing / Optimizing your pipeline
    • Cython, used for speeding up code / Speeding up your code with Cython
    • cache frequent requests / Caching frequent requests
  • predictive models, creating
    • training data, oversampling / Oversampling the training data
  • predictive models, creating with Keras
    • about / Creating predictive models with Keras
    • target, extracting / Extracting the target
    • test set, creating / Creating a test set
    • building / Building the model
    • simple baseline, creating / Creating a simple baseline
    • complex models, building / Building more complex models
  • preexisting social biases / Sources of unfairness in machine learning
  • pretrained models
    • working with / Working with pretrained models
    • VGG16, modifying / Modifying VGG-16
    • random image augmentation / Random image augmentation
    • random image augmentation, with ImageDataGenerator / Augmentation with ImageDataGenerator
  • pretrained word vectors
    • loading / Loading pretrained word vectors
  • principal component analysis (PCA) / Understanding autoencoders
  • Python
    • regex module, using / Using Python's regex module

Q

  • Q-function / Catch – a quick guide to reinforcement learning
  • Q-learning
    • about / Catch – a quick guide to reinforcement learning
    • used, for conversion of RL into supervised learning / Q-learning turns RL into supervised learning
    • exploration / Training to play Catch
  • Q-learning model
    • defining / Defining the Q-learning model
  • qualitative rationale / The feature engineering approach

R

  • recurrent dropout / Recurrent dropout
  • recurrent neural networks (RNN) / Simple RNN
  • regex module
    • using, in Python / Using Python's regex module
    • using, in pandas / Regex in pandas
    • using / When to use regexes and when not to
  • Region-based Convolutional Neural Network (R-CNN) / Bounding box prediction
  • regular expressions
    • about / Regular expressions
  • regularization
    • about / Regularization
    • L2 regularization / L2 regularization
    • L1 regularization / L1 regularization
    • in Keras / Regularization in Keras
  • reinforcement learning
    • about / Reinforcement learning
    • effectiveness of data / The unreasonable effectiveness of data
    • machine learning models / All models are wrong
    • conversion, to supervised learning with Q-learning / Q-learning turns RL into supervised learning
  • reinforcement learning (RL)
    • about / Catch – a quick guide to reinforcement learning, Markov processes and the bellman equation – A more formal introduction to RL
    • frontiers / Frontiers of RL
  • reward functions, designing
    • manual reward shaping / Careful, manual reward shaping
    • inverse reinforcement learning (IRL) / Inverse reinforcement learning
    • human preferences, learning / Learning from human preferences
    • robost RL, creating / Robust RL
  • RL engineering
    • best practices / Designing good reward functions
    • reward functions, designing / Designing good reward functions
  • rule-based matching
    • about / Rule-based matching
    • custom functions, adding to matchers / Adding custom functions to matchers
    • matchers, adding to pipeline / Adding the matcher to the pipeline
    • combining, with learning based systems / Combining rule-based and learning-based systems

S

  • sampling biases / Sources of unfairness in machine learning
  • semi-supervised generative adversarial network (SGAN)
    • about / Using generative models
    • used, for fraud detection / SGANs for fraud detection
    • reference / SGANs for fraud detection
  • semi-supervised learning / Using less data – active learning
  • Seq2Seq models
    • about / Seq2seq models
    • architecture overview / Seq2seq architecture overview
    • data / The data
    • inference models, creating / Creating inference models
    • translations, creating / Making translations, Exercises
  • SHAP (SHapley Additive exPlanation) / Interpreting models to ensure fairness
  • simple model / What to do if you don't have enough data
  • simple RNN / Simple RNN
  • SpaCy
    • about / An introductory guide to spaCy, Document similarity with word embeddings
    • Doc instance / An introductory guide to spaCy
    • Vocab class / An introductory guide to spaCy
  • Spearmint
    • reference / Learning how to learn
  • stationarity
    • types / Different kinds of stationarity
    • significance / Why stationarity matters
  • stationarity issues
    • avoiding / When to ignore stationarity issues
  • Stochastic Gradient Descent (SGD) / Compiling the model
  • stochastic volatility
    • reference / Metropolis-Hastings MCMC
  • supervised learning / Supervised learning, Using less data – active learning
  • Synthetic Minority Over-sampling Technique (SMOTE) / Oversampling the training data
  • systematic error / Sources of unfairness in machine learning

T

  • t-SNE algorithm
    • used, for visualizing latent spaces / Visualizing latent spaces with t-SNE
  • Tabotea project / The data
  • tensors
    • and computational graph / Tensors and the computational graph
  • Term Frequency, Inverse Document Frequency (TF-IDF) / TF-IDF
  • test set / Creating a test set
  • text classification task
    • about / A text classification task
  • time series
    • examining / Examining the sample time series
  • time series models
    • using, with word vectors / Time series models with word vectors
  • time series stationary
    • making / Making a time series stationary
  • topic modeling / Topic modeling
  • tpot
    • reference / Learning how to learn
  • training and testing regime
    • establishing / Establishing a training and testing regime
  • transfer learning / What to do if you don't have enough data
  • tree-based methods
    • about / A brief primer on tree-based methods
    • decision tree / A simple decision tree
    • random forest / A random forest
    • XGBoost / XGBoost
  • Tree of Parzen (TPE) algorithm / Hyperparameter search with Hyperas
  • true negatives (TN) / Observational fairness
  • true payoff probability (TPP) / An intuitive guide to Bayesian inference
  • true positives (TP) / Observational fairness
  • two-layer model, Keras
    • about / A two-layer model in Keras
    • layer, stacking / Stacking layers
    • model, compiling / Compiling the model
    • model, training / Training the model

U

  • unsupervised learning / Unsupervised learning, Using less data – active learning

V

  • vanishing gradient problem / Monitoring training with TensorBoard
  • variance stationarity
    • about / Different kinds of stationarity
  • variational autoencoders (VAEs)
    • about / Variational autoencoders
    • MNIST example / MNIST example
    • Lambda layer, using / Using the Lambda layer
    • Kullback-Leibler divergence / Kullback–Leibler divergence
    • custom loss, creating / Creating a custom loss
    • using, for data generation / Using a VAE to generate data
    • used, for end-to-end fraud detection / VAEs for an end-to-end fraud detection system
    • using, in time series / VAEs for time series

W

  • word embeddings
    • about / Word embeddings
    • document similarity / Document similarity with word embeddings
  • word vectors
    • used, for preprocessing / Preprocessing for training with word vectors
  • workspace
    • setting up / Setting up your workspace, Using Kaggle kernels
    • notebooks, local execution / Running notebooks locally

X

  • Xtreme Gradient Boosting (XGBoost)
    • reference / A brief primer on tree-based methods, XGBoost
    • about / XGBoost

Y

  • You Only Look Once (YOLO) / Bounding box prediction