Book Image

Mastering Machine Learning Algorithms

Book Image

Mastering Machine Learning Algorithms

Overview of this book

Machine learning is a subset of AI that aims to make modern-day computer systems smarter and more intelligent. The real power of machine learning resides in its algorithms, which make even the most difficult things capable of being handled by machines. However, with the advancement in the technology and requirements of data, machines will have to be smarter than they are today to meet the overwhelming data needs; mastering these algorithms and using them optimally is the need of the hour. Mastering Machine Learning Algorithms is your complete guide to quickly getting to grips with popular machine learning algorithms. You will be introduced to the most widely used algorithms in supervised, unsupervised, and semi-supervised machine learning, and will learn how to use them in the best possible manner. Ranging from Bayesian models to the MCMC algorithm to Hidden Markov models, this book will teach you how to extract features from your dataset and perform dimensionality reduction by making use of Python-based libraries such as scikit-learn v0.19.1. You will also learn how to use Keras and TensorFlow 1.x to train effective neural networks. If you are looking for a single resource to study, implement, and solve end-to-end machine learning problems and use-cases, this is the book you need.

Title Page

Dedication

Packt Upsell

Contributors

Preface

Free Chapter

Machine Learning Model Fundamentals

Machine Learning Model Fundamentals

Models and data

Features of a machine learning model

Loss and cost functions

Introduction to Semi-Supervised Learning

Introduction to Semi-Supervised Learning

Semi-supervised scenario

Generative Gaussian mixtures

Contrastive pessimistic likelihood estimation

Semi-supervised Support Vector Machines (S3VM)

Transductive Support Vector Machines (TSVM)

Graph-Based Semi-Supervised Learning

Graph-Based Semi-Supervised Learning

Label propagation

Label spreading

Label propagation based on Markov random walks

Manifold learning

Bayesian Networks and Hidden Markov Models

Bayesian Networks and Hidden Markov Models

Conditional probabilities and Bayes' theorem

Bayesian networks

Hidden Markov Models (HMMs)

EM Algorithm and Applications

EM Algorithm and Applications

MLE and MAP learning

Gaussian mixture

Factor analysis

Principal Component Analysis

Independent component analysis

Addendum to HMMs

Hebbian Learning and Self-Organizing Maps

Hebbian Learning and Self-Organizing Maps

Sanger's network

Rubner-Tavan's network

Self-organizing maps

Clustering Algorithms

Clustering Algorithms

k-Nearest Neighbors

Spectral clustering

Ensemble Learning

Ensemble Learning

Ensemble learning fundamentals

Gradient boosting

Ensembles of voting classifiers

Ensemble learning as model selection

Neural Networks for Machine Learning

Neural Networks for Machine Learning

The basic artificial neuron

Multilayer perceptrons

Optimization algorithms

Regularization and dropout

Batch normalization

Advanced Neural Models

Advanced Neural Models

Deep convolutional networks

Recurrent networks

Transfer learning

Autoencoders

Variational autoencoders

Generative Adversarial Networks

Generative Adversarial Networks

Adversarial training

Wasserstein GAN (WGAN)

Deep Belief Networks

Deep Belief Networks

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning

Reinforcement Learning fundamentals

Policy iteration

Value iteration

TD(0) algorithm

Advanced Policy Estimation Algorithms

Advanced Policy Estimation Algorithms

TD(λ) algorithm

SARSA algorithm

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Index

A

activation functions
- about / The basic artificial neuron, Activation functions
- sigmoid / Sigmoid and hyperbolic tangent
- hyperbolic tangent / Sigmoid and hyperbolic tangent
- rectifier activation functions / Rectifier activation functions
- softmax function / Softmax
Actor-Critic TD(0)
- in checkerboard environment / Actor-Critic TD(0) in the checkerboard environment
AdaBoost / AdaBoost
AdaBoost, with Scikit-Learn
- example / Example of AdaBoost with Scikit-Learn
AdaBoost.R2 / AdaBoost.R2
AdaBoost.SAMME / AdaBoost.SAMME
AdaBoost.SAMME.R / AdaBoost.SAMME.R
AdaDelta
- about / AdaDelta
- with Keras / AdaDelta with Keras
AdaGrad
- about / AdaGrad
- with Keras / AdaGrad with Keras
Adam
- about / Adam
- with Keras / Adam with Keras
adjacency matrix / Label propagation
Adjusted Rand Index / Adjusted Rand Index
advantage Actor-Critic (A3C) / Actor-Critic TD(0) in the checkerboard environment
adversarial training / Adversarial training
affinity matrix / Label propagation
approaches, ensemble learning
- bagging / Ensemble learning fundamentals
- boosting / Ensemble learning fundamentals
- stacking / Ensemble learning fundamentals
approaches, spectral clustering
- k-Nearest Neighbors (KNN) / Spectral clustering
- radial basis function (RBF) / Spectral clustering
artificial neuron / The basic artificial neuron
assumptions, semi-supervised model
- smoothness assumption / Smoothness assumption
- cluster assumption / Cluster assumption
- manifold assumption / Manifold assumption
atrous convolution / Atrous convolution
autoencoders / Autoencoders
average pooling / Pooling layers

B

back-propagation algorithm
- about / Back-propagation algorithm
- stochastic gradient descent (SGD) / Stochastic gradient descent
- weight initialization / Weight initialization
backpropagation through time (BPTT) / Backpropagation through time (BPTT)
Ball Trees / Ball Trees
batch normalization (BN) / Batch normalization
batch normalization (BN), with Keras
- example / Example of batch normalization with Keras
Bayes' theorem / Conditional probabilities and Bayes' theorem
Bayes accuracy / Underfitting
Bayesian network
- about / Bayesian networks
- sampling from / Sampling from a Bayesian network
- direct sampling / Direct sampling
- Markov chains / A gentle introduction to Markov chains
- Gibbs sampling / Gibbs sampling
- Metropolis-Hastings sampling / Metropolis-Hastings sampling
bidimensional discrete convolutions
- about / Bidimensional discrete convolutions
- padding / Strides and padding
- strides / Strides and padding
binary classification / Label propagation based on Markov random walks
bootstrap sampling / Ensemble learning fundamentals
brute-force algorithm / k-Nearest Neighbors
bucketing / Ensemble learning as model selection

C

candidate-generating distribution / Metropolis-Hastings sampling
capacity, models
- defining / Capacity of a model
- Vapnik-Chervonenkis capacity /
categorical cross-entropy / Categorical cross-entropy
CD-k algorithm / RBMs
chain rule of derivatives / Back-propagation algorithm
chain rule of probabilities / Conditional probabilities and Bayes' theorem
Chapman-Kolmogorov / A gentle introduction to Markov chains
checkerboard environment
- policy iteration / Policy iteration in the checkerboard environment
- value iteration / Value iteration in the checkerboard environment
- TD(0) algorithm / TD(0) in the checkerboard environment
- Actor-Critic TD(0) / Actor-Critic TD(0) in the checkerboard environment
- SARSA algorithm / SARSA in the checkerboard environment
- Q-learning / Q-learning in the checkerboard environment
Cifar
- reference link / An example of a variational autoencoder with TensorFlow
CIFAR-10
- reference link / Example of DCGAN with TensorFlow
class rebalancing / Example of label propagation based on Markov random walks
clique / MRF
completeness score / Completeness score
complex checkerboard environment
- temporal difference algorithm / TD(λ) in a more complex Checkerboard environment
conditional independence / Conditional probabilities and Bayes' theorem
conditional probability / Conditional probabilities and Bayes' theorem
consistent estimator / Bias of an estimator
constant error carousel (CEC) / LSTM
Constraint Optimization by Linear Approximation (COBYLA) / Example of S3VM
Contrastive Pessimistic Likelihood Estimation (CPLE) algorithm
- about / Contrastive pessimistic likelihood estimation
- example / Example of contrastive pessimistic likelihood estimation
convolutional LSTM / LSTM
convolutions
- about / Convolutions
- bidimensional discrete convolutions / Bidimensional discrete convolutions
- separable convolution / Separable convolution
- transpose convolution / Transpose convolution
cost function
- about / Loss and cost functions
- starting point / Loss and cost functions
- local minima / Loss and cost functions
- ridges/local maxima / Loss and cost functions
- plateaus / Loss and cost functions
- global minimum / Loss and cost functions
- examples / Examples of cost functions
- mean squared error / Mean squared error
- Huber cost function / Huber cost function
- Hinge cost function / Hinge cost function
- categorical cross-entropy / Categorical cross-entropy
- regularization / Regularization
covariance rule
- about / Hebb's rule
- analysis / Analysis of the covariance rule
- application, example / Example of covariance rule application
- example / Example of covariance rule application
Cramér-Rao bound / The Cramér-Rao bound
cross-validation / Cross-validation

D

data / Models and data
data generating process / Models and data
DCGAN, with TensorFlow
- example / Example of DCGAN with TensorFlow
decision stumps / Random forests
decoder / Autoencoders
Deep Belief Network (DBN)
- about / DBNs
- reference link / Example of unsupervised DBN in Python
deep convolutional autoencoder
- with TensorFlow / An example of a deep convolutional autoencoder with TensorFlow
deep convolutional network, with data augmentation
- example / Example of a deep convolutional network with Keras and data augmentation
deep convolutional network, with Keras
- example / Examples of deep convolutional networks with Keras, Example of a deep convolutional network with Keras and data augmentation
deep convolutional networks
- about / Deep convolutional networks
- convolutions / Convolutions
- pooling layers / Pooling layers
- padding layers / Other useful layers
- upsampling layers / Other useful layers
- cropping layers / Other useful layers
- flattening layers / Other useful layers
deep learning / Example of a perceptron with Scikit-Learn
degree matrix / Label propagation
denoising autoencoders
- about / Denoising autoencoders
- with TensorFlow / An example of a denoising autoencoder with TensorFlow
depth multiplier / Separable convolution
depthwise separable convolution / Separable convolution
Dijkstra algorithm / Isomap
dilated convolution / Atrous convolution
direct sampling
- about / Direct sampling
- example / Example of direct sampling
Discrete AdaBoost / AdaBoost
discrete Laplacian operator / Bidimensional discrete convolutions
dropout / Regularization and dropout, Dropout
dropout, with Keras
- example / Example of dropout with Keras
Dunn's partitioning coefficient / Fuzzy C-means

E

early stopping / Early stopping
ElasticNet / ElasticNet
emissions / Hidden Markov Models (HMMs)
empirical risk / Loss and cost functions
encoder / Autoencoders
ensemble learning
- fundamentals / Ensemble learning fundamentals
- using, as model selection / Ensemble learning as model selection
environment, Reinforcement Learning (RL)
- rewards / Rewards
- checkerboard environment, in Python / Checkerboard environment in Python
estimator
- bias, measuring / Bias of an estimator
- underfitting / Underfitting
- variance, measuring / Variance of an estimator
- overfitting / Overfitting
- Cramér-Rao bound / The Cramér-Rao bound
evaluation metrics
- about / Evaluation metrics
- homogeneity score / Homogeneity score
- completeness score / Completeness score
- Adjusted Rand Index / Adjusted Rand Index
- silhouette score / Silhouette score
Expectation Maximization (EM) algorithm
- about / Models and data, EM algorithm
- parameter estimation, example / An example of parameter estimation
expected risk / Loss and cost functions

F

factor analysis (FA) / Factor analysis, Example of AdaBoost with Scikit-Learn
factor analysis (FA), with Scikit-Learn
- example / An example of factor analysis with Scikit-Learn
FastICA with Scikit-Learn
- example / An example of FastICA with Scikit-Learn
feature map / Convolutions
feature selection / Example of random forest with Scikit-Learn
feed-forward network / Multilayer perceptrons
Fisher information / The Cramér-Rao bound
forward-backward algorithm
- about / Forward-backward algorithm
- forward phase / Forward phase
- backward phase / Backward phase
- HMM parameter estimation / HMM parameter estimation
Forward Stage-wise Additive Modeling / Gradient boosting
fuzzy C-means / Fuzzy C-means
fuzzy C-means, with Scikit-Fuzzy
- example / Example of fuzzy C-means with Scikit-Fuzzy
fuzzy logic / Fuzzy C-means

G

Gated recurrent unit (GRU) / GRU
Gaussian mixture / Gaussian mixture
Gaussian mixture, with Scikit-Learn
- example / An example of Gaussian Mixtures using Scikit-Learn
Generalized Hebbian Rule (GHA) / Sanger's network
Generative Gaussian mixtures
- about / Generative Gaussian mixtures
- example / Example of a generative Gaussian mixture
- weighted log-likelihood / Weighted log-likelihood
Gibbs sampling / Gibbs sampling
Gini impurity / Random forests
gradient boosting / Gradient boosting
gradient perturbation / Gradient perturbation
gradient tree boosting, with Scikit-Learn
- example / Example of gradient tree boosting with Scikit-Learn
Gram-Schmidt / Sanger's network
Greedy in the Limit with Infinite Explorations (GLIE) / TD(0) algorithm

H

Hammersley–Clifford theorem / MRF
Harmonium / RBMs
Hebb's rule / Hebb's rule
He initializer / Weight initialization
Hidden Markov Models (HMMs)
- about / Hidden Markov Models (HMMs), Addendum to HMMs
- forward-backward algorithm / Forward-backward algorithm
- Viterbi algorithm / Viterbi algorithm
Hinge cost function / Hinge cost function
hmmlearn
- reference link / Example of HMM training with hmmlearn
- most likely hidden state sequence, finding / Finding the most likely hidden state sequence with hmmlearn
HMM parameter estimation / HMM parameter estimation
HMM training
- hmmlearn / Example of HMM training with hmmlearn
homogeneity score / Homogeneity score
Huber cost function / Huber cost function
hyperbolic tangent / Sigmoid and hyperbolic tangent

I

independent and identically distributed (i.i.d.) / Models and data
independent component analysis / Independent component analysis
inductive learning / Inductive learning
instance-based learning / k-Nearest Neighbors
Isomap algorithm
- about / Isomap
- example / Example of Isomap

K

K-Fold cross-validation
- about / Cross-validation
- Stratified K-Fold / Cross-validation
- Leave-one-out (LOO) / Cross-validation
- Leave-P-out (LPO) / Cross-validation
K-means / K-means
K-means++ / K-means++
K-means, with Scikit-Learn
- example / Example of K-means with Scikit-Learn
k-Nearest Neighbors (KNN)
- about / k-Nearest Neighbors
- KD Trees / KD Trees
- Ball Trees / Ball Trees
KD Trees / KD Trees
Keras
- reference link / Example of MLP with Keras
- SGD with momentum / SGD with momentum in Keras
KNN, with Scikit-Learn
- example / Example of KNN with Scikit-Learn
Kohonen / Self-organizing maps

L

label propagation
- about / Label propagation
- example / Example of label propagation
label propagation, based on Markov random walks
- about / Label propagation based on Markov random walks
- example / Example of label propagation based on Markov random walks
label spreading
- about / Label spreading
- example / Example of label spreading
Laplacian Spectral Embedding
- about / Laplacian Spectral Embedding
- example / Example of Laplacian Spectral Embedding
Lasso regularization / Lasso
Latent Dirichlet Allocation (LDA) / MLE and MAP learning
Leave-one-out (LOO) / Cross-validation
Leave-P-out (LPO) / Cross-validation
LeCun initialization / Weight initialization
likelihood / Conditional probabilities and Bayes' theorem
Lloyd's algorithm / K-means
Locally Linear Embedding (LLE)
- about / Locally linear embedding
- example / Example of locally linear embedding
long-short-term memory (LSTM) / LSTM
long-term depression (LTD) / Hebb's rule
long-term potentiation (LTP) / Hebb's rule
loss function
- about / Loss and cost functions
- defining / Loss and cost functions
LSTM network, with Keras
- example / Example of an LSTM network with Keras

M

manifold learning
- about / Manifold learning
- Isomap algorithm / Isomap
- Locally Linear Embedding (LLE) / Locally linear embedding
Markov chains / A gentle introduction to Markov chains
Markov Decision Process (MDP) / Reinforcement Learning fundamentals, TD(λ) algorithm
Markov random field (MRF) / MRF
maximal clique / MRF
Maximum A Posteriori (MAP) learning / MLE and MAP learning
Maximum Likelihood Estimation (MLE) learning / MLE and MAP learning, Hebb's rule
max pooling / Pooling layers
mean squared error / Mean squared error
metric multidimensional scaling / Isomap
Metropolis-Hastings sampling
- about / Metropolis-Hastings sampling
- example / Example of Metropolis-Hastings sampling
mini-batch gradient descent / Stochastic gradient descent
MLLE
- reference link / Locally linear embedding
MLP, with Keras
- example / Example of MLP with Keras
models
- about / Models and data
- zero-centering / Zero-centering and whitening
- whitening / Zero-centering and whitening
- training set / Training and validation sets
- validation set / Training and validation sets
- cross-validation / Cross-validation
models, features
- about / Features of a machine learning model
- capacity, defining / Capacity of a model
- estimator bias, measuring / Bias of an estimator
- estimator variance, measuring / Variance of an estimator
Modified LLE / Locally linear embedding
momentum / Momentum and Nesterov momentum
Multilayer Perceptron (MLP)
- about / Multilayer perceptrons
- activation functions / Activation functions
- back-propagation algorithm / Back-propagation algorithm

N

Nesterov momentum / Momentum and Nesterov momentum
neural network
- used, in Q-learning / Q-learning using a neural network
non-parametric models / Models and data

O

Occam's razor principle / The Cramér-Rao bound
Oja's rule / Weight vector stabilization and Oja's rule
optimization algorithms
- about / Optimization algorithms
- gradient perturbation / Gradient perturbation
- momentum / Momentum and Nesterov momentum
- Nesterov momentum / Momentum and Nesterov momentum
- RMSProp / RMSProp
- Adam / Adam
- AdaGrad / AdaGrad
- AdaDelta / AdaDelta
Ordinary Least Squares (OLS) / Ridge
overfitting / Overfitting

P

pandas
- reference link / Example of an LSTM network with Keras
parametric models / Models and data
PCA with Scikit-Learn
- example / An example of PCA with Scikit-Learn
- about / An example of PCA with Scikit-Learn
peephole LSTM / LSTM
perceptron / Perceptron
perceptron, with Scikit-Learn
- example / Example of a perceptron with Scikit-Learn
point of inflection / Loss and cost functions
policy iteration
- about / Policy iteration
- in checkerboard environment / Policy iteration in the checkerboard environment
pooling layers / Pooling layers
Principal Component Analysis (PCA) / Isomap, Principal Component Analysis, Analysis of the covariance rule, Example of AdaBoost with Scikit-Learn
prior probability / Conditional probabilities and Bayes' theorem
PyMC3
- reference link / Sampling example using PyMC3

Q

Q-learning
- about / Q-learning
- in checkerboard environment / Q-learning in the checkerboard environment
- neural network, using / Q-learning using a neural network

R

random forests / Random forests
random forests, with Scikit-Learn
- example / Example of random forest with Scikit-Learn
Rayleigh-Ritz method / Locally linear embedding
rectifier activation functions / Rectifier activation functions
recurrent networks
- about / Multilayer perceptrons, Recurrent networks
- backpropagation through time (BPTT) / Backpropagation through time (BPTT)
- long-short-term memory (LSTM) / LSTM
- Gated recurrent unit (GRU) / GRU
regularization
- about / Overfitting, Regularization, Regularization and dropout
- Ridge regularization / Ridge
- Lasso regularization / Lasso
- ElasticNet / ElasticNet
- early stopping / Early stopping
Reinforcement Learning (RL)
- fundamentals / Reinforcement Learning fundamentals
- environment / Environment
- policy / Policy
representational capacity / Capacity of a model
Restricted Boltzmann Machines (RBM) / RBMs
Ridge regularization / Ridge
RMSProp
- about / RMSProp
- with Keras / RMSProp with Keras
Rubner-Tavan's network
- about / Rubner-Tavan's network
- example / Example of Rubner-Tavan's network

S

saddle points / Loss and cost functions
same padding / Strides and padding
Sanger's network
- about / Sanger's network
- example / Example of Sanger's network
SARSA algorithm
- about / SARSA algorithm
- in checkerboard environment / SARSA in the checkerboard environment
Scikit-Fuzzy
- reference link / Example of fuzzy C-means with Scikit-Fuzzy
Scikit-Learn
- label propagation / Label propagation in Scikit-Learn
Self-Organizing Maps (SOMs)
- about / Self-organizing maps
- example / Example of SOM
semi-supervised model
- scenario / Semi-supervised scenario
- transductive learning / Transductive learning
- inductive learning / Inductive learning
- assumptions / Semi-supervised assumptions
semi-supervised Support Vector Machines (S3VM)
- about / Semi-supervised Support Vector Machines (S3VM)
- example / Example of S3VM
separable convolution / Separable convolution
Sequential Least Squares Programming (SLSQP) / Example of S3VM
SGD, with momentum
- in Keras / SGD with momentum in Keras
shattering /
Shi-Malik spectral clustering algorithm / Spectral clustering
sigmoid / Sigmoid and hyperbolic tangent
silhouette score / Silhouette score
singular value decomposition (SVD) / Zero-centering and whitening, Principal Component Analysis
softmax function / Models and data, Softmax
sparse autoencoders / Sparse autoencoders
sparse coding / Lasso
sparseness
- adding, to Fashion MNIST deep convolutional autoencoder / Adding sparseness to the Fashion MNIST deep convolutional autoencoder
spectral clustering / Spectral clustering
spectral clustering, with Scikit-Learn
- example / Example of spectral clustering with Scikit-Learn
stacking / Ensembles of voting classifiers
Stagewise Additive Modeling using Multi-class Exponential loss (SAMME) / AdaBoost.SAMME
Standard K-Fold / Cross-validation
stochastic gradient descent (SGD) / Mean squared error, Stochastic gradient descent
Stochastic Gradient Descent (SGD) / TD(λ) algorithm
Stratified K-Fold / Cross-validation
supervised DBN, with Python
- example / Example of Supervised DBN with Python
Support Vector Machine (SVM) / Cross-validation, Semi-supervised Support Vector Machines (S3VM), DBNs
support vector machines (SVM) / Ensemble learning fundamentals
synaptic weight vector / The basic artificial neuron

T

t-Distributed Stochastic Neighbor Embedding (t-SNE)
- about / t-SNE
- example / Example of t-distributed stochastic neighbor embedding
TD(0) algorithm
- about / TD(0) algorithm
- in checkerboard environment / TD(0) in the checkerboard environment
temporal difference algorithm
- about / TD(0) algorithm, TD(λ) algorithm
- in complex checkerboard environment / TD(λ) in a more complex Checkerboard environment
TensorFlow
- installation link / Example of MLP with Keras, An example of a deep convolutional autoencoder with TensorFlow
Tikhonov regularization / Ridge
training set / Training and validation sets
transductive learning / Transductive learning
Transductive Support Vector Machines (TSVM)
- about / Transductive Support Vector Machines (TSVM)
- example / Example of TSVM
transfer learning / Transfer learning
transition probability / A gentle introduction to Markov chains
transpose convolution / Transpose convolution
truncated backpropagation through time (TBPTT) / Backpropagation through time (BPTT)

U

unbiased estimator / Bias of an estimator
underfitting / Underfitting
unsupervised DBN, in Python
- example / Example of unsupervised DBN in Python

V

validation set / Training and validation sets
valid padding / Strides and padding
value iteration
- about / Value iteration
- in checkerboard environment / Value iteration in the checkerboard environment
vanishing gradients / Sigmoid and hyperbolic tangent, Back-propagation algorithm
Vapnik-Chervonenkis-capacity /
Vapnik-Chervonenkis theory /
variance scaling / Weight initialization
variational autoencoder (VAE)
- about / Variational autoencoders
- with TensorFlow / An example of a variational autoencoder with TensorFlow
VC-capacity /
VC-dimension /
Viterbi algorithm / Viterbi algorithm
voting classifiers
- ensemble, creating / Ensembles of voting classifiers
voting classifiers, with Scikit-Learn
- example / Example of voting classifiers with Scikit-Learn

W

Wasserstein GAN (WGAN) / Wasserstein GAN (WGAN)
weighted log-likelihood / Weighted log-likelihood
weight initialization / Weight initialization
weight shrinkage / Ridge
weight vector
- stabilization / Weight vector stabilization and Oja's rule
- about / The basic artificial neuron
WGAN, with TensorFlow
- example / Example of WGAN with TensorFlow
whitening
- about / Zero-centering and whitening
- advantages / Zero-centering and whitening
winner-takes-all / Self-organizing maps

X

Xavier initialization / Weight initialization

Z

zero-centering / Zero-centering and whitening