Book Image

Scala for Machine Learning, Second Edition - Second Edition

Book Image

Scala for Machine Learning, Second Edition - Second Edition

Overview of this book

The discovery of information through data clustering and classification is becoming a key differentiator for competitive organizations. Machine learning applications are everywhere, from self-driving cars, engineering design, logistics, manufacturing, and trading strategies, to detection of genetic anomalies. The book is your one stop guide that introduces you to the functional capabilities of the Scala programming language that are critical to the creation of machine learning algorithms such as dependency injection and implicits. You start by learning data preprocessing and filtering techniques. Following this, you'll move on to unsupervised learning techniques such as clustering and dimension reduction, followed by probabilistic graphical models such as Naïve Bayes, hidden Markov models and Monte Carlo inference. Further, it covers the discriminative algorithms such as linear, logistic regression with regularization, kernelization, support vector machines, neural networks, and deep learning. You’ll move on to evolutionary computing, multibandit algorithms, and reinforcement learning. Finally, the book includes a comprehensive overview of parallel computing in Scala and Akka followed by a description of Apache Spark and its ML library. With updated codes based on the latest version of Scala and comprehensive examples, this book will ensure that you have more than just a solid fundamental knowledge in machine learning with Scala.

Scala for Machine Learning Second Edition

Scala for Machine Learning Second Edition

Credits

About the Author

About the Author

About the Reviewers

About the Reviewers

www.PacktPub.com

www.PacktPub.com

Customer Feedback

Customer Feedback

Preface

Free Chapter

Getting Started

Getting Started

Mathematical notations for the curious

Why machine learning?

Model categorization

Taxonomy of machine learning algorithms

Leveraging Java libraries

Tools and frameworks

Let's kick the tires

Data Pipelines

Defining a methodology

Monadic data transformation

Workflow computational model

Assessing a model

Data Preprocessing

Data Preprocessing

Time series in Scala

Moving averages

Fourier analysis

The discrete Kalman filter

Alternative preprocessing techniques

Unsupervised Learning

Unsupervised Learning

K-mean clustering

Expectation-Maximization (EM)

Dimension Reduction

Dimension Reduction

Challenging model complexity

The divergences

Principal components analysis (PCA)

Nonlinear models

Naïve Bayes Classifiers

Naïve Bayes Classifiers

Probabilistic graphical models

Naïve Bayes classifiers

Multivariate Bernoulli classification

Naïve Bayes and text mining

Sequential Data Models

Sequential Data Models

Markov decision processes

The hidden Markov model (HMM)

Conditional random fields

Regularized CRF and text analytics

Comparing CRF and HMM

Performance consideration

Monte Carlo Inference

Monte Carlo Inference

The purpose of sampling

Gaussian sampling

Monte Carlo approximation

Bootstrapping with replacement

Markov Chain Monte Carlo (MCMC)

Regression and Regularization

Regression and Regularization

Linear regression

Numerical optimization

Logistic regression

Multilayer Perceptron

Multilayer Perceptron

Feed-forward neural networks (FFNN)

The multilayer perceptron (MLP)

Benefits and limitations

Deep Learning

Sparse autoencoder

Restricted Boltzmann Machines (RBMs)

Convolution neural networks

Kernel Models and SVM

Kernel Models and SVM

Kernel functions

The support vector machine (SVM)

Performance considerations

Evolutionary Computing

Evolutionary Computing

Genetic algorithms and machine learning

Genetic algorithm components

GA for trading strategies

Advantages and risks of genetic algorithms

Multiarmed Bandits

Multiarmed Bandits

Thompson sampling

Upper bound confidence

Reinforcement Learning

Reinforcement Learning

Reinforcement learning

Learning classifier systems

Parallelism in Scala and Akka

Parallelism in Scala and Akka

Scalability with Actors

Apache Spark MLlib

Apache Spark MLlib

Apache Spark core

Reusable ML pipelines

Extending Spark

Streaming engine

Performance evaluation

Basic Concepts

Scala programming

Suggested online courses

References

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Index

A

A/B testing / Bayesian Bernoulli bandits
accuracy / Key quality metrics
- versus model fitness / Model fitness
action-value iterative update / Action-value iterative update
activation convolution neural network / Convolution layers
activation function / Activation function
Actors
- scalability / Scalability with Actors
adaptive modeling / Model categorization
aggregation effect / Tuning the number of clusters
Akka / Scala as a scalable language
Akka framework
- about / Akka
- URL, for downloading / Akka
- master-workers design / Master-workers
- futures / Futures
AlgeBird
- reference link / Algebraic and numerical libraries
Algebird / Other libraries and frameworks
algebraic libraries
- about / Algebraic and numerical libraries
alpha (forward pass) / Alpha (forward pass)
Analysis of Variance (ANOVA) / Challenging model complexity
anomaly / Anomaly detection with one-class SVC
anomaly detection
- with one-class SVC / Anomaly detection with one-class SVC
Apache Commons Math
- reference link / Leveraging Java libraries
- about / Apache Commons Math
- description / Description
- licensing / Licensing
- installation / Installation
- URL, for download / Installation
Apache Commons Math library
- exceptions, handling / Training workflow
Apache Mesos resource manager
- URL / Deploying Spark
Apache Public License 2.0
- URL / Licensing
Apache Spark / Scala as a scalable language
- about / Apache Spark core, Extending Spark
- need for / Why Spark?
- design principles / Design principles
- experimenting with / Experimenting with Spark
- deploying / Deploying Spark
- URL / Deploying Spark
- shell, using / Using Spark shell
- Kullback-Leibler divergence / Kullback-Leibler divergence
- Kullback-Leibler evaluator / Kullback-Leibler evaluator
area under PRC / Area under PRC
area under ROC / Area under ROC
areaUnderROC (AuROC)
- about / Training the model
attributes
- about / What is a model?
Auto-Regressive Integrated Moving Average (ARIMA) / Alternative preprocessing techniques
Auto-Regressive Moving Average (ARMA) / Alternative preprocessing techniques
autoencoders
- sparse autoencoder / Sparse autoencoder, Categorization
- undercomplete autoencoder / Undercomplete autoencoder, Categorization, Feed-forward sparse, undercomplete autoencoder
- topology of hidden layers, characteristics / Undercomplete autoencoder
- deterministic autoencoder / Deterministic autoencoder
- complete autoencoder / Categorization
- overcomplete autoencoder / Categorization
- regularized autoencoder / Categorization
- stochastic autoencoder / Categorization
- feed-forward sparse autoencoder / Feed-forward sparse, undercomplete autoencoder
- implementing / Implementation
autonomous systems
- about / Understanding the challenge

B

batch EM / Online EM
batch training
- versus online training / Online versus batch training
- about / Online versus batch training
Baum-Welch estimator (EM) / Baum-Welch estimator (EM)
Bayesian Bernoulli bandits / Bayesian Bernoulli bandits
Bayesian networks
- about / Probabilistic graphical models
Bellman optimality equations / Bellman optimality equations
Bernoulli model / Multivariate Bernoulli classification
best practices, Scala programming
- encapsulation / Encapsulation
- class constructor template / Class constructor template
- companion objects, versus case classes / Companion objects versus case classes
- enumeration, versus case classes / Enumerations versus case classes
- overloading / Overloading
- design template, for immutable classifiers / Design template for immutable classifiers
beta (backward pass) / Beta (backward pass)
bias-variance decomposition / Bias-variance decomposition
BinaryClassificationEvaluator
- about / Validating the model
BinaryLogisticRegressionSummary
- about / Training the model
binary restricted Boltzmann machines
- about / Binary restricted Boltzmann machines
- conditional probabilities / Conditional probabilities
- sampling / Sampling
- log-likelihood gradient / Log-likelihood gradient
- contrastive divergence / Contrastive divergence
- configuration parameters / Configuration parameters
- unsupervised learning / Unsupervised learning
binary SVC
- about / The binary SVC
- LIBSVM / LIBSVM
- design / Design
- configuration parameters / Configuration parameters
- interface, creating to LIBSVM / Interface to LIBSVM
- training / Training
- classification / Classification
- margin / C-penalty and margin
- C-penalty / C-penalty and margin
- C-penalty, optimizing / C-penalty and margin
- kernel, evaluation / Kernel evaluation
- application, to risk analysis / Application to risk analysis
binomial logistic regression / Step 6 – evaluating the model
bitwise swap / Fast Fisher-Yates shuffle
Bloomberg / Naïve Bayes and text mining
Boltzmann machine / Boltzmann machine
bootstrapping
- with replacement / Bootstrapping with replacement
- overview / Overview
- resampling / Resampling
- implementation / Implementation
- pros and cons / Pros and cons of bootstrap
Box-Muller transform
- about / Box-Muller transform
Breeze / Other libraries and frameworks
Bregman distance
- about / The divergences
Broyden-Fletcher-Goldfarb-Shanno (BFGS) / BFGS
Broyden-Fletcher-Goldfarb-Shannon method / Numerical optimization

C

cake pattern
- about / Scala as an object oriented language, Step 3 – instantiation, Composing mixins to build workflow
canonical forms / The hidden Markov model (HMM)
- evaluation / The hidden Markov model (HMM)
- training / The hidden Markov model (HMM)
- decoding / The hidden Markov model (HMM)
category M
- about / Abstraction
centroid / K-means
Cholesky factorization
- about / Cholesky factorization
chromosomes
- about / Evolutionary computing
class constructor template / Class constructor template
classification / Step 4 – Classification, Classification
classification, multilayer perceptron (MLP)
- about / Training and classification
- regularization / Regularization
- model generation / Model generation
- Fast Fisher-Yates shuffle / Fast Fisher-Yates shuffle
- prediction / Prediction
- model fitness / Model fitness
cluster analysis
- about / K-mean clustering
clustering
- about / K-mean clustering
clustering algorithms
- K-means / K-mean clustering, K-means
- Expectation-Maximization / K-mean clustering
clusters
- defining / Defining clusters
- initializing / Initializing clusters
clusters assignment, K-means algorithm / Step 2 – Clusters assignment
clusters configuration, K-means algorithm / Step 1 – Clusters configuration
CNBC / Naïve Bayes and text mining
Colt
- reference link / Algebraic and numerical libraries
companion objects
- versus case classes / Companion objects versus case classes
complete autoencoder / Categorization
complex Fourier transform / Fourier analysis
conditional independence
- about / Probabilistic graphical models
conditional random field (CRF)
- about / Conditional random fields, Introduction to CRF
- linear chain CRF / Linear chain CRF
- potential functions (fi) / Linear chain CRF
- identity potential functions / Linear chain CRF
- transition feature functions / Linear chain CRF
- state feature functions / Linear chain CRF
- versus hidden Markov model (HMM) / Comparing CRF and HMM
configuration parameters, binary SVC
- about / Configuration parameters
- SVM formulation / The SVM formulation
- SVM kernel function / The SVM kernel function
- SVM execution / The SVM execution
confusion matrix
- about / F-score for multinomial classification
conjugate gradient / Conjugate gradient
connectionism / The biological background
constructive tuning / Regularization
consumer price index (CPI)
- about / Introducing the multinomial Naïve Bayes
context bounds / Context bounds
context free Thompson sampling / Prior/posterior beta distribution
continuation-passing style (CPS)
- about / Beyond Actors – reactive programming
continuous-time Kalman filter / Benefits and drawbacks
contract
- about / Error handling
contravariant vectors
- about / Manifolds
control learning
- about / A solution – Q-learning
convexity / The kernel trick
convex minimization / Ln roughness penalty
convolution layer / Local receptive fields
convolution neural networks
- about / Convolution neural networks
- local receptive fields / Local receptive fields
- weight sharing / Weight sharing
- convolution layers / Convolution layers
- sub-sampling layers / Sub-sampling layers
- implementing / Putting it all together
Cooley-Tukey algorithm / Discrete Fourier transform (DFT)
cosine distance / Measuring similarity
Counter class / Counter
covariant functor
- about / Functors
covariant functor
- about / Functors
covariant vectors
- about / Manifolds
cross-validation
- about / Cross-validation
- 1-fold cross-validation / One-fold cross-validation
- K-fold cross-validation / K-fold cross-validation
crossover
- about / Crossover
- population / Population
- chromosomes / Chromosomes
- genes / Genes
curse of dimensionality
- about / Curse of dimensionality
curve-fitting algorithms / Alternative preprocessing techniques

D

Darwinian process
- about / The origin
data
- about / Modeling
- profiling / Profiling data
data clustering / Clustering
data partitioning / Clustering
data scientist
- about / Defining a methodology
data segmentation / Clustering
DataSourceConfig class / Data extraction
Davidon-Fletcher-Powell method / Numerical optimization
DBpedia / Basics information retrieval
decision boundary / Visualizing model features
decoding, hidden Markov model (HMM)
- about / Decoding (CF-3)
- Viterbi algorithm / The Viterbi algorithm
deep belief network (DBM) / Restricted Boltzmann Machines (RBMs)
Deep belief networks (DBNs) / Boltzmann machine
Deep Boltzmann machines (DBMs) / Boltzmann machine
def
- overriding, with val / Understanding the problem
density estimation / Unsupervised learning
dependency injection
- about / Scala as an object oriented language, Understanding the problem
deployment modes, Apache Spark
- standalone mode / Deploying Spark
- local mode / Deploying Spark
- Yarn clusters manager / Deploying Spark
- Apache Mesos resource manager / Deploying Spark
descriptive models / Model categorization
design
- versus model / Model versus design
designing / Model versus design
design principles, Spark
- about / Design principles
- in-memory persistency / In-memory persistency
- laziness / Laziness
- transforms and actions / Transforms and actions
- shared variables / Shared variables
design template
- for immutable classifiers / Design template for immutable classifiers
destructive tuning / Regularization
deterministic autoencoder / Deterministic autoencoder
DFT-based filtering / DFT-based filtering
differential operator / Differential operator
dimension reduction / Dimension reduction
directed graphical models
- about / Probabilistic graphical models
Dirichlet Latent Allocation / Retrieving textual information
Discrete Fourier transform (DFT)
- about / Discrete Fourier transform (DFT)
discrete Fourier transform (DFT) / Performance
- about / Distributed discrete Fourier transform
discrete Kalman filter
- about / The discrete Kalman filter
- state space estimation / The state space estimation
- recursive algorithm / The recursive algorithm
discrete Markov chain
- about / The Markov property
discretization / Value encoding
discretized streams (DStreams)
- about / Discretized streams
discriminative kernels
- about / Common discriminative kernels
discriminative models / Discriminative models
divergences
- about / The divergences
- Kullback-Leibler (KL) divergence / The divergences
- Jensen-Shannon metric / The divergences
- mutual information / The divergences
- Bregman distance / The divergences
DMatrix class / DMatrix class
DNA
- about / Evolutionary computing
DocumentsSource class / Documents extraction
domain
- about / Defining a methodology
dynamic programming
- overview / Overview dynamic programming

E

eclipse Scala IDE
- about / Eclipse Scala IDE
- reference link / Eclipse Scala IDE
Eigen-decomposition
- about / Eigenvalue decomposition
encapsulation
- about / Encapsulation
encoding scheme
- flat encoding / Flat encoding
- hierarchical encoding / Hierarchical encoding
enumerations
- versus case classes / Enumerations versus case classes
epoch
- about / Training epoch
epoch training, multilayer perceptron (MLP)
- about / Training epoch
- input forward propagation / Step 1 – input forward propagation
- error backpropagation / Step 2 – error backpropagation
- exit condition / Step 3 – exit condition
- implementing / Putting it all together
Epsilon-greedy algorithm / Epsilon-greedy algorithm
error backpropagation, multilayer perceptron (MLP)
- about / Step 2 – error backpropagation
- weights adjustment / Weights adjustment
- error propagation / Error propagation
- computational model / The computational model
error handling
- about / Error handling
error insensitive zone / Overview
Euclidean distance / Measuring similarity
European Central Bank
- URL / Financial data sources
evaluation, hidden Markov model (HMM)
- about / Evaluation (CF-1)
- alpha (forward pass) / Alpha (forward pass)
- beta (backward pass) / Beta (backward pass)
evaluation, multilayer perceptron (MLP)
- about / Evaluation
- execution profile / Execution profile
- learning rate, impact / Impact of learning rate
- momentum factor, impact / Impact of the momentum factor
- number of hidden layers, impact / Impact of the number of hidden layers
- test case / Test case
exception handling / Implementation
exchange-traded funds (ETFs) / Test case
expectation-maximization (EM) / Training (CF-2)
Expectation-Maximization algorithm
- about / Expectation-Maximization (EM)
- Gaussian mixture model / Gaussian mixture model
- overview / EM overview
- implementation / Implementation
- classification / Classification
- testing / Testing
explicit model
- about / Monadic data transformation
explicit models
- about / Explicit models
exponential moving average
- about / Exponential moving average
extended Kalman filter (EKF) / Benefits and drawbacks
Extended Kalman Filters (EKF) / The discrete Kalman filter
Extended Learning Classifiers (XCS)
- about / Learning classifier systems

F

1-fold cross-validation / One-fold cross-validation
F-score
- for binomial classification / F-score for binomial classification
- for multinomial classification / F-score for multinomial classification
F1-measure / Key quality metrics
F1-score / Key quality metrics
False Negatives (FNs) / Key quality metrics
false positive rate (FPR) / Area under ROC
- about / Training the model
False Positives (FPs) / Key quality metrics
Fast Fisher-Yates shuffle / Fast Fisher-Yates shuffle
Fast Fourier Transform (FFT) / Discrete Fourier transform (DFT)
feature extraction
- about / Extracting features
feature functions / Linear chain CRF
features
- about / What is a model?
features extraction
- automation / Test case 2 – features selection
features selection
- about / Selecting features
federal fund rate (FDR)
- about / Introducing the multinomial Naïve Bayes
feed-forward neural networks (FFNN)
- about / Feed-forward neural networks (FFNN)
- biological background / The biological background
- mathematical background / Mathematical background
- without hidden layers / The multilayer perceptron (MLP)
feed-forward sparse autoencoder / Feed-forward sparse, undercomplete autoencoder
filtering
- versus smoothing / The discrete Kalman filter
filtering algorithms / Modularizing
final val
- versus val / C-penalty and margin
finances 101
- about / Finances 101
- fundamental analysis / Fundamental analysis
- technical analysis / Technical analysis
- options trading / Options trading
- financial data sources / Financial data sources
financial data sources
- about / Financial data sources
financial metrics
- earnings per share (EPS) / Fundamental analysis
- Price/Earnings Ratio (PE) / Fundamental analysis
- Price/Sales Ratio (PS) / Fundamental analysis
- Price/Book Value Ratio (PB) / Fundamental analysis
- Price to Earnings/Growth (PEG) / Fundamental analysis
- Operating Income / Fundamental analysis
- Net Sales / Fundamental analysis
- Operating Profit Margin / Fundamental analysis
- Net Profit Margin / Fundamental analysis
- Short Interest / Fundamental analysis
- Short Interest Ratio / Fundamental analysis
- Cash per Share / Fundamental analysis
- Pay-out Ratio / Fundamental analysis
- Annual Dividend yield / Fundamental analysis
- Dividend Coverage Ratio / Fundamental analysis
- Growth Domestic Product (GDP) / Fundamental analysis
- Consumer Price Index (CPI) / Fundamental analysis
- Federal Fund rate / Fundamental analysis
first-order discrete Markov chain
- about / The first-order discrete Markov chain
first order predicate logic / First order predicate logic
fitness function
- about / Fitness score
- fixed fitness function / Fitness score
- evolutionary fitness function / Fitness score
- approximate fitness function / Fitness score
fixed lag smoothing / Fixed lag smoothing
- complex strategies / Fixed lag smoothing
Fn score / Key quality metrics
forms, models
- parameteric / What is a model?
- differential / What is a model?
- probabilistic / What is a model?
- graphical / What is a model?
- directed graphs / What is a model?
- numerical methods / What is a model?
- chemistry / What is a model?
- taxonomy / What is a model?
- grammar / What is a model?
- lexicon / What is a model?
- inference logic / What is a model?
Fourier analysis
- about / Fourier analysis
- Discrete Fourier transform (DFT) / Discrete Fourier transform (DFT)
- DFT-based filtering / DFT-based filtering
- market cycles, detecting / Detection of market cycles
Fourier transform / Fourier analysis
frameworks
- about / Tools and frameworks
- Java / Java
- Scala / Scala
- SBT / Simple build tool
- Apache Commons Math / Apache Commons Math
- JFreeChart / JFreeChart
- libraries / Other libraries and frameworks
- tools / Other libraries and frameworks
frequency domain / Discrete Fourier transform (DFT)
function language
- Scala / Scala as a functional language
functor
- about / Scala as a functional language
functors
- about / Functors
fundamental analysis
- about / Fundamental analysis
futures
- about / Futures
- blocking / Blocking on futures
- callbacks / Future callbacks
- implementing / Putting it all together

G

Gauss-Newton method / Numerical optimization, Training workflow
Gauss-Newton technique
- about / Gauss-Newton
Gaussian distribution / Z-score and Gauss
Gaussian mixture / Class likelihood
Gaussian mixture model
- about / Gaussian mixture model
Gaussian noise / The transition equation
Gaussian sampling
- about / Gaussian sampling
- Box-Muller transform / Box-Muller transform
generalized autoregressive conditional heteroscedasticity (GARCH) / Alternative preprocessing techniques
generalized linear models (GLM) / Logistic regression
generative models / Generative models
Generic message handler
- about / Blocking on futures
genetic algorithm, implementations
- about / Implementation
- software design / Software design
- key components / Key components
- selection process / Selection
- population growth, controlling / Controlling population growth
- configuration / GA configuration
- crossover / Crossover
- mutation / Mutation
- reproduction / Reproduction
- solver / Solver
genetic algorithms (GA)
- about / Genetic algorithms and machine learning
- for trading strategies / GA for trading strategies
- advantages / Advantages and risks of genetic algorithms
- risks / Advantages and risks of genetic algorithms
genetic algorithms components
- about / Genetic algorithm components
- genetic encoding / Encodings
- genetic operators / Genetic operators
- fitness function / Fitness score
genetic encoding
- about / Encodings
- value encoding / Value encoding
- predicate encoding / Predicate encoding
- solution encoding / Solution encoding
- encoding scheme / The encoding scheme
genetic operators
- about / Genetic operators
- selection / Genetic operators, Selection
- crossover / Genetic operators, Crossover
- mutation / Genetic operators, Mutation
Gibbs sampling / Test
G measure / Key quality metrics
GNU Lesser General Public License (LGPL)
- about / Licensing
goal state
- about / Putting it all together
Google finances
- reference link / Financial data sources
Googles Breeze
- reference link / Abstraction
gradient descent methods
- steepest descent / Steepest descent
- conjugate gradient / Conjugate gradient
- stochastic gradient descent / Stochastic gradient descent
graph-structured CRF
- about / Introduction to CRF
graphical models
- about / Probabilistic graphical models
GraphX
- about / Overview
greedy approach
- about / Solver
grid search
- about / Grid search
growth domestic product (GDP)
- about / Introducing the multinomial Naïve Bayes

H

Hadoop Distributed File System (HDFS) / Step 2 – loading data
Hadoop distributed file system (HDFS)
- about / Overview
hard margin / The separable case (hard margin)
heat kernel function / Kernel monadic composition
Hessian matrix / Jacobian and Hessian matrices
hidden layers / The multilayer perceptron (MLP)
hidden Markov model (HMM)
- about / The hidden Markov model (HMM)
- notation / Notation
- lambda model / The lambda model
- design / Design
- evaluation (CF-1) / Evaluation (CF-1)
- training (CF-2) / Training (CF-2)
- decoding (CF-3) / Decoding (CF-3)
- implementing / Putting it all together
- ViterbiPath class / Putting it all together
- ViterbiPath singleton / Putting it all together
- test case / Test case 1 – Training, Test case 2 – Evaluation
- as filtering technique / HMM as filtering technique
- versus conditional random field (CRF) / Comparing CRF and HMM
- performance consideration / Performance consideration
Hidden Naïve Bayes (HNB) / Training
higher kinded types (HKTs)
- about / Higher kinded types
hinge loss / The non-separable case (soft margin)
HMM constructor
- config argument / Putting it all together
- xt argument / Putting it all together
- form argument / Putting it all together
- quantize argument / Putting it all together

I

identically distributed population (i.i.d) / The purpose of sampling
IITB CRF Java library
- evaluation / Training the CRF model
immutable statistics / Immutable statistics
immutable transformations
- about / Explicit models
implementation, regularized CRF
- about / Implementation
- CRF classifier, configuring / Configuring the CRF classifier
- CRF model, training / Training the CRF model
- CRF model, applying / Applying the CRF model
implicit model
- about / Monadic data transformation
implicit models
- about / Implicit models
incremental EM / Online EM
initialization
- about / Genetic operators
input forward propagation, multilayer perceptron (MLP)
- about / Step 1 – input forward propagation
- computational flow / Computational flow
- error functions / Error functions
- operating modes / Operating modes
- softmax / Softmax
input value
- about / Error handling
IntelliJ IDEA Scala plugin
- about / IntelliJ IDEA Scala plugin
- reference link / IntelliJ IDEA Scala plugin
intercept / Step 5 – implementing the classifier
iterators / Lazy views

J

Jacobian J matrix / Numerical optimization
Jacobian matrix / Jacobian and Hessian matrices
Java
- about / Java
- reference link / Overview
Java libraries
- leveraging / Leveraging Java libraries
Java Native Interface (JNI) / Algebraic and numerical libraries
Java Virtual Machine (JVM)
- about / Overview
JBlas
- reference link / Leveraging Java libraries
jBlas
- reference link / Algebraic and numerical libraries
Jensen-Shannon metric
- about / The divergences
JFreeChart
- about / JFreeChart, Plotting data
- description / Description
- licensing / Licensing
- installation / Installation
- URL, for installation / Installation

K

K-armed bandit
- about / K-armed bandit
- exploration-exploitation trade-offs / Exploration-exploitation trade-offs
- expected cumulative regret / Expected cumulative regret
- Bayesian Bernoulli bandits / Bayesian Bernoulli bandits
- Epsilon-greedy algorithm / Epsilon-greedy algorithm
K-fold cross-validation / K-fold cross-validation
K-means
- with MLlib / K-means using MLlib
K-means algorithm
- defining / Defining the algorithm
- steps / Defining the algorithm
- exit condition / Step 2 – Clusters assignment
K-means clustering
- about / K-means
- similarity measures / Measuring similarity
- evaluation, setting up / Evaluation
- results, evaluating / The results
- number of clusters, tuning / Tuning the number of clusters
- output, validating / Validation
K-means components
- creating / Creating K-means components
Kalman filter / The discrete Kalman filter
- recursive characteristic / The discrete Kalman filter
- optimal characteristic / The discrete Kalman filter
- non-linear systems / The discrete Kalman filter
Kalman smoothing / Kalman smoothing
Kernel functions
- about / Kernel functions
- overview / Overview
- discriminative kernels / Common discriminative kernels
- monadic composition / Kernel monadic composition
- monadic composition, interpretation / Kernel monadic composition
- in SVM / Kernel monadic composition
kernel functions, types
- linear kernel (dot product) / Common discriminative kernels
- polynomial kernel / Common discriminative kernels
- radial basis function (RBF) / Common discriminative kernels
- sigmoid kernel / Common discriminative kernels
- Laplacian kernel / Common discriminative kernels
- log kernel / Common discriminative kernels
Kernel PCA
- about / Kernel PCA
kernels, types
- probabilistic kernels / Common discriminative kernels
- smoothing kernels / Common discriminative kernels
- reproducible kernel Hilbert spaces / Common discriminative kernels
kernel trick
- about / The kernel trick
key components
- population / Population
- chromosomes / Chromosomes
- genes / Genes
Kohonen's self-organizing maps / Manifolds
Kullback-Leibler (KL) divergence
- about / The divergences, The Kullback-Leibler divergence
- overview / Overview
- implementation / Implementation
- testing / Testing
Kullback-Leibler divergence
- about / Kullback-Leibler divergence
- implementation / Implementation
Kullback-Leibler evaluator
- about / Kullback-Leibler evaluator

L

L1 regularization / Challenging model complexity
Lagrange multipliers / Max-margin classification, Lagrange multipliers
Laplacian Eigenmaps
- about / Manifolds
Laplacian kernel / Common discriminative kernels
Lasso regularization / Ln roughness penalty
Latent Dirichlet Allocation (LDA)
- about / Probabilistic graphical models
lazy direct acyclic graph (DAG)
- about / Use case – continuous parsing
lazy value trigger / Step 3 – instantiation
lazy views / Lazy views
LDL decomposition
- about / LDL decomposition
Learning Classifier Systems (LCS)
- about / Learning classifier systems, Introduction to LCS
- components / Introduction to LCS
- learning strategy, combining with evolutionary approach / Combining learning and evolution
- terminology / Terminology
- extended learning classifier systems / Extended learning classifier systems
- XCS components / XCS components
- portfolio management, application / Application to portfolio management
- XCS core data / XCS core data
- XCS rules / XCS rules
- covering phase / Covering
- implementation, example / Example of implementation
- benefits / Benefits and limitations of learning classifier systems
- limitations / Benefits and limitations of learning classifier systems
learning vector quantization (LVQ)
- about / K-mean clustering
least squares problem / Numerical optimization
lemmatization / Basics information retrieval
Levenberg-Marquardt algorithm
- about / Levenberg-Marquardt
Levenberg-Marquardt method / Numerical optimization, Training workflow
Levenberg-Marquardt optimizer / Alternative preprocessing techniques
LIBSVM
- about / LIBSVM
- URL / LIBSVM
- reference / LIBSVM
- need for / LIBSVM
- Java code / LIBSVM
- svm_node / Interface to LIBSVM
- scaling / Application to risk analysis
LIBSVM, Java class
- svm_model / LIBSVM
- svm_node / LIBSVM
- svm_parameters / LIBSVM
- svm_problem / LIBSVM
- svm / LIBSVM
linear algebra
- about / Linear algebra
- QR decomposition / QR decomposition
- LU factorization / LU factorization
- LDL decomposition / LDL decomposition
- Cholesky factorization / Cholesky factorization
- SVD / Singular Value Decomposition (SVD)
- Eigen-decomposition / Eigenvalue decomposition
- algebraic libraries / Algebraic and numerical libraries
- numerical libraries / Algebraic and numerical libraries
linear chain CRF
- about / Introduction to CRF, Linear chain CRF
linear chain structured graph CRF
- about / Introduction to CRF
linear kernel (dot product) / Common discriminative kernels
linear regression
- about / Linear regression
- univariate linear regression / Univariate linear regression
- ordinary least squares regression (OLS) / Ordinary least squares (OLS) regression
- concept / Test case 2 – features selection
- versus support vector regression (SVR) / SVR versus linear regression
linear SVM
- about / The linear SVM
- separable case (hard margin) / The separable case (hard margin)
- non-separable case (soft margin) / The non-separable case (soft margin)
Local Linear Embedding
- about / Manifolds
logistic regression
- about / Logistic regression
- logistic function / Logistic function
- design / Design
- training workflow / Training workflow
- classification / Classification
logistic regression, test case
- about / Let's kick the tires
- workflow, writing / Writing a simple workflow
- issues, scoping / Step 1 – scoping the problem
- data, loading / Step 2 – loading data
- data, preprocessing / Step 3 – preprocessing data, Immutable normalization
- immutable normalization / Immutable normalization
- patterns, discovering / Step 4 – discovering patterns
- data, analyzing / Analyzing data
- data, plotting / Plotting data
- model features, visualizing / Visualizing model features
- label, visualizing / Visualizing label
- classifier, implementing / Step 5 – implementing the classifier
- optimizer, selecting / Selecting an optimizer
- model, training / Training the model
- observations, classifying / Classifying observations
- model, evaluating / Step 6 – evaluating the model
log kernel / Common discriminative kernels
loss function / Selecting an optimizer
loss function approach
- about / Solver
Lotka-Volterra equation
- about / Selection
LU factorization
- about / LU factorization

M

machine learning
- need for / Why machine learning?
- classification / Classification
- prediction / Prediction
- optimization / Optimization
- regression / Regression
- about / Genetic algorithms and machine learning
machine learning algorithms
- taxanomy / Taxonomy of machine learning algorithms
- unsupervised learning / Unsupervised learning
- supervised learning / Supervised learning
- discriminative models / Discriminative models
- semi-supervised learning / Semi-supervised learning
- reinforcement learning / Reinforcement learning
macro formulas
- for multinomial precision / F-score for multinomial classification
- for recall / F-score for multinomial classification
/ Area under ROC
Manhattan distance / Measuring similarity
manifolds
- about / Manifolds
Markov chain
- about / The hidden Markov model (HMM)
Markov Chain Monte Carlo (MCMC)
- about / Markov Chain Monte Carlo (MCMC)
- overview / Overview
- Metropolis-Hastings (MH) / Metropolis-Hastings (MH)
- implementation / Implementation
- testing / Test
/ Log-likelihood gradient
Markov decision process (MDP) / K-armed bandit
Markov decision processes
- about / Markov decision processes
- Markov property / The Markov property
- first-order discrete Markov chain / The first-order discrete Markov chain
Markov property
- about / The Markov property
master-workers design
- about / Master-workers
- messages exchange / Messages exchange
- worker Actors / Worker Actors
- workflow controller / The workflow controller
- master Actor / The master Actor
- with routing / Master with routing
- DFT / Distributed discrete Fourier transform
- limitations / Limitations
mathematical abstractions
- supporting / Supporting mathematical abstractions
- variable declaration / Step 1 – variable declaration
- model definition / Step 2 – model definition
- instantiation / Step 3 – instantiation
mathematical notations / Mathematical notations for the curious
mathematics
- linear algebra / Linear algebra
- first order predicate logic / First order predicate logic
- Hessian matrix / Jacobian and Hessian matrices
- Jacobian matrix / Jacobian and Hessian matrices
- optimization techniques / Summary of optimization techniques
max-margin classification
- about / Max-margin classification
mean / Immutable statistics
mean square error (MSE) / Bias-variance decomposition
measurement equation / The state space estimation, The measurement equation
measurement noise covariance / The measurement equation
memory management
- about / Explicit models
methodology
- defining / Defining a methodology
Metropolis-Hastings (MH) / Metropolis-Hastings (MH)
mixin composition
- for ITransform / Instantiating the workflow
mixins
- composing, to build workflow / Composing mixins to build workflow
- about / Composing mixins to build workflow
mixins linearization
- about / Understanding the problem
MLlib library
- about / Overview, MLlib library
- components / Overview
- RDDs, creating / Creating RDDs
- using, for K-means / K-means using MLlib
- tests / Tests
model
- about / What is a model?
- versus design / Model versus design
model assessment
- about / Assessing a model
- validation / Validation
- area under curves / Area under the curves
- cross-validation / Cross-validation
- bias-variance decomposition / Bias-variance decomposition
- overfitting / Overfitting
model categorization
- about / Model categorization
- predictive models / Model categorization
- descriptive models / Model categorization
model complexity
- challenging / Challenging model complexity
model fitness
- versus accuracy / Model fitness
modeling
- about / Modeling
/ Model versus design
models
- forms / What is a model?
model validation
- about / Validation
- key quality metrics / Key quality metrics
- F-score, for binomial classification / F-score for binomial classification
- F-score, for multinomial classification / F-score for multinomial classification
modules
- defining / Defining modules
monad
- about / Scala as a functional language
monadic composition
- about / Monads
monadic data transformation
- about / Monadic data transformation
monads
- about / Monads, Monads to the rescue
Monitor class / Monitor
Monte Carlo approximation
- about / Monte Carlo approximation
- overview / Overview
- implementation / Implementation
Monte Carlo EM / Online EM
Monte Carlo integration / Sampling
morphism
- about / Error handling
moving averages
- about / Moving averages
- simple moving average / Simple moving average
- weighted moving average / Weighted moving average
- exponential moving average / Exponential moving average
- on multi-dimensional time series / Exponential moving average
multi-class scoring / F-score for binomial classification
multilayer perceptron (MLP)
- about / The multilayer perceptron (MLP)
- activation function / Activation function
- network topology / Network topology
- design / Design
- configuration / Configuration
- network components / Network components
- model / Model
- problem types (modes) / Problem types (modes)
- online training, versus batch training / Online versus batch training
- epoch, training / Training epoch
- training / Training and classification
- classification / Training and classification
- evaluation / Evaluation
- limitations / Benefits and limitations
- benefits / Benefits and limitations
multinomial Naïve Bayes
- about / Introducing the multinomial Naïve Bayes
- formalism / Formalism
- frequentist perspective / The frequentist perspective
- predictive model / The predictive model
- zero-Frequency problem / The zero-frequency problem
multivariate Bernoulli classification
- about / Multivariate Bernoulli classification
- model / Model
- implementation / Implementation
mutation
- about / Mutation
- population / Population
- chromosome / Chromosomes
- genes / Genes
mutual information
- about / The divergences, The mutual information

N

n-fold cross-validation / Application to risk analysis
NASDAQ
- URL / Financial data sources
natural language processing (NLP) / The feature functions model
natural selection
- about / Selection
Naïve Bayes
- pros and cons / Pros and cons
Naïve Bayes classifier
- used, for text mining / Naïve Bayes and text mining
Naïve Bayes classifiers
- about / Naïve Bayes classifiers
- multinomial Naïve Bayes / Introducing the multinomial Naïve Bayes
- implementation / Implementation
- design / Design
- training / Training
- classification / Classification
- F1 Validation / F1 Validation
- features extract / Features extraction
- testing / Testing
Naïve Bayes models
- about / Probabilistic graphical models
network components, multilayer perceptron (MLP)
- about / Network components
- network topology / Network topology
- hidden layers / Input and hidden layers
- input layers / Input and hidden layers
- output layers / Output layer
- synapses / Synapses
- connections / Connections
- weights initialization / Weights initialization
news
- macro trends / Naïve Bayes and text mining
- micro updates / Naïve Bayes and text mining
Nondeterministic Polynomial (NP) problems
- about / NP problems
- categories / NP problems
nonlinear least squares minimization
- about / Nonlinear least squares minimization
- Gauss-Newton technique / Gauss-Newton
- Levenberg-Marquardt algorithm / Levenberg-Marquardt
non linear models
- about / Nonlinear models
- Kernel PCA / Kernel PCA
- manifolds / Manifolds
nonlinear SVM
- about / The nonlinear SVM
- max-margin classification / Max-margin classification
- kernel trick / The kernel trick
normalization
- about / Immutable normalization
normalized inner product / Measuring similarity
null frequencies
- handling / Implementation
numerical libraries
- about / Algebraic and numerical libraries
numerical optimization
- about / Numerical optimization
- Newton (or second-order techniques) / Numerical optimization
- Quasi-newton (or first-order techniques) / Numerical optimization
Nyquist / Discrete Fourier transform (DFT)

O

object oriented language
- Scala / Scala as an object oriented language
observation
- about / Extracting features
one-class SVC
- anomaly detection / Anomaly detection with one-class SVC
online EM / Online EM
online training
- versus batch training / Online versus batch training
- about / Online versus batch training
operations, time series
- transpose operator / Transpose operator
- differential operator / Differential operator
optimization techniques
- gradient descent methods / Steepest descent
- Quasi Newton algorithms / Quasi-Newton algorithms
- nonlinear least squares minimization / Nonlinear least squares minimization
- Lagrange multipliers / Lagrange multipliers
- dynamic programming, overview / Overview dynamic programming
options trading
- about / Options trading
option trading
- Q-learning, used / Option trading using Q-learning
- option property / Option property
- option model / Option model
- quantization / Quantization
ordinary least squares regression (OLS)
- about / Ordinary least squares (OLS) regression
- design / Design
- implementation / Implementation
- test case / Test case 1 – trending, Test case 2 – features selection
output unit activation function / Activation function
output value
- about / Error handling
overcomplete autoencoder / Categorization
overfitting / Overfitting
- emprical estimation / Bias-variance decomposition
- versus sparsity / Sparsity updating equations
overloading / Overloading

P

padding / Value encoding
parallel collections
- about / Parallel collections
- processing / Processing a parallel collection
- benchmark framework / Benchmark framework
- performance evaluation / Performance evaluation
Parallel Colt
- reference link / Leveraging Java libraries
parent chromosomes
- preserving / Crossover
partial functions
- reusability / Error handling
- about / Error handling
- runtime validation / Error handling
- versus partially applied functions / DFT-based filtering
Partial Least Square Regression (PLSR) / Validation
partially applied functions
- versus partial functions / DFT-based filtering
partially connected networks / Network topology
particle filter / Alternative preprocessing techniques
penalized least squares regression / Ln roughness penalty
performance evaluation, Spark
- about / Performance evaluation
- tuning parameters / Tuning parameters
- considerations / Performance considerations
- pros and cons / Pros and cons
plate model
- about / Probabilistic graphical models
polynomial kernel / Common discriminative kernels
population growth
- controlling / Controlling population growth
pre-processing techniques
- alternative techniques / Alternative preprocessing techniques
precision / Key quality metrics
precision-recall curve (PRC) / Area under PRC
Predicted Residual Error Sum of Squares (PRESS) / Validation
predictive models / Model categorization
price pattern
- about / Price patterns
principal components analysis (PCA)
- about / Principal components analysis (PCA)
- algorithm / Algorithm
- covariance matrix / Algorithm
- implementation / Implementation
- test case / Test case
- evaluation / Evaluation
- extending / Extending PCA
- validation / Validation
- categorical features / Categorical features
- performance / Performance
probabilistic graphical models
- about / Probabilistic graphical models
- directed graphical models / Probabilistic graphical models
- Bayesian networks / Probabilistic graphical models
- Naïve Bayes models / Probabilistic graphical models
probabilistic kernels / Common discriminative kernels
projection
- about / Functors
proposal distribution / Overview
Proteins / Overview
protein sequence annotation / Overview
Python
- reference link / Overview

Q

Q-learning, implementation
- about / Implementation
- software design / Software design
- states / The states and actions
- actions / The states and actions
- search space / The search space
- action-value / The policy and action-value
- policy / The policy and action-value
- components / The Q-learning components
- training / The Q-learning training
- tail recursion / Tail recursion to the rescue
- validation / Validation
- prediction / The prediction
Q-learning algorithm
- about / A solution – Q-learning
- terminology / Terminology
- concept / Concept
- value of policy / Value of policy
- Bellman optimality equations / Bellman optimality equations
- temporal difference, for model free learning / Temporal difference for model-free learning
- action-value iterative update / Action-value iterative update
- used, for option trading / Option trading using Q-learning
QR Decomposition
- about / QR decomposition
QStar class / The Viterbi algorithm
Quandl
- URL / Financial data sources
quantization / Value encoding, Quantization
Quasi Newton algorithms
- BFGS / BFGS
- L-BFGS / L-BFGS

R

radial basis function (RBF) / Common discriminative kernels
- terminology / Common discriminative kernels
recall / Key quality metrics
receiver operating characteristic (ROC)
- about / Training the model
Receiver Operating Characteristics (ROC) / Area under ROC
recombination
- about / Evolutionary computing
reconstruction error minimization
- about / Step 3 – Reconstruction error minimization
- tail recursive implementation / Tail recursive implementation
- iterative implementation / Iterative implementation
recursive algorithm
- about / The recursive algorithm
- prediction / Prediction
- correction / Correction
- Kalman smoothing / Kalman smoothing
- fixed lag smoothing / Fixed lag smoothing
- experimentation / Experimentation
- benefits / Benefits and drawbacks
- drawbacks / Benefits and drawbacks
regression
- about / Regression
regression model / Design
regularization
- about / Regularization, Ln roughness penalty
- Ln roughness penalty / Ln roughness penalty
- in machine learning / Ln roughness penalty
- model estimation / Ln roughness penalty
- feature selection / Ln roughness penalty
- overfitting / Ln roughness penalty
- computation / Ln roughness penalty
- ridge regression / Ridge regression
regularized autoencoder / Categorization
regularized CRF
- text analytics / Regularized CRF and text analytics
- feature functions model / The feature functions model
- design / Design
- implementation / Implementation
- testing / Tests
Reinforcement learning
- Q-learning, implementation / Implementation
- Q-learning, used for option trading / Option trading using Q-learning
reinforcement learning / Model categorization, Reinforcement learning, K-armed bandit
- about / Reinforcement learning
- Q-learning algorithm / A solution – Q-learning
- implementing / Putting it all together
- evaluation / Evaluation
- pros and cons / Pros and cons of reinforcement learning
relative fitness degradation
- about / Selection
relative strength index (RSI) / Terminology
replicate / Resampling
reproducible kernel Hilbert spaces / Common discriminative kernels
resampling / Overview
resilient distributed dataset (RDD)
- about / Overview, Apache Spark core, Using Spark shell
- creating / Creating RDDs
Restricted Boltzmann machines (RBMs) / Restricted Boltzmann Machines (RBMs)
reusable ML pipelines
- about / Reusable ML pipelines
- Apache Spark application, debugging with ScalaTest / Apache Spark and ScalaTest
reusable ML transforms
- about / Reusable ML transforms
- features, encoding / Encoding features
- model, training / Training the model
- predictive model / Predictive model
- summary statistics, training / Training summary statistics
- model, validating / Validating the model
- grid search / Grid search
ridge regression / Ln roughness penalty
- about / Ridge regression
- design / Design
- implementation / Implementation
- test case / Test case

S

@specialized annotation / Discrete Fourier transform (DFT)
sampling
- purpose / The purpose of sampling
Scala
- about / Why Scala?, Scala, Scala
- used, as functional language / Scala as a functional language
- abstraction / Abstraction
- HKTs / Higher kinded types
- functors / Functors
- monads / Monads
- used, as object oriented language / Scala as an object oriented language
- used, as scalable language / Scala as a scalable language
- eclipse Scala IDE / Eclipse Scala IDE
- IntelliJ IDEA Scala plugin / IntelliJ IDEA Scala plugin
- time series / Time series in Scala
- object, creating / Object creation
- Streams / Streams
- parallel collections / Parallel collections
- reference link / Overview
scalability
- with Actors / Scalability with Actors
- Actor model / The Actor model
- partitioning / Partitioning
- reactive programming / Beyond Actors – reactive programming
scalable language
- Scala / Scala as a scalable language
ScalaNLP / Other libraries and frameworks
- URL / Algebraic and numerical libraries
Scala programming
- about / Scala programming
- libraries / List of libraries and tools
- tools / List of libraries and tools
- code snippet fromat / Code snippets format
Scala reactive library
- example, reference link / Beyond Actors – reactive programming
Scala standard library
- reference link / Scala
Scalaz
- reference link / Abstraction
scientific model
- about / What is a model?
selection process
- about / Selection
self-reference
- about / Composing mixins to build workflow
semi-supervised learning / Semi-supervised learning
Sequential Minimal Optimization (SMO) / The non-separable case (soft margin), LIBSVM
service level agreement (SLA)
- need for / Why streaming?
shared variables
- about / Shared variables
- broadcast values / Shared variables
- accumulator variables / Shared variables
shrinkage
- about / Ln roughness penalty
sigmoid activation
- versus tanh / Weight sharing
sigmoid kernel / Common discriminative kernels
similarity
- visualization / Overview
similarity measures
- Manhattan distance / Measuring similarity
- Euclidean distance / Measuring similarity
- normalized inner product / Measuring similarity
simple build tool (sbt)
- about / Deploying Spark
Simple Build Tool (SBT)
- about / Simple build tool
- reference link / Simple build tool
simple moving average
- about / Simple moving average
singular value decomposition (SVD) / Performance
Singular Value Decomposition (SVD)
- about / Singular Value Decomposition (SVD)
smoothing
- versus filtering / The discrete Kalman filter
smoothing kernels / Common discriminative kernels
soft margin / The non-separable case (soft margin)
software developer
- about / Defining a methodology
source code, Scala
- about / Source code
- conventions / Convention
- context bounds / Context bounds
- presentation / Presentation
- primitives / Primitives and implicits
- implicits / Primitives and implicits
- immutability / Immutability
SparkSQL
- about / Overview
Spark Streaming
- about / Design for reusing Streams memory
sparse autoencoder
- about / Sparse autoencoder
/ Categorization
sparsity
- versus overfitting / Sparsity updating equations
sparsity updating equations / Sparsity updating equations
spectral density estimation / Fourier analysis
Spectral theory / Fourier analysis
stackable trait injection
- about / Composing mixins to build workflow
state space estimation
- about / The state space estimation
- transition equation / The state space estimation, The transition equation
- measurement equation / The state space estimation, The measurement equation
steepest descent / Steepest descent
stemming / Basics information retrieval
steps, K-means algorithms
- clusters configuration / Step 1 – Clusters configuration
- clusters assignment / Step 2 – Clusters assignment
- reconstruction error minimization / Step 3 – Reconstruction error minimization
- classification / Step 4 – Classification
stepwise EM / Online EM
stimuli / The biological background
stochastic autoencoder / Categorization
stochastic gradient descent / Stochastic gradient descent
Stochastic Gradient Descent (SGD) / Selecting an optimizer
streaming engine
- about / Streaming engine
- need for / Why streaming?
- batch processing / Batch and real-time processing
- real-time processing / Batch and real-time processing
- architecture / Architecture overview
- discretized streams (DStreams) / Discretized streams
- continuous parsing, use case / Use case – continuous parsing
- checkpointing / Checkpointing
Streams
- about / Streams
- memory, allocating / Memory on demand
- memory, reusing designs / Design for reusing Streams memory
streams / Lazy views
subject-matter expert
- about / Defining a methodology
sum of the squared error (SSE) / Online versus batch training
supervised learning
- about / Supervised learning
- generative models / Generative models
support vector classifier (SVC)
- about / Support vector classifier (SVC)
- binary SVC / The binary SVC
support vector machine (SVM)
- about / The support vector machine (SVM)
- optional mathematical formulation / The support vector machine (SVM)
- linear SVM / The linear SVM
- nonlinear SVM / The nonlinear SVM
- support vector classifier (SVC) / Support vector classifier (SVC)
- anomaly detection, with one-class SVC / Anomaly detection with one-class SVC
- support vector regression (SVR) / Support vector regression (SVR)
- performance considerations / Performance considerations
support vector regression (SVR)
- about / Support vector regression (SVR)
- overview / Overview
- versus linear regression / SVR versus linear regression
- L2 regularization / SVR versus linear regression
SVM dual problem / Max-margin classification
SVM model
- accuracy / Training

T

tagging / The feature functions model
tagging model / Basics information retrieval
tail recursion
- about / Tail recursion to the rescue
tanh
- versus sigmoid activation / Weight sharing
technical analysis
- about / Technical analysis
- terminology / Terminology
- trading data / Trading data
- trading signal / Trading signal and strategy
- trading strategy / Trading signal and strategy
- price patterns / Price patterns
temporal difference
- for model free learning / Temporal difference for model-free learning
tensors
- about / Manifolds
test case, multilayer perceptron (MLP)
- about / Test case
- implementation / Implementation
- models evaluation / Models evaluation
- hidden layers' architecture, impact / Impact of hidden layers' architecture
testing, regularized CRF
- about / Tests
- convergence profile, training / The training convergence profile
- training set size impact, evaluating / Impact of the size of the training set
- L2 regularization factor, evaluating / Impact of L2 regularization factor
text analytics
- about / Regularized CRF and text analytics
text mining
- with Naïve Bayes classifier / Naïve Bayes and text mining
- information retrieval / Basics information retrieval
- implementation / Implementation
- documents, analyzing / Analyzing documents
- relative terms frequency, extracting / Extracting relative terms frequency
- features, generating / Generating the features
- testing / Testing
- textual information, retrieving / Retrieving textual information
- classifier, evaluating / Evaluating text mining classifier
theory of evolution
- about / Evolution
- origin / The origin
- Nondeterministic Polynomial (NP) problems / NP problems
- evolutionary computing / Evolutionary computing
Thompson sampling
- about / Thompson sampling
- Bandit context / Bandit context
- prior/posterior beta distribution / Prior/posterior beta distribution
- implementation / Implementation
- simulated exploration / Simulated exploration and exploitation
- simulated exploitation / Simulated exploration and exploitation
- versus UCB1 algorithm / Implementation
time-domain function / Discrete Fourier transform (DFT)
time dependency model / The measurement equation
time series
- about / Time series in Scala
- context bound / Context bounds
- types / Types and operations
- operations / Types and operations
- lazy views / Lazy views
trading data
- about / Trading data
trading signal
- about / Trading signal and strategy
trading strategies
- GA / GA for trading strategies
- definition / Definition of trading strategies
- operators / Trading operators
- cost function / The cost function
- market signals / Market signals
- about / Trading strategies
- signal encoding / Signal encoding
- test case / Test case – Fall 2008 market crash
- creating / Creating trading strategies
- optimizer, configuring / Configuring the optimizer
- finding / Finding the best trading strategy
- tests / Tests
- weighted score / The weighted score
- unweighted score / The unweighted score
trading strategy
- about / Trading signal and strategy
training, hidden Markov model (HMM)
- about / Training (CF-2)
- Baum-Welch estimator (EM) / Baum-Welch estimator (EM)
training, Naïve Bayes classifiers
- about / Training
- Likelihood class / Class likelihood
- binomial model / Binomial model
- multinomial model / Multinomial model
- classifier components / Classifier components
training files
- raw dataset / Training the CRF model
- tagged dataset / Training the CRF model
training workflow, logistic regression
- about / Training workflow
- optimizer, configuring / Step 1 – configuring the optimizer
- Jacobian matrix, computing / Step 2 – computing the Jacobian matrix
- convergence of optimizer, managing / Step 3 – managing the convergence of optimizer
- least squares problem, defining / Step 4 – defining the least squares problem
- sum of square errors, minimizing / Step 5 – minimizing the sum of square errors
- testing / Test
traits
- about / Composing mixins to build workflow
transition equation
- about / The state space estimation, The transition equation
transpose operator / Transpose operator
transposition operator
- about / Genetic operators
TrueFx (Forex)
- URL / Financial data sources
True Negatives (TNs) / Key quality metrics
true positive rate (TPR)
- about / Training the model
True Positives (TPs) / Key quality metrics
tuning, genetic algorithm / Mutation
tuning parameters
- about / Performance evaluation
Twitters Algebird
- reference link / Abstraction
two-step lag smoothing / Experimentation

U

UCB1 algorithm
- versus Thompson sampling / Implementation
unapply method
- about / Genes
undercomplete autoencoder / Undercomplete autoencoder, Categorization, Feed-forward sparse, undercomplete autoencoder
univariate linear regression
- about / Univariate linear regression
- implementation / Implementation
- test case / Test case
unsupervised learning / Unsupervised learning
- data clustering / Clustering
- dimension reduction / Dimension reduction
upper bound confidence
- about / Upper bound confidence
- confidence interval / Confidence interval
- for Bernoulli variables / Confidence interval
- implementation / Implementation
utility classes, Scala programming
- about / Utility classes
- data extraction / Data extraction
- financial data sources / Financial data sources
- documents extraction / Documents extraction
- DMatrix class / DMatrix class
- Counter class / Counter
- Monitor class / Monitor

V

val
- def, overriding with / Understanding the problem
- versus final val / C-penalty and margin
variables
- about / What is a model?
variance / Immutable statistics
variance-bias trade-off / Bias-variance decomposition
vector quantization
- about / K-mean clustering
views / Lazy views
Viterbi algorithm / The Viterbi algorithm
- psi matrix / The Viterbi algorithm
- qStar matrix / The Viterbi algorithm
- delta matrix / The Viterbi algorithm

W

weight decay / Ln roughness penalty
weighted moving average
- about / Weighted moving average
While loop / Discrete Fourier transform (DFT)
white noise / The transition equation
WordNet / Basics information retrieval
workflow
- instantiating / Instantiating the workflow
- modularizing / Modularizing
workflow computational model
- about / Workflow computational model

Y

1-year Treasury bill (1yTB)
- about / Introducing the multinomial Naïve Bayes
Yahoo finances
- reference link / Financial data sources

Z

Z-score / Z-score and Gauss