Book Image

Scala for Machine Learning

By : R. Nicolas
Book Image

Scala for Machine Learning

By: R. Nicolas

Overview of this book

Are you curious about AI? All you need is a good understanding of the Scala programming language, a basic knowledge of statistics, a keen interest in Big Data processing, and this book!
Table of Contents (15 chapters)
14
Index

Multivariate Bernoulli classification

The previous example uses the Gaussian distribution for features that are essentially binary, {UP=1, DOWN=0}, to represent the change in value. The mean value is computed as the ratio of the number of observations for which xi = UP over the total number of observations.

As stated in the first section, the Gaussian distribution is more appropriate for either continuous features or binary features for very large labeled datasets. The example is the perfect candidate for the Bernoulli model.

Model

The Bernoulli model differs from Naïve Bayes classifier in that it penalizes the features x, which do not have any observations; the Naïve Bayes classifier ignores them [5:10].

Note

The Bernoulli mixture model

For a feature function fi, with fi = 1 if the feature is observed, and a value of 0 if the feature is not observed:

Model

Implementation

The implementation of the Bernoulli model consists of modifying the Likelihood.score scoring function by using the Bernoulli...