Book Image

Scala for Machine Learning

By : Patrick R. Nicolas
Book Image

Scala for Machine Learning

By: Patrick R. Nicolas

Overview of this book

Table of Contents (20 chapters)
Scala for Machine Learning
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The Multivariate Bernoulli classification


The previous example uses the Gaussian distribution for features that are essentially binary (UP = 1 and DOWN = 0) to represent the change in value. The mean value is computed as the ratio of the number of observations for which xi = UP over the total number of observations.

As stated in the first section, the Gaussian distribution is more appropriate for either continuous features or binary features for very large labeled datasets. The example is the perfect candidate for the Bernoulli model.

Model

The Bernoulli model differs from the Naïve Bayes classifier in such a way that it penalizes the feature x that does not have any observation; the Naïve Bayes classifier ignores it [5:10].

Note

The Bernoulli mixture model

M8: For a feature function fk with fk = 1, if the feature is observed, and a value of 0 otherwise, and the probability p of the observed feature xk belongs to the class Cj, then the posterior probability is computed as follows:

Implementation...