In this chapter, we exposed the generic naive Bayes approach, starting from the Bayes' theorem and its intrinsic philosophy. The naiveness of such algorithms is due to the choice to assume all the causes to be conditional independent. This means that each contribution is the same in every combination and the presence of a specific cause cannot alter the probability of the other ones. This is not so often realistic; however, under some assumptions, it's possible to show that internal dependencies clear each other so that the resulting probability appears unaffected by their relations.
scikit-learn provides three naive Bayes implementations: Bernoulli, multinomial and Gaussian. The only difference between them is in the probability distribution adopted. The first one is a binary algorithm, particularly useful when a feature can be present or not. Multinomial assumes having feature vectors, where each element represents the number of times it appears (or, very often, its frequency)....