Summary
There is a reason why the Naïve Bayes model is one of the first supervised learning techniques taught in a machine learning course: it is simple and robust. As a matter of fact, this is the first technique that should come to mind when you are considering creating a model from a labeled dataset, as long as the features are conditionally independent.
This chapter also introduced you to the basics of text mining as an application of Naïve Bayes.
Despite all its benefits, the Naïve Bayes classifier assumes that the features are conditionally independent, a limitation that cannot be always overcome. In the case of the classification of documents or news releases, Naïve Bayes incorrectly assumes that terms are semantically independent: the two entities' age and date of birth are highly correlated. The discriminative classifiers described in the next few chapters address some of Naïve Bayes' limitations [5:14].
This chapter does not treat temporal dependencies, sequence of events, or conditional...