In this chapter, we took our first stab at probability theory, learning about random variables and conditional probabilities, which allowed us to get a glimpse of Bayes' theorem--the underpinning of a naive Bayes classifier. We talked about the difference between discrete and continuous random variables, between likelihoods and probabilities, between priors and evidence, and between normal and naive Bayes classifiers.
Finally, our theoretical knowledge would be of no use if we didn't apply it to a practical example. We obtained a dataset of raw email messages, parsed it, and trained Bayesian classifiers on it to classify emails as either ham or spam using a variety of feature extraction approaches.
In the next chapter, we will switch gears and, for once, discuss what to do if we have to deal with unlabeled data.