Chapter 3
Naïve Bayes and the Incredible Lightness of Being an Idiot
In the previous chapter, you hit the ground running with a bit of unsupervised learning. You looked at k-means clustering, which is like the chicken nugget of the data mining world: simple, intuitive, and useful. Delicious too.
In this chapter you're going to move from unsupervised into supervised artificial intelligence models by training up a naïve Bayes model, which is, for lack of a better metaphor, also a chicken nugget, albeit a supervised one.
As mentioned in Chapter 2, in supervised artificial intelligence, you “train” a model to make predictions using data that's already been classified. The most common use of naïve Bayes is for document classification. Is this e-mail spam or ham? Is this tweet happy or angry? Should this intercepted satellite phone call be classified for further investigation by the spooks? You provide “training data,” i.e. classified examples...