Our first foray into predictive analytics began with regression techniques for predicting continuous variables. In this chapter, we will be discussing a perhaps even more popular class of techniques from statistical learning known as classification.
All these techniques have at least one thing in common: we train a learner on input, for which the correct classifications are known, with the intention of using the trained model on new data whose class is unknown. In this way, classification is a set of algorithms and methods to predict categorical variables.
Whether you know it or not, statistical learning algorithms performing classification are all around you. For example, if you've ever accidently checked the spam folder of your email and been horrified, you can thank your lucky stars that there are sophisticated classification mechanisms that your email is run through to automatically mark spam as such so you don't have to see it. On the other...