-
Book Overview & Buying
-
Table Of Contents
Python Machine Learning by Example - Third Edition
By :
Movie recommendation can be framed as a machine learning classification problem. If it is predicted that you like a movie, for example, then it will be on your recommended list, otherwise, it won't. Let's get started by learning the important concepts of machine learning classification.
Classification is one of the main instances of supervised learning. Given a training set of data containing observations and their associated categorical outputs, the goal of classification is to learn a general rule that correctly maps the observations (also called features or predictive variables) to the target categories (also called labels or classes). Putting it another way, a trained classification model will be generated after the model learns from the features and targets of training samples, as shown in the first half of Figure 2.1. When new or unseen data comes in, the trained model will be able to determine their desired class memberships. Class information will be predicted based on the known input features using the trained classification model, as displayed in the second half of Figure 2.1:

Figure 2.1: The training and prediction stages in classification
In general, there are three types of classification based on the possibility of class output—binary, multiclass, and multi-label classification. We will cover them one by one in the next section.
This classifies observations into one of two possible classes. The example of spam email filtering we encounter every day is a typical use case of binary classification, which identifies email messages (input observations) as spam or not spam (output classes). Customer churn prediction is another frequently mentioned example, where a prediction system takes in customer segment data and activity data from CRM systems and identifies which customers are likely to churn.
Another application in the marketing and advertising industry is click-through prediction for online ads—that is, whether or not an ad will be clicked, given users' cookie information and browsing history. Last but not least, binary classification has also been employed in biomedical science, for example, in early cancer diagnosis, classifying patients into high or low risk groups based on MRI images.
As demonstrated in Figure 2.2, binary classification tries to find a way to separate data from two classes (denoted by dots and crosses):

Figure 2.2: Binary classification example
Don't forget that predicting whether a person likes a movie is also a binary classification problem.
This type of classification is also referred to as multinomial classification. It allows more than two possible classes, as opposed to only two in binary cases. Handwritten digit recognition is a common instance of classification and has a long history of research and development since the early 1900s. A classification system, for example, can learn to read and understand handwritten ZIP codes (digits from 0 to 9 in most countries) by which envelopes are automatically sorted.
Handwritten digit recognition has become a "Hello, World!" in the journey of studying machine learning, and the scanned document dataset constructed from the National Institute of Standards and Technology, called MNIST (Modified National Institute of Standards and Technology), is a benchmark dataset frequently used to test and evaluate multiclass classification models. Figure 2.3 shows four samples taken from the MNIST dataset:

Figure 2.3: Samples from the MNIST dataset
In Figure 2.4, the multiclass classification model tries to find segregation boundaries to separate data from the following three different classes (denoted by dots, crosses, and triangles):

Figure 2.4: Multiclass classification example
In the first two types of classification, target classes are mutually exclusive and a sample is assigned one, and only one, label. It is the opposite in multi-label classification. Increasing research attention has been drawn to multi-label classification by the nature of the omnipresence of categories in modern applications. For example, a picture that captures a sea and a sunset can simultaneously belong to both conceptual scenes, whereas it can only be an image of either a cat or dog in a binary case, or one type of fruit among oranges, apples, and bananas in a multiclass case. Similarly, adventure films are often combined with other genres, such as fantasy, science fiction, horror, and drama.
Another typical application is protein function classification, as a protein may have more than one function—storage, antibody, support, transport, and so on.
A typical approach to solving an n-label classification problem is to transform it into a set of n binary classification problems, where each binary classification problem is handled by an individual binary classifier.
Refer to Figure 2.5 to see the restructuring of a multi-label classification problem into a multiple binary classification problem:

Figure 2.5: Transforming three-label classification into three independent binary classifications
To solve these problems, researchers have developed many powerful classification algorithms, among which Naïve Bayes, support vector machine (SVM), decision tree, and logistic regression are often used. In the following sections, we will cover the mechanics of Naïve Bayes and its in-depth implementation, along with other important concepts, including classifier tuning and classification performance evaluation. Stay tuned for upcoming chapters that cover the other classification algorithms.