Classification is a technique to put data into different classes based on its utility. For example, an e-commerce company can apply two labels, namely will buy or will not buy, to the potential visitors.
This classification is done by providing some already labeled data to machine-learning algorithms called training data, as you know already. The challenge is how to mark the boundary between the two classes. Let's take a simple example, as shown in the following figure:
In the preceding case, we designated gray and black to the "will not buy" and "will buy" labels, respectively. Here, drawing a line between the two classes is easy, as follows:
Is this the best we can do? Not really. Let's try to do a better job. The black classifier is not really equidistant from the will buy and will not buy carts. Let's make a better attempt:
This looks good, doesn't it? This, in fact, is what the SVM algorithm does. You can see in the preceding diagram that there are...