Book Image

Learning Data Mining with Python

Book Image

Learning Data Mining with Python

Overview of this book

Table of Contents (20 chapters)
Learning Data Mining with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Support vector machines


Support vector machines (SVMs) are classification algorithms based on a simple and intuitive idea. It performs classification between only two classes (although we can extend it to more classes). Suppose that our two classes can be separated by a line such that any points above the line belong to one class and any below the line belong to the other class. SVMs find this line and use it for prediction, much the same way as linear regression works. SVMs, however, find the best line for separating the dataset. In the following figure, we have three lines that separate the dataset: blue, black, and green. Which would you say is the best option?

Intuitively, a person would normally choose the blue line as the best option, as this separates the data the most. That is, it has the maximum distance from any point in each class.

Finding this line is an optimization problem, based on finding the lines of margin with the maximum distance between them.

Note

The derivation of these...