Book Image

Python Machine Learning (Wiley)

By : Wei-Meng Lee
Book Image

Python Machine Learning (Wiley)

By: Wei-Meng Lee

Overview of this book

With computing power increasing exponentially and costs decreasing at the same time, this is the best time to learn machine learning using Python. Machine learning tasks that once required enormous processing power are now possible on desktop machines. Python Machine Learning begins by covering some fundamental libraries used in Python that make machine learning possible. You'll learn how to manipulate arrays of numbers with NumPy and use pandas to deal with tabular data. Once you have a firm foundation in the basics, you'll explore machine learning using Python and the scikit-learn libraries. You'll learn how to visualize data by plotting different types of charts and graphs using the matplotlib library. You'll gain a solid understanding of how the various machine learning algorithms work behind the scenes. The later chapters explore the common machine learning algorithms, such as regression, clustering, and classification, and discuss how to deploy the models that you have built, so that they can be used by client applications running on mobile and desktop devices. By the end of the book, you'll have all the knowledge you need to begin machine learning using Python.
Table of Contents (16 chapters)
Free Chapter
CHAPTER 9: Supervised Learning—Classification Using K‐Nearest Neighbors (KNN)
End User License Agreement

Kernel Trick

Sometimes, the points in a dataset are not always linearly separable. Consider the points shown in Figure 8.11.

Illustration of a scatter plot depicting two groups of points in a dataset distributed in a circular fashion.

Figure 8.11: A scatter plot of two groups of points distributed in circular fashion

You can see that it is not possible to draw a straight line to separate the two sets of points. With some manipulation, however, you can make this set of points linearly separable. This technique is known as the kernel trick. The kernel trick is a technique in machine learning that transforms data into a higher dimension space so that, after the transformation, it has a clear dividing margin between classes of data.

Adding a Third Dimension

To do so, we can add a third dimension, say the z‐axis, and define z to be:

equation z = x 2 + y 2 --

Once we plot the points using a 3D chart, the points are now linearly separable. It is difficult to visualize this unless you plot the points out. The following code snippet does just that:

%matplotlib inline
from mpl_toolkits.mplot3d...