Book Image

Python Machine Learning (Wiley)

By : Wei-Meng Lee
Book Image

Python Machine Learning (Wiley)

By: Wei-Meng Lee

Overview of this book

With computing power increasing exponentially and costs decreasing at the same time, this is the best time to learn machine learning using Python. Machine learning tasks that once required enormous processing power are now possible on desktop machines. Python Machine Learning begins by covering some fundamental libraries used in Python that make machine learning possible. You'll learn how to manipulate arrays of numbers with NumPy and use pandas to deal with tabular data. Once you have a firm foundation in the basics, you'll explore machine learning using Python and the scikit-learn libraries. You'll learn how to visualize data by plotting different types of charts and graphs using the matplotlib library. You'll gain a solid understanding of how the various machine learning algorithms work behind the scenes. The later chapters explore the common machine learning algorithms, such as regression, clustering, and classification, and discuss how to deploy the models that you have built, so that they can be used by client applications running on mobile and desktop devices. By the end of the book, you'll have all the knowledge you need to begin machine learning using Python.
Table of Contents (16 chapters)
Free Chapter
CHAPTER 9: Supervised Learning—Classification Using K‐Nearest Neighbors (KNN)
End User License Agreement

What Is Unsupervised Learning?

So far, all of the machine learning algorithms that you have seen are supervised learning. That is, the datasets have all been labeled, classified, or categorized. Datasets that have been labeled are known as labeled data, while datasets that have not been labeled are known as unlabeled data. Figure 10.1 shows an example of labeled data.

“Tabular illustration depicting labeled data - based on the size of the house and the year in which it was built, we have the price at which the house was sold.”

Figure 10.1: Labeled data

Based on the size of the house and the year in which it was built, you have the price at which the house was sold. The selling price of the house is the label, and your machine learning model can be trained to give the estimated worth of the house based on its size and the year in which it was built.

Unlabeled data, on the other hand, is data without label(s). For example, Figure 10.2 shows a dataset containing a group of people's waist circumference and corresponding leg length. Given this set of data, you can try to cluster them into groups based on the waist circumference and leg length...