Book Image

scikit-learn Cookbook

By : Trent Hauck
Book Image

scikit-learn Cookbook

By: Trent Hauck

Overview of this book

<p>Python is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility, and within the Python data space, scikit-learn is the unequivocal choice for machine learning. Its consistent API and plethora of features help solve any machine learning problem it comes across.</p> <p>The book starts by walking through different methods to prepare your data—be it a dataset with missing values or text columns that require the categories to be turned into indicator variables. After the data is ready, you'll learn different techniques aligned with different objectives—be it a dataset with known outcomes such as sales by state, or more complicated problems such as clustering similar customers. Finally, you'll learn how to polish your algorithm to ensure that it's both accurate and resilient to new datasets.</p>
Table of Contents (12 chapters)
scikit-learn Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Quantizing an image with KMeans clustering


Image processing is an important topic in which clustering has some application. It's worth pointing out that there are several very good image-processing libraries in Python. scikit-image is a "sister" project of scikit-learn. It's worth taking a look at if you want to do anything complicated.

Getting ready

We will have some fun in this recipe. The goal is to use cluster to blur an image.

First, we'll make use of SciPy to read the image. The image is translated in a 3-dimensional array; the x and y coordinates describe the height and width, and the third dimension represents the RGB values for each image:

# in your terminal
$ wget http://blog.trenthauck.com/assets/headshot.jpg

How do it…

Now, let's read the image in Python:

>>> from scipy import ndimage
>>> img = ndimage.imread("headshot.jpg")
>>> plt.imshow(img)

The following image is seen:

Hey, that's (a younger) me!

Now that we have the image, let's check its dimensions...