Book Image

IPython Interactive Computing and Visualization Cookbook

By : Cyrille Rossant
Book Image

IPython Interactive Computing and Visualization Cookbook

By: Cyrille Rossant

Overview of this book

Table of Contents (22 chapters)
IPython Interactive Computing and Visualization Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Detecting hidden structures in a dataset with clustering


A large part of unsupervised learning is devoted to the clustering problem. The goal is to group similar points together in a totally unsupervised way. Clustering is a hard problem, as the very definition of clusters (or groups) is not necessarily well posed. In most datasets, stating that two points should belong to the same cluster may be context-dependent or even subjective.

There are many clustering algorithms. We will see a few of them in this recipe, applied to a toy example.

How to do it...

  1. Let's import the libraries:

    In [1]: from itertools import permutations
            import numpy as np
            import sklearn
            import sklearn.decomposition as dec
            import sklearn.cluster as clu
            import sklearn.datasets as ds
            import sklearn.grid_search as gs
            import matplotlib.pyplot as plt
            %matplotlib inline
  2. Let's generate a random dataset with three clusters:

    In [2]: X, y = ds.make_blobs(n_samples=200, n_features...