Book Image

Rapid - Apache Mahout Clustering designs

Book Image

Rapid - Apache Mahout Clustering designs

Overview of this book

Table of Contents (16 chapters)
Apache Mahout Clustering Designs
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Summary


We discussed K-Clustering in this chapter. We also discussed how the K-means algorithm works and we used the Mahout implementation of K-means on a text dataset. We downloaded the data and converted it to a Mahout reusable vector format.

We discussed how to understand the cluster using the clusterdumper utility. We saw an example class to visualize the Mahout cluster as given in the Mahout example class.

Now, we will move on to the next chapter, where we will discuss Canopy clustering. This is also a very good technique and can be used to estimate the number of K for K-means clustering.