Book Image

Rapid - Apache Mahout Clustering designs

Book Image

Rapid - Apache Mahout Clustering designs

Overview of this book

Table of Contents (16 chapters)
Apache Mahout Clustering Designs
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Summary


We discussed Canopy clustering in this chapter and found out how to get the initial number of clusters using Canopy clustering. We discussed how the Canopy clustering algorithm works and used the Mahout implementation of Canopy on a text dataset to generate Canopies. We discussed how Canopy clustering is implemented using the MapReduce method. We saw an example class to visualize the Mahout cluster as given in the mahout example class. We also discussed the code to change the CSV file to the vector format that is used by Mahout.

Now, we will move on to the next chapter, where we will discuss the Fuzzy K-means clustering algorithm. This is also a very good topic under clustering algorithms.