Book Image

Apache Mahout Clustering Designs

Book Image

Apache Mahout Clustering Designs

Overview of this book

Table of Contents (16 chapters)
Apache Mahout Clustering Designs
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Visualizing clusters


The T Mahout example package provides classes to generate a sample dataset.

For K-means, DisplayKmeans is the class that displays the cluster. You can directly run the class. As per the code in the class, points are generated as follows:

generateSamples(500, 1, 1, 3); // 500 samples of sd 3
generateSamples(300, 1, 0, 0.5); //300 sample of sd 0.5
generateSamples(300, 0, 2, 0.1); //300 sample of sd 0.1

Data is a set of randomly-generated 2D data points, and the points are generated using a normal distribution centered at a mean location with a constant standard deviation.

Once you run this class, you will view the clusters, as shown here:

The final clustering done by the algorithm is shown using a bold red colored circle. In the console, you can find the output related to points generation and cluster formation.