Mahout under the mahout-example
package provides the classes to generate a sample dataset. In this class, it runs the reference clustering implementations over them.
For Canopy, DisplayCanopy
is the class that displays the cluster. You can directly run the class. As per the code in the class, points are generated as follows:
generateSamples(500, 1, 1, 3); // 500 samples of sd 3 generateSamples(300, 1, 0, 0.5); //300 sample of sd 0.5 generateSamples(300, 0, 2, 0.1); //300 sample of sd 0.1
Once you run this class, you will view the clusters as shown here:
The bold red color is the final clustering done by the algorithm. In the console, you can find the output related to the generation of points and cluster formation.