Besides generating statistics to validate the quality of the generated clusters, you can use known data clusters as the ground truth to compare different clustering methods. In this recipe, we will demonstrate how clustering methods differ with regard to data with known clusters.
In this recipe, we will continue to use handwriting digits as clustering inputs; you can find the figure at the author's GitHub page: https://github.com/ywchiu/ml_R_cookbook/tree/master/CH9.
Perform the following steps to cluster digits with different clustering techniques:
- First, you need to install and load the
png
package:
> install.packages("png")> library(png)
- Then, please read images from
handwriting.png
and transform the read data into a scatter plot:
> img2 = readPNG("handwriting.png", TRUE) > img3 = img2[,nrow(img2):1] > b = cbind(as.integer(which(img3 < -1) %% 28), which(img3 < -1) / 28) >...