Centroid-based clustering is a method in which each cluster is represented by a central vector, and the objects are assigned to the clusters based on the proximity such that the squared distance from the central vector is minimized.
In this section, we will create the clusters using the K-means algorithm. We will see the implementation of this using R.
We need to use the fpc
package called flexible procedure for the clustering in order to implement various clustering algorithms in R:
install.packages("fpc") library(fpc)
Before creating the clusters using the K-means algorithm, we need to identify the ideal number of clusters for the given dataset. We can get the ideal number of clusters using the pamk
function, where we do partitioning around the medoids to compute the ideal number of clusters. The clusters$nc
variable will hold the ideal number of clusters:
clusters<- pamk(wdata) n <- clusters$nc n [1] 5
The n
vector will hold...