As briefly mentioned, agglomerative clustering refers to algorithms. Let's start with the example of the data we used last in the previous chapter:
1 rownames(life.scaled) = life$country 2 a=hclust(dist(life.scaled)) 3 par(mfrow=c(1,2)) 4 plot(a, hang=-1, xlab="Case number", main = "Euclidean")
We started by adding the name of each country as the row name of the related case (line 1), in order to display it on the graph. The function hclust()
was then used to generate a hierarchical agglomerative clustering solution from the data (line 2). The algorithm uses a distance matrix, provided as an argument (here the default is the Euclidean distance) to determine how to create a hierarchy of clusters. We have discussed measures of distance in the previous chapter. Please refer to this explanation if in doubt. Finally, the hclust
object a
at line 2 was plotted in a dendrogram (line 4 in the following diagram). At line 3, we set the plotting area to...