In contrast to hierarchical clustering and k-means clustering, which use a heuristic approach and do not depend on a formal model, model-based clustering techniques assume varieties of data models and apply an EM algorithm to obtain the most likely model, and further use the model to infer the most likely number of clusters. In this recipe, we will demonstrate how to use the model-based method to determine the most likely number of clusters.
In order to perform a model-based method to cluster customer data, you need to have the previous recipe completed by generating the customer dataset.
Perform the following steps to perform model-based clustering:
- First, please install and load the
mclust
library:
> install.packages("mclust")> library(mclust)
- You can then perform model-based clustering on the
customer
dataset:
> mb = Mclust(customer) > plot(mb)
- Then, you can press the 1 key to obtain the BIC against...