Book Image

Mastering Scientific Computing with R

Book Image

Mastering Scientific Computing with R

Overview of this book

Table of Contents (17 chapters)
Mastering Scientific Computing with R
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Clustering


An alternative approach to PCA is k-means (unsupervised) clustering, which partitions the data into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. We can perform k-means clustering with the kmeans() function and plot the results with plot3d() as follows:

> set.seed(44)
> cl <- kmeans(fish.data[,1:3],5)
> fish.data$cluster <- as.factor(cl$cluster)
> plot3d(fish.log.pca$x[,1:3], col=fish.data$cluster, main="k-means clusters")

Note

The color scheme used for the groups is different from the 3D plot of the PCA results. However, the overall distribution of the groups is similar.

Let's now evaluate how well it categorizes the data with a table as follows:

> with(fish.data, table(cluster, fish))
       fish
cluster Bluegill Bowfin Carp Goldeye Largemouth_Bass
      1        0      0   14      39              18
      2        0     27   12       0              22
      3        0     23   13   ...