Book Image

R for Data Science

By : Dan Toomey
Book Image

R for Data Science

By: Dan Toomey

Overview of this book

Table of Contents (19 chapters)

Chapter 6. Data Analysis – Clustering

Clustering is the process of trying to make groups of objects that are more similar to each other than objects in other groups. Clustering is also called cluster analysis.

R has several tools to cluster your data (which we will investigate in this chapter):

  • K-means, including optimal number of clusters

  • Partitioning Around Medoids (PAM)

  • Bayesian hierarchical clustering

  • Affinity propagation clustering

  • Computing a gap statistic to estimate the number of clusters

  • Hierarchical clustering