The standard R package stats
provides the function for K-means clustering. We also use the cluster
package to plot the results of our cluster analysis.
If you have not already downloaded the files for this chapter, do it now and ensure that the auto-mpg.csv
file is in your R working directory. Also, ensure that you have installed the cluster
package.
To perform cluster analysis using K-means clustering, follow theses steps:
Read the data:
> auto <- read.csv("auto-mpg.csv")
Define a convenience function to standardize the relevant variables and append the resulting variables to the original data:
rdacb.scale.many <- function (dat, column_nos) { nms <- names(dat) for (col in column_nos) { name <- paste0(nms[col], "_z") dat[name] <- scale(dat[, col]) } cat(paste("Scaled", length(column_nos), "variable(s)\n")) dat }
Use the preceding convenience function to standardize the variables...