12.10 HOW TO PERFORM PRINCIPAL COMPONENTS ANALYSIS USING R
Import the clothing_store_PCA_training and clothing_store_PCA_test data sets as clothes_train and clothes_test, respectively. To simplify code that comes later, we will separate the training and test data into X and y variables.
y <‐ clothes_train$Sales.per.Visit
X <‐ clothes_train[, c(1:5)]
X_test <‐ clothes_test[, c(1:5)]
Remember to standardize the predictor variables.
X_z <‐ as.data.frame(scale(X))
colnames(X_z) <‐ c("Days.since.Purchase.Z", "Purchase.Visits.Z", "Days.on.File.Z",
"Days.between.Purchases.Z", "Diff.Items.Purchased.Z")
To obtain the correlation matrix, we use the cor() command.
round(cor(X_z), 3)
The cor() command, with the predictor variables X_z as input, is placed inside the round() command. The second input of the round() command is the number of significant digits the answers will be rounded...