Principal component analysis (PCA) is another exploratory method you can use to separate your samples into groups. PCA converts a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. PCA is widely used as a dimension reduction technique to help visualize data. PCA is different from LDA because it doesn't rely on class information to explore the relationship between the variable values and the sample group numbers. For example, let's perform a PCA to explore our simulated fish.data
dataset. Before performing PCA, it is important to remember that the magnitude of the variables and any skews in the data will influence the resulting principal components. So, we need to scale and transform our data.
First, we recommend you to log transform the data (if necessary). Then, run PCA using the prcomp()
function as follows:
> fish.data.mx <- as.matrix(fish.data[, 1:3]) > fish.data...