MLlib provides two models for dimensionality reduction; these models are closely related to each other. These models are Principal Components Analysis (PCA) and Singular Value Decomposition (SVD).
Types of dimensionality reduction
Principal components analysis
PCA operates on a data matrix X, and seeks to extract a set of k principal components from X. The principal components are each uncorrelated to each other, and are computed such that the first principal component accounts for the largest variation in the input data. Each subsequent principal component is, in turn, computed such that it accounts for the largest variation, provided that it is independent of the principal components computed so far.
In this way, the k principal components returned are guaranteed...