# Principal Component Analysis

Another common approach to the problem of reducing the dimensionality of a high-dimensional dataset is based on the assumption that, normally, the total variance is not explained equally by all components. If *p*_{data} is a multivariate Gaussian distribution with covariance matrix , then the entropy (which is a measure of the amount of information contained in the distribution) is as follows:

Therefore, if some components have a very low variance, they also have a limited contribution to the entropy, and provide little additional information. Hence, they can be removed without a high loss of accuracy.

Just as we've done with FA, let's consider a dataset drawn from (for simplicity, we assume that it's zero-centered, even if it's not necessary):

Our goal is to define a linear transformation, (a vector is normally considered a column, therefore, has a shape (*n* x 1)), such as the following:

As we want...