Principal Component Analysis Flashcards
What is PCA
It is a quite simple but useful method, often used as a preprocessing step for supervised methods (whitening, dimensionality reduction), for compression or visualization. It finds directions in the features space along which the data has maximum variance.
PCA is a projection method with a measure of “interestingness” being variance.
Assumptions
the data of which we calculate the covariance matrix is centered. This way, the dimensions should be comparable, but it is unfortunately susceptible to outliers
how can we formally define the interesting directions we are looking for?
What are the directions in feature space along which the variance is maximal
The eigenvalue problem
The most informative, orthogonal directions (complex features) are given by the eigenvectors of the covariance matrix. These eigenvectors are called the principal components and the variance along any one of these directions is given by the corresponding eigenvalue.
Principal components (PCs)
The normalized eigenvectors of the covariance matrix are called its principal components (PCs) and can be used to project the data onto a subspace which is spanned by the PCs with largest variance.
Properties of PCA
Whitening: Whitening is when the data is transformed to have the identity matrix as covariance matrix. This is done by dividing it by the square root of the covariance matrix. Variance is scale-sensitive, meaning scaling one dimension can change all PCs and can only tell something if the different dimensions are comparable. If they are not, variance can be gotten rid of. If the scales are incomparable, scale variance along all directions to 1 after decorrelation by PCA.
Decorrelation: Since C is diagonal with respect to eigenbasis, all values besides those on the diagonal are 0. Thus, the transformation into the eigenbasis yields uncorrelated features.
The restriction is that PCA is a linear combination method. Extensions:
nonlinear features ⇒ kernel PCA (“kernel trick”)
no underlying generative model ⇒ probabilistic PCA, factor analysis