Principal Component Analysis Flashcards

1
Q

What is PCA

A

It is a quite simple but useful method, often used as a preprocessing step for supervised methods (whitening, dimensionality reduction), for compression or visualization. It finds directions in the features space along which the data has maximum variance.
PCA is a projection method with a measure of “interestingness” being variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Assumptions

A

the data of which we calculate the covariance matrix is centered. This way, the dimensions should be comparable, but it is unfortunately susceptible to outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how can we formally define the interesting directions we are looking for?

A

What are the directions in feature space along which the variance is maximal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The eigenvalue problem

A

The most informative, orthogonal directions (complex features) are given by the eigenvectors of the covariance matrix. These eigenvectors are called the principal components and the variance along any one of these directions is given by the corresponding eigenvalue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Principal components (PCs)

A

The normalized eigenvectors of the covariance matrix are called its principal components (PCs) and can be used to project the data onto a subspace which is spanned by the PCs with largest variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Properties of PCA

A

Whitening: Whitening is when the data is transformed to have the identity matrix as covariance matrix. This is done by dividing it by the square root of the covariance matrix. Variance is scale-sensitive, meaning scaling one dimension can change all PCs and can only tell something if the different dimensions are comparable. If they are not, variance can be gotten rid of. If the scales are incomparable, scale variance along all directions to 1 after decorrelation by PCA.
Decorrelation: Since C is diagonal with respect to eigenbasis, all values besides those on the diagonal are 0. Thus, the transformation into the eigenbasis yields uncorrelated features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The restriction is that PCA is a linear combination method. Extensions:

A

nonlinear features ⇒ kernel PCA (“kernel trick”)

no underlying generative model ⇒ probabilistic PCA, factor analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly