PCA Flashcards

1
Q

What is PCA?

A

Separation of multivariate data into N most important components

Identify principle directions in which data varies

Explains variance covariance structure of a set of variables

Organise data into a set of composite variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 2 approaches to dimensionality reduction? Describe them.

A
  1. Feature Selection:
    Choose a subset of features present in the data set and extract data points with these features.
  2. Feature Extraction:
    Create a new subset by combining different features
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 2 types of feature extraction? Describe them.

A
  1. Classification - ICA
    Feature extraction mapping enhances discriminator information in low dimensional space
  2. Signal Representation - PCA
    Feature extraction mapping accurately represents samples in low dimensional space
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What underlying component does PCA work on?

A

Total sample variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are principle components?

A

Eigen-vectors that describe the direction of variance in data
All PCs are orthogonal i.e. Covariance between any 2 PCs = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do the Eigen-vectors and values represent?

A

Eigenvalues are the magnitude of the variances of the principle components

Eigenvectors describe direction of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the PCA algorithm?

A
  1. Define input matrix of dimension P
  2. Calculate mean of each column
  3. Calculate new matrix with centered value columns (mean + value)
  4. Calculate covariance matrix
  5. Calculate eigenvectors/values
  6. Select K largest PCs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly