PCA Flashcards

1
Q

What is the purpose of Principal Component Analysis (PCA)?

A

PCA transforms the original variables ๐‘‹1, โ€ฆ, ๐‘‹๐‘ into p new variables ๐‘1, โ€ฆ, ๐‘๐‘ called principal components (PCs).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how are the new variables created by PCA ordered?

A

The new variables are ordered by how much the variation is accounted for by that variable.

That is: ๐‘‰๐‘Ž๐‘Ÿ(๐‘1)โ‰ฅ๐‘‰๐‘Ž๐‘Ÿ(๐‘2)โ‰ฅโ€ฆ โ‰ฅ๐‘‰๐‘Ž๐‘Ÿ(๐‘๐‘).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how are the importance of PCs determined?

A

The variables which account for more variation are more important. If some subset of variables account for most of the variation, itโ€™s convention can forget about the rest of the variables!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are PCs in a linear algebra sense?

A

PCs are linear combinations of ๐‘‹1, โ€ฆ, ๐‘‹๐‘, i.e.,
๐‘1=๐‘Ž11๐‘‹1+๐‘Ž12๐‘‹2+ โ€ฆ+๐‘Ž1๐‘๐‘‹๐‘โ€,โ€
๐‘2=๐‘Ž21๐‘‹1+๐‘Ž22๐‘‹2+ โ€ฆ+๐‘Ž2๐‘๐‘‹๐‘, etc.

Example of a Linear Combination:
Z = 2 ร— [1 0] + 3 ร— [0 1].
Because the coefficient (or weight) of [0 1] is higher, Z points more in the direction of [0
1] than [1 0].

Application to Scalars:
Even though (X_p) are usually scalars, the same idea applies.
Let Xโ‚ = 3, Xโ‚‚ = 4, aโ‚โ‚ = 2, and aโ‚โ‚‚ = -1.
Then Zโ‚ is the difference between Xโ‚ and Xโ‚‚, with Xโ‚ weighted twice as much as Xโ‚‚

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

why should be first normalize our data so that the variances are all 1?

A

Normalizing the data so that variances are all 1 ensures that each variable contributes equally to the analysis.

Without normalization, variables with larger variances (often due to differences in units or scales) could dominate the principal components, skewing the results and reducing the interpretability of the analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how is the first PC chosen?

A

The first principal component, PC1, is chosen so that ๐‘‰๐‘Ž๐‘Ÿ(๐‘1) is as large as possible for any linear combination of ๐‘‹1, โ€ฆ, ๐‘‹๐‘.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how is Var(Z_1) made as large as possible?

A

This is achieved by maximizing C๐‘Žโƒ—โ‚, where ๐‘Žโƒ—โ‚ = [๐‘Žโ‚โ‚
โ‹ฎ
๐‘Žโ‚โ‚š].
However, this optimization is not interesting unless we enforce the constraint ๐‘Žโƒ—โ‚ = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how are the other PCs that are not the first chosen?

A

Subsequent PCs are chosen so that:
They have maximal variance (|๐ถ๐‘Žโƒ—๐‘–| is as large as possible)
The squares of the weights sum to 1 (which means |๐‘Žโƒ—๐‘–|=1)
And each PC is totally uncorrelated with the previous PCs.

i.e.
Each new Principal Component (PC) is chosen to capture as much variation as possible in the data. Each PC is a weighted sum of the original variables, but the weights are normalized so their squares sum to 1.
PCs are meant to be completely independent from each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what do the solutions for ๐‘Žโƒ—1, โ€ฆ,๐‘Žโƒ—๐‘ turn out to be?

A

Eigenvectors of the sample covariance matrix C and the variances turn out to be eigenvalues.

Where: ๐‘‰๐‘Ž๐‘Ÿ(๐‘_๐‘–)= ฮป_๐‘– where ฮป_๐‘– is the ith largest eigenvalue of ๐ถ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do we decide how many principal components to keep?

A
  • Scree Plot: Look for an elbow point where variance explained drops off.
  • 80% Rule: Keep enough components to explain at least 80% of total variation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what does this eqn mean?

C = QVQโ€™

A

C is cov matrix
Q is orthonormal matrix that doesnโ€™t change the lenght
V is the matrix of eigenvalues with are the variances
Qโ€™ is Q transpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is principle component?

A

A principal component is a new variable created by PCA that combines the original variables in a way that captures the most important patterns and variation in the data while reducing complexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

true or false, PCs are uncorrelcated with each other

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

explain this eqn: โˆ‘ใ€–๐‘‰๐‘Ž๐‘Ÿ(๐‘‹๐‘–)=โˆ‘ฮป๐‘–ใ€—

A

the sum of the variances of the orignal values is the sum of the eigenvalues along the diagonal of V

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

true or false? PCA doesnโ€™t do much to reduce the dimension of data which is largely uncorrelated.

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when do we use Spectural Decompostion and Singular value decomp.

A

computing eigenvectors (spectral decomposition) rely on the matrix being invertible

If the matrix isnโ€™t invertible, singular value decomposition (SVD) works better