PCA Flashcards

Question 1

Q

What is the purpose of Principal Component Analysis (PCA)?

Answer

A

PCA transforms the original variables 𝑋1, …, 𝑋𝑝 into p new variables 𝑍1, …, 𝑍𝑝 called principal components (PCs).

Question 2

Q

how are the new variables created by PCA ordered?

Answer

A

The new variables are ordered by how much the variation is accounted for by that variable.

That is: 𝑉𝑎𝑟(𝑍1)≥𝑉𝑎𝑟(𝑍2)≥… ≥𝑉𝑎𝑟(𝑍𝑝).

Question 3

Q

how are the importance of PCs determined?

Answer

A

The variables which account for more variation are more important. If some subset of variables account for most of the variation, it’s convention can forget about the rest of the variables!

Question 4

Q

what are PCs in a linear algebra sense?

Answer

A

PCs are linear combinations of 𝑋1, …, 𝑋𝑝, i.e.,
𝑍1=𝑎11𝑋1+𝑎12𝑋2+ …+𝑎1𝑝𝑋𝑝”,”
𝑍2=𝑎21𝑋1+𝑎22𝑋2+ …+𝑎2𝑝𝑋𝑝, etc.

Example of a Linear Combination:
Z = 2 × [1 0] + 3 × [0 1].
Because the coefficient (or weight) of [0 1] is higher, Z points more in the direction of [0
1] than [1 0].

Application to Scalars:
Even though (X_p) are usually scalars, the same idea applies.
Let X₁ = 3, X₂ = 4, a₁₁ = 2, and a₁₂ = -1.
Then Z₁ is the difference between X₁ and X₂, with X₁ weighted twice as much as X₂

Question 5

Q

why should be first normalize our data so that the variances are all 1?

Answer

A

Normalizing the data so that variances are all 1 ensures that each variable contributes equally to the analysis.

Without normalization, variables with larger variances (often due to differences in units or scales) could dominate the principal components, skewing the results and reducing the interpretability of the analysis.

Question 6

Q

how is the first PC chosen?

Answer

A

The first principal component, PC1, is chosen so that 𝑉𝑎𝑟(𝑍1) is as large as possible for any linear combination of 𝑋1, …, 𝑋𝑝.

Question 7

Q

how is Var(Z_1) made as large as possible?

Answer

A

This is achieved by maximizing C𝑎⃗₁, where 𝑎⃗₁ = [𝑎₁₁
⋮
𝑎₁ₚ].
However, this optimization is not interesting unless we enforce the constraint 𝑎⃗₁ = 1

Question 8

Q

how are the other PCs that are not the first chosen?

Answer

A

Subsequent PCs are chosen so that:
They have maximal variance (|𝐶𝑎⃗𝑖| is as large as possible)
The squares of the weights sum to 1 (which means |𝑎⃗𝑖|=1)
And each PC is totally uncorrelated with the previous PCs.

Question 9

Q

what do the solutions for 𝑎⃗1, …,𝑎⃗𝑝 turn out to be?

Answer

A

Eigenvectors of the sample covariance matrix C and the variances turn out to be eigenvalues.

Where: 𝑉𝑎𝑟(𝑍_𝑖)= λ_𝑖 where λ_𝑖 is the ith largest eigenvalue of 𝐶.

Question 10

Q

How do we decide how many principal components to keep?

Answer

A

Scree Plot: Look for an elbow point where variance explained drops off.
80% Rule: Keep enough components to explain at least 80% of total variation.

Question 11

Q

what is principle component?

Answer

A

A principal component is a new variable created by PCA that combines the original variables in a way that captures the most important patterns and variation in the data while reducing complexity

Question 12

Q

true or false, PCs are uncorrelcated with each other

Question 13

Q

explain this eqn: ∑〖𝑉𝑎𝑟(𝑋𝑖)=∑λ𝑖〗

Answer

A

the sum of the variances of the orignal values is the sum of the eigenvalues along the diagonal of V

Question 14

Q

true or false? PCA doesn’t do much to reduce the dimension of data which is largely uncorrelated.

Question 15

Q

when do we use Spectural Decompostion and Singular value decomp.

Answer

A

computing eigenvectors (spectral decomposition) rely on the matrix being invertible

If the matrix isn’t invertible, singular value decomposition (SVD) works better

PCA Flashcards

(15 cards)