1. Principal Component Analysis Flashcards

Question 1

Q

What is the purpose of PCA?

Answer

A

The purpose of principal component analysis is a dimension reduction while preserving as much variance as possible

Question 2

Q

What is the purpose of PCA?

Answer

A

The purpose of principal component analysis is a dimension reduction while preserving as much variance as possible

Question 3

Q

What is the purpose of SVD in PCA?

Answer

A

Singular value decomposition is a key mathematical tool enabling dimensionality reduction by decomposing the data into orthogonal components

Question 4

Q

What is feature transformation?

Answer

A

Feature transformation refers to the operation of changing an existing feature or adding a new feature to our dataset. Either by mean-centering or standardization

Question 5

Q

Why should you consider feature transformation?

Answer

A

If there is outliers which the model is sensitive to
making a simpler machine learning method more powerful
ordinal values (origin) might not encode as integer

Question 6

Q

What purpose does the transformation (mean-centering) ensure?

Answer

A

The goal of PCA is to capture direction of maximum variance in the data, and the transformation therefore allows PC1 to point in the direction of the largest variance and not just the location of the data

Question 7

Q

If there’s no transformation?

Answer

A

The first principal component might just reflect the overall mean of the dataset, rather than meaningful patterns of variation

Question 8

Q

The right singular vector is?

Answer

A

Calculated from the columns providing the vector corresponding to the given feature

Question 9

Q

The singular values are?

Answer

A

Calculated from the eigenvector of either right or left singular vectors, corresponding either features (right) or samples (left) resulting in the PC product of either features or samples

Question 10

Q

Left singular vector is?

Answer

A

Calculated from the rows providing eigenvector for each sample, providing information of each sample direction

Question 11

Q

How are principal components identified?

Answer

A

right (V^T) represents direction of maximum variance in the data corresponding to the principal components
singular values (sum sign) amount of variance captured by each principal component

Question 12

Q

What does large singular values represent?

Answer

A

If it is a large singular value, much variance is explained by the corresponding principal components

Question 13

Q

Why do you not keep all PC products?

Answer

A

To capture the most variance only the first highest k principal components are used, the others do not capture significant variance and is discarded

Question 14

Q

What is the benefits of PCA?

Answer

A

Easy to calculate and compute
Prevents predictive algorithms from data overfitting issues
Increase ML performance, by eliminating unnecessary correlated variables
Reduce noise that cannot be ignored automatically

Question 15

Q

What is the limitation of PCA?

Answer

A

PCA can be difficult to interpret, rare cases can be difficult to identify most important features
Sometimes it is harder to read after analysis than it was before analysis
More than two final PC products is often harder to interpret

1. Principal Component Analysis Flashcards

(15 cards)