1. Principal Component Analysis Flashcards
What is the purpose of PCA?
The purpose of principal component analysis is a dimension reduction while preserving as much variance as possible
What is the purpose of PCA?
The purpose of principal component analysis is a dimension reduction while preserving as much variance as possible
What is the purpose of SVD in PCA?
Singular value decomposition is a key mathematical tool enabling dimensionality reduction by decomposing the data into orthogonal components
What is feature transformation?
Feature transformation refers to the operation of changing an existing feature or adding a new feature to our dataset. Either by mean-centering or standardization
Why should you consider feature transformation?
- If there is outliers which the model is sensitive to
- making a simpler machine learning method more powerful
- ordinal values (origin) might not encode as integer
What purpose does the transformation (mean-centering) ensure?
The goal of PCA is to capture direction of maximum variance in the data, and the transformation therefore allows PC1 to point in the direction of the largest variance and not just the location of the data
If there’s no transformation?
The first principal component might just reflect the overall mean of the dataset, rather than meaningful patterns of variation
The right singular vector is?
Calculated from the columns providing the vector corresponding to the given feature
The singular values are?
Calculated from the eigenvector of either right or left singular vectors, corresponding either features (right) or samples (left) resulting in the PC product of either features or samples
Left singular vector is?
Calculated from the rows providing eigenvector for each sample, providing information of each sample direction
How are principal components identified?
- right (V^T) represents direction of maximum variance in the data corresponding to the principal components
- singular values (sum sign) amount of variance captured by each principal component
What does large singular values represent?
If it is a large singular value, much variance is explained by the corresponding principal components
Why do you not keep all PC products?
To capture the most variance only the first highest k principal components are used, the others do not capture significant variance and is discarded
What is the benefits of PCA?
- Easy to calculate and compute
- Prevents predictive algorithms from data overfitting issues
- Increase ML performance, by eliminating unnecessary correlated variables
- Reduce noise that cannot be ignored automatically
What is the limitation of PCA?
- PCA can be difficult to interpret, rare cases can be difficult to identify most important features
- Sometimes it is harder to read after analysis than it was before analysis
- More than two final PC products is often harder to interpret