Week 7 Flashcards
Principal Comp0onent Analysis
What is Principal Component Analysis (PCA)?
Primarily used for Dimensionality Reduction.
Feature Selection.
Main Idea: Project given data onto a lower dimensional subspace such that
1. Reconstruction error is minimized
2. Variance of the projected data is maximised.
How does PCA work?
Fix a subspace and then find the best projection. (initial)
There is also an optimal subspace (later)
How does PCA work?
If we have too many features, we may not be able to analyse using every feature. PCA combines features in a smart way and produces new features (aka Principal Components PC) which will reduce the loss of info.
What are the properties of principal Components (PCs)?
PCs are ordered, The first PC has higher weightage than the next. and so on. We may stop at the desired level of variance. Ideally, we want to get around 90% variance with just 2 to 3 PCs so that enough information is retained while we can still visualize our data on a plot.
how are PCs determined? What are loadings?
Loadings indicate the contribution of the variables to each PC . Note: each variable will get
What are real-symmetric matrices?