Principal Component Analysis Flashcards
1
Q
Curse of Dimensionality
A
- Standard regression classification techniques can become :
- ill-defined for M >> N
- ill conditioned/ numerically unstable even for M < N
- increase in dimensionality > exponential increase of space > data becomes sparse
- amount of data neede for a reliable result often grows exponentially with the dimensionality
2
Q
Regularization
A
- Idea: impose constraints on the parameters to stabilize solution
- example: introduce prior probability
3
Q
Maximum a-posteriori approach
A
4
Q
Dimensionality Reduction
A
- Goal: reduce data to features most relecant for learning task
- i.e. significance test for single features; find relevant directions/subspaces in correlated data
WHY?
- Vizualisation
- better generalization
- speeding up
- data compression
5
Q
Principle Component Analysis
A
- assume data is centered
- find direction of maximum variance
- Eigenvaluze Problem, direction of largest variance corresponds to direction of largest eigenvector
- not robust to outliers
6
Q
PCA applications
A
- Dimensionality reduction
- Eigenfaces
- Denoising
7
Q
Power Iteration
A
- Why: full eigendecomposition of scatter matrix is slow, often interesten only in a few first principal components
- Power Iteration Method: start with random vector w, parameter update w<- Sw/||Sw||