Principal Components Analysis Flashcards
How do we deal with high dimensionality (3)?
- Use domain knowledge
- Feature engineering (e.g. color historgrams for object detection)
- Make assumptions
- Independence
- Smoothness
- Symmetry
- Reduce dimensionality
What are the two methods for reducing dimensionality?
- Feature selection
- Feature extraction
What is feature selection?
Choosing a subset of the original features (e.g. highest infomation gain)
What is feature extraction?
Contruct a new set of dimensions from a linear combination of the original
What does PCA try to preserve?
The structure (variance) in the data
What are principal components?
Eigen vectors with the largest eigen values
What happens when you multiply a random vector with the covariance matrix?
It moves in the direction of greatest variance
What is an eigen vector?
A vector when multiplied by a matrix does not change direction, only magnitude
What is an eigen value?
The scaler for which an eigen vector grows
How do you find eigenvalues?
What is the determinant of a 2x2 matrix?
How do you find eigenvectors (given the eigen values)?
Which eigenvectors do we pick for principle components?
Unit length eigen vectors
How do you project a coordinate x’ given ei, …, em eigen vectors?
(x’ - mu)Tej for j = 1…m
What property does the eigen vector for a principle component have?
Its where the data is spread out the most