Dimensionality reduction Flashcards
Data preprocessing for PCA
Mean normalization and feature scaling
Mean normalization
Calculate the mean of a data set and subtract it from each data point
Feature scaling
Scale features to have a comparable range of values. (ie. divide by std. dev or range)
2D PCA vs. linear regression
1) linear regression tries to predict y based on x; PCA treats all features equally 2) linear regression minimizes squared vertical distance; PCA minimizes projected distance from line
Steps in PCA
1) Compute covariance matrix (Σ)
2) Compute Eigenvectors of Σ. First k columns of U provide our new basis vectors
3)
Covariance matrix
Compute Eigenvectors of Σ
(Octave code)
[U,S,V] = svd(Sigma);
Compute Σ
(Octave code)
Sigma = (1/m) * X’ * X
Define new reduced basis vectors
(Octave code)
Ureduce = U(:,1:k);
Convert n-dim data x to k-dim data z
(Octave code)
z = Ureduce’ * x
% of Variance Retained
Avg squared projection error / Total variation in the data
Determine k dim to retain
1) Run PCA
2) Pick smallest k, st. [99%] of variance retained
How to avoid overfitting
Use regularization - not PCA
When to use PCA
Always try to run process on raw data first. Use PCA only if there is a good reason
Good uses of PCA
Compression: Reduce memory / disk space needed; speed up learning
Visualization: Reduce to 2/3 dimensions