PCA Flashcards
What is PCA?
Separation of multivariate data into N most important components
Identify principle directions in which data varies
Explains variance covariance structure of a set of variables
Organise data into a set of composite variables
What are the 2 approaches to dimensionality reduction? Describe them.
- Feature Selection:
Choose a subset of features present in the data set and extract data points with these features. - Feature Extraction:
Create a new subset by combining different features
What are the 2 types of feature extraction? Describe them.
- Classification - ICA
Feature extraction mapping enhances discriminator information in low dimensional space - Signal Representation - PCA
Feature extraction mapping accurately represents samples in low dimensional space
What underlying component does PCA work on?
Total sample variance
What are principle components?
Eigen-vectors that describe the direction of variance in data
All PCs are orthogonal i.e. Covariance between any 2 PCs = 0
What do the Eigen-vectors and values represent?
Eigenvalues are the magnitude of the variances of the principle components
Eigenvectors describe direction of variance
What is the PCA algorithm?
- Define input matrix of dimension P
- Calculate mean of each column
- Calculate new matrix with centered value columns (mean + value)
- Calculate covariance matrix
- Calculate eigenvectors/values
- Select K largest PCs