Final Flashcards
What is the Feature Transformation?
The problem of pre-processing a set of features to create a new (smaller or more compact) feature set, while retaining as much (relevant useful) information as possible
What is the relation between feature selection and feature transformation?
Feature Selection is a subset of feature transformation.
Transformation usually mean Linear Transformation.
Why do feature transformation?
Information retrieval (ad hoc) (google problem) TBC
PCA vs ICA
PCA -> find Correlation -> by Maximization variance -> give you the ability to reconstruction
PCA - > Mutually Orthogonal
PCA -> Maximization variance
PCA -> Ordered Features
ICA -> Allow you to analysis your data to discover fundamental features of them.
ICA -> Maximal Mutale Information
ICA -> Mutually Independent
ICA -> find linear transformation of feature space-> By maximization independence
(Mutual Information between i(Yi, Yj) = 0 and I(Y,x) as high as possible
What is the set of Principle components?
1) Direction of the greatest variability in he data.
2) perpendicular to first, greatest variability of what lest
3) .. and so on until d (original dimensionality)
How to do PCA?
1) Normalize the data (“center he data at zero”)
2) compute the covariance matrix
* do x1 and x2 tend to increase together?
* or does x2 decease as x1 increase?
How to calculate covariance ?
cov(b,a) = 1/n Sum xa.xb
When taking a random vector multiply by the covariance matrix it turns it towards the dimensions of the greatest variance of the data
want vectors e which aren’t turned : E e = lambda e
- principal components = eigenvectors w. largest eigenvalues
Formulation of PCA Problem
Reduce from 2-dirmension to 1-dim: Find a direction (A vector u E R^N) onto which to project the data so as to minimize the projection error.
Reduce from n-dimension to k-dimension: Find k vectors u,1,u, uk, onto which to project the data so as to minimize the projection error.