Lecture 17 Flashcards
PCA what is it? (+/-)
reduce dimension, combine the feature variables in a specific way, retaining the most valuable parts of all of the feature variables, each of the “new variables” after PCA are all independent of one another (if strong correlation:reduction of dimension)
cons: loss of interpretability
PCA how to overall
Find the directions of maximum variance in high- dimensional dataset (n dimension) and project it onto a subspace with smaller dimension (k dimension, with k < n), while retaining most of the information.
PCA eq
A = X.T@X (nxn) - covariance matrix
Diagonalize A=UDU^{-1} (D:λ1, …, λn ; U:u1, …, un)
New feature X* = XU such that X.T@X=D
X* first 2 columns (principal components) explain most of the variance (λ1/λtotal + λ2/λtotal)
Diagonalization
A nxn with n linearly independent eigenvector u, then A=UDU^{-1} (U lin indep normalised eigenvectors, D eigenvalues)
If less than n lin indep eigenvector, defective and not diagonalizable
nxn symmetric matrix A with n distinct eigenvalues ?
Hence diagonalizable (/!\ if A diagonalizable, not necessarily distinct eigenvalues)
Identity matrix diagonalization
Iv = λv I = UDU-1 with eigenvalues all 1 and u1=(1 0 ... 0), u2=(0 1 0 ... 0), ...
SVD
Factorization mxn matrix A=UΣV.T where U mxm orthogonal, V.T nxn orthogonal, Σ mxn diagonal (σ1 >= σ2 >= …)
SVD, V?
A.T@A = V@Σ^2@V.T
V eigenvectors of A.T@A (lin. indep - vi.vj=0, right singular vectors)
SVD, A@A.T eigenvectors?
A@A.T = U@Σ^2@U.T
Columns of U are eigenvectors of A@A.T (left singular vectors)
SVD computation
1) A.T@A
2) Get V, D = eigen(A.T@A) (V columns right singular vector, D eigenvalues)
3) σi = √λi or Σ=D^{1/2} (singular values)
4) U = AVΣ^{-1} (columns are left singular vectors)
A.T@A eigenvalues
Non-negative because semi positive definite and symmetric (always square root!)
Orthogonal singularity ?
Not singular (X^{-1} = X.T)
Σ can have zero diagonal
Yes, singular value can be zero
Singular value of A ?
Squared singular values of A are eigenvalues of A.T@A (σi = √λi)