Module 2_ 5. Dimensionality reduction and Visualization Flashcards

Question 1

Q

Name to techniques to perform dimensionality reduction.

Answer

A

PCA and t-SNE

Question 2

Q

True or False.
By default, a vector is a row vector.

Question 3

Q

Given 2 vectors x1 = [2.2, 4.2] and x2 = [1.2, 3.2].
Calculate x1 + x2 and mean(x̄).

Answer

A

x1 + x2 = [3.4, 7.4]
x̄ = (1/n) Σ xi = (1/n) * (x1 + x2) = [1.7, 3.7]

Question 4

Q

Explain Feature/Column Normalization.

Answer

A

a1,a2,……..,an ——> n-values of fj

ai’ = (ai-amin)/(amax-amin) —-> ai’ ∈ [0, 1]

amin’ = (amin-amin)/(amax-amin) = 0
amax’ = (amax-amin)/(amax-amin) = 1

a1,a2,……..,an ——> col normalization ——-> a1’,a2’,……..,an’
such that ai’ ∈ [0, 1]

Basically we transform the data and move it into a unit square.

Question 5

Q

Explain Feature/Column Standardization.

Answer

A

a1,a2,……..,an ——> col standardization ——-> a1’,a2’,……..,an’
such that mean(ai’) = 0 and std-dev(ai’) = 1

Formula:
ai’ = (ai - ā)/δ where ā = mean(ai) and δ = std-dev(ai)

Column Standardization summarization:
1. Moving mean to origin
2. Squishing/expanding such that std-dev for any feature is 1.

Question 6

Q

Explain covariance of a data matrix.

Question 7

Q

How to convert 28 X 28 matrix of pixels to a vector?

Answer

A

28 X 28 matrix of pixels —–> ROW FLATTENING —–> Vector (784 X 1)

Question 8

Q

Explain PCA with 2 examples.

Answer

A

PCA stands for Principal Component Analysis (PCA).
We use PCA for dimensionality reduction.
Eg1.
f1 : blackness of hair
f2 : height
Spread of f2 is high and spread on f1 is minimal.
If we have to convert this to 1-D, we can consider sjipping f1 since the spread on f1 is minimal.
We should preserve data with maximal spread.
Eg2.
X —-> 2-D dataset. Say its column standardized
i.e. mean{f1} = mean{f2} = 0 and variance{f1} = variance{f2} = 1
Spread on both f1 and f2 is significant.
If we consider f1’ and f2’, spread(f2’) &laquo_space;spread(f1’)
Also, f1’ is perpendicular to f2’
So basically we are rotating f1,f2 by an angle θ s.t. the variance of xi’s projected onto f1’ is maximal and then dropping f2’

Question 9

Q

Derive the mathematical objective function of PCA?

Answer

A

u1 : unit vector (same direction as f1’)
||u1|| = 1
X —> column standardized
xi’ = projection of xi on u1
= (u1.xi)/||u1||
= u1T.xi
x̄ī’ = u1T.x̄

Find u1 s.t. variance{projection of xi on u1} is maximal
var{u1T.xi} = (1/n) Σ ( u1T.xi - u1T.x̄ )^2
Since X is column standardized, x̄ = [0, 0, 0, …… , 0]
var{u1T.xi} = (1/n) Σ ( u1T.xi )^2
Objective function of PCA —-> max u1 (1/n) Σ ( u1T.xi )^2
s.t. u1Tu1 = 1 = ||u1||^2

Question 10

Q

Derive the alternative formulation of PCA : Distance minimization?

Answer

A

xi -> di : distance from xi to u1
min u1 Σ di^2
u1 : unit vector
u1Tu1 = 1 = ||u1||^2
di^2 = ||xi^2|| - ( u1T.xi )^2
= xiT.xi - ( u1T.xi )^2
Distance minimization PCA :
min u1 Σ ( xiT.xi - ( u1T.xi )^2 )
s.t. u1Tu1 = 1 = ||u1||^2

Question 11

Q

Explain Covariance matrix.

Answer

A

Covariance matrix is a square-symmetric matrix with diagonal elements that represent the variance and the non-diagonal components that express covariance.
Sij = Covariance(fi,fj) ; i : 1 -> d ; j : 1 -> d
Covariance(X,Y) = (1/n) Σ (xi - μx) * (yi - μy)
Covariance(X,X) = Variance(X) —-> (1)
Covariance(fi,fj) = Covariance(fj,fi) —-> (2)
Therefore,
Sij = Covariance(fi,fj) = Covariance(fj,fi) = Sji

Question 12

Q

Question 13

Q

Question 14

Q

Question 15

Q

What are the limitations of PCA?

Answer

A

When λ1 ~ λ2, the information lost is very high.
eg. sine wave, circular distribution of points, etc.
PCA can be used for dimensionality reduction but cannot be used for visualization.

Module 2_ 5. Dimensionality reduction and Visualization Flashcards

(15 cards)