Kursusgang 4 (Dimensionality reduction) Flashcards

Question 1

Q

What is the curse of dimensionality?

Answer

A

The growth of volume is an exponential function of the dimensionality.

Classification accuracy depends on the dimensionality (and amount of training data) but the computational complexity also depends on the dimensionality.

Question 2

Q

Why do we want to reduce the dimensionality?

Answer

A

Reduce time complexity: Less computation.
Reduce space complexity: Less parameters.
Simpler models are more robust on small datasets.
More interpretable & simpler explanation.
Data visualization (structure, groups, outliers, etc.) if plotted in 2 or 3 dimensions

Question 3

Q

What are the two general methods of doing dimensionality reduction?

Answer

A

Feature selection that try to find a subset of the original variables. This is done by choosing M < D important features and ignoring the remaining D – M features.

Feature extraction that applies a mapping of the multidimensional space into a space of fewer dimensions. This is done by projecting the original x_i, i =1 ,…, D dimensions to the new M < D dimensions, z_j , j =1 ,…, M

Question 4

Q

Name the methods of feature selection

Answer

A

Forward search

Hill-climbing algorithm

Backward search

Floating search

Question 5

Q

How does forward search work?

Answer

A

The method adds the best feature at each step.
Let the set of features F initially Ø. Then, at each iteration, find the best new feature
j = argmin_i E (F ∪ x_i).
Add x_j to F if E(F ∪ xj) < E(F).

Question 6

Q

How does the hill-climbing algorithm work?

Answer

A

Start with an arbitrary or random solution. Identify neighboring states of the current solution by making small adjustments. If one of the neighboring states offers a better solution (according to some evaluation function), move to this new state. Repeat this process until no neighboring state is better than the current one. At this point, you’ve reached a local maximum

Question 7

Q

How does backward search work?

Answer

A

It works by iteratively removing features one at a time that are not predictive of the target variable or have the least predictive power.

Question 8

Q

How does floating search work?

Answer

A

It adds k features and then removes l features. Both k and l and varied, thus searching the best features for a given dimension.

Question 9

Q

What are the two main feature extraction methods?

Answer

A

Principal component analysis (PCA)
Linear discriminant analysis (LDA)

Question 10

Q

What is principal component analysis?

Answer

A

It is a unsupervised linear method for dimensionality reduction.

It can be defined in two ways:
* The linear projection that minimizes the
average projection cost, that is defined as the mean squared distance between the data points and their projections
* The direction in the data space that has highest variance. The eigenvector to the largest eigenvalue is the first principal component.

The principal components must be mutually orthogonal.

Question 11

Q

What does the eigenvalue in principal component analysis mean?

Answer

A

The eigenvalue magnitude signifies on how much variance each component explains

Question 12

Q

How is principal component analysis performed in practice?

Answer

A

The covariance matrix of data is constructed. Eigenvectors of this matrix are computed. Data are projected.

Question 13

Q

How should the proportion of variance, M<D, be chosen in principal component analysis?

Answer

A

Proportion of Variance is defined as the fraction of the sum of the M largest eigenvalues over the sum of the D largest eigenvalues. Proportion of variance should be below 0.9.

Question 14

Q

Define the Fischer criterion.

Answer

A

It is the projection of the data that maximizes the ratio
of class variance relative to the within-class variance.

Question 15

Q

Define linear discriminant analysis.

Answer

A

Find a low-dimensional space such that when x is projected, classes are well-separated.

For two classes, this simplifies to the Fischer criterion.

Question 16

Q

Is principal component analysis an unsupervised method?

Answer

Study These Flashcards

A

Yes

Question 17

Q

Is linear discriminant analysis a supervised method?

Answer

Study These Flashcards

A

Yes

Question 18

Q

What is the reconstruction error of principal component analysis?

Answer

Study These Flashcards

A

The variance not included in the PCA.

Kursusgang 4 (Dimensionality reduction) Flashcards

(18 cards)