Dimensionality Reduction (Brainscape) Flashcards

1
Q

What are the main motivations for reducing a dataset’s dimensionality(3)? what are the main drawbacks(4)?

A

1) speed up subsequent training algorithm
2) to visualize the data and gain insights on the most important features
3) to save space (data compression)

The main drawbacks are:

1) Some information is lost
2) Can be computationally intensive
3) It adds some complexity to your machine learning pipelines.
4) Transformed features are often hard to interpret.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the curse of dimensionality?

A

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.

For example is the fact that randomly sampled high-dimensional vectors are generally very sparse, increasing the risk of overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Once a dataset’s dimensionality has been reduced, is it possible to reverse the operation? If so, how? if not, why?

A

It is almost always impossible to perfectly reverse the operation because some information gets lost during dimensionality reduction. But it is possible to estimate with good accuracy what the original dataset looked like.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Does it make any sense to chain two different dimensionality reduction algorithyms?

A

It can absolutely make sense. A common example is using PCA to quickly get rid of a largue number of useless dimension, then applying another much slower dimensionality reduction algorithm, such as LLE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the main problem of dimensionality reduction?

A

You lose some information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the main idea of manifold learning.

A

Manifold learning is an approach to non-linear dimensionality reduction. Algorithms for this task are based on the idea that the dimensionality of many data sets is only artificially high.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Application of dimensionality reduction

A

Customer relationship management
Text Mining
Image retrieval
Microarray data analysis
Protein classification
face recognition
handwriting digit recognition
intrusion detection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Feature Selection

A

A process that chooses an optimal subset of features according to a objective function

Objectives: reduce dimensionality and remove noise. Improve speed of learning, predictive accuracy, and simplicity

Think stepwise / forward / backward regressions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Feature Extraction

A

The mapping of the original high dimensionality data to a lower dimensional space

Goals can change based on end usage:
Unsupervised learning - minimize information loss (PCA)
Supervised learning - maximize class discrimination (LDA)

Think PCA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Pros of feature reduction

A

All original features are used although they may not be used in the same form. They are combined linearly.

In feature selection, only a subset of the original features are selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Feature selection methods

A

Remove features with missing values
remove features with low variance
remove highly correlated features
univariate feature selection
feature selection using select from model
filter methods
wrapper methods
embedded methods
hybrid methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Filter Methods for Feature Selection

A

Filter based on:
Information Gain
Chi Squared Test
Fishers Score
Correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Information gain

A

Calculates the reduction in entropy from the transformation of a dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fisher Score

A

Fishers score is one of the most widely used supervised feature selection methods.

The algorithm returns the ranks of variables based on the fishers score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly