What is Dimensionality Reduction Flashcards

Question 1

Q

WHAT IS DIMENSIONALITY? P355

Answer

A

The number of input variables or features for a dataset is referred to as its dimensionality.

Question 2

Q

WHAT IS THE “CURSE OF DIMENSIONALITY”? P355

Answer

A

More input features often make a predictive modeling task more challenging to model, more generally referred to as the curse of dimensionality.

Question 3

Q

EXTERNAL Q: WHAT IS DEGREE OF FREEDOM IN ML?

Answer

A

In machine learning, degrees of freedom is the number of parameters of a model.
Parameters in machine learning and deep learning are the values your learning algorithm can change independently as it learns and these values are affected by the choice of hyperparameters you provide.

Question 4

Q

AT WHICH STAGE OF THE PROJECT WE DO DIMENSIONALITY REDUCTION? P356

Answer

A

Dimensionality reduction is a data preparation technique performed on data prior to modeling. It might be performed after data cleaning and data scaling and before training a predictive model.

Question 5

Q

WHAT ARE THE MAIN TECHNIQUES FOR DIMENSIONALITY REDUCTION? P357

Answer

A

Feature Selection Methods
Matrix Factorization: Most common is PCA
Manifold Learning: Often for the purposes of data visualization
Autoencoder Methods

Question 6

Q

WHAT ARE SOME EXAMPLES OF MANIFOLD LEARNING TECHNIQUE FOR DIMENSIONALITY REDUCTION? P358

Answer

A

ˆ Kohonen Self-Organizing Map (SOM).
ˆ Sammons Mapping
ˆ Multidimensional Scaling (MDS)
ˆ t-distributed Stochastic Neighbor Embedding (t-SNE).

Question 7

Q

WHAT ARE AUTOENCODERS? P358

Answer

A

An auto-encoder is a kind of unsupervised neural network that is used for dimensionality reduction and feature discovery. More precisely, an auto-encoder is a feedforward neural network that is trained to predict the input itself.

Question 8

Q

WHAT ARE ENCODERS AND DECODERS IN AUTOENCODERS? P358

Answer

A

In auto encoders, a network model is used that seeks to compress the data flow to a bottleneck layer with far fewer dimensions than the original input data. The part of the model prior to and including the bottleneck is referred to as the encoder, and the part of the model that reads the bottleneck output and reconstructs the input is called the decoder.

Question 9

Q

WHAT HAPPENS AFTER TRAINING AN AUTO-ENCODER?

Answer

A

The decoder is discarded and the output from the bottleneck is used directly as the reduced dimensionality of the input.

Question 10

Q

WHAT IS PROJECTION? P358

Answer

A

In mathematics, a projection is a kind of function or mapping that transforms data in some way.

Question 11

Q

DEEP AUTO-ENCODERS ARE AN EFFECTIVE FRAMEWORK FOR ____ DIMENSIONALITY REDUCTION. P358

Answer

A

Non-linear

Question 12

Q

WHEN USING DEEP AUTO-ENCODERS, WHICH LAYER DO WE USE AS THE REDUCED INPUT FOR THE PROBLEM? P358

Answer

A

The top-most layer of the encoder

Question 13

Q

WHY IS IT CHALLENGING TO INTERPRET OUTPUT OF THE BOTTLENECK? P358

Answer

A

The output of the encoder is a type of projection and like other projection methods, there is no direct relationship from the bottleneck output back to the original input variables, making them challenging to interpret.

Question 14

Q

WHICH METHODS OF DIMENSIONALITY REDUCTION ASSUME SAME SCALE OR DISTRIBUTION FOR ALL INPUT FEATURES, WHAT SHOULD WE DO PRIOR TO USING THEM? P359

Answer

A

Linear algebra and manifold learning (an approach to non-linear dimensionality reduction) methods; it is good practice to either normalize or standardize data prior to using these methods.