Dimensionality Reduction Flashcards

DS

1
Q

Why would you want to use dimensionality reduction techniques to transform your data before training?

A

Dimensionality reduction can allow you to:

  • remove collinearity from the feature space
  • speed up training by reducing the number of features
  • reduce memory usage by reducing the number of features.
  • Identify underlying, latent features that impact multiple features in the original space
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why would you want to avoid dimensionality reduction techniques to transform your data before training?

A

Dimensionality reduction can:

  • Add unnecessary computation
  • Make the model difficult to interpret the latent features are not easy to understand
  • Add complexity to the model pipeline
  • Reduce the predictive power of the model if too much signal is lost
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Name the 4 popular dimensionality reduction algorithms and briefly describe them.

A
  1. PCA: uses an eigen decomposition to transform the original feature data into linearly independent eigenvectors.

The most important vectors (with highest eigenvalues) are then selected to represent the features in the transformed space.

  1. Non-negative matrix factorization (NMF): can be used to reduce dimensionality for certain problem types while preserving more information than PCA.
  2. Embedding Techniques: various embedding techniques, e.g. finding local neighbors as done in Local Liner Embedding, can be used to reduce dimentionality.
  3. Clustering or Centroid techniques: each value can be described as a member of a cluster, or a linear combination of cluster centroids.

By far the most popular is PCA and similar eigen-decomposition based variations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

After doing dimensionality reduction, can you transform the data back to the original feature space? If so, how?

A

Yes and no.

Most dimensionality methods have inverse transformations, but signal is often lost when reducing dimensions, so the inverse transformation is usually only an approximation of the original data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you select the number of principal components needed for PCA?

A

Selecting the number of latent features to retain is typically done by inspecting the eigenvalue of each eigenvector (where eigenvalue is percent variance explained). As eigenvalues decrease, the impact of the latent feature on the target variable also increases.

This means that principal components with small eigenvalues have a small impact on the model and can be removed.

There are various rules of thumb, but one general rule is to include the most significant principal components that account for at least 95% of the variation of the features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly