Dimensionality Reduction (Unsupervised Learning) Flashcards

1
Q

Dimension

A

Dimension is an attribute of an instance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Another name of Dimension

A

Variable or Feature or Attribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Dimension Set

A

Set of attributes used to describe an instance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Another name of Dimension Set

A

Feature set or Attribute set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Feature Importance in Dimensionality

A

When an instance is described with a dimension set, then all the dimensions are not important to describe the instance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What happens if we use all possible attributes to describe an instance in machine learning?

A

Using all possible attributes can lead to overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When does model goes to overfitting (in terms of attributes)

A

Model goes to overfitting if we use all possible attributes to describe an instance in machine learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What happens to computation time and efficiency when dimensions increase in machine learning?

A

When dimensions increase, computation time increases and efficiency decreases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What challenge arises when dealing with higher dimensions in data analysis?

A

When the dimensions are more, it is very difficult to plot the data and analyse the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is data represented in different dimensions, and what challenge arises with higher dimensions?

A
  • 1-D data is represented with a dot (.)
  • 2-D data is represented with x-axis and y-axis
  • 3-D data is represented with x-axis, y-axis, and z-axis
  • Data with more than 3 dimensions is very difficult to plot.
    To address this, dimensionality reduction techniques are required to effectively train models in high-dimensional spaces.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Dimensionality Reduction.

A

Dimensionality Reduction is the process of reducing the dimensions of input dataset under the consideration by selecting a smaller set of principal attributes.
–Here, consideration criteria is minimum attribute selection
–Here, Principal attribute is an attribute which contains minimum error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Advantages of Dimensionality Reduction

A
  • Decreases the complexity of an algorithm.
  • Saves time by eliminating less important input variables.
  • Allows simpler models to train the machine.
  • Makes it easier to plot and analyze data.
  • Simplifies knowledge extraction (i.e., identifying prediction patterns).
  • Reduces computation time.
  • Increases the efficiency of machine predictions.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Dimensionality Reduction methods

A

Two methods
1.) Feature selection
2.) Feature extraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What happens in feature selection

A

‘k’ dimensions are selected from given d-dimensional set, meaning (d-k) dimensions are discarded

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Compare the efficiency of feature selection and feature extraction

A

Feature selection is less efficient than feature extraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why feature selection is less efficient than feature extraction

A

Feature selection doesn’t cover all the dimensions given in the input set

17
Q

Which method is used in feature selection?

A

Subset selection method is used in the feature selection

18
Q

Type of Subset selection algorithms

A

1.) Forward Selection
2.) Backward Selection

19
Q

Forward Selection

A

It starts with an empty subset (No variable) and adding the variables one by one and select the subset which decreases the error the most that can’t decreased further by adding another variable

20
Q

Backward Selection

A

In this selection, initially take the complete set (full set). Later delete/remove the variables one by one till we decrease the error the most, that can’t be decreased further by removing the other variables.

21
Q

Advantages of forward selection(3)

A

1.)Faster for Large Datasets (since it starts with an empty set of features)
2.)Good understanding of the incremental benefits after adding each feature.
3.)Less Computationally Intensive compared with backward selection

22
Q

What is a key advantage of forward selection when dealing with large datasets and why?

A

It is faster for large datasets since it starts with an empty set of features.

23
Q

How does forward selection help in understanding model performance?

A

It provides a good understanding of the incremental benefits after adding each feature.

24
Q

Which is less computationally intensive (Forward selection or backward selection) and why?

A

Forward selection starts with an empty set and adds features gradually, making it less computationally intensive than backward selection, which starts with all features and removes them.

25
Q

Disadvantages of forward selection(2)

A

1.) It is computationally expensive when data set is very large
2.) May be miss out the interaction between the features those are more relevant

26
Q

What is a major disadvantage of forward selection with very large datasets?

A

It can be computationally expensive when the dataset is very large.

27
Q

What is a potential issue with feature interactions in forward selection?

A

Forward selection may miss out on interactions between features that are more relevant.

28
Q

Advantages of Backward selection(2)

A

1.) We can identify and remove the irrelevant attributes in the early stage
2.)We can consider the interactions between the features since it starts from full set

29
Q

What is an advantage of backward selection in terms of irrelevant attributes?

A

It allows us to identify and remove irrelevant attributes in the early stages of the model-building process.

30
Q

How does backward selection address feature interactions?

A

It considers the interactions between features since it starts from a full set of attributes.

31
Q

Disadvantages of backward selection(2)

A

1.)It is generally slower for the large data set because features are more in the large data set
2.)More computationally intensive when the data set is large

32
Q

What is a key disadvantage of backward selection when dealing with large datasets?

A

It is generally slower for large datasets because there are more features to evaluate.

33
Q

Why is backward selection considered computationally intensive for large datasets?

A

It requires high CPU usage, large memory consumption, extended execution time, high storage requirements, and may involve parallel processing.

34
Q

Feature extraction

A
  • In this method,’k’ new dimensions are derived with the combination of ‘d’ dimensions given in the input data set
  • In this method, k value is compulsory less than d value, means new dimensions are less than original dimensions
  • In this method, all the input dimensions are covered in the new dimension. Therefore, it is more efficient than feature selection
  • PCA(Principal Component Analysis) technique is used in the feature extraction
35
Q

What happens to the dimensions in feature extraction?

A

‘k’ new dimensions are derived by combining the ‘d’ dimensions given in the input dataset.

36
Q

In feature extraction, how does ‘k’ compare to ‘d’?

A

The value of ‘k’ is always less than ‘d’, meaning the new dimensions (k) are fewer than the original dimensions (d).

37
Q

Why is feature extraction more efficient than feature selection?

A

Because all input dimensions are covered in the new dimensions, making it more efficient.

38
Q

Which technique is commonly used in feature extraction?

A

Principal Component Analysis (PCA) is used in feature extraction.