Dimensionality Reduction (Unsupervised Learning) Flashcards
Dimension
Dimension is an attribute of an instance
Another name of Dimension
Variable or Feature or Attribute
Dimension Set
Set of attributes used to describe an instance
Another name of Dimension Set
Feature set or Attribute set
Feature Importance in Dimensionality
When an instance is described with a dimension set, then all the dimensions are not important to describe the instance.
What happens if we use all possible attributes to describe an instance in machine learning?
Using all possible attributes can lead to overfitting.
When does model goes to overfitting (in terms of attributes)
Model goes to overfitting if we use all possible attributes to describe an instance in machine learning
What happens to computation time and efficiency when dimensions increase in machine learning?
When dimensions increase, computation time increases and efficiency decreases.
What challenge arises when dealing with higher dimensions in data analysis?
When the dimensions are more, it is very difficult to plot the data and analyse the data
How is data represented in different dimensions, and what challenge arises with higher dimensions?
- 1-D data is represented with a dot (.)
- 2-D data is represented with x-axis and y-axis
- 3-D data is represented with x-axis, y-axis, and z-axis
- Data with more than 3 dimensions is very difficult to plot.
To address this, dimensionality reduction techniques are required to effectively train models in high-dimensional spaces.
What is Dimensionality Reduction.
Dimensionality Reduction is the process of reducing the dimensions of input dataset under the consideration by selecting a smaller set of principal attributes.
–Here, consideration criteria is minimum attribute selection
–Here, Principal attribute is an attribute which contains minimum error
Advantages of Dimensionality Reduction
- Decreases the complexity of an algorithm.
- Saves time by eliminating less important input variables.
- Allows simpler models to train the machine.
- Makes it easier to plot and analyze data.
- Simplifies knowledge extraction (i.e., identifying prediction patterns).
- Reduces computation time.
- Increases the efficiency of machine predictions.
Dimensionality Reduction methods
Two methods
1.) Feature selection
2.) Feature extraction
What happens in feature selection
‘k’ dimensions are selected from given d-dimensional set, meaning (d-k) dimensions are discarded
Compare the efficiency of feature selection and feature extraction
Feature selection is less efficient than feature extraction