Module 11 - Decision Tree Flashcards
it is a tree shaped diagram used to determine the course of action.
Decision Tree
2 Uses of a Decision Tree:
- Regression Analysis
- Classification Analysis
is a statistical method that allows you to examine the relationship between two or more variable of interest
Regression Analysis
is a data analysis task that identifies and assigns categories to a collection of data to allow for more accurate analysis.
Classification Analysis
3 Advantages of Decision Tree:
- Simple to understand, interpret, and visualize.
- Can handle both numerical and categorical data
- Non-linear parameters don’t affect its performance.
3 Disadvantages of Decision Tree
- Overfitting
- High Variance
- Low Biased Tree
this occurs when the algorithm captures noise in the data.
Overfitting
the model an get unstable due to small variation in the data.
High Variance
a highly complicated decision tree tends to have a low bias which makes it difficult for the model to work with new data.
Low Biased Tree
(4) Definition of Terms:
- Entropy
- Information Gain
- Leaf Node
- Root Node
is a measure of randomness of unpredictability of the dataset.
Entropy
is a measure of decrease in the entropy after the data set is split.
Information Gain
it carries the classification or the decision.
Leaf Node
is the topmost decision node in the decision tree.
Root Node