lecture 9 Flashcards
What are decision trees used for?
Classifying data by recursively splitting it based on feature values.
What is an internal node in a decision tree?
A decision point that tests a feature to split data.
What is a leaf node in a decision tree?
A terminal node that assigns a class label.
Why are decision trees prone to overfitting?
They can learn noise and irrelevant patterns in training data.
What is the primary advantage of decision trees?
They are easy to interpret and visualize.
What is a common method to regularize decision trees?
Limiting tree depth or pruning unnecessary branches.
What is the standard algorithm for training decision trees?
The ID3 (Iterative Dichotomiser 3) algorithm.
What is the greedy strategy in decision tree learning?
Choosing the best split at each step without backtracking.
How do decision trees handle missing values?
They can assign the most common value or split on other features.
What is entropy in decision trees?
A measure of uncertainty in a dataset.
What is information gain?
The reduction in entropy after splitting on a feature.
What is the Gini index?
A measure of how pure a split is in decision trees.
What is overfitting in decision trees?
When a tree is too complex and memorizes training data instead of generalizing.
What is pruning in decision trees?
Removing branches that do not improve generalization.
What is an ensemble model?
A combination of multiple models to improve performance.
What is bagging in ensemble learning?
Training multiple models on different subsets of the data and averaging predictions.
What is the key idea behind random forests?
Building multiple decision trees with random feature selection to improve generalization.
What is boosting in ensemble learning?
A method that trains models sequentially, giving more weight to misclassified instances.
What is a weak learner in boosting?
A model that performs slightly better than random chance.
What is AdaBoost?
A boosting algorithm that combines weak learners to create a strong classifier.
What is gradient boosting?
A boosting method that optimizes a loss function by adding models sequentially.
How does boosting differ from bagging?
Boosting trains models sequentially, while bagging trains models independently.
What is the role of decision stumps in boosting?
They serve as simple weak learners in boosting algorithms.
Why are ensemble models often better than single models?
They reduce variance and improve generalization.
What is out-of-bag error in bagging?
The error estimated on samples not used in training each model.
How do random forests differ from standard decision trees?
They use multiple trees with different random feature subsets.
What is the main disadvantage of ensemble methods?
They are less interpretable compared to single decision trees.
What is feature importance in decision trees?
A measure of how much a feature contributes to making decisions.
What is the takeaway from decision trees and ensemble learning?
Combining multiple decision trees using ensemble methods improves model robustness and accuracy.