Chapter 6 Decision Trees Flashcards
What are Decision Trees used for in machine learning?
Classification, regression, and multioutput tasks.
What algorithm is used to train Decision Trees?
The CART (Classification and Regression Tree) algorithm.
How do Decision Trees work for classification?
By traversing the tree from root to leaf, checking conditions at each node to reach a classification.
What is Gini impurity?
A measure of node impurity; it’s 0 if all training instances belong to the same class.
How can Decision Trees estimate class probabilities?
By checking the class distribution in the leaf node where the instance ends up.
What does CART try to do when training?
Repeatedly split data to create the purest subsets, weighted by size.
What’s the training time complexity of Decision Trees?
O(n × m log(m)), where n is the number of features and m is the number of samples.
What are the two impurity measures in Decision Trees?
Gini impurity (default) and entropy.
When is entropy used in decision trees?
As an alternative impurity measure using ‘criterion=entropy’.
Which is faster to compute: Gini or Entropy?
Gini is faster.
What is a “white box” model in ML?
A model that is easy to interpret, like a Decision Tree.
What hyperparameters help regularize a Decision Tree?
max_depth, min_samples_split, min_samples_leaf, max_leaf_nodes, max_features.
How do you prevent overfitting in a Decision Tree?
By increasing min_* and decreasing max_* parameters.
What’s the difference in training a regression tree vs classification tree?
Regression uses MSE instead of impurity to split nodes.
What are two main drawbacks of Decision Trees?
Sensitivity to training data changes and axis-aligned (orthogonal) splits.
Why are Decision Trees unstable?
Small changes in data can lead to a completely different model.
How can instability in Decision Trees be fixed?
Use Random Forests (average predictions from many trees).
What happens when you rotate the training data?
The model may fail to generalize because splits are orthogonal (not rotationally invariant).