Chapter 6 Decision Trees Flashcards

1
Q

What are Decision Trees used for in machine learning?

A

Classification, regression, and multioutput tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What algorithm is used to train Decision Trees?

A

The CART (Classification and Regression Tree) algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do Decision Trees work for classification?

A

By traversing the tree from root to leaf, checking conditions at each node to reach a classification.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Gini impurity?

A

A measure of node impurity; it’s 0 if all training instances belong to the same class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can Decision Trees estimate class probabilities?

A

By checking the class distribution in the leaf node where the instance ends up.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does CART try to do when training?

A

Repeatedly split data to create the purest subsets, weighted by size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s the training time complexity of Decision Trees?

A

O(n × m log(m)), where n is the number of features and m is the number of samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two impurity measures in Decision Trees?

A

Gini impurity (default) and entropy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When is entropy used in decision trees?

A

As an alternative impurity measure using ‘criterion=entropy’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which is faster to compute: Gini or Entropy?

A

Gini is faster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a “white box” model in ML?

A

A model that is easy to interpret, like a Decision Tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What hyperparameters help regularize a Decision Tree?

A

max_depth, min_samples_split, min_samples_leaf, max_leaf_nodes, max_features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you prevent overfitting in a Decision Tree?

A

By increasing min_* and decreasing max_* parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What’s the difference in training a regression tree vs classification tree?

A

Regression uses MSE instead of impurity to split nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are two main drawbacks of Decision Trees?

A

Sensitivity to training data changes and axis-aligned (orthogonal) splits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why are Decision Trees unstable?

A

Small changes in data can lead to a completely different model.

17
Q

How can instability in Decision Trees be fixed?

A

Use Random Forests (average predictions from many trees).

18
Q

What happens when you rotate the training data?

A

The model may fail to generalize because splits are orthogonal (not rotationally invariant).