Chapter 6 Decision Trees Flashcards

Question 1

Q

What are Decision Trees used for in machine learning?

Answer

A

Classification, regression, and multioutput tasks.

Question 2

Q

What algorithm is used to train Decision Trees?

Answer

A

The CART (Classification and Regression Tree) algorithm.

Question 3

Q

How do Decision Trees work for classification?

Answer

A

By traversing the tree from root to leaf, checking conditions at each node to reach a classification.

Question 4

Q

What is Gini impurity?

Answer

A

A measure of node impurity; it’s 0 if all training instances belong to the same class.

Question 5

Q

How can Decision Trees estimate class probabilities?

Answer

A

By checking the class distribution in the leaf node where the instance ends up.

Question 6

Q

What does CART try to do when training?

Answer

A

Repeatedly split data to create the purest subsets, weighted by size.

Question 7

Q

What’s the training time complexity of Decision Trees?

Answer

A

O(n × m log(m)), where n is the number of features and m is the number of samples.

Question 8

Q

What are the two impurity measures in Decision Trees?

Answer

A

Gini impurity (default) and entropy.

Question 9

Q

When is entropy used in decision trees?

Answer

A

As an alternative impurity measure using ‘criterion=entropy’.

Question 10

Q

Which is faster to compute: Gini or Entropy?

Answer

A

Gini is faster.

Question 11

Q

What is a “white box” model in ML?

Answer

A

A model that is easy to interpret, like a Decision Tree.

Question 12

Q

What hyperparameters help regularize a Decision Tree?

Answer

A

max_depth, min_samples_split, min_samples_leaf, max_leaf_nodes, max_features.

Question 13

Q

How do you prevent overfitting in a Decision Tree?

Answer

A

By increasing min_* and decreasing max_* parameters.

Question 14

Q

What’s the difference in training a regression tree vs classification tree?

Answer

A

Regression uses MSE instead of impurity to split nodes.

Question 15

Q

What are two main drawbacks of Decision Trees?

Answer

A

Sensitivity to training data changes and axis-aligned (orthogonal) splits.

Question 16

Q

Why are Decision Trees unstable?

Answer

A

Small changes in data can lead to a completely different model.

Question 17

Q

How can instability in Decision Trees be fixed?

Answer

A

Use Random Forests (average predictions from many trees).

Question 18

Q

What happens when you rotate the training data?

Answer

A

The model may fail to generalize because splits are orthogonal (not rotationally invariant).