Chapter 9 Flashcards
What is a tree leaf node?
In a classification tree each time there is a partition, the measurements are divided into 2 leaf nodes.
How to determine which questions is better in decision trees?
1) balance - how much each questions splits up a group.
2) Purity - we want the groups toonly contain one group (e.g. type of animal)
What is impurity?
A number that represents how impure the root is and also how impure it is after split.
Name impurity functions:
1) Entropy
2) Gini
3) ClassError
What is early stopping?
Instead of stopping at totally pure splits, stops at different criteria:
1) when branch contains fewer than x observations
2) Stop at certain tree depth
3) when purity gain of best split below a certain value
What is pruning?
Cutting branches after stopped algorhitm.