Lecture 9 - Decision Trees Flashcards
1
Q
Decision Tree
A
- Data-driven method
- Popular classification technique
Reasons
- Performs well across a wide range of situations
- Does not require much effort from the analyst
- Easy understandable by the consumers
- At least when the trees are not too large
- Can be used for both:
- Classification, called classification trees
- Prediction, called regression trees
2
Q
Example
A
3
Q
Nodes
A
- Conditions in the nodes give the splitting value on a predictor
- The number inside the node gives the records after the split
- The bracket provides the number of records per class: [not acceptor, acceptor]
- The leaf nodes, named terminals, are marked with color to indicate a non-acceptor (orange) or acceptor (blue)
4
Q
Trees are easy translated into a set of rules
A
5
Q
Induction (with a Greedy Strategy)
A
- Tree is constructed in a top-down recursive divide-and-conquer manner
- St start, all the training instances are at the root
- Instances, i.e., from the training set, are then partitioned recursively based on selected attributes
6
Q
Issues with Induction (with a Greedy Strategy)
A
- Determine how to split the records
- How to specify the attribute test condition?
- How to determine the best split?
- Determine when to stop splitting
Specifying Test Condition
-
Depends on attribute type:
- Nominal
- Ordinal
- Continuous
- Depends on number of ways to split:
- Binary split, i.e., 2-way
- Multi-way split
7
Q
Splitting based on nominal attributes
A
8
Q
Splitting based on Continuous Attributes
A
Discretization vs binary
9
Q
Determining the Best Split
A
10
Q
Information gain
A
- Used to determine which feature/attribute provide the maximum information about a class
- Split records based on an attribute test optimising certain criterion
- Need a measure of node impurity, e.g., Gini Index, Entropy, etc.
11
Q
Information gain (visual)
A
12
Q
Gini Index
A
13
Q
Entropy measure
A
14
Q
Combined impurity
A
15
Q
Categorical Attributes
A
16
Q
Stopping Criteria for Tree Induction
A
- Stop expanding a node when all the records belong to the same class
- Stop expanding a node when all the records have similar attribute values
- Early termination (to be discussed later)
17
Q
How to Address Overfitting
A
Pre-Pruning
- Stop the algorithm before it becomes a fully-grown tree
- Typical stopping conditions for a node:
- Stop if all instances belong to the same class
- Stop if all attribute values are the same
- More restrictive conditions:
- Stop if number of instances is less than some user-specified threshold
- Stop if expanding the current node does not improve impurity measures, e.g., Gini or information gain
18
Q
How to Address Overfitting
A
Post-pruning
- Grow decision tree to its entirety
- Trim the nod es of the decision tree in a bottom-up fashion
- If generalisation error improves after trimming, replace sub-tree by a leaf node
- Class label of leaf node is determined from majority class of instances in the sub-tree
19
Q
Pros and cons of decision trees
A
Advantages:
- Easy to understand (domain experts love them)
- Easy to generate rules
Disadvantages
- May suffer from overfitting
- Classifies by rectangular partitioning (so does not handle correlated features very well)
- Can be quite large - pruning is necessary
- Does not handle streaming data easily
- … but a few successful ideas/techniques exist