5: Decision Trees Flashcards
What are decision trees?
Decision trees represent a group of classification techniques that are based on the construction of a tree like structure. This structure is a series of steps, where each step uses the given features one by one to help classify the input object.
Where are decision trees used?
Image processing and character recognition, medicine, financial analysis, astronomy, manufacturing, production, and molecular biology.
Are decision trees SL or UL?
SL, since they use labeled training instances to construct a classifier
How does DT work?
TOP-DOWN PROCESS. 1. Select the highest ranked feature create the decision node
2. From this node, create the branches with distinct value (range)
− If all instances of this feature value (range) are of the same class:
the child node from this branch is a leaf node
− else:
repeat step 1 and 2
What is the structure of DT?
Nodes (decision based on features), Branches (conditional statements IF), Leaves (classes)
What happens to datasets that contain more than one feature?
For a dataset that contains more than one feature, the decision tree classifier uses a ranking technique to detect their degree of importance to the given classification problem. Accordingly, the classifier selects the most salient feature for representing the root node and then the remaining features in decreasing importance for the rest of the tree nodes.
How does the complexity of decision rules affect the interpretability and size of a decision tree?
A decision tree uses a tree structure to represent decision rules, which makes it easy for experts to understand the reasons behind classifications. However, as the tree adds more rules, it needs more training data, and if there are many features, the rules become more complex. This added complexity can make the tree harder to interpret, reducing its value as a visual tool.
What is underfitting?
If the classifying model is not trained enough, the inducted decision tree is going to be too simple to classify instances accurately.
When is a DT model succesfull?
When it is able to generalize.
What are some challanges with DT?
May include branches that represent outliers or noise in the input dataset.
What are the benefits?
-easy to interpret due to the natural representation (svm and neural networks are blaxk box classifiers where the decision logic is unknown) - independent from the statistical distribution of the input data - relationship between the features and the class lables can be nonlinear
What is pruning?
- handles overfitting by decreasing the size of the tree to make it less
complex
− Method: Removing sub-trees in the decision tree that have low
classification power
What are the two types of pruning?
Pre-pruning: avoids building up the low-discriminating sub-trees while the
decision tree is being constructed, and replaces with leaf nodes
Post-pruning: removes spurious sub-trees from the fully constructed decision
tree, and replaces with leaf nodes
What are the most popular methods for DT?
ID3, C4.5 and CART (differ in feature selection and how the pruning mechanism is used)
What techniques do the mechanism use?
ID3 –> information gain, C4.5 –> gain ratio technique, CART –> Gini index technique