3 Predictive Analytics – Decision Trees Flashcards

1
Q

Question 1
Level: easy
Which statement is TRUE?
a) Estimating class probabilities in the leaf nodes of a classification tree is a better approach than majority voting for making the splitting decision in learning a decision tree, since it allows to differentiate among the instances that are classified in the same class.
b) In addition to learning interaction effects, decision trees can easily mimic linear relations or decision boundaries; therefore, they typically outperform linear methods.
c) Decision trees can be transformed into a set of decision rules by following the various paths from the root node to the leaf nodes.
d) The information gain criterion allows to select the split that maximizes the decrease in impurity, by comparing the impurity of the parent node and the average impurity of the child nodes.

A

c) Decision trees can be transformed into a set of decision rules by following the various paths from the root node to the leaf nodes.

a)majority voting / class prediction is for assignmet task
splitting decision uses information gain or gini imputrity
b) Decision trees are not inherently good at capturing linear relationships or decision boundaries.
They are more suitable for capturing complex, non-linear relationships and interactions in the data.
Linear methods may outperform deicision trees in such scenarios.
d) almost correct. The information gain criterion is commonly used in decision tree algorithms, such as ID3 and C4.5.
It aims to find the split that maximizes the reduction in impurity by comparing the impurity of the parent node with the WEIGHTED sum of impurities in the child nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Question 2
Level: easy
Which statement is FALSE?
a) Decision tree learning methods such as C4.5 and CART adopt a greedy approach for making the splitting decision and therefore may not learn the globally optimal tree for the training data
b) Decision trees can learn and incorporate both linear relations and interaction effects.
c) Regression trees can both minimize the error within leaf nodes and maximize the difference between predicted values of leaf nodes.
d) A greedy approach is adopted for learning decision trees, because the number of possible trees grows exponentially with the number of variables in the dataset.

A

b) Decision trees can learn and incorporate both linear relations and interaction effects.

a)TRUE greedy approach= locally optimal choice at each stage
b) A decision tree cannot learn and incorporate linear (i.e., proportional) relations!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Question 1
Which three types of decisions are to be made by decision tree learners?

A

-splitting
- evaluation
- adignment decision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Question 2
Explain the use of the gini measure, majority voting and tuning in learning a decision tree.

A

gini (spliiting decision) is a impurity measure how often a randomly chosen element would be incorrectly classified.
The split that minimizes the Gini impurity is chosen at each node

majority voting (assignment task): predicting the class label that is most prevalent in that leaf node.

tuning hyperparameters (evaluation decision)
adjusts hyperparameters to optimize the performance of the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Question 3
What is overfiting? How is overfitting avoided by early-stopping?

A

overfitting= Memorizing the data set
earlys-topping =stops training when parameter updates no longer begin to yield improves on a validation set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Question 4
Explain the steps in the ID3 decision tree algorithm with standard deviation reduction.

A
  • Step 1: The standard deviation of the target is calculated.
  • Step 2: The dataset is split on the different attributes
  • Step 3: The attribute with the largest standard deviation reduction is chosen for the decision node.
  • Step 4: The dataset is divided based on the values of the selected attribute. This process is run recursively on the non-leaf branches, until all data is processed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly