Lec2 - Decision Trees Flashcards

1
Q

What is Decision Tree Learning? Describe the Decision Tree algorithm.

A

Decision Tree Learning is a method for approximating discrete classification functions ny means of a tree-based representation.

Algorithm:

  1. Search for a split point (or attribute) using a statistical test for each attribute to determine how well it classifies the training examples when considered alone.
  2. Split the dataset according to the spit point.
  3. Repeat 1 and 2 on each of the created subsets.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Give the names of 3 statistical tests for Decision Tree Learning.

A
  1. Information Gain
  2. Gini Impurity
  3. Variance Reduction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Information Entropy? Give it’s formula.

A

Entropy is a measure of the uncertainty of a random variable.

E = -Sum_c[ y_c * log(y_c)]

where c = classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe two different ways of splitting for Decision Tree Learning.

A

Ordered Values:
For each feature: start by sorting the values of the attribute, and then consider only split points that are between two examples in sorted order that have different classifications.

Symbolic Values:
Search for the most informative feature and then create as many branches as there are different values for that feature.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Compute the Information Gain for the following table and say which feature split would be the best.

H(Sunny) = 0.971

https://drive.google.com/open?id=1qkbSHoStEtFabV_3fpgmSeAm4LBgFQEE

A

Solution:

https://drive.google.com/open?id=1wteJEaM3xp0-MLoKsAdrljFgvZEWRCxO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is overfitting, how can you deal with it, and what is a common way of dealing with it for decision trees?

A

Overfitting is when the estimated function learns to fit the training data perfectly, such that it is hard for it to generalise to unseen data.

To deal with it we can split our data to a training and validation set, use the training set for training and the validation set to measure performance, and stop when the error on it starts to increase.

Common approach with decision trees is to do pruning. Recursively go through nodes only connected to leaves and check if the accuracy would increase if that node was converted to a leaf node.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly