decision trees Flashcards

1
Q

what is the information gain

A

generally speaking, information gain reflects the reduction in the entropy, to be more precise we are interested in the attributes that lead to higher gain when used as a split

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is the downside of the UID

A

attributes with many values lead to higher gain, but end up with a useless decision tree, hence we overcome this issue by relying on the gain ratio GainRatio(X, S)= Gain(X) / Entropy(S) where X is the label attributes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the intropy

A

the entropy reflects the uncertainty about the messageand how much information can we extract from a particular attribute

e, the receiver needs to ask log2n yes/no questions to know the message. In other words, the log2 here since the output is usually binar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly