Lecture 3 Notes Flashcards
What game is a Decision Tree similar to?
20 questions
What type of problem does a Decision Tree algorithm solve?
Classification
What does each non-leaf node in a Decision Tree represent?
A query about some feature
What happens when a leaf node is reached in a Decision Tree?
It is a label
What is the average number of questions in a Decision Tree?
∑ (questions weighted by their probabilities)
What is a common issue with large Decision Trees?
They tend to overfit, smaller trees are better
What should be done if all samples in a Decision Tree have the same data?
Make a leaf node with the same label
What to do if no samples answer a question or features are exhausted in a Decision Tree?
Use default case / most common label
What is the Smart Choose function in Decision Trees?
Look for a function where a large proportion of samples fall in homogeneous groups
How is impurity measured in a Decision Tree?
n diff kinds of labels and set of samples X. dividing X by label gives X1 -> Xn.
What is the probability of sampling for a label in Decision Trees?
P(k) = |xk|/|X| for each label k
What is the Remainder in the context of Decision Trees?
Amount of uncertainty left in data set after Splitting
What can metric attributes in Decision Trees ask about?
If higher or lower than a number
(|L|/|X|) Impurity (L) + (|G|/|X|) Impurity G
What is the bias in Decision Trees regarding space?
Bias towards rectangular space
Impurity & Gini function
Information Entropy: − ∑ ( P(i) lg P(i) )
Gini: 1 - ∑ P(i)^2
Information Gain, formula and what is it
subtract entropy of whole - remainder, says how good the split was, Higher is better.