Decision Trees Flashcards
What role does Entropy play?
Controls how the DT splits the data. It’s the measure of impurity in a bunch of examples. Impurity being how uniform are the classes in the example set.
What is the formula for Entropy?
Entropy = Sum(i) { p(i) * log2(p(i)) }, where p(i) = fraction of examples in class i, and sum(i) sums over all classes.
What is the entropy of all examples being same class?
0
What is information gain?
entropy(parent) - [weighted average]*entropy(children)
How does the decision tree utilize information gain?
It maximizes information gain to determine the splits.
Give intuitive explanation for how to remember bias
I can train the model with all sorts of data but it’s bias towards it’s original behavior and doesn’t change
Give intuitive explanation for how to remember variance
It cares so very much about the data it’s being trained on and will change it’s behavior to match it’s behavior to whatever data it sees
What are DT strengths and weaknesses?
Strengths: Easy to use, graphically interpretable (knowledge representation), can build bigger classifiers from them with ensemble methods
Weaknesses: Prone to overfitting especially with lots of features,
Give an example of remembering xor logic gate
When someone asks do you want to go to the movie or bowling. usually they mean xor as in pick one or the other but not both and not neither
Decision tree space - compare xor and or
xor - exponential space for nodes
or - linear as you add nodes
What is Inductive Bias
The inductive bias of a learning algorithm is the set of assumptions that the learner uses to predict outputs given inputs that it has not encountered. A classical example of an inductive bias is Occam’s Razor, assuming that the simplest consistent hypothesis about the target function is actually the best.
What is Preference Bias?
A preference bias is when a learning algorithm incompletely searches a complete hypothesis
space. It chooses which part of the hypothesis space to search. An example is decision trees
What is Representation Bias?
A representation bias completely searches and incomplete hypothesis space. It searches the
whole space, but it is a small incomplete space.