Chp 6 Geron Flashcards
Leaf node
Has no child nodes
Node samples
Counts how many training instances it applies to
Node value
Tells you how many training instances of each class this node applies to
Node Gini
Measures its impurity
Pure=Gini=0
Binary trees
Nonleaf nodes, always 2 children
Id3
Produce decision tree nodes with more than 2 children
P
Set of problems that can be solved in polynomial time
NP
Set of problems whose solutions can be verified in polynomial time
NP-hard
A problem to which any NP problem can be reduced in polynomial time
NP-complete
Both NP and NP-hard
Information gain
Reduction of entropy
Nonparametric model
Number of parameters is not determined prior to training, model structure is free to stick closely to the data —overfitting
Parametric model
Pre-determined number of parameters, degree of freedom is limited, reducing risk of overfitting, but increasing risk of under fitting
Regularization
Restricting decisions tree’s freedom during training
Helps to avoid overfitting
Min_samples_spilt
Minimum umber of samples a node effect it can be spilt