W05 Supervised Learning Flashcards
Classification:
data basis
several independent attributes
one dependent attribute, the class
Classification:
condition
a priori knowledge of classification for some instances (supervised learning!)
Classification:
model building
generate rules from classified instances
first: generate best fit
then: prune based on validation set
Classification:
generalization
apply rules to new instances
Classification:
methods
logistic regression naive bayes classifier support vector machines decision trees random forest neural networks nearest neighbour
Decision Tree Terminology:
Binary Tree
each node splits data at most in 2 sets
Decision Tree Terminology:
Classification Tree
split can lead to >2 branches
Decision Tree Terminology:
Decision Tree
Nominal (categorical) Classes
Decision Tree Terminology:
Regression Tree
Cardinal Classes
Decision Tree Terminology:
Input
Instance pool
Decision Tree Terminology:
Output
Full Tree
Decision Tree Terminology:
Objective
Formulate rules of type:
If condition 1-n, THEN condition n
Decision Tree Terminology:
Rule
Path from root to leaf
Generating a decision tree algorithm
1 all objects in single node 2 search for best classification criterion 3 classify all objects accordingly 4 recusively apply 2+3 until STOP 5 prune tree
Classifcation algorithms variety
1 stop criteria 2 pruning strategy 3 choice of attributes as classification criterion 4 number of splits per node 5 scale of measurement