5 - Supervised Learning, Classification Flashcards
Classification: Assign instances to predefined classes
Data basis:
- several (independent) attributes
- one (dependent) attribute
Condition:
- a priori knowledge of classification for some instances (supervised learning)
Model building:
- generate rules from classified instances
- > first: generate the best fit
- > then: prune based on validation set
Generalization:
- apply rules to new instances
Classification: Assign instances tp predefined classes
Exemplary methods
- logistic regression
- support vector machines
- decision trees, regression trees
- random forest
- neural networks, nearest neighbor
Classification Examples
Can you think of binary vs. nominal classes for these examples?
- credit scoring
- marketing responses
- geo-temporal events
Credit scoring:
- nominal
Marketing responses:
- binary: response vs. no response
Geo-Temporal events:
- can be both
Decision Tree Terminology
Which types of trees are there?
Binary tree:
- each node splits data at most in 2 sets
Classification tree:
- splits can lead to more than 2 branches
Decision tree:
- classes are nominal (categorical) or ordinal
Regression tree:
- classes are cardinal (continuous values)
Decision Tree Terminology
Input
instance pool ((x1, …, xn), c)
with x = set of independent attributes c = class attribute
Decision Tree Terminology
Output
Full tree
Decision Tree Terminology
Objective
Formulate rules of the type:
If (condition 1) AND … AND (condition n) then c
Decision Tree Terminology
Rule
Path from root to leaf
Generating a decision tree
Algorithm steps
- Start: all objects are in a single node
- search for the best classification criterion
- classify all objects according to this criterion
- Recursively apply steps 2 and 3 until stop
- Go back and prune the tree
Generating a decision tree
Algorithm design varies in …
- stop criteria: e.g. number of instances per class, tree depths, homogeneity measurements (“Gini Index”)
- pruning strategy
- choice of attributes as classification criterion (split quality)
- number of splits per node
- scales of measurement
Which decision tree algorithms are there?
(CH)AID
- (chi-squared) automatic interaction detection
CART
- classification and regression trees
ID3
- iterative dichotomizer
Which decision tree algorithms are there?
(CH)AID
(chi-squared) automatic interaction detection
- objective: find significantly different subsets of data
- select attributes that generate significantly different subsets
Which decision tree algorithms are there?
CART
Classification and regression trees
- objective: maximize the information content I
- select attributes that split the data with the best success quota
- only binary trees
Which decision tree algorithms are there?
ID3
Interactive dichotomizer 3
- objective: minimize entropy
- split on attribute that produces subsets with minimal entropy
ID3: classification by entropy
- Compute the entropy (i.e. the homogeneity) of a given set of instances according to the target attribute c
- entropy measures the homogeneity according to c