E2 Flashcards
Target Variable
The variable whose values are to be modeled and predicted by other variables
Supervised Segmentation
Split the set into disjoint subsets,
where all the individuals within a subset show the same value for a selected attribute
The process of recursively segmenting the population stops when at least one of the following conditions is met:
– All elements of a segment belong to the same class
– The maximum allowed tree depth has been reached
– Using more attributes does not “help”
Entropy
measures the general disorder (unpredictability) of a set
Entropy is zero at…
minimum disorder
-> all members belong to the same class
Entropy is one at…
maximal disorder
-> members equally distributed among classes
Information gain (IG)
measures the change in entropy due to any amount of new information being added