Lecture 4: Decision Trees and k-means clustering Flashcards
What is A=>B
A implies B
What is Deduction?
Conclusion follows necessary from the premi
From A=>B and A, we conclude that B
Example:
“all men are mortal”
socrates is mortal
Abduction
Conclusion is one hypothetical (most probable) explanation for the premises
From A ⇒ B and B, we conclude A
Ex:
Drunk people do not walk straight.
John does not walk straight.
John is drunk.
Not sound… but may be most likely explanation for B
Induction
Conclusion about all members of a class from the examination of only a few member of the class.
From A ∧ C ⇒ B and A ∧ D ⇒ B, we conclude A⇒B
We construct a general explanation based on a specific
case.
Ex:
All CS students in COMP 472 are smart.
All CS students on vacation are smart.
All CS students are smart.
Not sound
But, can be seen as hypothesis construction or generalisation
What is Inductive Learning?
= learning from examples
Most work in ML
Examples are given (positive and/or negative) to train a
system in a classification (or regression) task
Given a new instance X you have never seen
You must find an estimate of the function f(X) where f(X) is
the desired output
What is the framework for inductive learning
Input data are represented by a vector of features (attributes), X
Each vector X is a list of (attribute, value) pairs.
Ex: X = [nose:big, teeth:big, eyes:big, moustache:no]
The number of attributes is fixed (positive, finite)
Each attribute has a fixed, finite number of possible values
Each example can be interpreted as a point in a n-dimensional feature space
where n is the number of attributes
What are 3 common techniques in Machine Learning?
Probabilistic Methods
ex: Naïve Bayes Classifier
Decision Trees
Use only discriminating features as questions in a big if-then-else
tree
Neural networks
Also called parallel distributed processing or connectionist systems
Intelligence arise from having a large number of simple
computational units
How does a decision tree work?
Look for features that are very good indicators of the result, place these features (as questions) in nodes of
the tree
Split the examples so that those with different values for the chosen feature are in a different set
Repeat the same process with another feature
How to select attribute in a decision tree?
search the space of all decision trees
always pick the next attribute to
split the data based on its
“discriminating power”
(information gain)
What are the 4 different factors that quanitify the size of tree?
Number of leaves
Height of the tree
External Path Length
Weighted External Path Length
What is height of a tree?
Longest path in the tree from the root to a leaf
What is External Path length?
Start at leaf, go up to the root and count the number of
edges
Do this for every leaf and add up the numbers
What is weighted external path length?
Weighted External Path Length
Idea: not all paths are equally important/likely
Use the training data to computed a weighted sum
What is the equation for entropy?
H(X) = -(summation from i = i to n) [p(xi)log2p(xi)]
where n = possible outcomes
Formula to choose the best feature
gain(S,A) = H(S) - H(S|A)
= H(S) - (summation) |Sv|*H(Sv) / |S|