Lecture 11 - Decision Tree Induction Part 1 Flashcards

1
Q

What is Induction?

A

Learning by generalizing from examples or experiences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Induction can be contrasted with

A

Deduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Repeatedly adding pairs of odd numbers and noticing the result is always even is an example of

A

induction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Constructing a mathematical proof that two odd numbers added will always be even is

A

deduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a decision tree?

A

A tree in which each leaf is a decision

Each non-leaf is an attribute

Each branch is a value that the attribute parent can take

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is decision tree induction?

A

A procedure that attempts to use a training set of data to build a decision tree that will correctly predict the results of any unclassified data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a training set, in decision tree induction?

A

Set of classified examples/samples

Very small fraction of population (usually)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a classified example/sample in decision tree induction?

A

Vector of attributes and values, and the class

e.g:

(Skin Covering = Feathers, Beak = Straight, Teeth = None, Class = Heron)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the basic decision tree induction procedure?

A

Function buildDecTree(examples, atts)

Create node N if necessary

If examples are all in same class return N labelled with that class

If atts is empty return N labelled with modal example class

bestAtt = chooseBestAtt(examples, atts)

label N with bestAtt

for each value ai of bestAtt

si = subset of examples with bestAtt = ai

if si is not empty then

newAtts = atts - bestAtt

subtree = buildDecTree(si, newatts)

attach subtree as child of N

else

create leaf node l

label l with modal example class

attach l as child of N

return N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the “best attribute” in decision tree induction?

A

The attribute that best discriminates the examples with respect to their classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the standard way of discriminating examples in decision tree induction?

A

Information gain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Shannon’s Information Function?

Write the formula for information gain with equiprobable outcomes

A

A function that determines the number of bits of information gained after an outcome

Information = log2(N)

where N is the number of possible outcomes

or Information = -log2(p)

where p is the probability of any of the equiprobable outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the formula of Shannon’s Information Function for non-equiprobable outcomes?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Information is also sometimes known as ________ or ________

A

Uncertainty

Entropy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

With the training set

  • | red | blue

class1 | 63 | 7

class2 | 6 | 24

Calculate the information gain from knowing the color

A

uncertainty_nocolor = -0.69*log(0.69)-0.31*log(0.31) = 0.893

uncertainty_red = -(63/69)*log(63/69) - (6/69)*log(6/69) = 0.426

uncertainty_blue = -(7/31)*log(7/31) - (24/31)*log(24/31) = 0.771

uncertainty_colour = 0.69*0.426 + 0.31*0.771 = 0.533

informationgain_color = 0.893-0.533 = 0.36

How well did you know this?
1
Not at all
2
3
4
5
Perfectly