Lecture 11 - Decision Tree Induction Part 1 Flashcards

Question 1

Q

What is Induction?

Answer

A

Learning by generalizing from examples or experiences.

Question 2

Q

Induction can be contrasted with

Answer

A

Deduction

Question 3

Q

Repeatedly adding pairs of odd numbers and noticing the result is always even is an example of

Answer

A

induction

Question 4

Q

Constructing a mathematical proof that two odd numbers added will always be even is

Answer

A

deduction

Question 5

Q

What is a decision tree?

Answer

A

A tree in which each leaf is a decision

Each non-leaf is an attribute

Each branch is a value that the attribute parent can take

Question 6

Q

What is decision tree induction?

Answer

A

A procedure that attempts to use a training set of data to build a decision tree that will correctly predict the results of any unclassified data

Question 7

Q

What is a training set, in decision tree induction?

Answer

A

Set of classified examples/samples

Very small fraction of population (usually)

Question 8

Q

What is a classified example/sample in decision tree induction?

Answer

A

Vector of attributes and values, and the class

e.g:

(Skin Covering = Feathers, Beak = Straight, Teeth = None, Class = Heron)

Question 9

Q

What is the basic decision tree induction procedure?

Answer

A

Function buildDecTree(examples, atts)

Create node N if necessary

If examples are all in same class return N labelled with that class

If atts is empty return N labelled with modal example class

bestAtt = chooseBestAtt(examples, atts)

label N with bestAtt

for each value ai of bestAtt

si = subset of examples with bestAtt = ai

if si is not empty then

newAtts = atts - bestAtt

subtree = buildDecTree(si, newatts)

attach subtree as child of N

else

create leaf node l

label l with modal example class

attach l as child of N

return N

Question 10

Q

What is the “best attribute” in decision tree induction?

Answer

A

The attribute that best discriminates the examples with respect to their classes

Question 11

Q

What is the standard way of discriminating examples in decision tree induction?

Answer

A

Information gain

Question 12

Q

What is Shannon’s Information Function?

Write the formula for information gain with equiprobable outcomes

Answer

A

A function that determines the number of bits of information gained after an outcome

Information = log₂(N)

where N is the number of possible outcomes

or Information = -log₂(p)

where p is the probability of any of the equiprobable outcomes

Question 13

Q

What is the formula of Shannon’s Information Function for non-equiprobable outcomes?

Question 14

Q

Information is also sometimes known as ________ or ________

Answer

A

Uncertainty

Entropy

Question 15

Q

With the training set

| red | blue

class1 | 63 | 7

class2 | 6 | 24

Calculate the information gain from knowing the color

Answer

A

uncertainty_nocolor = -0.69*log(0.69)-0.31*log(0.31) = 0.893

uncertainty_red = -(63/69)*log(63/69) - (6/69)*log(6/69) = 0.426

uncertainty_blue = -(7/31)*log(7/31) - (24/31)*log(24/31) = 0.771

uncertainty_colour = 0.69*0.426 + 0.31*0.771 = 0.533

informationgain_color = 0.893-0.533 = 0.36

Lecture 11 - Decision Tree Induction Part 1 Flashcards

(15 cards)