Lecture 3 Notes Flashcards

1
Q

What game is a Decision Tree similar to?

A

20 questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of problem does a Decision Tree algorithm solve?

A

Classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does each non-leaf node in a Decision Tree represent?

A

A query about some feature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What happens when a leaf node is reached in a Decision Tree?

A

It is a label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the average number of questions in a Decision Tree?

A

∑ (questions weighted by their probabilities)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a common issue with large Decision Trees?

A

They tend to overfit, smaller trees are better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What should be done if all samples in a Decision Tree have the same data?

A

Make a leaf node with the same label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What to do if no samples answer a question or features are exhausted in a Decision Tree?

A

Use default case / most common label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Smart Choose function in Decision Trees?

A

Look for a function where a large proportion of samples fall in homogeneous groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is impurity measured in a Decision Tree?

A

n diff kinds of labels and set of samples X. dividing X by label gives X1 -> Xn.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the probability of sampling for a label in Decision Trees?

A

P(k) = |xk|/|X| for each label k

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Remainder in the context of Decision Trees?

A

Amount of uncertainty left in data set after Splitting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What can metric attributes in Decision Trees ask about?

A

If higher or lower than a number
(|L|/|X|) Impurity (L) + (|G|/|X|) Impurity G

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the bias in Decision Trees regarding space?

A

Bias towards rectangular space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Impurity & Gini function

A

Information Entropy: − ∑ ( P(i) lg P(i) )
Gini: 1 - ∑ P(i)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Information Gain, formula and what is it

A

subtract entropy of whole - remainder, says how good the split was, Higher is better.