Week 3 (Trees and SVMs) Flashcards

1
Q

What is a decision tree and how does it divide space

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How are trees structured

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the tree structure learned

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How are trees pruned

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does a kernel function measure

A

Similarity between data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does regularised linear regression look like

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the design matrix, and what are dual parameters

A

(Dual parameters represent how important each data point is to make the decision)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a linear kernel

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the kernel trick

A

Computing the kernel without needing to compute the underlying feature map

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why don’t we need all the dual parameters?

A

Values equal to zero (common for classification) are unimportant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Gram matrix

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the closed form solution to compute the dual parameter matrix

A

(note t is the labels)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How are kernels used for regression

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do maximum margin classifiers work

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why does summing over the kernels with a new datapoint help classify it

A

Since it is either predicting +1 or -1 weighted by a, it simply gets the vote of if it is close to the points of each class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the dual representation for maximum margin classifiers

A
17
Q

What is the most popular kernel for SVM

A

RBF

18
Q

What is a soft margin and a slack variable for SVMs

A
19
Q

What does this equation mean for SVMs

A

wT w represents the margin, and the weird e is the slack variables. The subject to defines the relationship between the slack variables and the predicted values

20
Q

What is the C parameter for in SVMs

A
21
Q

How do SVMs perform multi class classification

A