Support Vector Machines Flashcards

Question 1

Q

How do you decide which resultant classification is best? Why does this occur?

Answer

A

There are multiple possible separation boundaries, each represented by local minima but there is often a minima that is better than others because it:
> Correctly classifies all points
> Is as far away from all points as possible
[Picture 9]

Question 2

Q

What are the classes that we use in support vector machines?

Question 3

Q

How can we express that a point is correctly classified (using an equation)?

Answer

A

t(w^Tx + w_0) ≥ 0

Question 4

Q

What happens when t(w^Tx + w_0) = 0 ?

Answer

A

This means that the point is on the decision boundary. This means that the classifier is undecided. This situation should be avoided

Question 5

Q

How can we express that all points are correctly classified and none are on the decision boundary? how can this equation be transformed?

Answer

A

t(w^Tx + w_0) ≥ ε
ε > 0
We can divide both sides by ε and set the inequality to any constant that we want (example 1)
t(w^Tx + w_0) ≥ 1

Question 6

Q

Why can the equations:
t(w^Tx + w_0) ≥ ε
ε > 0 
become: 
t(w^Tx + w_0) ≥ p
with p being any positive number?

Answer

A

Becayse we can just rescale the weights to whaever we want

Question 7

Q

What is the margin?

Answer

A

The distance between the closest point to the separation boundary and the boundary itself
[Picture 10]

Question 8

Q

What do we want from the magin?

Answer

A

We want to maximise it to get the best classification

Question 9

Q

How do we calculate the distance of points to the hyperplane? How do we apply this to the margin?

Answer

A

x = x_p + d × w/||w||
w^Tx + w_0 = d||w||
t(w^Tx + w_0) / ||w|| ≥ |d|

Question 10

Q

Methematically, how do we maximise the margin?

Answer

A

By minimising ||w||

1/||w|| ∝ |d|

Question 11

Q

What is the support vector formulation?

Answer

A

Minimise: 0.5||w||^2
(makin the margin as large as possible)
Subject to the contraints:t_i(w^Tx_i + w_0) ≥ 1
(Every point is on the correct side, no point is on the hyperplane, this must be verified by every single point)

Question 12

Q

Which points matter for SVM?

Answer

A

Only the closest points

Question 13

Q

What is a data set limitation of support vector formulation? Why is this a problem? What can be do?

Answer

A

The equation will only give you the answer if the data set is exactly linearly separable but if there is noise in the data set then this may be a problem. We can make exceptions but we want as few as possible

Question 14

Q

What is it called when you allow some data points to be wrongly classified?

Question 15

Q

What is the equation for including slack in SVM? What is the additional constraint?

Answer

A

t_i (w^Tx_i + w_0 ) ≥ 1 - ξ_i

ξ_i ≥ 0

Question 16

Q

What happens to the value of ξ_i on different sides of the margin? What is it a measure of?

Answer

Study These Flashcards

A

Correct side: ξ_i = 0
Incorrect side: ξ_i > 0
It is a measure of how much we are violating the constraint

Question 17

Q

What is the equation that we want to minimise to reduce the number of misclassifications (including the weight of violations)?

Answer

Study These Flashcards

A

Minimise: ||w||^2 + C × ∑ξ_i
C = weight of the violations
> C = 0 means we are ignoring all misclassified points
> C = ∞ means that we have a hard magin and do not allow any misclassified points

Question 18

Q

What happens if the points are not linearly separable in their normal dimensions?

Answer

Study These Flashcards

A

We can increase the number of dimensions so that they are linearly separable

Question 19

Q

How do we expand the number of dimensions?

Answer

Study These Flashcards

A

We find a basis and expand each point. The equations remain the same but are instead in vector form

Support Vector Machines Flashcards

(19 cards)