Support Vector Machines Flashcards

1
Q

Define a linear classifier margin

A

the width that the boundary could be increased by before touching a datapoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the margin of the Linear SVM

A

The maximum margin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are support vectors

A

The datapoints that the margin pushes up against

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Advantages of Maximum Margin

A
  • if the boundary is marginally misplaced, this gives us the least chance of misclassification
  • Empirically, this works very well
  • model is immune to removal of any non-support-vector data points
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The hyperplane: wx + b = 0, is fully determined by ?

A

(w, b)

w = Weight Vector, b = bias term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

w is perpendicular to the plus and minus planes, how does this help us calculate the margin width?

A

We know that the distance between two points on opposite planes will be w multiplied by a constant

w . x+ + b = +1
w . x- + b = -1
x- = x+ + λw
|x+ - x-| = M
M = 2 / ||w||

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Problems with Maximum Margin

A
  • The solution can change drastically if there is an outlier
  • no solution if the classes are not linearly separable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the general idea of the Soft Margin SVM

A
  • “Relax” the formulation to allow points to be on the “wrong” side.
  • Penalize points according to how far they are on the wrong side
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How well does Soft Margin SVM do on unseen data?

A
  • Depends on the training error and the number of support vectors
  • When the number of support vectors is small, we can be sure that the generalization error is not much higher than the training error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a vector

A

an object that has both a magnitude and a direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is a vector’s norm

A

the magnitude, or length, of a vector.
Denoted ||x||

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a vector norm and How is it computed

A

It is the magintude, or length, of a vector.
Calculated using Euclidean norm formula;
Square root of the sum of the squared points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Give the vector that denotes the direction of a vector

A

W = (cos(θ), cos(α))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Dot product of n-dimensional vectors (formula)

A

x ⋅ y = SUM(i=0, n) xiyi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What separates data in a) one dimension b) two dimensions c) three dimensions

A

point
line
plane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe the kernel trick used in SVM and what is the mathematical reasoning that makes it work?

A

Applying a transformation ∅ to all data points before running the algorithm to find non-linear patterns
A function K(⋅, ⋅) is a kernel if there exists a function ∅(⋅) s.t.
K(xi, xj) = ∅(xi) ⋅ ∅(xj)

17
Q

Why is the kernel trick important?

A

We get the advantage of using a lot of non-linear features without the computational price

18
Q

What do SVMs learn and what approach is this called

A

SVMs learn the discrimination boundary
They are called discriminatory approaches

19
Q

What are the 3 key ideas of SVMs

A
  1. Use optimisation to find solution with few errors
  2. seek large margin separator to improve generalisation
  3. use kernel trick to make large feature spaces efficiently (computationally)