Lecture 12 - Support Vector Machines & Kernel methods Flashcards

1
Q

What is a Support Vector Machine (SVM)?

A

An SVM is a supervised machine learning model used for classification and regression tasks. It aims to find the hyperplane that best separates data into classes by maximizing the margin between data points closest to the hyperplane (support vectors).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does SVM find the optimal separating hyperplane?

A

It maximizes the margin (𝛾) between the closest data points of each class while satisfying: rt​(wTxt​+w0​)β‰₯1
Objective: Minimize 1/2​ ∣∣w∣∣^2 .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are support vectors?

A

Support vectors are the data points closest to the hyperplane. They define the margin and are critical for determining the decision boundary.

Mnemonic: β€œSupport Vectors Support the Hyperplane!”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does SVM handle non-linearly separable data?

A

SVM uses soft margin by introducing slack variables (ΞΎt) to allow misclassifications.

Lp​=1/2 β€‹βˆ£βˆ£w∣∣^2 + C tβˆ‘β€‹ΞΎt​

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the kernel trick?

A

Transforms data into a higher-dimensional space for linear separation using a kernel function without explicitly calculating the transformation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is hinge loss in SVM?

A

Hinge loss penalizes misclassifications and points within the margin:
Lhinge​=max(0,1βˆ’rt​(w^Txt​+w0​))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the role of 𝐢 in SVM?

A

High 𝐢: Focuses on minimizing misclassification; smaller margin (low bias, high variance).
Low 𝐢: Allows more misclassification; larger margin (high bias, low variance).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the common kernel types?

A

Linear Kernel: 𝐾 (π‘₯,𝑦)= π‘₯^𝑇𝑦

Polynomial Kernel: 𝐾 (π‘₯,𝑦) = ( π‘₯^𝑇 𝑦 + 𝑐 ) ^𝑑

RBF Kernel: 𝐾 (π‘₯,𝑦) = exp ⁑ ( βˆ’ 𝛾 ∣∣ π‘₯ βˆ’ 𝑦 ∣∣ ^2 )

Custom Kernels: Designed for specific tasks like strings or graphs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does SVM handle multiclass classification?

A

One-vs-All: Train 𝐾 classifiers for 𝐾 classes.

One-vs-One: Train 𝐾 ( 𝐾 βˆ’ 1 ) / 2 classifiers for 𝐾 classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is SVM a convex optimization problem?

A

The SVM objective function is convex, ensuring a global minimum solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly