Lecture 12 - Support Vector Machines & Kernel methods Flashcards
What is a Support Vector Machine (SVM)?
An SVM is a supervised machine learning model used for classification and regression tasks. It aims to find the hyperplane that best separates data into classes by maximizing the margin between data points closest to the hyperplane (support vectors).
How does SVM find the optimal separating hyperplane?
It maximizes the margin (πΎ) between the closest data points of each class while satisfying: rtβ(wTxtβ+w0β)β₯1
Objective: Minimize 1/2β β£β£wβ£β£^2 .
What are support vectors?
Support vectors are the data points closest to the hyperplane. They define the margin and are critical for determining the decision boundary.
Mnemonic: βSupport Vectors Support the Hyperplane!β
How does SVM handle non-linearly separable data?
SVM uses soft margin by introducing slack variables (ΞΎt) to allow misclassifications.
Lpβ=1/2 ββ£β£wβ£β£^2 + C tββΞΎtβ
What is the kernel trick?
Transforms data into a higher-dimensional space for linear separation using a kernel function without explicitly calculating the transformation.
What is hinge loss in SVM?
Hinge loss penalizes misclassifications and points within the margin:
Lhingeβ=max(0,1βrtβ(w^Txtβ+w0β))
What is the role of πΆ in SVM?
High πΆ: Focuses on minimizing misclassification; smaller margin (low bias, high variance).
Low πΆ: Allows more misclassification; larger margin (high bias, low variance).
What are the common kernel types?
Linear Kernel: πΎ (π₯,π¦)= π₯^ππ¦
Polynomial Kernel: πΎ (π₯,π¦) = ( π₯^π π¦ + π ) ^π
RBF Kernel: πΎ (π₯,π¦) = exp β‘ ( β πΎ β£β£ π₯ β π¦ β£β£ ^2 )
Custom Kernels: Designed for specific tasks like strings or graphs.
How does SVM handle multiclass classification?
One-vs-All: Train πΎ classifiers for πΎ classes.
One-vs-One: Train πΎ ( πΎ β 1 ) / 2 classifiers for πΎ classes.
Why is SVM a convex optimization problem?
The SVM objective function is convex, ensuring a global minimum solution.