SVN - Support Vector Machines Flashcards
What is a maximal margin classifier
Its where the decision boundary is drawn such that there is the maximum “margin” between the boundary and the observations. If a decision boundary could perfectly cut across 2 classes, there would be an infinite number of possible boundaries, b/c you can shift the slope of the boundary up or down by a tiny amount. This chooses the boundary such that there is the most “space” between the observations and the boundary.
What is a support vector
There are “vectors” that are perpendicular to the maximal margin classifier decision boundary. The vectors originate from the closest points on either side of the boundary and end at the boundary. The boundary ONLY relies on the location of the points near the boundary b/c if these points change than the boundary will change to achieve the maximum margin. That’s why these are called “support” vectors b/c they “support” the boundary
Support Vector Classifier
Is like the Maximal Margin Classifier, but allow some points to be on the wrong side of the margin in order to achieve lower variance. There is a tuning parameter C - which is a “budget” for how many observations can be on the wrong side of the boundary.
Tuning parameter of Support Vector Classifier
C - is a “budget” for the number of observations allowed to be on the wrong side of the boundary. Choose this through CV.
Support Vector Machines Intuition
Is like a Support Vector Classifier, but generalized to support non-linear decision boundaries. Utilizes something called “Kernels” which are like shape of functions (Polynomial, radial)
Radial Kernel
an SVM option that can draw a circular decision boundary around a set of points. This method has a tuning parameter that dictates how non linear it is.
Tuning Parameter to SVM
Lambda - higher the lambda, the more slack the algo has to allow misclassification.
SVM vs. Logistic Regression
They are related (I don’t understand how)
When classes are well separated, SVMs tend to behave better than logistic regression. When there is lots of overlap b/w clases logistic wins.