Lecture 6 Flashcards
What causes irregular boundries?
Irregular distribution, imbalanced training sizes, and outliers
What causes misclassifications?
unoptimized decision boundries
Support Vectors
a subset of vectors that support or determine the boundry
What is the goal of support vector machines?
to learn a boundary that leads to the largest margin (buffer) from points on both sides
What points of the data set have an influence on the decision boundry (when using an svm)?
Only the support vectors. any point that isn’t a support vector has no influence and can be messed with while having no affect on the decision boundary.
What data points do SVMs use to compute predictions?
Only the support vectors, not the whole training set
What dimension is a decision boundary for a dataset with 2 features?
It is the third dimension and creates a 2D plane
Where is the decision function equal to zero?
On the decision boundary
When are inputs labeled as ‘undefined’?
When they lie between the margins i.e. between -1 and 1
What is the goal of decision functions for SVMs?
to maximize the margin between the data points and the hyperplane
How can optimal values of w and b be found?
through optimization via projective gradient descent
What do your graph and results look like when parameters w and b are optimized?
The algorithm correctly classifies the training examples and the margin is maximized
In a margin-based classifier, what happens to the margin when the weight vector w gets smaller?
The margin gets larger
In hard margin SVMs, do we minimize ‘1/2 abs(w)^2’ or ‘abs(w)’ and why?
1/2 abs(w)^2 because abs(w) is not differentiable at w = 0
What do you do when the data is not linearly separable?
Introduce slack variables and allow “error” in classification
What does the data have to look like for you to not be able to use a hard margin SVM?
A margin can’t cleaning split the data without leaving, for example, a blue piece on the red side. In other words, if blue pieces separate a red piece from the other red pieces.
What are the two conflicting elements of soft margin SVMs?
- making the slack variables as small as possible to reduce the margin violations and
- making w^T * w as small as possible to increase the margin
What does a kernel function map?
It maps the low dimensional data to high dimensional space
What type of function is the kernel function?
a similarity function
What’s the relationship between the weight vector w and the margin?
The smaller the weight vector w, the larger the margin
When using RBF kernel in SVM what does a high Gamma value signify?
The model would consider only the points close to the hyperplane for modeling.
Gamma parameter
The gamma parameter in SVM tuning signifies the influence of points either near or far away from the hyperplane. For a low gamma, the model will be too constrained and include all points of the training dataset, without really capturing the shape. For a higher gamma, the model will capture the shape of the dataset well.
Disadvantages of SVMs
It is not suited to larger datasets as the training time with SVM’s can be high
It is less effective on noisier datasets with overlapping classes
It was originally designed as a 2-class classifier