L18 - Support Vector Machine Flashcards
What type of model is a SVM?
Non-probabilistic Supervised Classification.
What is the goal of SVM?
Find the optimal plane through a set of data points. That being, the one with the largest margin from the plane to the support vectors.
This enables greater classification capability.
In a feature space, what do the dimensions represent?
Features. I.e if there are 5 features, there are 5 dimensions in the feature space.
How does SVM classify data?
The feature space is partitioned, each partition represents a class. The SVM places the data about the partition planes to classify it.
For example, 2 dimensions, data would be place above or below the plane for classification.
What does it mean the SVM is non-probabilistic?
The classification is based purely on features of the data point.
What does the Kernel Trick enable SVM to do?
Make classification of non-linear data
What are Support Vectors?
Data points closest to the separation plane.
They are called this since the support the classification decision boundaries.
Define the Maximum Margin Classifier…
When we classify based on the threshold that gives us the largest margin between the place and the support vectors.
What is an issue with using Max. Margin Classifier?
Prone to outliers.
What is the solution to the outlier problem?
Enable the SVM to make misclassifications.
What is the Soft Margin of the SVM?
If misclassifications are allowed, the soft margin is the margin between the support vectors and the separation plane.
But the soft margin enables misclassifications, meaning that outliers on the wrong side of the separation plane will be classified to prevent outlier impacting classification.
Thus, misclassification is allowed to enhance the general classification capability.
How do we know if we have non-linear classification?
If data overlaps in the features space.
If we have a non-linear problem, what transformation do we perform on the data?
A mapping transformation in which overlapping features are bought into higher dimensions. This means if the data is not linearly separable in one dimension, it should be separable in a higher dimensional space.
What is the name of the function that performs higher dimensionality mapping? Given an example of one
Kernel Function
Example : X mod 2
What is the issue with non-linear SVM?
Computational cost of operations increase greatly when they have to be performed in higher dimensions.
This is infeasible for large data.