Supervised Non-Linear Models Flashcards
Some notes from the lectures covering Supervised Non-Linear Models that may help in the final exam.
What do Non-Linear algorithms assume?
Non-Linear Algorithms assume a non-linear relationship between x and y.
What are the 4 common Non-Linear models?
- K-Nearest Neighbour
- Kernel SVM
- Decision Trees
- Neural Networks
How does K-Nearest Neighbour work?
- Decide on the value of K, which determines the amount of neighbours in consideration.
- Calculate the Euclidean distance between the primary point you’re analysing, and the K nearest neighbours
- Sort all neighbours by distance and class
- Select the K nearest Neighbours
- Pick the majority class
How do you calculate the Euclidean distance between two points?
distance = square root of ((x1 - x2)^2 + (y1 - y2)^2)
What are the advantages of using the K-nearest Neighbour algorithm?
Doesn’t require any prior training, just the storage of data
Can be used for both classification and regression
What are some disadvantages of K Nearest Neighbour?
- Computationally Expensive - Finding distances to all training points can be slow
- Sensitive to K and Distance
- Curse of Dimensionality - Distance becomes less meaningful with higher dimensions
What are the steps involved in using a Hard Margin SVM?
- Define the Hyperplane with the equation w^Tx + b = 0, where w is the weight vector, b is the bias term, and x is the feature vector
- Maximise the margin, where the distance between the hyperplane and the closest data points from either class is greatest.
- Apply constraints for Hard Margin
- Minimise the norm of the weight vector, w, by maximising the margin
What are the advantages of Hard Margin SVMs?
Theoretical Guarantee - Finds the hyperplane with the maximum margin, leading to good generalisation
Deterministic - There is always a unique solution for linearly separable data.
What are some disadvantages of Hard Margin SVMs?
Assumes the data is perfectly linearly separable, otherwise the algorithm fails.
Sensitive to Outliers - A single outlier can drastically affect the hyperplane.
What is the main difference between Soft Margin SVMs and Hard Margin SVMs?
Soft Margin SVMs are designed to handle cases where the data is not perfectly linearly separable, whereas Hard Margin SVMs can’t handle them as well.
What are the effects of increasing the value of C within Soft Margin SVMs?
A higher C value will result in a larger penalisation for margin violations, thereby leading to smaller margins and fewer miscalculations.
How does a Kernel SVM differ from that of a Hard or Soft Margin SVM algorithm?
Kernel SVMs are able to solve problems where the data is not linearly separable in the original feature space, by mapping the data to a higher-dimensional space.
What are the advantages of using a Soft Margin SVM over a Hard Margin SVM?
Soft Margin SVMs are more robust to outliers, and they can be customised more thoroughly to better cater for the problem it’s attempting to solve.
What is the step-by-step operation of a Kernel SVM?
- Discover if the data is linearly separable or not in the standard feature space
- If not, then map the data to a higher dimensional space
- Apply the ‘Kernel Trick’, which uses a kernel function to map the data to a higher-dimensional space
- Solve the SVM optimisation problem by using the kernel function
- Create the Decision Boundary, based on the terms of the kernel
What are the advantages of a Kernel SVM?
Able to handle non-linear data
Very powerful tool for high-dimensional and non-linear datasets
Multiple kernel options allow adaptation to different types of data.