Topic 5: Support Vector Machines Flashcards

1
Q

What is the primary goal of a support vector machine?

A

To find an optimal hyperplane that separates two classes with the largest margin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the role of kernel functions in SVMs?

A

To transform data into a higher-dimensional space where a linear separator is possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are two strategies for multiclass classification using SVMs?

A

One-versus-all and one-versus-one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a hyperplane in the context of SVM?

A

A decision boundary that separates different classes in the feature space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the margin in SVM?

A

The distance between the hyperplane and the nearest data points from each class, known as support vectors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are support vectors in SVM?

A

The data points that are closest to the hyperplane and influence its position and orientation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does SVM handle non-linearly separable data?

A

By using kernel functions to transform the data into a higher-dimensional space where it becomes linearly separable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are common types of kernel functions used in SVM?

A

Linear Kernel: For linearly separable data.
Polynomial Kernel: For non-linear data with polynomial relationships.
Radial Basis Function (RBF) Kernel: For complex non-linear relationships.
Sigmoid Kernel: Similar to a neural network activation function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the advantages of SVM?

A

Effective in high-dimensional spaces.
Robust to overfitting, especially with proper kernel selection.
Can model complex relationships using kernel functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the disadvantages of SVM?

A

Computationally intensive for large datasets.
Choosing the correct kernel and hyperparameters can be challenging.
Less interpretable compared to simpler models like decision trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the duality principle in SVM?

A

It refers to the equivalence between the primal and dual optimization problems, allowing kernels to be applied indirectly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the primary goal of a support vector machine?

A

To find an optimal hyperplane that separates two classes with the largest margin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the role of kernel functions in SVMs?

A

To transform data into a higher-dimensional space where a linear separator is possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are two strategies for multiclass classification using SVMs?

A

One-versus-all and one-versus-one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a hyperplane in the context of SVM?

A

A decision boundary that separates different classes in the feature space.

17
Q

What is the margin in SVM?

A

The distance between the hyperplane and the nearest data points from each class, known as support vectors.

18
Q

What are support vectors in SVM?

A

The data points that are closest to the hyperplane and influence its position and orientation.

19
Q

How does SVM handle non-linearly separable data?

A

By using kernel functions to transform the data into a higher-dimensional space where it becomes linearly separable.

20
Q

What is a kernel function in SVM?

A

A function that computes the similarity between data points in a transformed feature space without explicitly computing the transformation.

21
Q

What are common types of kernel functions used in SVM?

A

Linear Kernel: For linearly separable data.
Polynomial Kernel: For non-linear data with polynomial relationships.
Radial Basis Function (RBF) Kernel: For complex non-linear relationships.
Sigmoid Kernel: Similar to a neural network activation function.

22
Q

How is the decision boundary determined in SVM?

A

By solving an optimization problem that maximizes the margin while minimizing classification errors.

23
Q

What are the advantages of SVM?

A

Effective in high-dimensional spaces.
Robust to overfitting, especially with proper kernel selection.
Can model complex relationships using kernel functions.

24
Q

What are the disadvantages of SVM?

A

Computationally intensive for large datasets.
Choosing the correct kernel and hyperparameters can be challenging.
Less interpretable compared to simpler models like decision trees.

25
Q

What is the role of the support vectors in the decision function?

A

Support vectors define the position of the hyperplane, and only they contribute to the decision function.

26
Q

How can SVM handle multi-class classification?

A

One-vs-One (OvO): Trains a classifier for every pair of classes.
One-vs-All (OvA): Trains a classifier for each class against all other classes.

27
Q

What is the kernel trick?

A

A method to compute the inner product of data in a higher-dimensional space without explicitly transforming the data, enabling efficient computation.

28
Q

What happens when the data is not scaled in SVM?

A

Features with larger magnitudes can dominate, leading to poor performance. SVM requires scaled or normalized data for optimal results.

29
Q

How does the RBF kernel parameter
𝛾
γ affect SVM performance?

A

A high
𝛾
γ: Models the data very closely, leading to potential overfitting.
A low
𝛾
γ: Produces a smoother decision boundary, possibly underfitting.

30
Q

What is a sparse solution in SVM?

A

Only a subset of data points (support vectors) is used to define the model, making it memory efficient.

31
Q

How does SVM compare to logistic regression?

A

SVM: Maximizes margin and can handle non-linear relationships using kernels.
Logistic Regression: Directly models probabilities and is simpler for linear problems.