SVM Flashcards

1
Q

SVM

A

A discriminative classifier formally defined by seperating hyperplanes. Given labelled data (supervised learning) the algorithm outputs an optimal hyperplane which categorizes a new example.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Margin

SVM

A

The distance between the line and the closest data point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Maximum-Margin Hyperplane

A

The best or optimal line that can seperate two classes is the line with the largest margin.

https://medium.com/@skilltohire/support-vector-machines-4d28a427ebd\

wTx + b = -1
wTx + b = 0
wTx + b = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Soft Margin Classifier

A
  • Real data is messy and cannot be seperated perfectly with a hyperplane
  • Relaxing the constraint of maximizing margin allows some points to violate this.
  • A tuning parameter C is introduced that defines the magnitude of wiggle across all dimensions (the amount of violation of the margin allowed).
  • C = 0 means no violation -> Maximal Margin Classifier
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the different kernel types?

5

A

1) Linear Kernel - inner product + constant -> K(x,xi) = (xxi) + c
2) Polynomial Kernel -> K(x,xi) = 1 + sum(x
xi)^d
3) Radial Basis Function or Gaussian Kernel -> K(x,xi) = exp(-gamma * sum((x-xi^2)). Note that gamma is often between 0 and 1
4) Sigmoid Kernel -> K(x,xi) = tanh(yxi*xj +c)
5) Chi-Squared Kernel -> x^2_c = SUM(observed values - expected values)^2/Expected values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to we evaluate an SVM model?

A

Confusion Matrix
|actual Values | Predicted Values|
| | + | - |
| + | TP | FP |
| - | FP | TN |

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the formulas for precision, F1, recall and accuracy?

A

Precision = TP/(TP+FP)
Recall = TP/(TP + FN)
Accuracy = (TP + TN)/Total
F1 = 2(PrecisionRecall)/(Precision + Recall)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Calculate Precision, Recall, Accuracy and F1 when the following is true:
TP = 50 ; FP = 10 ; FN = 5 ; TN = 100

A

Final Results
Precision ≈ 0.8333

Recall ≈ 0.9091

Accuracy ≈ 0.9091

F1 Score ≈ 0.8694

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Solve the following multi-dimensional model example. Find the True positive and negative and false positive and negative values.

Also solve for precision, recall and F1.

True Class |
|Predicted Class | Apple | Orange | Mango |
| Apple | 7 | 8 | 9 |
| Orange | 1 | 2 | 3 |
| Mango | 3 | 2 | 1 |

A

TP = 7
TN = 2 + 3 + 2 + 1 = 8
FP = 8 + 9 = 17
FN = 1 + 3 = 4
Precision = 7/(7+17) = 0.29
Recall = 7/(7+4) = 0.64
F1-score = 0.4

For class: Apple
TP = 7 (predicted Apple, true Apple)
FP = 8 + 9 = 17 (predicted Apple, but true Orange or Mango)
FN = 1 + 3 = 4 (true Apple, but predicted Orange or Mango)
TN = sum of all other cells = Total - TP - FP - FN
= 36 - 4 - 17 - 7 = 8

For class: Orange
TP = 2
FP = 8 + 2 = 10
FN = 1 + 3 = 4
TN = 36 - 2 - 10 - 4 = 20

For class: Mango
TP = 1
FP = 9 + 3 = 12
FN = 3 + 2 = 5
TN = 36 - 1 - 12 - 5 = 18

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the advantages and disadvantages of SVM?

A

Advantages
1) High accuracy
2) data is linearly seperable
3) avoids overfitting

Disadvantages
1) sensitive to noise
2) only considers two classes
3) computationally inefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly