Linear classification Flashcards
Data-set
x vector of d attributes
x=[x1,x2,…,xd]
y=f(x)
data set {(x1,y1),…,(xN,yN)}
Hypothesis set
Perceptron:
h(x) = sign(sum(i=1,d) wi*xi + b)
linear combination of the attributes, plus a bias b
- > h(x) = sign(sum(i=0,d) wi*xi)
where w0 = b and x0 = 1
unknowns: w=[w0,w1,…,wd]’
Learning algorithm
Perceptron Learning algorithm PLA
Main assumption: there exist a separating hyperplane (data must be linearly separable)
Idea:
- start from a misclassified point
- update w such that the point is correctly classified
- > update rule: w(i+1) = w(i) + y(i)*x(i)
Convergence of the PLA
PLA converges to a perfect classification in a finite number of iterations
- > Ein(h) = 0
Generalization of PLA
Eout(h) = P[h(x) != f(x)] = Ein(h) + O( sqrt(d/N*ln(N) )
Confusion matrix
[ a b
c d ]
a: true positive
b: false positive
c: false negative
d: true negative
Precision
Precision or positive predicted value
PPV = a/(a+b)
Recall
Recall or true positive rate
TPR = a/(a+c)
F1-score
F1 = 2* PPV*TPR/(PPV+TPR)
Performance indexes
- Precision or positive predicted value
- Recall or true positive rate
- F1-score