Week 3 Flashcards
Wat is X in het knikker-vazen model?
Het aantal rode knikkers in de steekproef (het sample).
Wat is nu, uitgedrukt in X en N?
X/N
Het aantal rode knikkers in de steekproef / de grootte van de steekproef.
Geef de Hoeffding Inequality:
P[|E.in-E.out| > epsilon] = 2e ^(-2(epsilon^2)N)
voor alle epsilon > 0
Noem de twee soorten supervised learning-problemen:
Classificatie: Y bestaat uit een klein aantal elementen (Bij binair: 2 elementen)
Regressie: Y = R
Linear regression
A linear model based on the signal function. The output is the signal.
Logistic regression
A linear model that outputs a probability between 0 and 1. Holds no threshold at all:
h(x) = theta * ( w^T * x )
Give the logistic function theta(s):
theta(s) = (e^s) / (1+e^s)
output between 0 and 1
Linear classification
Uses a hard threshold on the signal.
h(x) = sign(w^T *x)
classification output…
is bounded
regression output…
is real
What is meant with the ‘soft threshold’?
the logistic function theta(s)
Why is the logistic function also called a sigmoid?
because its shape looks like a flattened out ‘s’
What is the target that a logistic function is trying to learn?
A probability that depends on the input x.
What is the target function in logistic regression?
f(x) = P[y=+1 | x]
Error measure
How likely it is that we would get this output y from the input x if the target distribution P(y|x) was indeed captured by our hypothesis h(x).