Week 3 Flashcards by Jedidja Marsman

Wat is X in het knikker-vazen model?

Het aantal rode knikkers in de steekproef (het sample).

How well did you know this?

Not at all

Perfectly

Wat is nu, uitgedrukt in X en N?

X/N
Het aantal rode knikkers in de steekproef / de grootte van de steekproef.

How well did you know this?

Not at all

Perfectly

Geef de Hoeffding Inequality:

P[|E.in-E.out| > epsilon] = 2e ^(-2(epsilon^2)N)
voor alle epsilon > 0

How well did you know this?

Not at all

Perfectly

Noem de twee soorten supervised learning-problemen:

Classificatie: Y bestaat uit een klein aantal elementen (Bij binair: 2 elementen)
Regressie: Y = R

How well did you know this?

Not at all

Perfectly

Linear regression

A linear model based on the signal function. The output is the signal.

How well did you know this?

Not at all

Perfectly

Logistic regression

A linear model that outputs a probability between 0 and 1. Holds no threshold at all:
h(x) = theta * ( w^T * x )

How well did you know this?

Not at all

Perfectly

Give the logistic function theta(s):

theta(s) = (e^s) / (1+e^s)
output between 0 and 1

How well did you know this?

Not at all

Perfectly

Linear classification

Uses a hard threshold on the signal.
h(x) = sign(w^T *x)

How well did you know this?

Not at all

Perfectly

classification output…

is bounded

How well did you know this?

Not at all

Perfectly

regression output…

is real

How well did you know this?

Not at all

Perfectly

What is meant with the ‘soft threshold’?

the logistic function theta(s)

How well did you know this?

Not at all

Perfectly

Why is the logistic function also called a sigmoid?

because its shape looks like a flattened out ‘s’

How well did you know this?

Not at all

Perfectly

What is the target that a logistic function is trying to learn?

A probability that depends on the input x.

How well did you know this?

Not at all

Perfectly

What is the target function in logistic regression?

f(x) = P[y=+1 | x]

How well did you know this?

Not at all

Perfectly

Error measure

How likely it is that we would get this output y from the input x if the target distribution P(y|x) was indeed captured by our hypothesis h(x).

How well did you know this?

Not at all

Perfectly

Give the formula for the in-sample error in linear regression:

Study These Flashcards

E.in(h) = 1/N * (w.T*x.i - y.i)^2

voor alle n in N

method of maximum likelihood

Study These Flashcards

Selects the hypothesis h(x) which maximises the probability to get all yn’s in the dataset from the corresponding xn’s

Give the formula for the in-sample error measure for logistic regression:

Study These Flashcards

Ein(w) = 1/N * ln(1+ e^(-yn* w.T *xn))

for all n in N.

What is the target in linear regression?

Study These Flashcards

A noisy target function formalized as a distribution of the random variable y.

Linear regression: method and goal

Study These Flashcards

We have an unknown distribution P(x,y) that generates each (xn,yn), and we want to find a hypothesis g that minimizes the error between g(x) and y with respect to that distribution.

Matrix representation of Ein(h) in linear regression:

Study These Flashcards

The N x (d+1) matrix with input vectors xn as rows and y the target vector as columns with yn as components/target values.

How do you get the gradient of Ein(w) to be 0?

Study These Flashcards

Solve the following for a w:

X^T * X * w = X^T * y

Wat is ^y in de kwadratische fout?

Study These Flashcards

De schatting volgens de hypothese.

Wat is y in de kwadratische fout?

Study These Flashcards

De correcte waarde, waar we op mikken (target waarde)

Geef de formule voor de kwadratische fout (squared error):

e(^y, y) = (^y - y)^2

Linear regression algorithm in 3 steps

1) Construct matrix X and vector y from the data set, with each x0=1. 2) Compute the pseudo-inverse of the matrix X. 3) Return wlin = pseudo-inverse of X * y

OLS

Ordinary least squares

hoe wordt de afgeleide van functie f in richting w.i geschreven?

omgekeerde a/ omgekeerde a w.i f(x)

Wat is (omgekeerd driehoekje v) f(x)?

De vector van de afgeleides van f(x) in richting w0, ... wn.

Wat is de kleinste-kwadraten schatter (least-squares estimator)?

De oplossing wlin die je krijgt als je de gradient van E.in oplost voor gradient = 0.

Wat is wlin?

w = ((X.T * X) ^-1) * X.T *y

Wat kun je doen met de kleinste-kwadraten schatter?

y voorspellen voor een willekeurige x.

Hoe voorspel je y voor een willekeurige x met de kleinste-kwadraten schatter?

^y = w.lin.T * x

Wat doet de hat matrix?

Vertaalt de daadwerkelijke outputdata y naar outputdata die met de hypothese kloppen.

What is the main difference between the learning approach and the design approach?

The role that data plays: in the design approach, the problem is well-defined and f can be analytically derived without seeing data. In the learning approach, data is needed to pin down f.

Week 3 Flashcards

(35 cards)