Linear regression Flashcards by Giorgio Ferrara

Data-set

Supervised Learning problem:
{(x1,y1),…,(xN,yN)}
xi app R^d
yi app R

How well did you know this?

Not at all

Perfectly

Hypothesis

Assuming a linear relation between x and y
h(x) = sum(i=0,d) wi*xi = w’ x

where
w = [w0, w1, …, wd]’
x = [x0, …, xd]’

How well did you know this?

Not at all

Perfectly

Learning algorithm

Minimize wrt h the sum of the squared distances between the data and the line h(x)

In theory, minimization of the out of sample error:
Eout(h) = E[ (h(x) - f(x))^2 ]

Since the probability distribution of f is unknown, in practice, minimization of the in sample error:
Ein(h) = 1/N * sum(n=1,N) (h(xn) - yn)^2

How well did you know this?

Not at all

Perfectly

Analytical formula that solves the problem

Least squares formula:
w^ = (X’ X)^-1 X’ Y

where X is the input matrix N x d+1
Y is the output vector app R^N

How well did you know this?

Not at all

Perfectly

Generalization

Theorem:

Eout(h) = E[ (h(x) - f(x))^2 ] = Ein(h) + O(d/N)

How well did you know this?

Not at all

Perfectly

Linear regression Flashcards