Linear and Kernel Models Flashcards

Question 1

Q

Write down the general form of a linear predictive model

Answer

A

f(x) = <w, x> + b

Question 2

Q

Write down the optimal weight vector of a regularised model with loss L

Answer

A

w = argmin (1/N) sum{L(f(xi) - yi) + lambda * J(w)}

Question 3

Q

What is the goal of linear regression?

Answer

A

Finding a linear function that best interpolates a given set of labelled training points.

Question 4

Q

Write down the optimal weight vector of Least Squares Regression

Answer

A

w = argmin (1/N) sum{(f(xi) - yi)^2} = (XT X)^(-1) XT y

Question 5

Q

What is Ridge Regression? What does it solve?

Answer

A

w = argmin (1/N) sum{(f(xi) - yi)^2 + lambda * ||w||^2} = (XT X + lambda I)^(-1) XT y
It has a closed form solution, prevents weights from exploding, and can be used for N < p

Question 6

Q

Define elastic net regression

Answer

A

Combines L1 and L2 norm

Question 7

Q

Define 0-1 loss

Answer

A

Indicator function that returns 1 when the target and output are not equal and 0 otherwise

Question 8

Q

What is logistic regression?

Answer

A

A regression that returns a value between 0 and 1 representing the log odds of an event.
It uses a sigmoid function
p(y=1|x) = 1 / {1 + exp(-<w,x>-b)}
Or
log{p(y=1|x) / p(y=0|x)} = <w, x> + b

Question 9

Q

What happens when you apply high regularisation to regression?

Answer

A

It limits the influence of individual points

Question 10

Q

What do SVMs maximise?

Answer

A

The margin, i.e. the distance of the closest points to the hyper plane.

Question 11

Q

Write down the formula for the hard and soft margin versions of an SVM

Answer

A

check notes

Question 12

Q

What does increasing C do to an SVM?

Answer

A

It increases how much you pay for each point that violates the margin constant. I.e. decreases regularisation

Question 13

Q

What are the advantages of kernel methods?

Answer

A

They represent a computational shortcut and permit closed form solutions when p>N

Question 14

Q

What is the aim of the kernel function?

Answer

A

To embed data into a space where patterns can be discovered as linear relations

Question 15

Q

Write the formula for both the primal and dual solution

Answer

A

Check notes

Question 16

Q

What is a Kernel function? How is it used?

Answer

Study These Flashcards

A

A substitute for a dot product in Kernel Regression, where for instance

alpha = (X XT)^-1 y
is substituted by a
lpha = (K)^-1 y
Where K is some function of the inner product of X

Question 17

Q

What is multiple kernel learning?

Answer

Study These Flashcards

A

The kernel K is considered a linear combination of M basis kernels. We can then learn both the kernel alphas and the weights of each kernel as a single optimisation problem.

Linear and Kernel Models Flashcards

(17 cards)