Regularisation Flashcards

Question 1

Q

Goal of regularisation in NNs

Answer

A

Take a model that includes the generating process but also many other possible generating processes

And

Help it match the true data generating process

Question 2

Q

Denote regularized objective function

Answer

A

Where:
α € [0, inf)
Ω(θ) is norm penalty

Question 3

Q

Normal practise where selecting parameter norm penalty for NNs

Answer

A

Choose Ω to penalise only weights of affine transformations at each layer, leave biases as they typically require less data than weights to fit accurately

Largely because weights concern interactions of variables, biases concern single variables

Also, including biases can introduce a significant amount of underfitting

Question 4

Q

w

Answer

A

Refers to all weights that should be affected by norm penalty while θ denotes all parameters including both w and unregularized parameters

Question 5

Q

L² called?

Answer

A

Partner norm penalty, AKA weight decay, RIDGE regression or Tikhonov regression

Question 6

Q

Show how L² works on gradient of objective function (let θ = w for simplicity)

Question 7

Q

Further simplify by making quadratic appromation of objective function in neighbourhood of value of weights that obtains minimal inregularized training costs