Past Papers Flashcards

Question 1

Q

What are hyper parameters?

Answer

A

Set before training

Be sure to specify that they DEFINE network’s architecture

Question 2

Q

For an L¹ regularised neural network, write down how the regularisation term changes the way the parameters are modified during back prop

Question 3

Q

For an L¹ regularised neural network, write the loss function expanded around θ* and show what the minimum is for θi

Question 4

Q

For an L² regularised neural network:

Write down how regularisation term changes the way the parameters update

Expand around the min of the loss function

Write down an expression for the ith component of the minimum

Question 5

Q

When doing questions with L¹ remember to

Answer

A

Mention that we are introducing sparsity to the solution -> some parameters will go to zero if they are not significant

Question 6

Q

Describe what early stopping does

Answer

A

At each iteration of early stopping, we check how the validation or test set errors behave. After p (patience) consecutive iterations where the test error gets worse, the algorithm terminates

Question 7

Q

GPT4o definition of Universal Approximation Property

Answer

A

A feedforward NN with a single hidden layer containing a finite number of neurons can approximate any continuous function cation on a compact subset of R^d, given an appropriate activation function

Question 8

Q

What is the capacity of an infinitely wide neural network with a single hidden layer?

Answer

A

By the UAP, this network can approximate any function and therefore has infinite capacity

Question 9

Q

When asked to compare loss functions?

Answer

A

Remember to check wether the functions are bounded & wether they are differentiable

Question 10

Q

When discussing MSE?

Answer

A

Remember to mention that it is preferred for regression

Question 11

Q

How does L² regularisation change the way the parameters are updated using backprop?

Question 12

Q

When finishing values that minimise MSE for linear regression

Answer

A

Mention linear independence of system of derivatives with respect to β

Question 13

Q

How many times are parameters updated ?

Answer

A

N * (1 - validation_split) = num of training samples (X)

X / batch size = num of batches (B)

B * epochs = number of parameter updates

Question 14

Q

Total # of parameters in a NN

Question 15

Q

UAP conditions on g

Answer

A

Maps R to R
Measurable
Non polynomial
Bounded on any finite interval
The closure of the set of all discontinuities of g in R has zero lebesgue measure

Question 16

Q

UAP property 1

Question 17

Q

UAP property 2