Lecture 4 - Training CNNS Flashcards

1
Q

Regression

A

A set of processes for modelling the relationship between an output and input features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Classification

A

Problem of classifying inputs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to measure regression performance

A

Loss Function like Mean Absolute Error or Mean Squared Error,…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Loss Function: Value size when it is well classified vs not

A

well classified = small value
else big value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Negative Log Likelihood Loss

A

Lp = -Log(p)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Softmax

A

Normalise network output tio a probability distribution over predicted output classes

exp(xi) divided by the sum of all j where exp(xj)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cross entropy loss

A

-Log(softmax)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Optimisation

A

Process to find best weight to minimise loss function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Gradient Descent

A

Gradient direciton and step size. “Walk” towards minima.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Learning Rate

A

Step size.

Too low = no progress
too high = instability and never converges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Stochastic Gradient Descent

A

each w = w - a * gradient value

a here is alpha (learning rate)

see slides for full equation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

SGD Weight Decay

A

Used to prevent the weight being too big

w <- w - a(deriv(w)L(x,w)-yw) + p[last update]
a is alpha (learning rate) and y is gamma. yw is weight decay regularisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SGD Momentum

A

p[last update]

where p is a number less than 1 (meaning p * update i think?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

“Gradient Value” equation. Using weights, inputs and derivative of..?

A

sum(i)(deriv(w)*L(xi,w))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly