Handout #7 - Advances in Network Architecture Flashcards

1
Q

Explain what transfer learning is

A

Transfer learning is reusing knowledge learned from a task to boost performance on a related task.

E.g. Using image recognition of cars on trucks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What’s the need for transfer learning

A

Training a state of the art CNN network from scratch demands a lot of images/data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain why you’d use off-the-shelf networks such as VGG, AlexNet or GoogleNet

A

These OTS networks have been trained for many different images and classes.

This means we only need to design and test the classification part (the DNN layers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If you don’t have enough data, what can you do with the values of the imported weights.

A

Freeze the values of the imported weights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the problem with implementing transfer learning

A

Transfer learning means using the already trained network for a similar set of data.

The data might not be normalised -> if we’ve trained the network on sunny days, it’ll have different values to cloudy days. The data isn’t in the correct value range for the use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What can you do to avoid the problem with transfer learning.

A

You can do BN or normalisation, which shifts the scales the data according to the training set statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain BN

A

BN is a normalisation layer, which shifts the data to the correct range.

The values are distributed after BN is 0 centred and variance 1.

It can achieve higher learning rates -> no need to worry about dropout or intialisation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain what GoogLeNet and ResNet do to avoid vanishing gradient.

A

They avoid a pure sequential network (l -> l -> l) and have layers in parallel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain how you can add more data in the model

A

Data augmentation;

Images -> flipping, rotation, zoom

Test -> replace with synonym, shuffle sentences, delete words

Noise -> add noise, reverb, compression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s the best optimiser learned so far in the course based on -> (i) vast data (ii) small data

A

(i) SGD
(ii) ADAM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain what the learning rate scheduler is:

A

Periodically raise the learning rate to temporary diverge and allow to hope over hills.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly