Handout #7 - Advances in Network Architecture Flashcards

Question 1

Q

Explain what transfer learning is

Answer

A

Transfer learning is reusing knowledge learned from a task to boost performance on a related task.

E.g. Using image recognition of cars on trucks.

Question 2

Q

What’s the need for transfer learning

Answer

A

Training a state of the art CNN network from scratch demands a lot of images/data.

Question 3

Q

Explain why you’d use off-the-shelf networks such as VGG, AlexNet or GoogleNet

Answer

A

These OTS networks have been trained for many different images and classes.

This means we only need to design and test the classification part (the DNN layers)

Question 4

Q

If you don’t have enough data, what can you do with the values of the imported weights.

Answer

A

Freeze the values of the imported weights.

Question 5

Q

Explain the problem with implementing transfer learning

Answer

A

Transfer learning means using the already trained network for a similar set of data.

The data might not be normalised -> if we’ve trained the network on sunny days, it’ll have different values to cloudy days. The data isn’t in the correct value range for the use.

Question 6

Q

What can you do to avoid the problem with transfer learning.

Answer

A

You can do BN or normalisation, which shifts the scales the data according to the training set statistics.

Question 7

Q

Explain BN

Answer

A

BN is a normalisation layer, which shifts the data to the correct range.

The values are distributed after BN is 0 centred and variance 1.

It can achieve higher learning rates -> no need to worry about dropout or intialisation.

Question 8

Q

Explain what GoogLeNet and ResNet do to avoid vanishing gradient.

Answer

A

They avoid a pure sequential network (l -> l -> l) and have layers in parallel.

Question 9

Q

Explain how you can add more data in the model

Answer

A

Data augmentation;

Images -> flipping, rotation, zoom

Test -> replace with synonym, shuffle sentences, delete words

Noise -> add noise, reverb, compression

Question 10

Q

What’s the best optimiser learned so far in the course based on -> (i) vast data (ii) small data

Answer

A

(i) SGD
(ii) ADAM

Question 11

Q

Explain what the learning rate scheduler is:

Answer

A

Periodically raise the learning rate to temporary diverge and allow to hope over hills.

Handout #7 - Advances in Network Architecture Flashcards

(11 cards)