Ch5 Image Classification End-of-Chapter Questions Flashcards

Question 1

Q

Why do we first resize images to a large size on the CPU, and then to a smaller size on the GPU?

Answer

A

We want to first get all the images to be the same size so that we can put them in tensors and pass them to the GPU, but we don’t want to lose much information so we resize to a large size to keep as much of the image as possible.

This resizing outputs images large enough to allow us to have a spare margin after applying transforms and doing the final resize. This avoids the creation of empty zones, which would not teach the model anything, when we apply transforms.

Example: A transform rotating an image by 45 degrees would fill corner regions of the new bounds with empty space.

Question 2

Q

What are two ways in which data is most commonly provided for most deep learning datasets?

Answer

A

A table of data, e.g. csv file
Items of data in individual files that could be organized into folders or with filenames that describe those items.

Question 3

Q

Give two examples of ways that image transformations can degrade the quality of the data.

Answer

A

We can lose resolution after resizing and we can introduce empty zones by rotating after reducing the image to final size.

Question 4

Q

What method does fastai provide to view the data in DataLoaders?

Answer

A

show_batch method

dls.show_batch(nrows=2, ncols=3)

Question 5

Q

What method does fastai provide to help you debug a DataBlock?

Answer

A

summary method

dblk.summary(‘filepath’)

Question 6

Q

Should you hold off on training a model until you have thoroughly cleaned your data?

Answer

A

No, you should train a model as soon as you can because the incorrect predictions from the model can help you clean the data more efficiently.

Question 7

Q

What are the two pieces that are combined into cross-entropy loss in PyTorch?

Answer

A

Softmax and log likelihood

Question 8

Q

What are two properties of activations that softmax ensures? Why is this important?

Answer

A

It ensures that activations are represented as a number from 0 to 1 and that the sum of activations from all categories add up to 1. The raw activation values don’t have meaning by themselves – they represent the relative confidence of an input being in category1 vs category2. What we care about is which activation is higher and by how much. We get this when all activations add up to 1. Then we can think of it as the probability of being in that category.

The second property is that if one of the numbers in our activations is slightly bigger than the others, the exponential in softmax will amplify this difference, which is useful when we really want the classifier to pick one image as the prediction.

Question 9

Q

When might you want your activations to not have the two properties that softmax ensures?

Answer

A

When you want your model to tell you when it’s not sure of something, e.g. it comes across an image it was not trained on.

Question 10

Q

Why can’t we use torch.where to create a loss function for datasets where our label can have more than two categories?

Answer

A

The function we used to calculate loss for a binary target was:
~~~
def mnist_loss(predictions, targets):
return torch.where(targets==1, 1-predictions, predictions).mean()
~~~

In this case torch.where returns 1-predictions when target==1 and predictions otherwise. We would need multiple conditions to handle more than two categories, and we can’t do that with torch.where, which only takes one condition as an argument.

Question 11

Q

What is the value of log(-2)? Why?

Answer

A

It is undefined because the log function is not defined for numbers less than 0.

Question 12

Q

What are two good rules of thumb for picking a learning rate from the learning rate finder?

Answer

A

Find the minimum loss and use a learning rate that is one magnitude less than the one at the min, i.e. divide it by 10.
The last point where the loss was clearly decreasing (subjective). Could use the point where derivative is steepest.

Question 13

Q

What two steps does the fine_tune method do?

Answer

A

Trains the randomly added layers for one epoch, with all other layers frozen
Unfreezes all the layers, and trains them for the number of epochs requested.

Question 14

Q

What are discriminative learning rates?

Answer

A

Learning rates that are different depending on the depth of the layer. Use a lower learning rate for the early layers of the neural network and a higher learning rate for the later layers, particularly the randomly added layers.

Question 15

Q

How is a Python slice object interpreted when passed as a learning rate to fastai?

Answer

A

The first value passed will the learning rate for the earliest layer of the neural network and the second value will be the learning rate in the final layer. The layers in between will have learning rates that are multiplicatively equidistant throughout that range.

Question 16

Q

Why is early stopping a poor choice when using 1 cycle training?

Answer

Study These Flashcards

A

With early stopping, we would save the model at the end of each epoch and then select the model with the best accuracy among all the models saved. However, the epochs in the middle occur before the learning rate has had a chance to reach the small values, where it can really find the best result.

Question 17

Q

What is the difference between resnet50 and resnet101?

Answer

Study These Flashcards

A

The number of layers. Larger models are generally able to better capture the real underlying relationships in your data, but they also more likely to overfit to the training data.

Question 18

Q

What does to_fp16 do?

Answer

Study These Flashcards

A

It converts numbers to half-precision floating point values, which are less precise. This can speed up training.

Ch5 Image Classification End-of-Chapter Questions Flashcards

(18 cards)