Week 5 Flashcards

1
Q

Why don’t we use regular NN for images?

A

Doesn’t scale:
100x100 pixels = 10k parameters per node
Not robust to small changes in input
Doesn’t take advantage of correlations between pixels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are filters in CNN

A

Subset of the image which acts as the weights that will be learned by the NN via backpropagation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do we apply filters from CNNs

A

Do a dot product between filter and original image, store the result with a bias term in a matrix called a feature map. Move the filter 1 pixel across and go again.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a feature map

A

Map of where the feature indicated by your filter appears
Values > 0: filter appears here
Values < 0: does not appear here

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Limitations of deep learning

A

DL is data hungry: it needs a LOT of data to work
DL is heavy: you often need GPUs and cloud computing to train it and even to use it
DL is bad at representing uncertainty: it’s easy to trick a neural network into thinking it’s right
Hard to optimise: architecture, learning method…
Hard to interpret: neural networks are black boxes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do we do to the feature map once its made?

A

Pass the values throguh a ReLU function.
(e.g. its equal to x if x>=0 or its 0 if its less).
Finds all the filter matches.
relu(x) = max(0,x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why do we need to downsample

A

Our feature map is N-2 pixels when N is the width of our image. Does not scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we downsample?

A

Pooling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Pooling?

A

Aggregating,
Summarising,
Downsampling the image

Max-pooling
Average-pooling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is stride?

A

Step size in convolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Whats a problem with relu?

A

Discards all negative values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What other activation functions do we have?

A

ReLU
tanh
Sigmoid
Leaky ReLU
Maxout
ELU
Softmax

different activation functions solve different problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Whats a problem with tanh?

A

It’s derivatives go to 0 (which is bad for backpropagation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is softmax?

A

Uses exponents to normalise the output layer of NN into probabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Issues with large learning rate

A

Overshoot the bottom of the curve for error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Issues with small learning rate

A

Takes too long to learn and can get stuck on a suboptimal solution

17
Q

Examples of hyperparameters for a CNN

A

Filter size and number of filters
Padding and stride value
Learning rate and dropout rate
Number of epochs and batch size
Activation function
Number of hidden layers
Number of neurons in each layer