College 4 Flashcards by Marlotte Pannekoek

Name and define 3 properties of convolution.

sparse connectivity: convolution kernel is much smaller than the input (less connections)
Parameter sharing:
kernel coefficients are identical for each input location
Equivariant representations:
convolution value covaries with input value (I you shift your input (image) your output is going to be the same with the same shift)

How well did you know this?

Not at all

Perfectly

If you have 2d convolution, with an 32x32 input and a 3x3 filter how many parameters do you have to learn? And what will be the size of the feature map

9 (3x3)

2. 28 x 28

How well did you know this?

Not at all

Perfectly

If you apply 6 filters in convolution how many feature maps do you get?

How well did you know this?

Not at all

Perfectly

What is the filter size of a 2d convolution for an input with N channels?

(3,3,N)

How well did you know this?

Not at all

Perfectly

1,1,2,4
5,6,7,8
3,2,1,0
1,2,3,4

What is the result of applying 2d max pooling with a 2x2 filter and stride: 2?

6,8

3,4

How well did you know this?

Not at all

Perfectly

If you have a convolution over a 7 by 7 input
Filter size: 3x3
stride: 1
What is the output size?

5x5

How well did you know this?

Not at all

Perfectly

If you have a convolution over a 7 by 7 input
Filter size: 3x3
stride: 2
What is the output size?

3x3

How well did you know this?

Not at all

Perfectly

What can be an advantage of maxpooling?

more robustness (to little shifts in the input) / better generalisation

How well did you know this?

Not at all

Perfectly

What can be an advantage of increasing strides?

efficiency / space reduction

How well did you know this?

Not at all

Perfectly

If you have a convolution over a 8 by 8 input
Filter size: 3x3
stride: 3
padding: 2
What is the output size?

4x4

How well did you know this?

Not at all

Perfectly

How do you calculate the width and height of the output size

width_out = ((Width_input - filter_width + 2 x padding) / Stride) + 1

height_out = ((Height_input - filter_height + 2 x padding) / Stride) + 1

How well did you know this?

Not at all

Perfectly

If you have a convolution over a 5 by 5 input
Filter size: 3x3
stride: 2
padding: 1
What is the output size?

3x3

How well did you know this?

Not at all

Perfectly

If you have a convolution over a 64 by 64 by 3 input
Filter size: 4x4x3
filters: 32
stride:2
What is the number of feature maps?
What is the output width and height?
What is the number of parameters of the convolutional layer?

output width: 31
output height: 31
number of feature maps: 32
number of parameters: 1568
(4x4x3 x32 + 1 , 
1 for bias)

How well did you know this?

Not at all

Perfectly

Define: transposed convolution

A specific transformation is not always useful, so a more robust way to upsample is to learn som filters that allow going from a feature map to a larger one.

How well did you know this?

Not at all

Perfectly

Apply transposed convolution:
input =
0, 1
2, 3

kernel =
0, 1
2, 3

0, 0, 1
0, 4, 6
4, 12, 9

How well did you know this?

Not at all

Perfectly

Name an application of transposed convolution

Study These Flashcards

automatic colorization (encoder-decoder)

what is the size of a filter with 1 by 1 convolution?

Study These Flashcards

1x1

why would you apply 1 by 1 convolution?

Study These Flashcards

the deeper you get into the network the more feature maps you get, if your network get to big you want to reduce the size of the network, you can apply 1x1 convolution to reduce the dimensionality of the feature maps; compressed version

How does 1x1 convolution work?

Study These Flashcards

input feature map with shape (W,H, N

m filters of size (1,1,N) with (m

What is the goal of inception architectures?

Study These Flashcards

Increasing the depth and width of the network while keeping the computational budget constant

How does the naïve version of the inception module work?

Study These Flashcards

The information from the previous layer goes through:
1x1 convolutions
3x3 convolutions
5x5 convolutions
3x3 max pooling

and this information is then concatenated

What was the problem with the naïve version of the inception module

Study These Flashcards

It did not keep the computational budget constant.

What were the improvements made to the naïve version of the inception module?

Study These Flashcards

Before applying the 3x3 and 5x5 convolutions 1x1 convolutions were applied. And after the 3x3 max pooling 1x1 convolutions were applied

What happens when you add more layers (get a deeper network)?

Study These Flashcards

The intuition is that you get more parameters, more powerful network and better performance but that is not always the case. A 20-layer network can outperform a 56-layer network. A layer gives information to the next layer and that layer gives information to the next layer etc. If one layer extracts information that is not very useful the next layer will try to learn something from non-useful information. The later layers get input that has been processed a lot and has probably lost information that was in the original input.

What is the idea of Res-Nets?

Give the network the option to just copy the input if the function F(x) (the information passed through from the previous layer) is not informative (and add this information to the output)

What are attention mechanisms?

Attention mechanisms highlight the most informative features (in an image. This process runs parallel to to the feature extraction and creates a mask to ignore the features that are not important.

What are the main methods for object detection?

- Region proposals - R -CNN (Fast R-CNN and Faster R-CNN) - You Only Look Once - YOLO - Single Shot MultiBox Detector - SSD - RetinaNet

College 4 Flashcards

(27 cards)