Computer Vision Flashcards

Question 1

Q

What is feature map?

Answer

A

Output pixel based on the multiplication and summation for each position of the kernel on the input image

Question 2

Q

What is kernel?

Answer

A

(small) tensor with values that are multiplied with the values of an input tensor at the position of the kernel

Question 3

Q

Concept of Mapping a Convolutional Kernel

Answer

A

Applying the kernel across the image means sliding it over the coordinate grid of pixels. In default, no part of the kernel can overstep the boundaries of the grid

Question 4

Q

Why apply the convolutional kernel?

Answer

A

Resulting feature map will be smaller than input

Question 5

Q

What padding does?

Answer

A

adds virtual number of rows and columns in the height and width dimension filled with zeros

Question 6

Q

Why apply padding?

Answer

A

If the output feature map should have the same size as the input tensor

Question 7

Q

How many channels used for color images?

Question 8

Q

Kernels and Multichannel Inputs?

Answer

A

In the case of an input with multiple channels, the kernels processing this input will have a channel dimension with the same number of channels

Question 9

Q

Rules for model plot statistics

Answer

A

Mean and standard deviation should develop smoothly, activations near zero are problematic since they cannot be processed further in a meaningful way

Question 10

Q

gradient estimate is more accurate with…

Answer

A

higher batch size

Question 11

Q

Which activations are problematic since they cannot be processed further in a meaningful way?

Answer

A

near zero

Question 12

Q

Which requirements 1cycle training incorporate?

Answer

A

start low
End low
Medium high

Question 13

Q

Phases of learning rate schedule?

Answer

A

– Warmup phase: the learning rate is increased gradually
– Annealing phase: the learning rate is decreased again

Question 14

Q

What solves initial distribution with many near zero activations and shift of the distribution during training?

Answer

A

Batch normalization

Question 15

Q

Why use batch normalization?

Answer

A

Speed up training (high learning rate), allows sup-optimal starts (less iterations), adds randomness

Question 16

Q

What is convolution?

Answer

A

Sliding the kernel across the input tensor (image) and multiplying overlapping values, then summing the products to get a single output value for that location

Question 17

Q

What higher stride leads to?

Answer

A

Smaller output feature map, reduced computational cost

Question 18

Q

What lower stride leads to?

Answer

A

More spatial detail for capturing fine features

Question 19

Q

What is stride?

Answer

A

Step size with which the kernel moves across the input during the convolutional operation

Question 20

Q

Typical start for convolution?

Answer

A

Top left corner