CNN Flashcards
CV problems
classifications, object detection, neural style transfer
how to detect edges? (vertical, horizontal, dia)
use filter that contains 1s and 0s like
1 0 -1
1 0 -1
1 0 -1
to detect vertical edges
the values for pixels at the ver edge will be v large/small (lighter/darker color) –> diff from other pixels
similar for horizontal
what is sobel or sehorr filter in edge detection?
1 0 -1
2 0 -2
1 0 -1
focus on the center pix
sehorr
3 0 -3
10 0 -10
3 0 -3
what is padding
To prevent losing information at the edges of the image (convolution shrinks mattrix), add 0s around the orignal matrix to pad
p can be 0 1 2…
what the result dimension when apply a fxf filter on nxn mat?
n-f+1xn-f+1
type of padding?
valid: no padding
same: output size same as input size
calculated based on input and filter size (f is usually odd)
what is strided conv?
moving filter s steps
s can be 1 2 3…
formula to cal output dimension
(n+2p-f)/s+1 round down
how to do cross correlation (deconvolution)
rotate the filter clockwise then flip it horizontally
then do inputxfilter
what is the condtion of input and filter when the are more than 1 channel?
the number of channels of input and filter must be the same
the number of channels of the output will be number of filters used
what is pooling layee? what are the types of pooling?
at pooling layer, instead of performing multiplication (*), use MAX, MIN, AVG operation instead.
diff between conv layer and pool layer?
conv layer has params but pool layer doesnt
in NN, conv1+pool1–> layer 1, conv2+pool2–> layer 2…
what is parameter sharing?
a feature detecter eg vertical edge detecter can be applied to other image to detect vertical edge
why doing convolution?
parameter sharing
sparsity of connections
what is sparsity of connections in convolution?
in each layer, each output value depends on small number of inputs
what are some classic neural networks?
lenet5: conv1 avgpool1 conv2 avgpool2 fc1 fc2 softmax –> very simple, common type of arrangement
alexnet: bigger, use maxpool instead of avgpool, same arrangement
vgg16: use padding (same) to preserve output dim,
what is residual block?
short cut from early input to later output (before ReLU sum up the ealier and the current)
advantage of resnet
allow to train deeper NN without hurting generability
why resnet work well without hurting performance?
identity function is ez to learn –> get back past result
g(wa+b +pasta)
because wa+b is small –> past a is large –> got a
feature of resnet?
have skip connections
what does an 1x1 convolution do?
shrink the number of channels or increase it
motivation for inception network?
improve computational cost
what is inception block?
apply different filters on 1 input and concat them as the output (bottle neck) –> input to other layer
what is advantage of mobilenet?
no need large computational resources