CNN Flashcards
CV problems
classifications, object detection, neural style transfer
how to detect edges? (vertical, horizontal, dia)
use filter that contains 1s and 0s like
1 0 -1
1 0 -1
1 0 -1
to detect vertical edges
the values for pixels at the ver edge will be v large/small (lighter/darker color) –> diff from other pixels
similar for horizontal
what is sobel or sehorr filter in edge detection?
1 0 -1
2 0 -2
1 0 -1
focus on the center pix
sehorr
3 0 -3
10 0 -10
3 0 -3
what is padding
To prevent losing information at the edges of the image (convolution shrinks mattrix), add 0s around the orignal matrix to pad
p can be 0 1 2…
what the result dimension when apply a fxf filter on nxn mat?
n-f+1xn-f+1
type of padding?
valid: no padding
same: output size same as input size
calculated based on input and filter size (f is usually odd)
what is strided conv?
moving filter s steps
s can be 1 2 3…
formula to cal output dimension
(n+2p-f)/s+1 round down
how to do cross correlation (deconvolution)
rotate the filter clockwise then flip it horizontally
then do inputxfilter
what is the condtion of input and filter when the are more than 1 channel?
the number of channels of input and filter must be the same
the number of channels of the output will be number of filters used
what is pooling layee? what are the types of pooling?
at pooling layer, instead of performing multiplication (*), use MAX, MIN, AVG operation instead.
diff between conv layer and pool layer?
conv layer has params but pool layer doesnt
in NN, conv1+pool1–> layer 1, conv2+pool2–> layer 2…
what is parameter sharing?
a feature detecter eg vertical edge detecter can be applied to other image to detect vertical edge
why doing convolution?
parameter sharing
sparsity of connections
what is sparsity of connections in convolution?
in each layer, each output value depends on small number of inputs
what are some classic neural networks?
lenet5: conv1 avgpool1 conv2 avgpool2 fc1 fc2 softmax –> very simple, common type of arrangement
alexnet: bigger, use maxpool instead of avgpool, same arrangement
vgg16: use padding (same) to preserve output dim,
what is residual block?
short cut from early input to later output (before ReLU sum up the ealier and the current)
advantage of resnet
allow to train deeper NN without hurting generability
why resnet work well without hurting performance?
identity function is ez to learn –> get back past result
g(wa+b +pasta)
because wa+b is small –> past a is large –> got a
feature of resnet?
have skip connections
what does an 1x1 convolution do?
shrink the number of channels or increase it
motivation for inception network?
improve computational cost
what is inception block?
apply different filters on 1 input and concat them as the output (bottle neck) –> input to other layer
what is advantage of mobilenet?
no need large computational resources
what is the feature of mobilenet?
depthwise separable conv
depthwise: filter will be fxf (not x nc) and nc filters ( 1 channel filter but no filters = no. channels) –> output has same nc
followed by pointwise conv with 1x1xnc’ filter
–> final output is n’xn’xnc’
what is effnet?
width, height and resolution can be scaled uniformly (compound scaling)
what is depthwise convolution?
Depthwise convolution operates on each input channel separately, rather than combining all input channels for each output channel as in standard convolution. This significantly reduces the number of parameters, leading to a smaller and faster model.
what is difference between depthwise conv and depthwise separable conv?
depthwise separable conv is depthwise conv + pointwise conv (to change no.of channels)