Mentimeters Flashcards
A CNN filter is applied to
all channels across layer
Stride is
step with which the filter is applied
Padding
increases the size of input data
Pooling
- combines feature values within a region
- downsamples feature maps
CNN activation is applied to
channel
Hyperparameters can be learned with
validation
FC layer is typically used
close to the output side of the network
Typical loss for multiclass classification
- cross entropy
- softmax
- negative log likelihood
ReLu can be applied
before or after max-pooling
Learning rate is
step of weights update
Weights are not updated once per
epoch
All training data is used to update weights in one
epoch
Averaging updates over iterations is called
momentum
first and second order moments of gradients and used in
- adadelta
- RMSProp
- Adma
- Adagrad
Batch normalisation is applied to
channels
Dropout is an effective regularisation of
fully connected layers
L2 regularisation of weights is called
decay
finetuning is a process of
updating parameters pretrained on another dataset
data augmentation consists of
generating new samples from existing ones
hard negative is a
negative example which is similar to a positive one
hard positive is a
positive sample which is dissimilar to positive ones
to debug a model
overfit on a small dataset
bias in a dataset is
confusing noise introduced during data collection
VGG uses
3x3 filters and max pooling
VGG is widely used because of its
effective feature representation
efficiency of 1x1 filters was exploited in
inception
inception block uses
parallel filters with concatenated outputs
skip connections are used in
ResNet
skip connections in ResNet
do not change data