Mentimeters Flashcards

Question 1

Q

A CNN filter is applied to

Answer

A

all channels across layer

Question 2

Q

Stride is

Answer

A

step with which the filter is applied

Question 3

Q

Padding

Answer

A

increases the size of input data

Question 4

Q

Pooling

Answer

A

combines feature values within a region
downsamples feature maps

Question 5

Q

CNN activation is applied to

Question 6

Q

Hyperparameters can be learned with

Answer

A

validation

Question 7

Q

FC layer is typically used

Answer

A

close to the output side of the network

Question 8

Q

Typical loss for multiclass classification

Answer

A

cross entropy
softmax
negative log likelihood

Question 9

Q

ReLu can be applied

Answer

A

before or after max-pooling

Question 10

Q

Learning rate is

Answer

A

step of weights update

Question 11

Q

Weights are not updated once per

Question 12

Q

All training data is used to update weights in one

Question 13

Q

Averaging updates over iterations is called

Question 14

Q

first and second order moments of gradients and used in

Answer

A

adadelta
RMSProp
Adma
Adagrad

Question 15

Q

Batch normalisation is applied to

Question 16

Q

Dropout is an effective regularisation of

Answer

A

fully connected layers

Question 17

Q

L2 regularisation of weights is called

Question 18

Q

finetuning is a process of

Answer

A

updating parameters pretrained on another dataset

Question 19

Q

data augmentation consists of

Answer

A

generating new samples from existing ones

Question 20

Q

hard negative is a

Answer

A

negative example which is similar to a positive one

Question 21

Q

hard positive is a

Answer

A

positive sample which is dissimilar to positive ones

Question 22

Q

to debug a model

Answer

A

overfit on a small dataset

Question 23

Q

bias in a dataset is

Answer

A

confusing noise introduced during data collection

Question 24

Q

VGG uses

Answer

A

3x3 filters and max pooling

Question 25

Q

VGG is widely used because of its

Answer

A

effective feature representation

Question 26

Q

efficiency of 1x1 filters was exploited in

Answer

A

inception

Question 27

Q

inception block uses

Answer

A

parallel filters with concatenated outputs

Question 28

Q

skip connections are used in

Question 29

Q

skip connections in ResNet

Answer

A

do not change data

Question 30

Q

best performing word embedding is

Question 31

Q

Which unit is least effective in remembering sequences

Question 32

Q

Gating mechanism uses

Question 33

Q

in GRU hidden state and input are

Answer

A

concatenated

Question 34

Q

Language modelling uses architecture type

Answer

A

many to many

Question 35

Q

Transformers self attention uses

Answer

A

linear projections

Question 36

Q

What is the goal of reinforcement learning?

Answer

A

maximise expected return

Question 37

Q

The discount factor in value function is used to

Answer

A

weigh immediate and future rewards

Question 38

Q

which behaviour is exploration in game exploring

Answer

A

play an experimental move

Question 39

Q

what is the main drawback to monte carlo sampling approach to RL

Answer

A

needs to run an entire episode before updating

Question 40

Q

what is the main different between Q learning and SARAR

Answer

A

SARAR uses epsilon greedy update policy

Question 41

Q

main problem of Q-learning

Answer

A

not scalable

Question 42

Q

in deep Q learning ‘deep’ is mainly used to

Answer

A

approximate Q function

Question 43

Q

in policy based methods do we select actions according to value function

Question 44

Q

policy optimisation can only be performed using gradient based methods

Question 45

Q

REINFORCE is based on

Answer

A

monte carlo

Question 46

Q

in REINFORCE with baseline the baseline is used to

Answer

A

reduce variance

Question 47

Q

which method is not designed to reduce variance

Answer

A

REINFORCE

Question 48

Q

in actor-critic methods, critic is similar to which part of a GAN

Answer

A

discriminator

Question 49

Q

compared to value-based methods, policy-based methods can handle continuous action easily?

Question 50

Q

which parameters are not hyperparameters

Answer

A

weights of convolutional kernel

Question 51

Q

which hyperparameters optimization method is more efficient

Answer

A

random search

Question 52

Q

in successive halving, the number of configurations n indicates

Answer

A

exploration

Question 53

Q

in meta learning only training tasks contain training set and test set

Question 54

Q

in meta learning total loss is computed using

Answer

A

test examples

Question 55

Q

meta learning and multi task learning are the same

Question 56

Q

which colour representation can be used to compute colour similarities

Answer

A

RGB colour space

Question 57

Q

unsupervised representation learning can’t be used for

Answer

A

learning a mapping function from dataset and labels

Question 58

Q

autoencoder is an

Answer

A

unsupervised method

Question 59

Q

in autoencoder the decoder must be symmetric to the encoder

Question 60

Q

as long as an autoencoder can reconstruct the input, this autoencoder can learn useful representations of the input

Question 61

Q

what objective function is used to train autoencoder

Answer

A

reconstruction loss

Question 62

Q

which is not an autoencoder?

Answer

A

disruptive

Question 63

Q

which autoendcoder can be used to perform dimensionality reduction

Answer

A

undercomplete autoencoders

Question 64

Q

in autoencoders which technique is used for anomaly detection

Answer

A

reconstruction error

Answer 48

A

denoising autoencoder

Answer 49

A

discriminative model

Answer 50

A

explicit methods

Answer 51

A

maximising likelihood

Answer 52

A

implicit methods

Answer 53

A

non-saturating heuristic

Answer 54

A

both are generative models