Mentimeters Flashcards

1
Q

A CNN filter is applied to

A

all channels across layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Stride is

A

step with which the filter is applied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Padding

A

increases the size of input data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Pooling

A
  • combines feature values within a region
  • downsamples feature maps
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

CNN activation is applied to

A

channel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Hyperparameters can be learned with

A

validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

FC layer is typically used

A

close to the output side of the network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Typical loss for multiclass classification

A
  • cross entropy
  • softmax
  • negative log likelihood
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ReLu can be applied

A

before or after max-pooling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Learning rate is

A

step of weights update

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Weights are not updated once per

A

epoch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

All training data is used to update weights in one

A

epoch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Averaging updates over iterations is called

A

momentum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

first and second order moments of gradients and used in

A
  • adadelta
  • RMSProp
  • Adma
  • Adagrad
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Batch normalisation is applied to

A

channels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Dropout is an effective regularisation of

A

fully connected layers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

L2 regularisation of weights is called

A

decay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

finetuning is a process of

A

updating parameters pretrained on another dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

data augmentation consists of

A

generating new samples from existing ones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

hard negative is a

A

negative example which is similar to a positive one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

hard positive is a

A

positive sample which is dissimilar to positive ones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

to debug a model

A

overfit on a small dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

bias in a dataset is

A

confusing noise introduced during data collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

VGG uses

A

3x3 filters and max pooling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

VGG is widely used because of its

A

effective feature representation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

efficiency of 1x1 filters was exploited in

A

inception

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

inception block uses

A

parallel filters with concatenated outputs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

skip connections are used in

A

ResNet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

skip connections in ResNet

A

do not change data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

best performing word embedding is

A

Bert

31
Q

Which unit is least effective in remembering sequences

A

RNN

32
Q

Gating mechanism uses

A

sigmoid

33
Q

in GRU hidden state and input are

A

concatenated

34
Q

Language modelling uses architecture type

A

many to many

35
Q

Transformers self attention uses

A

linear projections

36
Q

What is the goal of reinforcement learning?

A

maximise expected return

37
Q

The discount factor in value function is used to

A

weigh immediate and future rewards

38
Q

which behaviour is exploration in game exploring

A

play an experimental move

39
Q

what is the main drawback to monte carlo sampling approach to RL

A

needs to run an entire episode before updating

40
Q

what is the main different between Q learning and SARAR

A

SARAR uses epsilon greedy update policy

41
Q

main problem of Q-learning

A

not scalable

42
Q

in deep Q learning ‘deep’ is mainly used to

A

approximate Q function

43
Q

in policy based methods do we select actions according to value function

A

no

44
Q

policy optimisation can only be performed using gradient based methods

A

false

45
Q

REINFORCE is based on

A

monte carlo

46
Q

in REINFORCE with baseline the baseline is used to

A

reduce variance

47
Q

which method is not designed to reduce variance

A

REINFORCE

48
Q

in actor-critic methods, critic is similar to which part of a GAN

A

discriminator

49
Q

compared to value-based methods, policy-based methods can handle continuous action easily?

A

true

50
Q

which parameters are not hyperparameters

A

weights of convolutional kernel

51
Q

which hyperparameters optimization method is more efficient

A

random search

52
Q

in successive halving, the number of configurations n indicates

A

exploration

53
Q

in meta learning only training tasks contain training set and test set

A

false

54
Q

in meta learning total loss is computed using

A

test examples

55
Q

meta learning and multi task learning are the same

A

false

56
Q

which colour representation can be used to compute colour similarities

A

RGB colour space

57
Q

unsupervised representation learning can’t be used for

A

learning a mapping function from dataset and labels

58
Q

autoencoder is an

A

unsupervised method

59
Q

in autoencoder the decoder must be symmetric to the encoder

A

false

60
Q

as long as an autoencoder can reconstruct the input, this autoencoder can learn useful representations of the input

A

false

61
Q

what objective function is used to train autoencoder

A

reconstruction loss

62
Q

which is not an autoencoder?

A

disruptive

63
Q

which autoendcoder can be used to perform dimensionality reduction

A

undercomplete autoencoders

64
Q

in autoencoders which technique is used for anomaly detection

A

reconstruction error

65
Q

which autoencoder should be used to recover noisy data

A

denoising autoencoder

66
Q

an image classification model is a

A

discriminative model

67
Q

VAEs

A

explicit methods

68
Q

how are VAEs trained

A

maximising likelihood

69
Q

the reparameterisation trick in VAEs is used for

A

training

70
Q

GANs are

A

implicit methods

71
Q

which loss is better for training the generator of GANs

A

non-saturating heuristic

72
Q

what do GANs and VAEs have in common

A

both are generative models

73
Q

Are VAEs easier to train but generate less sharp images?

A

yes