Multi-Choice: Deep Learning Flashcards

1
Q

Q1 (Deep Learning). A neural net outputs f(x1)=0.1, f(x2)=0.5, f(x3)=0.7 for targets (1,0,1). Binary cross entropy loss to 2 decimals? (One choice) 1) -1.12 2) 4 3) 1.12 4) 0.77 5) 2.12

A

Correct item: 3. Explanation: The BCE ~ 1.12.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Q2 (Deep Learning). A fully connected NN with 5 hidden layers each of 10 neurons, input is 55x55x3, output is size 1. Total number of parameters? (One choice) 1) 91200 2) 91211 3) 91210 4) 91201 5) 91160 6) 91161 7) 9075

A

Correct item: 2. Explanation: 9075->10 + 10->10 (×4) + last layer => total 91211.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Q3 (Deep Learning). A CNN with three conv layers on an RGB 55x55: (1) 5x5 filters,5 filters,stride1,pad0 => (2) 7x7,10 filters,stride2,pad0 => (3) 7x7,20 filters,stride4,pad0. Final output dimension? (One choice) 1) 5520 2) 6620 3) 232310 4) 51515 5) 55553 6) 553

A

Correct item: 1. Explanation: The dimension steps: 55->51->23->5 in spatial dims, final depth 20 => 5520.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Q4 (Deep Learning). Same CNN as Q3. Total number of parameters (including bias)? (One choice) 1) 9820 2) 12660 3) 2460 4) 380 5) 12280 6) 12760

A

Correct item: 2. Explanation: Layer1=380, Layer2=2460, Layer3=9820 => 12660 total.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Q5 (Deep Learning). A single-hidden-layer net for multiclass: h=σ(W(1)x + b(1)), ŷ=softmax(W(2)h + b(2)). Input dimension=D, classes=k, hidden=H. Total # of parameters? (One choice) 1) (D+1)H + (H+1)k 2) D+H+k 3) HD+KH 4) H+K 5) HD+KH+K

A

Correct item: 1. Explanation: The first layer has (D+1)H parameters, second has (H+1)k.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Q6 (Deep Learning). Same net as Q5, cross-entropy error, gradient wrt w(1)ᵢⱼ? (One choice) 1) Σₖ(yₖ - ŷₖ) hᵢ(1-hᵢ) xⱼ 2) Σₖ(yₖ - ŷₖ) w(2)ₖᵢ hᵢ(1-hᵢ) xⱼ 3) Σₖ(yₖ - ŷₖ) w(2)ₖᵢ hᵢ(1-hᵢ) 4) Σₖ(yₖ - ŷₖ) w(2)ₖᵢ hᵢ(1-hᵢ) xᵢ …

A

Correct item: 2. Explanation: The chain rule gives Σₖ(yₖ - ŷₖ) w(2)ₖᵢ hᵢ(1-hᵢ) xⱼ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Q7 (Deep Learning). Which are true for convolutional layers? (Two correct) 1) Good for all tabular data 2) Trained by gradient descent 3) Fewer params than fully connected for same input 4) Use linear activations

A

Correct items: 2 and 3. Explanation: Convs do share weights, so fewer parameters, and are trained via gradient descent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Q8 (Deep Learning). Which are true for RNNs? (Two correct) 1) LSTMs fix vanishing gradient issues 2) Param count depends on sequence length 3) RNNs only do many-to-one tasks 4) RNN design uses parameter sharing

A

Correct items: 1 and 4. Explanation: LSTMs mitigate vanishing gradients; RNNs share parameters across timesteps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly