ML-04 - Neural network Flashcards

1
Q

ML-04-Neural network

Why use ANN over polynomial regression?

A

For a large number of features, polynomial regression gets too big.

Ex:
1,000 raw features with quadratic features x_I^2 will have ~1,000,000 input features. Scales even worse for cubic+.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ML-04-Neural network

Where is the dendrite(s) located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ML-04-Neural network

Where is the nucleus located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ML-04-Neural network

Where is the axon located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ML-04-Neural network

Where are the input wires located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

ML-04-Neural network

Where is the cell body located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ML-04-Neural network

Where is the output wire located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

ML-04-Neural network

Where is the node of ranvier located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ML-04-Neural network

Where is the axon terminal located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

ML-04-Neural network

Where is the myelin sheath located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ML-04-Neural network

Where is the Scwann cell located? (See image)

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ML-04-Neural network

What is the difference between a NN and the perceptron?

A
  • Perceptron uses step function
  • Perceptron outputs are binary, i.e. in {0, 1}.
  • NN can use other activation functions.
  • NN outputs are real values, often in [0, 1] or [-1, 1]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

ML-04-Neural network

What notation would you use to denote which layer a weight belongs to?

A

w_3^(1) = first layer, 3rd neuron
This connects input values (layer 1) with the 2nd layer.
(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

ML-04-Neural network

What notation would you use to denote the 3rd neuron’s activation in the 2nd layer?

A

a_3^(2) = first layer, 3rd neuron
(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

ML-04-Neural network

Describe what the numbers mean in this picture (See image)

A
  • Red is the layer
  • Blue is which neuron in the layer it is or which neuron the weight belongs to.
  • Green is which specific weight it is.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

ML-04-Neural network

In a neural network, what is “a_1^(2)”?

A

The activation for the 1st neuron in the 2nd layer, i.e.

a = g(z) = g(W^T * x)

17
Q

ML-04-Neural network

In a neural network, what is z_1^(2)?

A

The weighted summation of the inputs for the 1st neuron in the 2nd layer.

z = W^T * x

18
Q

ML-04-Neural network

What is the difference between cross-entropy and sparse cross-entropy?

A
  • CE uses one-hot coded data.
  • SCE uses integer labels.
19
Q

ML-04-Neural network

Why use sparse cross-entropy loss?

A

Saves memory and computation when labels are sparse.

20
Q

ML-04-Neural network

How can you use a neural network for linear regression?`

A

Single layer network with activation function g(x) = x.

21
Q

ML-04-Neural network

What’s the requirement for an activation function?

A

Must be differentiable.

22
Q

ML-04-Neural network

Should you calculate regularization for the bias term?

A

No.

23
Q

ML-04-Neural network

How do you check is backpropagation is correctly implemeted?

A

Gradient checking.

24
Q

ML-04-Neural network

What is gradient checking?

A

Numerically checking if the gradient matches the analytical version. Roughly similar values means it should be okay.

25
Q

ML-04-Neural network

What’s the formula for gradient checking?

A

(See image)

26
Q

ML-04-Neural network

What’s a problem you might face when initializing weights too large?

A

Exploding gradients problem.

27
Q

ML-04-Neural network

What’s a problem you might face when initializing weights too small?

A

Vanishing gradients problem.

28
Q

ML-04-Neural network

What causes vanishing/exploding gradients?

A

The recursive nature of backpropagation.

29
Q

ML-04-Neural network

How does Xavier initialization work?

A
  • Weights: Layer l is normally sampled from mean 0 and variance 1/(n^(l-1)).
  • Bias: Zero
30
Q

ML-04-Neural network

What output activation function would you use for binary classification?

A

Sigmoid

31
Q

ML-04-Neural network

What output activation function would you use for multiclass classification?

A

Softmax

32
Q

ML-04-Neural network

What output activation function would you use for multi-label classification?

A

Sigmoid

33
Q

ML-04-Neural network

What output activation function would you use for regression?

A

A linear activation function.

34
Q

ML-04-Neural network

What loss function would you use for binary classification?

A

Binary cross entropy

35
Q

ML-04-Neural network

What loss function would you use for multiclass classification?

A

Categorical cross-entropy

36
Q

ML-04-Neural network

What loss function would you use for multi-label classification?

A

Binary cross-entropy

37
Q

ML-04-Neural network

What loss function would you use for regression?

A

MSE, MAE ++