Activation Functions in FCNs Flashcards

1
Q

What does backpropagation compute in a neural network?

A

It computes how the error changes with respect to each weight and updates them using gradient descent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the role of gradient descent in training neural networks?

A

It minimizes the loss function by updating weights iteratively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why are activation functions necessary in neural networks?

A

They introduce non-linearity, enabling the network to model complex patterns beyond linear relationships.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the limitation of using a purely linear activation function?

A

It cannot learn complex patterns and reduces the network to a simple linear regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What advantages do non-linear activation functions provide?

A

They enable deep networks by allowing backpropagation and stacking multiple layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the primary issue with the sigmoid function in deep learning?

A

It suffers from the vanishing gradient problem for very large or very small input values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When is the softmax function used?

A

It is used in multi-class classification to convert logits into probability distributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does the tanh function differ from sigmoid?

A

It maps inputs between -1 and 1, making it better at representing negative and positive relationships.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What problem does tanh suffer from?

A

It still suffers from the vanishing gradient problem for extreme input values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the mathematical definition of ReLU?

A

f(x)=max(0,x), meaning it outputs zero for negative inputs and the input itself for positive values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What issue does ReLU suffer from?

A

The “dying ReLU” problem, where neurons output zero for all negative inputs and stop learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does Leaky ReLU address the dying ReLU problem?

A

It allows a small, non-zero gradient for negative inputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why are activation functions necessary in neural networks?

A

They introduce non-linearity, enabling the network to learn complex patterns beyond linear transformations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

ReLU

A

Rectified Linear Unit, a non-linear activation function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which activation function is best for hidden layers in deep networks?

A

ReLU is the most commonly used due to its efficiency in training.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which activation function is preferred in the output layer for binary classification?

17
Q

Which activation function is used for multi-class classification?

18
Q

Why must activation functions be differentiable?

A

So that gradient descent can update the network’s weights using backpropagation.