Question 1

Why are activation functions necessary in neural networks?

Accepted Answer

They introduce non-linearity, allowing neural networks to learn complex patterns and perform better on real-world tasks.

Question 2

What is a linear activation function?

Accepted Answer

A function where the output is a linear transformation of the input, limiting the model's ability to learn complex patterns.

Question 3

What are the benefits of non-linear activation functions?

Accepted Answer

They enable backpropagation and allow multiple layers to learn complex representations.

Question 4

What is the Sigmoid activation function's main drawback?

Accepted Answer

It suffers from the vanishing gradient problem, where values far from 0 have very small gradients, hindering learning.

Question 5

What is the advantage of the Tanh function over Sigmoid?

Accepted Answer

It outputs values between -1 and 1, making it centered around zero and allowing faster convergence.

Question 6

Why is ReLU the most popular activation function?

Accepted Answer

It is computationally efficient and avoids vanishing gradients by outputting zero for negative values and the input itself for positive values.

Question 7

What is the dying ReLU problem?

Accepted Answer

When neurons output zero for all inputs due to negative values, preventing learning.

Question 8

How does Leaky ReLU solve the dying ReLU problem?

Accepted Answer

It allows a small, non-zero gradient for negative inputs.

Question 9

What is the purpose of a loss function?

Accepted Answer

It quantifies how well a neural network's predictions match the ground truth, guiding the training process.

Question 10

How is loss computed for an entire training set?

Accepted Answer

By averaging individual loss values over all training examples.

Question 11

What is Log Loss?

Accepted Answer

Also known as cross-entropy loss, it penalizes incorrect classifications logarithmically.

Question 12

What is Binary Cross-Entropy loss?

Accepted Answer

It measures the difference between actual and predicted class probabilities for binary classification problems.

Question 13

What is Multi-class Cross-Entropy loss?

Accepted Answer

It extends Binary Cross-Entropy to multiple classes by summing losses over all possible classes.

Question 14

What are the steps of gradient-based optimization?

Accepted Answer

1. Run forward pass 2. Compute loss 3. Compute gradients 4. Update weights using gradients

Question 15

Why do we use gradients in optimization?

Accepted Answer

Gradients indicate the steepest ascent direction; moving in the opposite direction minimizes loss.

Loss and Learning Mechanisms Flashcards

(25 cards)