lesson_3_flashcards

Question 1

Q

What is regularization in deep learning?

Answer

A

Techniques to prevent overfitting by penalizing large weights or encouraging sparsity, such as L1, L2, or dropout regularization.

Question 2

Q

How does dropout regularization work?

Answer

A

At each iteration, nodes are randomly ‘dropped’ (deactivated) with a probability (p), preventing over-reliance on specific features.

Question 3

Q

What is batch normalization?

Answer

A

A method to normalize layer outputs during training, improving gradient flow and learning stability.

Question 4

Q

What is data augmentation?

Answer

A

Techniques like flipping, rotating, or adding noise to expand datasets artificially, improving model generalization.

Question 5

Q

Why is weight initialization important?

Answer

A

Proper initialization ensures effective gradient flow, avoiding vanishing or exploding gradients, and helps models converge faster.

Question 6

Q

What is Xavier initialization?

Answer

A

A method to maintain consistent variance of activations across layers by scaling weights based on the number of input and output nodes.

Question 7

Q

What are optimizers in deep learning?

Answer

A

Algorithms like SGD, Adam, and RMSProp used to adjust model parameters to minimize the loss function.

Question 8

Q

What is momentum in optimization?

Answer

A

A technique to smooth updates by incorporating past gradients, improving convergence speed and stability.

Question 9

Q

What is the vanishing gradient problem?

Answer

A

A phenomenon where gradients become very small as they propagate back through layers, slowing or halting learning.

Question 10

Q

What are the key considerations for designing a neural network architecture?

Answer

A

Understanding the data, selecting appropriate layers, ensuring gradient flow, and leveraging domain-specific insights.

Question 11

Q

What is the purpose of normalization in deep learning?

Answer

A

To standardize input or layer data, ensuring balanced and effective gradient flow throughout the network.

Question 12

Q

What is data preprocessing?

Answer

A

Preparing raw data for training through techniques like normalization, scaling, or encoding.

Question 13

Q

How does ReLU improve gradient flow?

Answer

A

By providing non-saturating gradients for positive inputs, ensuring gradients remain large enough to drive learning.

Question 14

Q

What is the importance of hyperparameter tuning?

Answer

A

Adjusting settings like learning rate, batch size, or regularization strength can significantly impact model performance.

Question 15

Q

What is the difference between overfitting and underfitting?

Answer

A

Overfitting occurs when a model performs well on training data but poorly on unseen data, while underfitting fails to capture patterns in training data.

lesson_3_flashcards

(15 cards)