Lecture 16 - Introduction to deep learning Flashcards

Question 1

Q

What is regularization in machine learning?

Answer

A

Techniques designed to reduce test error, even at the cost of increasing training error, to balance model complexity and data.

Think of regularization as a “balancing act” that prevents overfitting by simplifying the model’s behavior.

Question 2

Q

Name three common regularization techniques in deep learning.

Answer

A

Dropout: Temporarily disables random neurons during training.
Early Stopping: Halts training when validation error stops improving.
Parameter Norm Penalties: Penalizes large weights (e.g., L1/L2 norms).

Question 3

Q

What is dropout, and why is it used?

Answer

A

Dropout prevents overfitting by randomly dropping units (neurons) during training, making the model less reliant on specific features.

Question 4

Q

Define Early Stopping.

Answer

A

A method to avoid overfitting by stopping training when validation set performance no longer improves.

Question 5

Q

What is stochastic gradient descent (SGD)?

Answer

A

A method that updates model weights by calculating gradients on small, random batches of data.

Question 6

Q

How does momentum improve optimization?

Answer

A

Momentum accumulates gradients over time to accelerate convergence and smooth updates.

Question 7

Q

What is the Adam optimizer?

Answer

A

Combines momentum and adaptive learning rates for efficient optimization.

Question 8

Q

What is the primary purpose of Convolutional Neural Networks (CNNs)?

Answer

A

To process grid-like data such as images or time-series.

Question 9

Q

What are the three key stages of a CNN layer?

Answer

A

Convolution: Detects features using filters.
Non-linearity: Introduces non-linear decision boundaries (e.g., ReLU).
Pooling: Reduces spatial dimensions for efficiency and invariance.

Question 10

Q

What are recurrent neural networks (RNNs) used for?

Answer

A

Processing sequential data (e.g., text, time series) by sharing parameters across time steps.

Question 11

Q

How do RNNs differ from CNNs?

Answer

A

RNNs process data across time (sequentially), while CNNs focus on spatial patterns in grid-like data.

Question 12

Q

What are parameter norm penalties?

Answer

A

Add terms to the loss function to penalize large weights.

L1 Norm: Promotes sparsity.
L2 Norm: Penalizes the size of weights (weight decay).

Question 13

Q

Why is dataset augmentation useful?

Answer

A

Generates synthetic data to reduce overfitting, especially in data-scarce domains.

Lecture 16 - Introduction to deep learning Flashcards

(13 cards)