Lecture 16 - Introduction to deep learning Flashcards

1
Q

What is regularization in machine learning?

A

Techniques designed to reduce test error, even at the cost of increasing training error, to balance model complexity and data.

Think of regularization as a “balancing act” that prevents overfitting by simplifying the model’s behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name three common regularization techniques in deep learning.

A

Dropout: Temporarily disables random neurons during training.
Early Stopping: Halts training when validation error stops improving.
Parameter Norm Penalties: Penalizes large weights (e.g., L1/L2 norms).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is dropout, and why is it used?

A

Dropout prevents overfitting by randomly dropping units (neurons) during training, making the model less reliant on specific features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define Early Stopping.

A

A method to avoid overfitting by stopping training when validation set performance no longer improves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is stochastic gradient descent (SGD)?

A

A method that updates model weights by calculating gradients on small, random batches of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does momentum improve optimization?

A

Momentum accumulates gradients over time to accelerate convergence and smooth updates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the Adam optimizer?

A

Combines momentum and adaptive learning rates for efficient optimization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the primary purpose of Convolutional Neural Networks (CNNs)?

A

To process grid-like data such as images or time-series.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the three key stages of a CNN layer?

A

Convolution: Detects features using filters.
Non-linearity: Introduces non-linear decision boundaries (e.g., ReLU).
Pooling: Reduces spatial dimensions for efficiency and invariance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are recurrent neural networks (RNNs) used for?

A

Processing sequential data (e.g., text, time series) by sharing parameters across time steps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do RNNs differ from CNNs?

A

RNNs process data across time (sequentially), while CNNs focus on spatial patterns in grid-like data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are parameter norm penalties?

A

Add terms to the loss function to penalize large weights.

L1 Norm: Promotes sparsity.
L2 Norm: Penalizes the size of weights (weight decay).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is dataset augmentation useful?

A

Generates synthetic data to reduce overfitting, especially in data-scarce domains.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly