Deep Learning Flashcards

1
Q

True or False: Deep learning models are prone to overfitting.

A

True.
Deep learning models can have millions of parameters, making them complex and prone to overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is early stopping?

A

A method for avoiding overfitting.
It identifies the point during training when the error of the model on a validation dataset begins to increase while the training error decreases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a patience parameter?

A

A predefined threshold that specifies the number of consecutive epochs for which the error on the validation set increases while the training error decreases before training is stopped.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is dropout?
(don’t even think about it!)

A

Dropout is another method for avoiding overfitting.
For each feedforward phase, a random set of neurons is chosen from the input and hidden layers.

These selected nodes are dropped from the network and reinstated at the end of the subsequent backpropagation phase.

Essentially, training then runs on the smaller network after the neurons have been dropped.

It is a form of regularization. It keeps all the weights in a network small, thereby making a model’s predictions relatively stable with respect to small changes in the input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Complete: Once training is stopped, the version of the model that produced the lowest _____ set error is selected.

A

validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does a traditional Convolutional Neural Network (CNN) consist of?
(Hint: there are five layers)

A

1) input layer
2) convolutional layer
3) flattened layer
4) fully connected layer
5) output layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is inverted dropout?

A

A neuron is dropped by multiplying the activation of the neurons during the forward pass by zero.

A drop-mask is created using a probability P that a node in the network will not be dropped.

The activations of the nodes that are not dropped are divided by P to preserve the magnitude of the weighted sum calculations feeding into the next layer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the input(s) and output(s) of a CNN?

A

A CNN accepts an image as input.
It can have a variety of outputs, e.g. soft-max output or image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Complete: Grayscale images are _-dimensional structures, where pixel values are gray level values in the range ___ (black) to ___ (white).

A

two-dimensional
0 to 255

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Complete: Color images are normally represented using the ____ color scheme as __-dimensional structures where the __ dimension (depth) has three channels such that ? < Img(i,j,k) < ?
Here i is the ____ of the image, ____ is the width of the image, k = __, and Img(i,j,k) is a color pixel in the image.

A

RGB color scheme.
three-dimensional
third dimension
0 to 255
i is the height
j is the width
k = 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True or False: Image normalization prior to convolution ensures that the extracted features are agnostic of specific image intensity values.

A

True.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is convolution?

A

Image processing technique.

Filter/kernel used as sliding window that traverses similarly sized tiles of an input image, while generating pixel values for an output image.

Element-wise multiplication between input image tile and kernel pixel values.

Then a summation operation that includes a bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is padding?

A

Extends image boundaries by appending rows/columns of zero values to solve border effect problem.

“valid” - no padding
“same” - sufficient padding to ensure input and output dimensions are the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is CNN training?

A

Process of learning kernel and fully connected layer weights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is pooling in a CNN, and why is it used?

A

Pooling scales down feature maps by aggregating a window of pixels to reduce dimensions and capture important features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the purpose of passing the convolution output through a Re-LU activation function?

A

To facilitate a non-linear transformation, allowing the network to learn complex patterns.

15
Q

Why does pooling help convolution layers in a CNN?

A

Pooling enables convolution to extract higher-level (more abstract) features from image.

16
Q

What happens after a series of convolution and pooling layers in a CNN?

A

The output is flattened into a feature vector that feeds into a fully connected layer for classification.

17
Q

Complete: Exploding gradients are caused by the effect of ____ and _____ on the backpropagation of an ____ over multiple timestamps.

A

constant activation differentials
large weights
error gradient

18
Q

Complete: Diminishing gradients are caused by the effect of _____ and ____ on the backpropagation of an ____ over multiple timestamps.

A

small activation differentials
small weights
error gradient

19
Q

True or False: Exploding gradients are evidenced by increasingly higher activation outputs as an RNN feeds forward an input over multiple timestamps.

A

False.
Exploding gradients refer to rapidly increasing gradients during backpropagation.

20
Q
A