Introduction to Deep Learning Flashcards

1
Q

In a single perceptron(neuron), what is the formula for calculating a single output?

A

y = g(W0 + XW)
1. Dot product (XW)
2. Add bias (W0)
3. Apply non-linearity (g)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 3 layers of NN?

A

Input layer, hidden layers, output layer.
The input layer receives data, hidden layers process it, and the output layer produces the final predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to determine how well the model perform?

__ Function

A

Using Loss Function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 2 types of loss functions?

  1. Binary __ loss
  2. Mean __
A
  1. Binary cross entropy loss (Classification)
  2. Mean Squared Error (MSE) (Regression)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the problem of training NN?

Finding the vector __. The __ determine the performance, because they…

A

Finding the vector weights(W). The weights determine the performance, because they directly influence the output and thus the loss.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is backpropagation in simple sense?

Backpropa uses the __ rule to compute the gradient of the loss …

A

Backpropagation uses the chain rule to compute the gradient of the loss with respect to the weights (W). This allows us to update the weights in the direction that minimizes the loss.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In summary, what is forwardpropagation?

A

FP refers to the computation of outputs from the inputs in NN.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the optimization method we can use during backpropagation?

__ (GD)

A

The method is gradient descent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does GD works?

It sets the ___ __ (LR) …

A

It sets the learning rates by designing an adaptive LR that adapts to the landscape.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are 2 tips to train NN?

  1. S…
  2. B…
A

Stochastic GD (SGD) and batching

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Batching is better than SGD. Why?

It uses ___ batch of data to compute.

A

Instead of using single training point, batching uses small batch of data to compute. It is more accurate and faster to train.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to deal with overfitting in NN?

Re___

A

Regularization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the 2 regularization technique?

  1. D…
  2. E… S…
A
  1. Dropout
  2. Early Stopping
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does Dropout work?

  1. Randonly set some ___ to 0.
  2. Drop __% of __ in layers.
A
  • Randomly set some activations to 0.
  • Drop 50% of activation in layers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does Early Stopping work?

Stop the model from ___ before ___

A

Stop the model from training before overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly