Deep Learning Flashcards

Question 1

Q

How do NN capture interactions?

Answer

A

By using hidden layers in which nodes are the result of the dot products of n-1 layer nodes.

Question 2

Q

What is forward propagation?

Answer

A

Multiply input node values by weights specified in edge between those nodes and next layer’s node and ADD..

Question 3

Q

How to compute final NN output value using np?

Answer

A

output = (hidden_layer_values * weights[‘output’]).sum()

Question 4

Q

What is an activation function?

Answer

A

A function applied to node inputs to produce node output.

Question 5

Q

What is RELU?

Answer

A

Rectified Linear Activation. 0 if x < 0 else x.

Question 6

Q

Why is there less of a need for feature engineering with DL?

Answer

A

Deep networks internally build representations of patterns in the data. Subsequent layers build increasingly sophisticated representations of raw data.

Question 7

Q

What are the steps of Gradient Descent?

Answer

A

Start at random point
Until you are somewhere flat:
● Find the slope
● Take a step downhill

Question 8

Q

How to avoid big steps with GD?

Answer

A

Using a learning rate: Update each weight by subtracting

learning rate * slope

Question 9

Q

How to calculate new weight for a current weight of 2 connecting Node with value of 3 to Node with predicted Valued 6 and actual Value of 10 and learning rate 0.01?

Answer

A

Multiply the gradient with the learning rate:
* Slope of the loss function (Error) w.r.t value at the node we feed into: 2* (Predicted Value [6] - Actual Value [10]): -8
* The value of the node that feeds into our weight: 3
* Slope of activation function at the node it feeds into. None here.
* Learning rate: 0.01
Result: 2 - 0.01(-24) = 2.24

Question 10

Q

What is backpropagation?

Answer

A

It allows gradient descent to update all weights in neural network (by getting gradients for all weights). It first tries to estimate the slope of the loss function w.r.t each weight and then uses forward propagation to calculate predictions and errors.

Question 11

Q

What is stochastic gradient descent?

Answer

A

When slopes are calculated on one batch at a time.

Question 12

Q

What is an epoch?

Answer

A

When all batches have been used to update the weights.

Question 13

Q

What is the Adam optimizer?

Answer

A

An algorithm for first-order gradient-based optimization of stochastic objective functions.

Question 14

Q

How to load csv data with np?

Answer

A

predictors = np.loadtxt(‘predictors_data.csv’, delimiter=’,’)

Question 15

Q

How to ease optimization?

Answer

A

Scaling data before fitting can ease optimization.

Question 16

Q

What is a good objective function for regression?

Answer

Study These Flashcards

A

Mean squared error

Question 17

Q

What is a good objective function for classification?

Answer

Study These Flashcards

A

categorical cross entropy

Question 18

Q

What is softmax?

Answer

Study These Flashcards

A

A generalization of the logistic function that “squashes” a K-dimensional vector z of arbitrary real values to a K-dimensional vector s(z) of real values in the range [0, 1] that add up to 1.