Deep Learning Flashcards
How do NN capture interactions?
By using hidden layers in which nodes are the result of the dot products of n-1 layer nodes.
What is forward propagation?
Multiply input node values by weights specified in edge between those nodes and next layer’s node and ADD..
How to compute final NN output value using np?
output = (hidden_layer_values * weights[‘output’]).sum()
What is an activation function?
A function applied to node inputs to produce node output.
What is RELU?
Rectified Linear Activation. 0 if x < 0 else x.
Why is there less of a need for feature engineering with DL?
Deep networks internally build representations of patterns in the data. Subsequent layers build increasingly sophisticated representations of raw data.
What are the steps of Gradient Descent?
Start at random point
Until you are somewhere flat:
● Find the slope
● Take a step downhill
How to avoid big steps with GD?
Using a learning rate: Update each weight by subtracting
learning rate * slope
How to calculate new weight for a current weight of 2 connecting Node with value of 3 to Node with predicted Valued 6 and actual Value of 10 and learning rate 0.01?
Multiply the gradient with the learning rate:
* Slope of the loss function (Error) w.r.t value at the node we feed into: 2* (Predicted Value [6] - Actual Value [10]): -8
* The value of the node that feeds into our weight: 3
* Slope of activation function at the node it feeds into. None here.
* Learning rate: 0.01
Result: 2 - 0.01(-24) = 2.24
What is backpropagation?
It allows gradient descent to update all weights in neural network (by getting gradients for all weights). It first tries to estimate the slope of the loss function w.r.t each weight and then uses forward propagation to calculate predictions and errors.
What is stochastic gradient descent?
When slopes are calculated on one batch at a time.
What is an epoch?
When all batches have been used to update the weights.
What is the Adam optimizer?
An algorithm for first-order gradient-based optimization of stochastic objective functions.
How to load csv data with np?
predictors = np.loadtxt(‘predictors_data.csv’, delimiter=’,’)
How to ease optimization?
Scaling data before fitting can ease optimization.