Final Test Flashcards
What does Entropy mean ?
It’s the degree of disorder
How random it is
Entropy’s formula
Entropy([6+,2-])
-(6/8)log2(6/8) - (2/8)log2(2/8)
=0.8113
Entropy([0+,4-]) with log2
0
Entropy([4+,4-]) with log2
1
What are the main concepts of Backpropagation ?
It optimizes the Weight and Biases of a Neural Network
It starts from the last parameter and works its way inward
What’s the Chain Rule ?
dSize = dSize x dHeight
———– ——— ———–
dFat dHeight dFat
Sigmoid function’s formula
1 + pow(e,-(x))
What is the use of Gradient Descent ?
Calculate better parameters for prediction.
Take bigger steps when far, and small steps when the parameter is close
finds the minimum value by taking steps from an initial guess until it reaches the best value
=> Good when derivative = 0
What is the meaning of SSR ?
Sum of Squared Residuals
1.1² + 0.4² + (-1.3)² = …
What are the steps of Gradient Descent ?
When making Predictions
1. Choose the Loss function
2. Calculate SSR (how different we are)
3. Take derivative of SSR
4. Pick random value for the intercept
5. Calculate derivative using that intercept
6. Calculate the Step Size
7. Calculate the New Intercept
8. Use new intercept and repeat 5 to 8
9. Stop when Step Size is close to 0
SSR’s formula
(observed1 - predicted1)² +
(observed2 - predicted2)² + …
(1.4 - (intercept + 0.64 x 0.5))²
observed = real value
predicted = equation line
= (intercept + 0.64 x axis val)
= (intercept + 0.64 x 0.5)
Derivative of SSR with respect to the intercept
= derivative of all parts
=> Use CHAIN RULE
= -2 (1.4 - (intercept + 0.64 x 0.5))
Exemple of Loss function
SSR
What is the intercept in Gradient Descent
It’s the Y value that touches the line
How to calculate the Step Size in Gradient Descent
Step Size = Slope x Learning rate