Gradient Descent Flashcards
What is gradient descent?
It is an optimization algorithm used to minimize a function, the function we are talking about is the lost function ( we defined it in the previous lesson).
It is fundamental for training ML models .
The formula is a bit of a mess so we view it on iPad
What is the main hypothesis we use when we compute gradient descent,
The function has to be differentiable , (in stochastic process we are not able to calculate the derivative)
Why is it usefull the taylor series in computing the gradient descent?
Can be used to approximate the derivative which are computanionally intensive to calculate
What is the main definition of gradient descent
So the gradient is the generalization of the derivative in more variables
What is the consequence of considering **alpha ** too large or too small in the gradient descent
If alpha is too large we have a time of convergence faster but it isn’t guarantee while in the other case the convergence is surely going to happen but in a slow time
Write down the formula and explain it
So the formula is messy to so I would do this exercise on paper + explaining
Given a function f(x,y), x0, alpha ; write down the iteration to do (exam question)
Resolution exercise on paper