Gradient Flashcards
Gradient Descent
Gradient descent is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).
Procedure:
The procedure starts off with initial values for the coefficient or coefficients for the function. These could be 0.0 or a small random value.
A learning rate parameter (alpha) must be specified that controls how much the coefficients can change on each update.
This process is repeated until the cost of the coefficients (cost) is 0.0 or close enough to zero to be good enough.
Notes:
Gradient descent can be slow to run on very large datasets.
Jacobian Matrix (J):
Jacobian matrix of a vector-valued function in several variables is the matrix of all its first-order partial derivatives.
Hessian Matrix (H):
Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field
H >= 0 - Convex (positive semidefinite)
H > 0 - Strictly Convex (positive definite)
H < 0 - Negative definite
H <= 0 - Negative semidefinite
Gradient Ascent
maximize the value of parameters