Unconstrained and Constrained Optimization Flashcards
How do you find a critical point of a single variable function using calculus?
Take derivative and set it equal to zero. Solve for critical point(s).
How do you determine Max/Min of a single variate function using calculus?
Take second derivative, if its “+” then its a minima (inflection concave, bowl), if “-“ then maxima (inflection convex, hill)
How do you find the critical points for multi-variate function with calculus?
Take gradient of function w.r.t. all variables. Set = 0, solve for critical point(s).
How do you determine if multi-variate function is max/min?
Take Hessian H(x), (determinant of second gradient).
- Positive Definite Hessian - positive value of determinant, all eigenvalues ‘+’, point is minima.
- Positive Semi-Definite Hessian - Det(H(x)) >= 0, one eigenvalue = 0, point could be infinite line or unbound from below/above.
- Negative Definite - negative determinant, all eigenvalues ‘-‘, point is maxima.
- Indefinite - both positive and negative eigenvalues, saddle point.
Define a Positive Definite Matrix, what does PD hessian mean?
PD matrix means all eigenvalues are positive. PD hessian means determinant is positive, point is a minima.
Define a Positive Semi-Definite Matrix, what does PSD hessian mean?
A PSD matrix means at least one, or multiple eigenvalues are = 0, the rest are positive. A PSD hessian means your minimizer could be a hyperplane of infinite minima or its unbounded from below/above
Define a Negative Definite Matrix, what does ND hessian mean?
a ND matrix means all eigenvalues are negative. A ND hessian means the point is a maximum.
Define a non-singular indefinite matrix
Both positive and negative eigenvalues
Define a non-singular/invertible matrix
matrix that can be inverted, AA^(-1) = I
Define a singular matrix
matrix that does not have an inverse
Find eigenvalues and eigenvectors of 2x2 matrix:
1) (A-lambdaI)
2) det(A-lambdaI) = 0
3) (A-lambda_iI)V_i = 0 *for each lambda is V
What is optimization?
The process of making something fully perfect, functional, or effective as possible. *Mathematically finding the min/max within a neighborhood or constraints.
What is the goal of unconstrained optimization?
To minimize some function f(x), or maximize -f(x).
What are challenges associated with unconstrained optimization.
- The function f(x) may be unknown form, or have little knowledge of the shape
- f(x) may be expensive to evaluate, may need to reduce number of function calls/queries of objective function
What is Newton’s Root Finding Method?
Root finding methods are used the determine the ‘x’ values at which a function ‘g’, crosses zero using successive approximations to these roots:
x_n+1 = x_n - f(x_n)/f’(x_n)
What is the first step to Newton’s root finding method?
Need an initial point.
How do we determine an initial pt for Newton’s root finding?
Make a plot, or guess based on intuition.
What is a multimodal problem?
a multimodal problem is a problem with multiple local minima/optimum over a range of x
*unimodal - monotonically increases for some range x<=m, and decreases for some range x>=m (single optima)
When do calculus-based method fail to find a minima
- Higher order functions
- Transcendental functions
- Constrained optimization
- Functions that are “black box” functions of unknown form
When does a derivative of a function not exist?
- When the function is not continuous: cusp, hold, any discontinuity
- If an asymptote or vertical/horizontal line
What is the Lagrangian Function?
A modification to an objective function accounting for “m” inequality constraints and l equality constraints
What are Lagrange multipliers?
They tell you the sensitivity of the solution to perturbation of the constraint.
- zero = inactive / passive
- large = highly sensitive
- small = insensitive to changes in constraint
What is an active constraint?
A constraint for which the constrained optimization solution lies on the boundary, preventing the solution from reaching the unconstrained global minima.
What are 3 characteristics of “Black Box” functions?
1) Functions of unknown form
2) Functions may be engineering tools that are expensive to run: inadequate # samples, can’t plot, too many vbls to visualize
3) Functions often strongly non-linear and have discontinuities
What is engineering analysis?
Given the values for the design variables, analysis to determine how we expect the system to perform
What is engineering design?
Given a desired performance or measure of ranking designs based on performance, what system do we want?
What is the vector x in an optimization problem f(x)?
X - is a design variable vector related to the design drawn from the design space.
What is a weak local solution to a minimization problem?
A weak local solution is x* of which f(x) < = f(x) for all x in Neighborhood around x
What is a strong local solution to a minimization problem?
A strong local solution is x* of which f(x) < f(x) for all x in the Neighborhood of N around x*
What is a global solution? Which type of function guarantees global sol?
X* is a global minimizer of f(x*) <= f(x) for ALL x.
A convex function guarantees a global sol.
Why are global solutions difficult to find?
Because we can only evaluate f(x), grad_f(x), and H(x) locally - one point at a time. Its difficult to know when to stop, there is no test that guarantees a global solution without further info unless function is convex.
Define the concept of convexity, write the equation
a function is convex if for all points x1,x2 in the domain of f(x) and for all lambda [0,1], the value of the function at (x1lambda+(1-lambda)x2) is less than then function average f(x1)lambda + (1-lambda)f(x2).
- if f is convex, any local optimizer x* is a global minimizer
- concept of convexity proves unimodality
What are the first and second order necessary conditions for unconstrained optimization?
(Optimality Conditions)
1st Order Nec:
If x* is a local minimizer, and the function is continuous and differentiable - the gradient at x* must equal zero.
2nd Order Nec:
if x* is a local minimizer, and the function is continuous and differentiable - the gradient at x* must equal zero and the hessian at x* must be positive semi-definite.
What are the first and second order sufficient conditions?
If the gradient f(x) equals zero and the hessian H(x) is positive definite at the critical point x, then x is a strong local solution.
**ONLY CONDITIONS that guarantee minimizer solution.
What is an optimization algorithm?
An optimization algorithm is one that evaluates a sequence of designs with the goal of finding the best.
Basic steps of an optimization algorithm (super abbreviated):
1) Starting point for design (x)
2) Information of state around x
3) Select subset of design space to search for improvement upon current point from state of x information
4) Search subset area, stop after sufficient improvement to “best” in subset
5) Move to this pt as “current state” - restart step 2
6) Iterate until improvement is “good enough”
What are three types of unconstrained optimization algorithms?
Direct search, line search, and trust region methods
Describe Direct Search Methods
- No derivatives used in search
2. Search direction based on a sequence, patter, or random
What are the types of Direct Search Methods?
- Grid & Random search - distribute pts, evaluate function and zoom to min(f) for finer discretiziation
- Random walk, compass search, and coordinate pattern search - use univariate directions (or random) to check function improvements, eval min(f), and move
Describe Line Search Methods
- Typically uses derivatives to define search direction, but not always
- Search is conducted along successive lines in design space.
-only allows for semi-definite or definite hessians
“fix direction (p), compute step length (alpha)”
What are the orders of Line Search Methods? Explain each one
Zeroth Order:
- only uses function values f(x)
First Order:
- uses function value f(x), and gradient grad_f(x)
- may approximate hessian w/ first order methods
Second Order:
- uses f(x), grad_f(x), and Hessian H(x)
Describe Trust Region Methods.
- Search is conducted locally by approximating objective function as quadratic or linear
- “trust region” refers to how far from current point your model function is still a valid approximation
- Trust regions allow for indefinite or semi-definite Hessians (line search does not)
“fix distance (delta), choose direction (p)”
What are the two types of line search methods?
Exact and Inexact
What is an unconstrained optimization Merit Function? What is the relation to the objective function? What is its advantage?
The merit function is a 1D slice of high-dimensional space, approximating the objective function.
phi(a) = f(x_k+ap)
min phi(a)
phi(0) = f(x)
phi’(0) = dphi/da)a = 0 = grad_f(x_k)p < 0
Why does unconstrained optimization require sufficient decrease and curvature conditions?
The line search method f(xk+apk) <= f(xk) is not enough to produce convergence to an inflection pt/minimizer x.
- -Finds relative decrease each iteration, but could miss local min
- -Why we need Wolfe Conditions
What is the sufficient decrease condition?
Sufficient decrease, first wolfe, or armijo condition: checks that the value of the merit function at alpha is less than the starting point value