Unconstrained and Constrained Optimization Flashcards

Question

What is engineering analysis?

Answer 1

Given the values for the design variables, analysis to determine how we expect the system to perform

Answer 2

Given a desired performance or measure of ranking designs based on performance, what system do we want?

Answer 3

X - is a design variable vector related to the design drawn from the design space.

Answer 4

A weak local solution is x* of which f(x*) < = f(x) for all x in Neighborhood around x*

Answer 5

A strong local solution is x* of which f(x*) < f(x*) for all x in the Neighborhood of N around x*

Answer 6

X* is a global minimizer of f(x*) <= f(x) for ALL x. | A convex function guarantees a global sol.

Answer 7

Because we can only evaluate f(x), grad_f(x), and H(x) locally - one point at a time. Its difficult to know when to stop, there is no test that guarantees a global solution without further info unless function is convex.

Answer 8

a function is convex if for all points x1,x2 in the domain of f(x) and for all lambda [0,1], the value of the function at (x1*lambda+(1-lambda)*x2) is less than then function average f(x1)*lambda + (1-lambda)*f(x2). - if f is convex, any local optimizer x* is a global minimizer * concept of convexity proves unimodality

Answer 9

1st Order Nec: If x* is a local minimizer, and the function is continuous and differentiable - the gradient at x* must equal zero. 2nd Order Nec: if x* is a local minimizer, and the function is continuous and differentiable - the gradient at x* must equal zero and the hessian at x* must be positive semi-definite.

Answer 10

If the gradient f(x*) equals zero and the hessian H(x*) is positive definite at the critical point x*, then x* is a strong local solution. **ONLY CONDITIONS that guarantee minimizer solution.

Answer 11

An optimization algorithm is one that evaluates a sequence of designs with the goal of finding the best.

Answer 12

1) Starting point for design (x) 2) Information of state around x 3) Select subset of design space to search for improvement upon current point from state of x information 4) Search subset area, stop after sufficient improvement to "best" in subset 5) Move to this pt as "current state" - restart step 2 6) Iterate until improvement is "good enough"

Answer 13

Direct search, line search, and trust region methods

Answer 14

1. No derivatives used in search | 2. Search direction based on a sequence, patter, or random

Answer 15

1. Grid & Random search - distribute pts, evaluate function and zoom to min(f) for finer discretiziation 2. Random walk, compass search, and coordinate pattern search - use univariate directions (or random) to check function improvements, eval min(f), and move

Answer 16

1. Typically uses derivatives to define search direction, but not always 2. Search is conducted along successive lines in design space. -only allows for semi-definite or definite hessians "fix direction (p), compute step length (alpha)"

Answer 17

Zeroth Order: - only uses function values f(x) First Order: - uses function value f(x), and gradient grad_f(x) - may approximate hessian w/ first order methods Second Order: - uses f(x), grad_f(x), and Hessian H(x)

Answer 18

1. Search is conducted locally by approximating objective function as quadratic or linear 2. "trust region" refers to how far from current point your model function is still a valid approximation 3. Trust regions allow for indefinite or semi-definite Hessians (line search does not) "fix distance (delta), choose direction (p)"

Answer 19

Exact and Inexact

Answer 20

The merit function is a 1D slice of high-dimensional space, approximating the objective function. phi(a) = f(x_k+a*p) min phi(a) phi(0) = f(x) phi'(0) = dphi/da)a = 0 = grad_f(x_k)*p < 0

Answer 21

The line search method f(xk+a*pk) <= f(xk) is not enough to produce convergence to an inflection pt/minimizer x*. - -Finds relative decrease each iteration, but could miss local min - -Why we need Wolfe Conditions

Answer 22

Sufficient decrease, first wolfe, or armijo condition: checks that the value of the merit function at alpha is less than the starting point value

Answer 23

The second Wolfe Condition checks that the slope of the merit function is increasing to more negative

Answer 24

The Strong Wolfe condition checks that the absolute value of the merit function at alpha is less than the absolute at the origin. --> It insures the function is swallowing out like approaching the bottom of the bowl.

Answer 25

xk = x0 + a*pk 1. Define an initial point in the design space to begin search x0 2. Define a descent search direction - grad(f(x)) 3. Find the value alpha that minimizes the function along the descent search direction. (step length 'alpha' approximations) 4. Decide when the process has converged to an acceptable solution to stop iterating grad(f(x)) = 0

Answer 26

1. Backtracking 2. Polynomial approximation 3. Golden Section Method (GSM)

Answer 27

1. Start at a given start length for alpha. 2. Evaluate the merit function at alpha_k 3. Check if satisfies the sufficient decrease condition. 4. Decrease length of alpha for next step by some percentage. Repeat

Answer 28

The idea is that we approximate the function f(X) along the search direction in terms of alpha. typically interested in "reasonable" but inexact approximations of the minimum along the search line. Use quadratic or cubic

Answer 29

- For a quadratic objective function, we would need either a 2 pt. or 3 pt. approximation to find unknown constants - Once we have a form for polynomial approximation, we can differentiate with respect to alpha and find the value of alpha that minimizes f along the search direction.

Answer 30

We bracket the minimum to ensure that we can find a good approximation along the line.

Answer 31

- Assume the problem has a single minimum - Assume that we are traveling along a descent direction. - Assume that we are working with a fixed interval size

Answer 32

Choosing points such that they reduce/interior to the min/max points between (f1,f2,f3) if f1

Answer 33

Choice of too small an interval: f3 < min(f1,f2) - may not include local min Choice of too large an interval: f1 < min(f2,f3) - may infinitely decrease steps

Answer 34

The golden section method is successively refines the bracketing of the minimum step length in order to converge to an estimate the step length a* that corresponds to a minimum f(x*).

Answer 35

It eliminates the same fraction of the interval in successive iterations. Limitations are that GSM only good for convex functions or unimodal (second deriv > 0 everywhere)

Answer 36

Start with first bracket along search direction defined by an upper and lower bound. - add two intermediate points - replace one bound with new pt reducing size of search region - re-label the remaining intermediate point and develop a new intermediate point based on effective spacing rules.

Answer 37

The two relations guarantee that we discard the same fraction of the interval with respect to each new point and the interior remaining point. The ratio of the intermediate points gives us the golden section ratio: 1.618

Answer 38

Establish relative tolerance

Answer 39

If you know the function is unimodal when use polynomial approximation. If you don't, then you may want to use GSM.

Answer 40

It can find the estimated minimum with fewer function calls

Answer 41

The descent direction must form an acute angle with the negative function gradient. The dot product of the descent direction with the negative function gradient, if it exists, means there is a component of the descent direction that decreases the objective function.

Answer 42

A direction that provides the most rapid decrease in function evaluation within the small neighborhood of current point through the domain

Answer 43

Steepest descent directions are perpendicular to previous dir, which is tangent to contour line. - Inexact searches, approximately perpendicular

Answer 44

Newton direction is search direction based on 2nd order Taylor Series expansion of the function. - search direction pk = - H(x)^-1* grad(f), iteration xk = x+pk - Assuming step length of 1 gives exact solution for min. - For positive definite hessian, the search direction of max descent toward minimum - Finds direction for function with quadratic convergence

Answer 45

- Hessian may not exist - Hessian may not be positive definite. So it may not define descent direction - Saddle point or maximum

Answer 46

Most accurate, computationally expensive. For non +def hessian, need a hessian update to maintain sufficiently positive definite hessian approximation (to satisfy Wolfe/GSM/Backtracking)

Answer 47

- Quasi-Newton search direction is a Newton direction except with 1st order hessian approximation - Define this by using Taylor Series expansion of the gradient

Answer 48

A-orthogonality (linear independence)

Answer 49

Defined vectors such that no linear combination of the any vector can be written to make another in the set

Answer 50

Powells Method: Univariate Search:

Answer 51

Steepest Descent, Conjugate Gradient, Quasi-Newton Methods

Answer 52

modifies a steepest descent algorithm by enforcing conjugacy. The search direction is based on the gradient at the current point plus a scalar multiple of the previous search direction.

Answer 53

Univariate search is a zeroth-order line search algorithm in which we systematically move along a different coordinate variable direction per iteration, determining '+' or '-' direction along the coordinate line is a descent direction.

Answer 54

We can choose our next coordinate variable direction based on the derivative of each to see which direction will give the greatest improvement in our function value

Answer 55

Powell's method is a modification of the univariate search based on conjugate directions. It takes min steps in each univariate, builds a conjugate, takes min step, and removes first unit in direction vectors. Steps in univariate direction, and build a new conjugate. When slope is within a tolerance, restart univariate directions again.

Answer 56

In general, for an n-D quadratic function with a minimum. Powell's method will converge to the minimum after n conjugate directions have been formed, requiring n^2 line searches.

Answer 57

Steepest descent algorithms do not use information from previous iterations to form search directions

Answer 58

A convex set is defined as a dataset for which every point can be connected to another pt by a line, and ALL pts are inside the set

Answer 59

A globally convergent algorithm is to describe an algorithm that will converge to a global optimum from ANY starting pt for a convex optimization problem. *convexity is the key b.c. local optimum IS global optimum

Answer 60

Conjugate direction method uses linear independent searches to build a set of conjugate directions over whole space, producing 1-D linear searches. Which minimizes along coordinate directions based on gradient residual minimization. Fletcher Reeves Conjugate Gradient converts to non-linear conjugates and performs line search to approximate hessian from previous steps and search directions

Answer 61

FRCG uses '+' def H(x) to converge at most n iterations for quadratic functions, linearly. BFGS uses quasi-newton hessian approximation to converge super-linearly

Answer 62

Eigenvalues are a directional transformation constant (stretch/squeeze), magnitude determines rate of gradient change over axis

Answer 63

Eigenvectors are prinipal directions of constant span, telling you direction of gradient changes

Answer 64

all L > 0, + def H(x), convex, x* is minima L > = 0 (one L = 0), + semi-def, either saddle, unbound, or not enough information L<0, -def, concave, x* is maxima L1>L2>0, dir V1 has higher rate of change over unit length, contours closer. L1<0 L1>0 & and L2<0 = saddle point

Answer 65

Constraint qualifications ensure constraint linearization is a good representation of the feasible set around x*.

Answer 66

Constraint qualifications are required to use KKT conditions in constrained optimization.

Answer 67

LICQ - linear independence constraint qual. *Tells you that all constraint gradients are linearly independent MFCQ - Mangasarian Fromovitz const. qual. *Tells you all active constrains are bound, not necessarily unique or linearly independent.

Answer 68

The KKT conditions are first order necessary conditions for constrained optimization using a lagrangian approximation of your function including inequality and equality constraints. 1. ∇f(x*)=-∇gi(x*)λi (collinearity, along minimizer) 2. gi(x*) <= 0 (x* is feasible, in feasible space) 3. λi*gi(x*) = 0, where λi>=0 (complementarity, g(x) = 0 when on active constraint, h(x) = 0 always)

Answer 69

Sequential Linear Programming (SLP) Method of Feasible Directions (MoFD) Generalization Reduced Gradients (GRG) Sequential Quadratic Programming (SQP)

Answer 70

For Hessian [fxx fyx; fxy fyy]: a. if fxx > 0 and fyy < 0, det(fxx*fyy-fxy^2) < 0 we know saddle (x upward convexity, y downward concavity) b. det(H) > 0, local inflection - fxx>0, have minima - fxx<0, have maxima c. det(H) = 0, not enough information - could either be saddle, or unbound from below/above

Answer 71

Constrained line search method, uses the concept of feasibility (∇f(xo)*p<= 0) and usability (∇g(xo)*p<=0) to maintain search directions inside the feasible design space . It moves away from or travels along the constraints.

Answer 72

1. Find usable or feasible pt xo and set convergence conditions. 2. if in feasible region g<0 set steepest descent - ∇f(xo) 3. if any active constraints, solve direction method w/ pushoff or side constraints via LP methods. 4. if pushoff convergence small, check minimization. 5. Conduct unconstrained line search, w/ step length a*, evaluate constraint minimization at gi(xo+a*p) 6. If line terminates within interior of feasible region check convergence and constraint criteria KKT 7. update iteration, pushoff factors, and convergence criteria repeat.

Answer 73

Using KKT conditions.

Answer 74

The method of reducing dimensionality through total differentials of your objective and constraints to develop gradient w.r.t. dependent variables ∇f and general gradient of independent variables combined from equality and inequality constraints. Then through new linearized constraints, conducting line search methods using these two and updating linearization of gradients.

Answer 75

Step back with Newton's method for root finding.

Answer 76

Advantages: - Very efficient compared to sequential unconstrained minimization (indirect methods) Disadvantages: - spend most time recovering, lots of memory with increased slack variables and dimensionality.

Answer 77

Sequentially solving linear programming problems through linearization of your objective function and constraints to solve linear programming within side constraints, slowly marching to march toward constrained non-linear optimization problem.

Answer 78

1. Linearize problem (objective and constraints) around baseline point in design space 2. Add side constraints/move limits around baseline point to prevent unbounded searching 3. Solve LP with any appropriate technique 4. set new baseline pt, repeat 1-3

Answer 79

A quadratic approximation of your objective function and linearized constraints with taylor series 2nd order hessian or approx B (BFGS) to build your quadratic problem. Then building full set of constraints, converting any inequality with interior point or using active set methods to determine which inequality constraints are active. From here, get Lagrangian and use KKT conditions to solve equality constrained QP in 1-D searches (Direct or iteratively), update hessian/approx B, and repeat.

Answer 80

Exterior penalty function Interior penalty function Augmented Lagrangian

Answer 81

Direct methods utilize the constraints directly to solve for approximate functions or direction find for minimization. Indirect methods build pseudo functions which penalize the optimization for violating constraints through unconstrained searches.

Answer 82

Exterior penalty function method builds pseudo-objective function which penalizes the function, f(rp →∞), as your proceed exterior to the feasible region.

Answer 83

Advantages: - Simple to implement, can start interior/exterior to feasible region, allows for application of unconstrained optimization Disadvantages: - As rp →∞, the function becomes increasingly non-linear and causes difficulty line searching. - The constrained optimum is approached from infeasible region, which means if iterations stop prematurely the result could be and infeasible design

Answer 84

The interior penalty function method begins inside of the feasible region and penalizes your function, f(rp* log(g) →0), as it nears the constraint boundary.

Answer 85

Advantages: - Starting from interior, so any premature stops will still result in feasible design. - Can combine interior and exterior to address equality constraints Disadvantages: - only applies to inequality constraints (no interior for equality constraints) - Slightly more complicated - Pseudo Obj. function discontinuous at constraint boundaries, issues for line search methods.

Answer 86

1. Ensure curvature of pseudo-objective function is not dominated by single constraint with much steep gradients than other constraints 2. Make methods less sensitive to initial choices of rp (exterior) and rp' (interior) penalties.

Answer 87

scaling constraints to the same order of magnitude as the gradient of the objective function.

Answer 88

The Augmented Lagrangian method attempts to build similar penalty on equality and inequality constraints while still being able to achieve a constrained optimal design for finite values of rp without requiring convergence of rp, rp', instead of requiring lagrange multiplier updates.

Answer 89

Advantages: - Insensitive to values of rp, not necessary for rp →∞. - get precise active inequality and equality constraints - accelerated convergence due to updating of lagrange multipliers - start infeasible/feasible locations - at optimum, when λi =0, will automatically identify active constraint

Unconstrained and Constrained Optimization Flashcards

(117 cards)