CHAPTER 5: Duality Flashcards
Lagrangian multipliers, which was used to find the critical points of a function with constraints given by equations.
find the minimum of f(x1,x2) = x2 1 given that x1 and x2 satisfy x1 + x2 = 1. To solve the problem, you define a Lagrangian function
L = f(x1,x2)−λ(x1 + x2 −1),
where λ is the Lagrangian multiplier. The minimum of f (with the constraint) is one of the critical points of L, where ∂L/∂x_i = 0.
You will see that the definitions of the Lagrangian functions we introduce below are very similar to the one above, except that we now will be dealing with inequality constraints in general. It turns out that it is easier to see the key features of a dual problem from a more general point of view, without the details related to LPPs. We therefore consider the following more general problem:
max z = f(x),
subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n
We let x∗ be the feasible optimal solution for the problem, i.e., f(x∗) = max f(x) and g(x∗) ≤ 0, and A denotes the feasible region, i.e.,
A = {x ∈ Rn : g(x) ≤0}.
Note the constraint xj ≥ 0 has been included in the above problem, because it can be written as −xj ≤0 and we can take−xj as one of the components of g(x). Remark. In terms of the feasible regionA, the problem can be written as max_{x∈A} f(x), where the constraints are not shown explicitly, but have been implicitly included in the domain in which we search for the maximum.
DEF 5.1
Lagrangian function, Lagrangian dual functio
The Lagrangian function L(x,y) associated with the maximisation problem in Eq. 5.1 is defined as:
L(x,y) = f(x)−y^T g(x)
= f(x)- sum from I=1 to m of [y_i g_i (x)]
where y ∈ Rm, x ∈ Rn and y ≥ 0.
y is the vector of the dual variables.
(y is new variables we introduces)
For each constraint there is a corresponding dual variable.
The Lagrangian dual function is defined as:
v(y) = max _{x∈Rn} L(x,y), for y≥0.
lagrangian function motivation
solve the problem given by equation
max z = f(x),
subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n
one may try to combine the constraint and the cost function into a single function, and somehow enforce the constraint directly. One way is to replace the problem by
max_{x∈Rn} [l(x)] =
f(x) + P(g(x))
(without constraints).
l(x) is the penalized cost function and P(·) is the penalty function
- P(u) takes very large negative values when u bigger than 0, nearly 0 when u ≤0.
- P(g(x)) makes a very large negative contribution when g(x) bigger than 0
- SO maximum of l(x) is never found for x where g(x) bigger than 0
- Thus sol is same as if g(x) ≤ 0 is imposed
Since l(x) ≈ f(x) when g(x) ≤ 0, the solution would be essentially the same as the original constrained problem.
The Lagrangian function L(x,y) may be regarded as a poor–but much simpler–approximation of l(x) with P(g(x)) = −y^T g(x) for some y≥0
diagram: P(u) approximation from lagrangian dashed straight decreasing line through O, P(u) is just below 0 for negative u and decreases closer to -0,for positive u continues to decrease and bounded below by -30
The Lagrangian dual function is the optimal value of the penalized cost function. Though the Lagrangian function appears as a poor choice for a penalised cost function if we were using the penalty method, it has the advantages of being simple and retains some key properties of the penalty method
PROPOSITION 5.1
*usefulness of dual problem
v(y) and f(x)
v(y) ≥ f(x*) ≥ f(x) for y≥0 and x being a feasible solution of
max z = f(x),
subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n
(i. e., x ∈A).
* v(y) for any y is an upper bound for f(x), so provides info if exact minimum of f(x) is difficult to find
Proof
v(y) ≥ f(x∗) ≥ f(x) for y≥0 and x being a feasible solution of
max z = f(x),
subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n
1) f(x∗) ≥ f(x)
follows from the def of x∗
2) note v(y) = max_{x∈R^n} [L(x,y)] =
max_{x∈R^n} [f(x) - y^Tg(x)]
since x∗ is feasible we have g(x∗)≤0 hence -y^T g(x∗) ≥0 for y ≥0
Thus v(y)= max_{x∈R^n} [L(x,y)] ≥ L(x∗,y) = f(x∗)- y^Tg(x∗) ≥ f(x*)
*def of L
Def 5.2
can we find the minimum of v(y), hence obtain the best upper bound for f(x)?
To answer this question, we need to solve the following problem: min v(y) subject to y≥0. This problem is the dual problem of the problem
max z = f(x),
subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n
The dual problem of the primal problem in
max z = f(x),
subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n
is min v(y), subject to y≥0,
where v(y) is the Lagrangian dual function, i.e.,
v(y) =
max_{x∈Rn} L(x,y),
L(x,y) = f(x)−y^T g(x),
and y is the vector of the dual variables
NOTES 5.2
the dual problem of the problem
max z = f(x),
subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n
the dual Lagrangian function v(y) is always convex (or concave, if the primal problem is a minimisation problem), and the dual problem is a convex minimisation problem, which makes it easier to solve if we can write down the problem explicitly
EXAMPLE 5.1
The primal problem is defined as:
max f(x) = 4+2x−x^2, s. t.
x ≤ 2.
Find the Lagrangian dual function of the problem.
Lagrangian function
L(x,y) = f(x)−y^T g(x)
= 4+2x−x^2 -
for fixed y
v(y) =
max_{x∈Rn} L(x,y),
+2x−x2
Example 5.2. The primal problem is defined as: max f(x) = 4 + 2x−x^2, s. t.
x(3−x) ≤ 0.
Find the Lagrangian dual function of the problem.
Lagrangian function
L(x,y)=4+2x-x^2-yx(3-x)
we have
∂L/∂x = 2 − 2x − 3y + 2xy,
∂^2L/∂x^2 = −2 + 2y
if y=1 then ∂L/∂x=-1 less than 0. Therefore max L(x,y)= +∞ for y=1. If y6=1, letting ∂L/∂x=0 gives the stationary point x= (3y-2)/2(y-1), corresponds to a minimum if y less than 1 since ∂^2L/∂x^2 bigger than 0, ie v(y)= max L(x,y)= +∞ for y bigger than 1.
The stationary point corresponds to a maximum when y less than 1 : v(Y)= L(x,y)|_{x=(3y-2)/2(y-1)}= 4 - [ (3y-2)^2/4(y-1)] which is defined for 0 less than or equal to y less than 1.
dual function
primal function
We discuss briefly the relation between the dual function and the primal function in this section.
PROP 5.2 WEAK DUALITY CONDITION
Let y* be the feasible optimal solution for the dual problem, i.e., v(y) = min v(y) and y ≥ 0. Then v(y) ≥ f(x).
Proof: It follows from Proposition 5.1: v(y) ≥ f(x*) for any y ≥ 0.
DUALITY GAP
Remark. Weak duality condition is satisfied by any primal-dual pairs. In general, v(y) ≠ f(x), and the difference is called the duality gap.
DEF 5.3 STRONG DUALITY
. We say the strong duality condition holds if v(y) = f(x), i.e., if the optimal costs for the primal/dual problems are the same.
EXAMPLE 5.3
It can be checked that the strong duality condition holds in both Example 5.1 and 5.2
Example 5.1, we find x= 1 with f_max = 5 and y = 0 with v_min = 5. For the primal problem in Example 5.2, the constraint is equivalent to x ≤ 0 or x ≥ 3. Therefore we find f_max = f(0) = 4 with x* = 0. Letting dv/dy = 0, we find y* = 2/3 and vmin = v(y*) = 4.