CHAPTER 5: Duality Flashcards

1
Q

Lagrangian multipliers, which was used to find the critical points of a function with constraints given by equations.

A

find the minimum of f(x1,x2) = x2 1 given that x1 and x2 satisfy x1 + x2 = 1. To solve the problem, you define a Lagrangian function
L = f(x1,x2)−λ(x1 + x2 −1),

where λ is the Lagrangian multiplier. The minimum of f (with the constraint) is one of the critical points of L, where ∂L/∂x_i = 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

You will see that the definitions of the Lagrangian functions we introduce below are very similar to the one above, except that we now will be dealing with inequality constraints in general. It turns out that it is easier to see the key features of a dual problem from a more general point of view, without the details related to LPPs. We therefore consider the following more general problem:

A

max z = f(x),

subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n

We let x∗ be the feasible optimal solution for the problem, i.e., f(x∗) = max f(x) and g(x∗) ≤ 0, and A denotes the feasible region, i.e.,

A = {x ∈ Rn : g(x) ≤0}.

Note the constraint xj ≥ 0 has been included in the above problem, because it can be written as −xj ≤0 and we can take−xj as one of the components of g(x). Remark. In terms of the feasible regionA, the problem can be written as max_{x∈A} f(x), where the constraints are not shown explicitly, but have been implicitly included in the domain in which we search for the maximum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

DEF 5.1

Lagrangian function, Lagrangian dual functio

A

The Lagrangian function L(x,y) associated with the maximisation problem in Eq. 5.1 is defined as:

L(x,y) = f(x)−y^T g(x)
= f(x)- sum from I=1 to m of [y_i g_i (x)]

where y ∈ Rm, x ∈ Rn and y ≥ 0.
y is the vector of the dual variables.
(y is new variables we introduces)

For each constraint there is a corresponding dual variable.

The Lagrangian dual function is defined as:
v(y) = max _{x∈Rn} L(x,y), for y≥0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

lagrangian function motivation

A

solve the problem given by equation
max z = f(x),

subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n

one may try to combine the constraint and the cost function into a single function, and somehow enforce the constraint directly. One way is to replace the problem by

max_{x∈Rn} [l(x)] =
f(x) + P(g(x))
(without constraints).

l(x) is the penalized cost function and P(·) is the penalty function

  • P(u) takes very large negative values when u bigger than 0, nearly 0 when u ≤0.
  • P(g(x)) makes a very large negative contribution when g(x) bigger than 0
  • SO maximum of l(x) is never found for x where g(x) bigger than 0
  • Thus sol is same as if g(x) ≤ 0 is imposed

Since l(x) ≈ f(x) when g(x) ≤ 0, the solution would be essentially the same as the original constrained problem.

The Lagrangian function L(x,y) may be regarded as a poor–but much simpler–approximation of l(x) with P(g(x)) = −y^T g(x) for some y≥0

diagram: P(u) approximation from lagrangian dashed straight decreasing line through O, P(u) is just below 0 for negative u and decreases closer to -0,for positive u continues to decrease and bounded below by -30

The Lagrangian dual function is the optimal value of the penalized cost function. Though the Lagrangian function appears as a poor choice for a penalised cost function if we were using the penalty method, it has the advantages of being simple and retains some key properties of the penalty method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

PROPOSITION 5.1

*usefulness of dual problem

v(y) and f(x)

A

v(y) ≥ f(x*) ≥ f(x) for y≥0 and x being a feasible solution of

max z = f(x),

subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n

(i. e., x ∈A).
* v(y) for any y is an upper bound for f(x), so provides info if exact minimum of f(x) is difficult to find

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Proof
v(y) ≥ f(x∗) ≥ f(x) for y≥0 and x being a feasible solution of

max z = f(x),

subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n

A

1) f(x∗) ≥ f(x)
follows from the def of x∗

2) note v(y) = max_{x∈R^n} [L(x,y)] =
max_{x∈R^n} [f(x) - y^Tg(x)]

since x∗ is feasible we have g(x∗)≤0 hence -y^T g(x∗) ≥0 for y ≥0

Thus v(y)= max_{x∈R^n} [L(x,y)] ≥ L(x∗,y) =
f(x∗)- y^Tg(x∗) ≥ f(x*)

*def of L

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Def 5.2

can we find the minimum of v(y), hence obtain the best upper bound for f(x)?

To answer this question, we need to solve the following problem: min v(y) subject to y≥0. This problem is the dual problem of the problem
max z = f(x),

subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n

A

The dual problem of the primal problem in

max z = f(x),

subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n

 is 
min v(y), subject to y≥0, 

where v(y) is the Lagrangian dual function, i.e.,

v(y) =
max_{x∈Rn} L(x,y),

L(x,y) = f(x)−y^T g(x),

and y is the vector of the dual variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

NOTES 5.2

the dual problem of the problem
max z = f(x),

subject to g(x) ≤0, where g(x) = (g1(x),…,gm(x))T ∈ Rm, x ∈ R^n

A

the dual Lagrangian function v(y) is always convex (or concave, if the primal problem is a minimisation problem), and the dual problem is a convex minimisation problem, which makes it easier to solve if we can write down the problem explicitly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

EXAMPLE 5.1
The primal problem is defined as:
max f(x) = 4+2x−x^2, s. t.

x ≤ 2.

Find the Lagrangian dual function of the problem.

A

Lagrangian function

L(x,y) = f(x)−y^T g(x)

= 4+2x−x^2 -

for fixed y

v(y) =
max_{x∈Rn} L(x,y),
+2x−x2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
Example 5.2. The primal problem is defined as: 
max f(x) = 4 + 2x−x^2, s. t. 

x(3−x) ≤ 0.

Find the Lagrangian dual function of the problem.

A

Lagrangian function
L(x,y)=4+2x-x^2-yx(3-x)

we have
∂L/∂x = 2 − 2x − 3y + 2xy,
∂^2L/∂x^2 = −2 + 2y

if y=1 then ∂L/∂x=-1 less than 0. Therefore max L(x,y)= +∞ for y=1. If y6=1, letting ∂L/∂x=0 gives the stationary point x= (3y-2)/2(y-1), corresponds to a minimum if y less than 1 since ∂^2L/∂x^2 bigger than 0, ie v(y)= max L(x,y)= +∞ for y bigger than 1.

The stationary point corresponds to a maximum when y less than 1 : v(Y)= L(x,y)|_{x=(3y-2)/2(y-1)}= 4 - [ (3y-2)^2/4(y-1)] which is defined for 0 less than or equal to y less than 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

dual function

primal function

A

We discuss briefly the relation between the dual function and the primal function in this section.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

PROP 5.2 WEAK DUALITY CONDITION

A

Let y* be the feasible optimal solution for the dual problem, i.e., v(y) = min v(y) and y ≥ 0. Then v(y) ≥ f(x).

Proof: It follows from Proposition 5.1: v(y) ≥ f(x*) for any y ≥ 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DUALITY GAP

A

Remark. Weak duality condition is satisfied by any primal-dual pairs. In general, v(y) ≠ f(x), and the difference is called the duality gap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

DEF 5.3 STRONG DUALITY

A

. We say the strong duality condition holds if v(y) = f(x), i.e., if the optimal costs for the primal/dual problems are the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

EXAMPLE 5.3

It can be checked that the strong duality condition holds in both Example 5.1 and 5.2

A

Example 5.1, we find x= 1 with f_max = 5 and y = 0 with v_min = 5. For the primal problem in Example 5.2, the constraint is equivalent to x ≤ 0 or x ≥ 3. Therefore we find f_max = f(0) = 4 with x* = 0. Letting dv/dy = 0, we find y* = 2/3 and vmin = v(y*) = 4.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

THEOREM 5.1 COMPLEMENTARY SLACKNESS

A

If the strong duality condition holds, then y_ig_i(x) = 0 for all i. The equations are called the complementary slackness conditions.

17
Q

PROOF

THEOREM 5.1 COMPLEMENTARY SLACKNESS

A
From the proof of Proposition 5.1, we have
v(y) = max L(x, y) ≥ L(x
∗
, y) = f(x
∗
) − y
T
g(x
∗
) ≥ f(x
∗
)
where the last step follows from the fact that −y
Tg(x
∗
) ≥ 0 since y ≥ 0 and g(x
∗
) ≤ 0 due to the feasibility of
x
∗
. The above expression is true for any y ≥ 0, so is also true for y
∗
. Hence v(y
∗
) ≥ f(x
∗
) − y
∗Tg(x
∗
) ≥ f(x
∗
).
By strong duality, we have f(x
∗
) = v(y
∗
). This means
f(x
∗
) ≥ f(x
∗
) − y
∗T
g(x
∗
) ≥ f(x
∗
)
Thus y
∗Tg(x
∗
) = 0. We also have y
∗
i
gi(x
∗
) ≤ 0 for any i, since y
∗
i ≥ 0 and gi(x
∗
) ≤ 0. Therefore for y
∗Tg(x
∗
) =
0, we need y
∗
i
gi(x
∗
) = 0 for all i.
Remark. Consider a maximisation primal problem, and assume the strong duality condition holds. We can show that x = x* is a maximum of L(x, y*) (as a function of x). To see this, we note
L(x, y*) = f(x) − y*T
g(x) ≤ max
x∈Rn
L(x, y*) = v(y*) = f(x*)
On the other hand, L(x*, y*) = f(x*), since y*Tg(x*) = 0. Therefore we have L(x, y*) ≤ L(x*, y*), which
means x* is a maximum of L(x, y*).

As a result, the gradient of L(x, y) with respect to x must be zero at x, i.e., ∂L(x, y)/∂xi
|x=x
= 0 for
i = 1, 2, …, n. Substituting in the expression for L(x, y), we have ∂ f(x)/∂xi

x=x
∗
−
m
∑
j=1
y
∗
j
∂gj(x)
∂xi

x=x

= 0.

18
Q

THEOREM 5.2 KKT CONDITIONS

narrows down where optimal should be

WE WILL APPLY THIS IN EXAM

A

Let x* and y* be the optimal solutions for the primal and dual problems, respectively.

If the strong duality condition holds, then x* and y* satisfy the following Karush-Kuhn-Tucker (KKT) conditions:
∂L(x, y*)/∂xi  |_{x=x*}
≡
∂ f(x)/∂xi  |_{x=x*}
−
∑_{j=1,m} [y*_j ∂gj(x)/∂xi] |_{x=x*}

= 0 (i = 1, 2, …, n),

g_j(x) ≤ 0,
y
_j ≥ 0,
y_j g_j(x) = 0,
(j = 1, 2, …, m)

19
Q

THEOREM 5.2 KKT CONDITIONS

NOTES

A

The last condition in the KKT conditions is the complementary slackness condition. The first equation should be familiar to you, because same condition was used in the method of Lagrangian multipliers, which was used to find the critical points of a function subject to (equality) constraints.

The above theorem shows that we can find x* and y* by solving the equations in the above KKT conditions. It forms the basis of a whole suite of new methods that complement the simplex methods. We will expand a bit more on this later.

20
Q

Duality for a minimisation primal problem

DEF 5.4 DUALITY FOR MINIMISATION PROBLEM

A

For a minimisation problem min f(x) subject to g(x) ≤ 0, the Lagrangian function is defined as
L(x, y) = f(x) + y^T g(x)
with y ≥ 0.

The second term is always non-positive, so that L(x, y) provides a lower bound for f(x). The Lagrangian dual function is defined as
v(y) = min_{x∈R^n} L(x, y)
and the dual problem is:
max v(y), with y ≥ 0

in this cse f(x)≥ f(x) ≥ v(y) and weak duality condition is v(y) ≥ f(x). The neg sign in first of KKT conditions is changed to +, rest remain same

21
Q

MEMORIZE KKT

A

If the definitions appear difficult to memorize, it may help to remember that the Lagrangian functions are penalized cost functions; the added terms penalize the costs when the constraints are not satisfied. For minimisation problems, the added term is y^T g(x) exactly because it is penalizing: it is positive when g > 0 (i.e., when the constraint is not satisfied), so that L would be bigger than z (hence penalized).

22
Q

The dual problems of LPPs

A

When the primal problem is an LPP, the above general definition would lead to a rather simple expression for the dual problem, which is consistent with the dietician-salesman example we described before. In this sub-section, the LPPs are given in its original formulation where only decision variables are present unless specified otherwise explicitly

23
Q

EXAMPLE 5.4
(The dual of a maximisation LPP with ≤ constraints)

Find the dual problem for the following
primal LPP:
max z = c^Tx, subject to Ax ≤ b, x ≥ 0,

where b ∈ R^m and x ∈ R^n

Here x does NOT include the slack variables or artificial variables. We assume
there are n decision variables x1, x2, …, xn and m constraints.

A

The constraints are Ax ≤ b, i.e., Ax − b ≤ 0, and −x ≤ 0, and the cost function f(x) = c^Tx.
Introducing dual variables y ≥ 0 corresponding to constraints Ax − b ≤ 0, and w ≥ 0 corresponding to −x ≤ 0, respectively. The Lagrangian function can be written as
L(x, y, w) = c^Tx − y^T(Ax − b) − w^T(−x)
= y^T b + (c^T + w^T − y^T A)x.

The Lagrangian dual function is v(y, w) = max_x∈R^n L(x, y, w) = y^T b + max_{x∈R^n}[(c^T+ w^T − y^T A)x]

24
Q

PROPOSITION 5.3 dual of the dual LPP

A

The dual of the dual LPP is the same as the original primal LPP.

25
Q

We now present some properties of the dual LPP in relation to the primal LPP. We will assume the primal problem is the maximisation LPP shown in the above example. As we have found there, the dual LPP is

min v(y) = b^Ty, 
subject to A^Ty ≥ c, y ≥ 0.

Note some of the following results are valid only for the above primal and dual problems. Nevertheless, similar results can be stated for other LPPs (e.g., an LPP with ≥ constraints). See the example sheet on duality for more examples.

A

There are m dual vars, y1,y2,…,ym corresponding to the m constraints in the primal LPP

The dual LPP has n constraints

Assume optimal sols for two problems are x* and y* respectively

In canonical form the constraints of the primal LPP is given by A¯x = b, where A¯ := [A, I].

Obviously, primal and dual LPPs will satisfy the weak duality condition. However, we can show that the
strong duality condition is also true for feasible LPPs.

26
Q

PROPOSITION 5.4

STRONG DUALITY CONDITIONS FOR LPPS

A

It can be proved that, the optimal solution y* for the dual
LPP in Eq. 5.3 is given by
y= B^−T c_B, and that the strong duality condition
v(y
) = z(x) holds. Here, B is the
optimal basic matrix of the primal problem given by Eq. 5.2, defined by the suitable columns of the (extended) coefficient matrix A in the canonical form of the primal problem. c ¯B is the corresponding cost coefficients. x
= B−1b is the
optimal solution of the primal LPP.

27
Q

PROPOSITION 5.5 optimal sols and shadow costs of primals

A

The optimal solutions y∗
for the dual LPP are the same as the shadow costs of the primal LPP (As a
consequence, the shadow cost of a constraint is also called the dual price of the resource).

28
Q

Remark. By Propositions 5.5 and 3.1, we know that

A

The optimal solution of a dual variable = the shadow cost of the corresponding constraint = the optimal reduced cost of the slack variable for the corresponding constraint.

Optimal dual variables 
= 
optimal reduced costs for slack variables 
=
shadow costs for the constraints

AND
From the proof of 5.4 we see that: The optimality condition for the primal LPP ⇔
the feasibility condition for the dual LPP.

29
Q

PROPOSITION 5.6

COmplementary SLackness conditions

A

The optimal solutions for the primal and the dual LPPs
satisfy the following two sets of complementary slackness conditions:
y_i[∑_{j=1,n} [a_ij x_j − bi] = 0

xj [∑_{i=1,m} [y_ia_ij − cj]= 0

(5.4)
for i = 1, 2, …, m, and j = 1, 2, …,n.

The conditions can also be written as
y_is_i = 0,
x_jr_j=0
for i = 1, 2, .., m, j = 1, 2, …, n (5.5)

where s_i is the slack of the ith constraint of the primal problem at the optimal solution, and r_j is the optimal reduced
cost for x_j

30
Q

PROPOSITION 5.6
COmplementary SLackness conditions

NOTES

A

The first equation in Eq. 5.4 is the direct result of Theorem 5.1 applied to the primal problem. The second equation Eq. 5.4 is the result of applying the theorem to the dual problem, where we note, because of
Proposition 5.3, xj is the dual variable for the jth constraint of the dual problem, and ∑_{i=1,m} [yia_ij − c_j] ≥ 0 is the jth constraint for the dual problem.
Eq. 5.5 follows from two observations: First, by definition
s*_i = bi − ∑
{j=1,n} [a_ij x
j]
, therefore the first equation
follows immediately from the first equation in Eq. 5.4. The second equation in Eq. 5.5 follows from the fact:
{i=1,m} [y_ia_ij − cj] = r_j

The complementary slackness conditions are a mathematical way to summarize a few properties of
the optimal solution of an LPP that we have described before.
The ‘long form’ of the conditions, given in Eq. 5.4, contains the same info as the ‘short form’, given in Eq. 5.5.
However, the ‘short form’ is given in terms of variables directly available in simplex calculations, therefore it
is more useful.
The first equation in Eq. 5.5 says if s

i > 0 then y

i must be zero, which means: if the ith constraint is nonbinding, then its shadow cost must be zero. On the other hand, if y_i > 0 then s_i = 0, which means: if the shadow cost of an constraint is non-zero, then it must be binding.
The 2nd equation in Eq. 5.5 says if x_j > 0 then r_j = 0, and if r_j > 0 then x_j = 0. It means, if a variable is
basic, then its reduced cost must be zero, and if the reduced cost of a variable is positive, then the variable
must be non-basic. This is the optimality condition we introduced before.

31
Q

EXAMPLE 5.5
As the KKT conditions are a generalisation of the method of Lagrangian multipliers, we should be able to use
them to find the optimal solution of an LPP. This is actually true and many algorithms are indeed based on the
KKT conditions.

Use the KKT conditions to find the optimal solution for the primal problem in Example 5.2:
max f(x) = 4 + 2x − x^2
subject to x(3 − x) ≤ 0. Note that, as stated in Example 5.3, the strong duality condition holds for this problem. Therefore the KKT conditions can be applied.

A

Solution. L(x, y) = 4 + 2x − x
2 − yx(3 − x).

The KKT conditions become
∂L/∂x = 2 − 2x − 3y + 2xy = 0,

yx(3 − x) = 0, x(3 − x) ≤ 0, y ≥ 0.

Solving the first two simultaneous equations gives three solutions:
(a) y = 0, x = 1, which does not satisfy the third inequality;

(b) x = 0, y = 2/3, which is feasible and gives f(x) = 4; and

(c) x = 3, y = 4/3, which
is also feasible but gives f(x) = 1.

Since solution (b) gives a bigger f(x), it is the optimal solution. That is, x* = 0 with f_max = f(0) = 4 and optimal dual variable y* = 2/3. The result is the same as the result shown in Example 5.3.

Remark. In this example, we used the known solution in Example 5.3 to show that the strong duality condition holds, hence the KKT conditions are applicable. Then we used the conditions to find the solution again! This seems a bit pointless. Fortunately, this is not necessary in reality. For many problems, there are simple methods available (no solution needed) to check if the KKT conditions are applicable.

32
Q

The KKT system for an LPP

We consider the usual maximisation LPP: max z = c^Tx, s.t. Ax ≤ b and x ≥ 0. We assume x ∈ Rn, b ∈ Rm, A ∈ Rm×n

i.e., there are n decision variables and m constraints.

not examinable

A

The Lagrangian function for the LPP is
L(x, y, w) = c^Tx − y^T(Ax − b) − w^T(−x)
= y^Tb + (c^T + w^T − y^TA)x,

where y(≥ 0) ∈ Rm+ and w(≥ 0) ∈ Rn+. The first KKT condition says ∂L/∂xi = 0 for i = 1, 2, …, n, which gives
ci −
m

j=1
[y_ja_ji + w_i] = 0, (i = 1, 2, …n), which in matrix form is
[A^Ty − w = c ]

The complementary slackness condition (which is one of the KKT conditions) gives
y_j[(Ax)_j − b_j] = 0 (j = 1, 2, …, m) and
w_ix_i = 0 (i = 1, 2, …, n).

We introduce slack variables s_j ≥ 0, such that s_j = b_j − (Ax)j
. We then have
[Ax + s = b, 
y_js_j = 0 (j = 1, 2, ..., m), 
w_ix_i = 0 (i = 1, 2, ..., n)]

The equations in the boxes are the full system of equations derived from the KKT conditions. We see that
the KKT system consists of the primal and the dual LPPs (both in canonical form) and the complementary
slackness conditions.

33
Q

The KKT system for an LPP

We consider the usual maximisation LPP: max z = c^Tx, s.t. Ax ≤ b and x ≥ 0. We assume x ∈ Rn, b ∈ Rm, A ∈ Rm×n

i.e., there are n decision variables and m constraints.

NOTES
not examinable

A

There are 2(n + m) unknowns in the system: x, y, w, and s, and same number of equations. Therefore there
is (usually) a unique solution to the system. According to the KKT conditions (i.e., Theorem 5.2), the solution gives the optimal solutions x* and y* for the LPP and its dual LPP, respectively. Thus, in the end the problem is reduced to a (more or less) routine problem of finding the unique solution of a set of (nearly linear) equations, for which many generic as well as specialised algorithms are available.

As a side note, the above KKT system reminds us that LPPs are not as eccentric as might have been suggested by the simplex methods (which have some very special features).

34
Q

INTERIOR POINT METHODS

We now very briefly introduce the interior point methods. The purpose is two-fold: to give you some ideas of another type of methods, and to show another application of the KKT conditions.

not examinable

A

The simplex methods search for the optimal solution along the edges of the feasible region. Although in practice they have been very efficient, it is still natural to ask if it might be more efficient to approach the optimal solution directly from the interior of the feasible region. The interior point methods are a class of iterative algorithms motivated by this line of thinking.

the idea is to use a “barrier function”. 4 The barrier function is added to the cost function, and acts like a ‘penalty’ term. It penalizes the cost function when the solution gets close to the boundary of the feasible region. The search algorithm thus automatically steers away from the boundary, as if there is a natural barrier (hence the name ‘barrier function’)
……
not included

35
Q

Whats going to be on the exam

A

*past papers:
topics covered are similar

4 questions:
1 SIMPLEX algorithm
2 Modelling
3 matrix formulations and duality
4 post optimality analysis

duality and last chapter: lagrangian function. lagrangian dual function, kkt condition