Convex Optimization_Free Document Search and Download

Page 1

Convex Optimization
Stephen Boyd Lieven Vandenberghe
Revised slides by Stephen Boyd, Lieven Vandenberghe, and Parth Nobel

Page 2

5. Duality

Page 3

Outline
Lagrangian and dual function
Lagrange dual problem KKT conditions Sensitivity analysis Problem reformulations Theorems of alternatives
Convex Optimization Boyd and Vandenberghe 5.1

Page 4

Lagrangian
▶ standard form problem (not necessarily convex)
minimize f0(x) subject to fi(x) ≤ 0, i = 1,..., m hi(x) = 0, i = 1,..., p variable x ∈ Rn, domain D, optimal value p★
▶ Lagrangian: L : Rn × Rm × Rp → R, with dom L = D × Rm × Rp,
L(x, , ) = f0(x) +
m
∑︁
i=1
ifi(x) +
p
∑︁
i=1
ihi(x)
– weighted sum of objective and constraint functions – i is Lagrange multiplier associated with fi(x) ≤ 0 – i is Lagrange multiplier associated with hi(x) = 0
Convex Optimization Boyd and Vandenberghe 5.2

Page 5

Lagrange dual function
▶ Lagrange dual function: g : Rm × Rp → R,
g( , ) = inf
x∈D
L(x, , ) = inf
x∈D
( f0(x) +
m
∑︁
i=1
ifi(x) +
p
∑︁
i=1
ihi(x) )
▶ g is concave, can be −∞ for some , ▶ lower bound property: if ⪰ 0, then g( , ) ≤ p★ ▶ proof: if ˜x is feasible and ⪰ 0, then
f0(˜x) ≥ L(˜x, , ) ≥ inf
x∈D
L(x, , ) = g( , ) minimizing over all feasible ˜x gives p★ ≥ g( , )
Convex Optimization Boyd and Vandenberghe 5.3

Page 6

Least-norm solution of linear equations
minimize xTx subject to Ax = b
▶ Lagrangian is L(x, ) = xTx + T (Ax − b) ▶ to minimize L over x, set gradient equal to zero:
∇xL(x, ) = 2x + AT = 0 =⇒ x = −(1/2)AT
▶ plug x into L to obtain
g( ) = L((−1/2)AT , ) = − 1 4
TAAT
− bT
▶ lower bound property: p★ ≥ −(1/4) TAAT
− bT for all
Convex Optimization Boyd and Vandenberghe 5.4

Page 7

Standard form LP
minimize cTx subject to Ax = b, x ⪰ 0
▶ Lagrangian is
L(x, , ) = cTx + T (Ax − b) − Tx = −bT + (c + AT − )
Tx
▶ L is affine in x, so
g( , ) = inf
x
L(x, , ) = { −bT AT − + c = 0 −∞ otherwise
▶ g is linear on affine domain {( , ) | AT
− + c = 0}, hence concave
▶ lower bound property: p★ ≥ −bT
if AT + c ⪰ 0
Convex Optimization Boyd and Vandenberghe 5.5

Page 8

Equality constrained norm minimization
minimize ∥x∥ subject to Ax = b
▶ dual function is
g( ) = inf
x
(∥x∥ − TAx + bT ) = { bT ∥AT ∥∗ ≤ 1 −∞ otherwise where ∥v∥∗ = sup∥u∥≤1 uTv is dual norm of ∥·∥
▶ lower bound property: p★ ≥ bT
if ∥AT ∥∗ ≤ 1
Convex Optimization Boyd and Vandenberghe 5.6

Page 9

Two-way partitioning
minimize xTWx subject to x2
i= 1, i = 1,..., n
▶ a nonconvex problem; feasible set contains 2n discrete points ▶ interpretation: partition {1,..., n} in two sets encoded as xi = 1 and xi = −1 ▶ Wij is cost of assigning i, j to the same set; −Wij is cost of assigning to different sets ▶ dual function is
g( ) = inf
x
( xTWx + ∑︁
i
i(x2
i− 1)
) = inf
x
xT (W + diag( )) x−1T = { −1T W + diag( ) ⪰ 0 −∞ otherwise
▶ lower bound property: p★ ≥ −1T
if W + diag( ) ⪰ 0
Convex Optimization Boyd and Vandenberghe 5.7

Page 10

Lagrange dual and conjugate function
minimize f0(x) subject to Ax ⪯ b, Cx = d
▶ dual function
g( , ) = inf
x∈dom f0
( f0(x)+(AT + CT )
Tx − bT
− dT ) = −f∗
0
(−AT − CT ) − bT − dT where f∗(y) = supx∈dom f (yTx − f (x)) is conjugate of f0
▶ simplifies derivation of dual if conjugate of f0 is known ▶ example: entropy maximization
f0(x) =
n
∑︁
i=1
xi log xi, f∗
0
(y) =
n
∑︁
i=1
eyi−1
Convex Optimization Boyd and Vandenberghe 5.8

Page 11

Outline
Lagrangian and dual function
Lagrange dual problem
KKT conditions Sensitivity analysis Problem reformulations Theorems of alternatives
Convex Optimization Boyd and Vandenberghe 5.9

Page 12

The Lagrange dual problem
(Lagrange) dual problem maximize g( , ) subject to ⪰ 0
▶ finds best lower bound on p★, obtained from Lagrange dual function ▶ a convex optimization problem, even if original primal problem is not ▶ dual optimal value denoted d★ ▶ , are dual feasible if ⪰ 0, ( , ) ∈ dom g ▶ often simplified by making implicit constraint ( , ) ∈ dom g explicit
Convex Optimization Boyd and Vandenberghe 5.10

Page 13

Example: standard form LP
(see page 5.5)
▶ primal standard form LP:
minimize cTx subject to Ax = b x ⪰ 0
▶ dual problem is
maximize g( , ) subject to ⪰ 0 with g( , ) = −bT if AT − + c = 0, −∞ otherwise
▶ make implicit constraint explicit, and eliminate to obtain (transformed) dual problem
maximize −bT subject to AT + c ⪰ 0
Convex Optimization Boyd and Vandenberghe 5.11

Page 14

Weak and strong duality
weak duality: d★ ≤ p★
▶ always holds (for convex and nonconvex problems) ▶ can be used to find nontrivial lower bounds for difficult problems, e.g., solving the SDP
maximize −1T subject to W + diag( ) ⪰ 0 gives a lower bound for the two-way partitioning problem on page 5.7 strong duality: d★ = p★
▶ does not hold in general ▶ (usually) holds for convex problems ▶ conditions that guarantee strong duality in convex problems are called constraint
qualifications
Convex Optimization Boyd and Vandenberghe 5.12

Page 15

Slater’s constraint qualification
strong duality holds for a convex problem minimize f0(x) subject to fi(x) ≤ 0, i = 1,..., m Ax = b if it is strictly feasible, i.e., there is an x ∈ int D with fi(x) < 0, i = 1,..., m, Ax = b
▶ also guarantees that the dual optimum is attained (if p★ > −∞) ▶ can be sharpened: e.g.,
– can replace int D with relint D (interior relative to affine hull) – linear inequalities do not need to hold with strict inequality
▶ there are many other types of constraint qualifications
Convex Optimization Boyd and Vandenberghe 5.13

Page 16

Inequality form LP
primal problem minimize cTx subject to Ax ⪯ b dual function g( ) = inf
x
( (c + AT )
Tx − bT
) = { −bT AT + c = 0 −∞ otherwise dual problem maximize −bT subject to AT + c = 0, ⪰ 0
▶ from the sharpened Slater’s condition: p★ = d★ if the primal problem is feasible ▶ in fact, p★ = d★ except when primal and dual are both infeasible
Convex Optimization Boyd and Vandenberghe 5.14

Page 17

Quadratic program
primal problem (assume P ∈ Sn
++)
minimize xTPx subject to Ax ⪯ b dual function g( ) = inf
x
( xTPx + T (Ax − b) ) = − 1 4
TAP−1AT
− bT dual problem maximize −(1/4) TAP−1AT − bT subject to ⪰ 0
▶ from the sharpened Slater’s condition: p★ = d★ if the primal problem is feasible ▶ in fact, p★ = d★ always
Convex Optimization Boyd and Vandenberghe 5.15

Page 18

Geometric interpretation
▶ for simplicity, consider problem with one constraint f1(x) ≤ 0 ▶ G = {(f1(x), f0(x)) | x ∈ D} is set of achievable (constraint, objective) values ▶ interpretation of dual function: g( ) = inf(u,t)∈G (t + u)
G p★ g() u + t = g() t u G p★ d★ t u
▶ u + t = g( ) is (non-vertical) supporting hyperplane to G ▶ hyperplane intersects t-axis at t = g( )
Convex Optimization Boyd and Vandenberghe 5.16

Page 19

Epigraph variation
▶ same with G replaced with A = {(u, t) | f1(x) ≤ u, f0(x) ≤ t for some x ∈ D}
A p★ g() u + t = g() t u
▶ strong duality holds if there is a non-vertical supporting hyperplane to A at (0, p★) ▶ for convex problem, A is convex, hence has supporting hyperplane at (0, p★) ▶ Slater’s condition: if there exist (˜u,˜t)∈A with ˜u < 0, then supporting hyperplane at (0, p★)
must be non-vertical
Convex Optimization Boyd and Vandenberghe 5.17

Page 20

Outline
Lagrangian and dual function Lagrange dual problem
KKT conditions
Sensitivity analysis Problem reformulations Theorems of alternatives
Convex Optimization Boyd and Vandenberghe 5.18

Page 21

Complementary slackness
▶ assume strong duality holds, x★ is primal optimal, ( ★, ★) is dual optimal
f0(x★) = g( ★ ,
★) = inf x
( f0(x) +
m
∑︁
i=1

★ i fi(x) + p
∑︁
i=1

★ i hi(x)
) ≤ f0(x★) +
m
∑︁
i=1

★ i fi(x★) + p
∑︁
i=1

★ i hi(x★)
≤ f0(x★)
▶ hence, the two inequalities hold with equality ▶ x★ minimizes L(x, ★, ★) ▶ ★
i
fi(x★) = 0 for i = 1,..., m (known as complementary slackness):
★ i > 0 =⇒ fi(x★) = 0,
fi(x★) < 0 =⇒ ★
i = 0
Convex Optimization Boyd and Vandenberghe 5.19

Page 22

Karush-Kuhn-Tucker (KKT) conditions
the KKT conditions (for a problem with differentiable fi, hi) are
1. primal constraints: fi(x) ≤ 0, i = 1,..., m, hi(x) = 0, i = 1,..., p 2. dual constraints: ⪰ 0 3. complementary slackness: ifi(x) = 0, i = 1,..., m 4. gradient of Lagrangian with respect to x vanishes:
∇f0(x) +
m
∑︁
i=1
i∇fi(x) +
p
∑︁
i=1
i∇hi(x) = 0 if strong duality holds and x, , are optimal, they satisfy the KKT conditions
Convex Optimization Boyd and Vandenberghe 5.20

Page 23

KKT conditions for convex problem
if ˜x, ˜ , ˜ satisfy KKT for a convex problem, then they are optimal:
▶ from complementary slackness: f0(˜x) = L(˜x, ˜ , ˜ ) ▶ from 4th condition (and convexity): g( ˜ , ˜ ) = L(˜x, ˜ , ˜ )
hence, f0(˜x) = g( ˜ , ˜ ) if Slater’s condition is satisfied, then x is optimal if and only if there exist , that satisfy KKT conditions
▶ recall that Slater implies strong duality, and dual optimum is attained ▶ generalizes optimality condition ∇f0(x) = 0 for unconstrained problem
Convex Optimization Boyd and Vandenberghe 5.21

Page 24

Outline
Lagrangian and dual function Lagrange dual problem KKT conditions
Sensitivity analysis
Problem reformulations Theorems of alternatives
Convex Optimization Boyd and Vandenberghe 5.22

Page 25

Perturbation and sensitivity analysis
(unperturbed) optimization problem and its dual minimize f0(x) subject to fi(x) ≤ 0, i = 1,..., m hi(x) = 0, i = 1,..., p maximize g( , ) subject to ⪰ 0 perturbed problem and its dual minimoize f0(x) subject to fi(x) ≤ ui, i = 1,..., m hi(x) = vi, i = 1,..., p maximize g( , ) − uT − vT subject to ⪰ 0
▶ x is primal variable; u, v are parameters ▶ p★(u, v) is optimal value as a function of u, v ▶ p★(0, 0) is optimal value of unperturbed problem
Convex Optimization Boyd and Vandenberghe 5.23

Page 26

Global sensitivity via duality
▶ assume strong duality holds for unperturbed problem, with ★, ★ dual optimal ▶ apply weak duality to perturbed problem:
p★(u, v) ≥ g( ★ ,
★) − uT

★ − vT

★ = p★(0, 0) − uT

★ − vT

★
▶ implications
– if ★
i
large: p★ increases greatly if we tighten constraint i (ui < 0)
– if ★
i
small: p★ does not decrease much if we loosen constraint i (ui > 0)
– if ★
i
large and positive: p★ increases greatly if we take vi < 0
– if ★
i
large and negative: p★ increases greatly if we take vi > 0
– if ★
i
small and positive: p★ does not decrease much if we take vi > 0
– if ★
i
small and negative: p★ does not decrease much if we take vi < 0
Convex Optimization Boyd and Vandenberghe 5.24

Page 27

Local sensitivity via duality
if (in addition) p★(u, v) is differentiable at (0, 0), then
★ i = −
p★(0, 0) ui ,
★ i = −
p★(0, 0) vi proof (for ★
i
): from global sensitivity result, p★(0, 0) ui = lim
t↘0
p★(tei, 0) − p★(0, 0) t ≥ − ★
i
p★(0, 0) ui = lim
t↗0
p★(tei, 0) − p★(0, 0) t ≤ − ★
i
hence, equality p★(u) for a problem with one (inequality) constraint:
u p
★(u)
p
★(0) − ★
u u = 0
Convex Optimization Boyd and Vandenberghe 5.25

Page 28

Outline
Lagrangian and dual function Lagrange dual problem KKT conditions Sensitivity analysis
Problem reformulations
Theorems of alternatives
Convex Optimization Boyd and Vandenberghe 5.26

Page 29

Duality and problem reformulations
▶ equivalent formulations of a problem can lead to very different duals ▶ reformulating primal problem can be useful when dual is difficult to derive, or uninteresting
common reformulations
▶ introduce new variables and equality constraints ▶ make explicit constraints implicit or vice-versa ▶ transform objective or constraint functions, e.g., replace f0(x) by (f0(x)) with convex,
increasing
Convex Optimization Boyd and Vandenberghe 5.27

Page 30

Introducing new variables and equality constraints
▶ unconstrained problem: minimize f0(Ax + b) ▶ dual function is constant: g = infx L(x) = infx f0(Ax + b) = p★ ▶ we have strong duality, but dual is quite useless ▶ introduce new variable y and equality constraints y = Ax + b
minimize f0(y) subject to Ax + b − y = 0
▶ dual of reformulated problem is
maximize bT − f∗
0
( ) subject to AT = 0
▶ a nontrivial, useful dual (assuming the conjugate f∗
0
is easy to express)
Convex Optimization Boyd and Vandenberghe 5.28

Page 31

Example: Norm approximation
▶ minimize ∥Ax − b∥ ▶ reformulate as minimize ∥y∥ subject to y = Ax − b ▶ recall conjugate of general norm:
∥z∥∗ = { 0 ∥z∥∗ ≤ 1 ∞ otherwise
▶ dual of (reformulated) norm approximation problem:
maximize bT subject to AT = 0, ∥ ∥∗ ≤ 1
Convex Optimization Boyd and Vandenberghe 5.29

Page 32

Outline
Lagrangian and dual function Lagrange dual problem KKT conditions Sensitivity analysis Problem reformulations
Theorems of alternatives
Convex Optimization Boyd and Vandenberghe 5.30

Page 33

Theorems of alternatives
▶ consider two systems of inequality and equality constraints ▶ called weak alternatives if no more than one system is feasible ▶ called strong alternatives if exactly one of them is feasible ▶ examples: for any a ∈ R, with variable x ∈ R,
– x > a and x ≤ a − 1 are weak alternatives – x > a and x ≤ a are strong alternatives
▶ a theorem of alternatives states that two inequality systems are (weak or strong)
alternatives
▶ can be considered the extension of duality to feasibility problems
Convex Optimization Boyd and Vandenberghe 5.31

Page 34

Feasibility problems
▶ consider system of (not necessarily convex) inequalities and equalities
fi(x) ≤ 0, i = 1,..., m, hi(x) = 0, i = 1,..., p
▶ express as feasibility problem
minimize 0 subject to fi(x) ≤ 0, i = 1,..., m, hi(x) = 0, i = 1,..., p
▶ if system if feasible, p★ = 0; if not, p★ = ∞
Convex Optimization Boyd and Vandenberghe 5.32

Page 35

Duality for feasibility problems
▶ dual function of feasibility problem is g( , ) = infx
(m
i=1 ifi(x) + p i=1
ihi(x) )
▶ for ⪰ 0, we have g( , ) ≤ p★ ▶ it follows that feasibility of the inequality system
⪰ 0, g( , ) > 0 implies the original system is infeasible
▶ so this is a weak alternative to original system ▶ it is strong if fi convex, hi affine, and a constraint qualification holds ▶ g is positive homogeneous so we can write alternative system as
⪰ 0, g( , ) ≥ 1
Convex Optimization Boyd and Vandenberghe 5.33

Page 36

Example: Nonnegative solution of linear equations
▶ consider system
Ax = b, x ⪰ 0
▶ dual function is g( , ) =
{ − Tb AT = −∞ otherwise
▶ can express strong alternative of Ax = b, x ⪰ 0 as
AT ⪰ 0,
Tb ≤ −1
(we can replace Tb ≤ −1 with Tb = −1)
Convex Optimization Boyd and Vandenberghe 5.34

Page 37

Farkas’ lemma
▶ Farkas’ lemma:
Ax ⪯ 0, cTx < 0 and ATy + c = 0, y ⪰ 0 are strong alternatives
▶ proof: use (strong) duality for (feasible) LP
minimize cTx subject to Ax ⪯ 0
Convex Optimization Boyd and Vandenberghe 5.35

Page 38

Investment arbitrage
▶ we invest xj in each of n assets 1,..., n with prices p1,..., pn ▶ our initial cost is pTx ▶ at the end of the investment period there are only m possible outcomes i = 1,..., m ▶ Vij is the payoff or final value of asset j in outcome i ▶ first investment is risk-free (cash): p1 = 1 and Vi1 = 1 for all i ▶ arbitrage means there is x with pTx < 0, Vx ⪰ 0 ▶ arbitrage means we receive money up front, and our investment cannot lose ▶ standard assumption in economics: the prices are such that there is no arbitrage
Convex Optimization Boyd and Vandenberghe 5.36

Page 39

Absence of arbitrage
▶ by Farkas’ lemma, there is no arbitrage ⇐⇒ there exists y ∈ Rm
+ with VTy = p
▶ since first column of V is 1, we have 1Ty = 1 ▶ y is interpreted as a risk-neutral probability on the outcomes 1,..., m ▶ VTy are the expected values of the payoffs under the risk-neutral probability ▶ interpretation of VTy = p:
asset prices equal their expected payoff under the risk-neutral probability
▶ arbitrage theorem: there is no arbitrage ⇔ there exists a risk-neutral probability
distribution under which each asset price is its expected payoff
Convex Optimization Boyd and Vandenberghe 5.37

Page 40

Example
V =         1.0 0.5 0.0 1.0 0.8 0.0 1.0 1.0 1.0 1.0 1.3 4.0         , p =       1.0 0.9 0.3       , ˜p =       1.0 0.8 0.7      
▶ with prices p, there is an arbitrage
x =       6.2 −7.7 1.5       , pTx = −0.2, 1Tx = 0, Vx =         2.35 0.04 0.00 2.19        
▶ with prices ˜p, there is no arbitrage, with risk-neutral probability
y =         0.36 0.27 0.26 0.11         VTy =       1.0 0.8 0.7      
Convex Optimization Boyd and Vandenberghe 5.38

Convex Optimization

Recent Documents:

Recent Search: