L. Vandenberghe
ECE236C (Spring 2022)
5. Conjugate functions
• closed functions
• conjugate function
• duality
5.1
Closed set
a set
is
closed if it contains its boundary:
�� , �� ¯
=⇒
¯ ��
Operations that preserve closedness
• the intersection of (finitely or infinitely many) closed sets is closed
• the union of a finite number of closed sets is closed
• inverse under linear mapping: { |
�� } is closed if
is closed
Conjugate functions
5.2
Image under linear mapping
the image of a closed set under a linear mapping is not necessarily closed
Example
= {(1,2) ��
R
2
+ | 1 2 �� 1},
= [ 1 0 ] ,
=
R++
Sufficient condition:
is closed if
•
is closed and convex
• and
does not have a recession direction in the nullspace of ,
i.e.,
= 0, ˆ �� , ˆ +
�� for all �� 0
=⇒
= 0
in particular, this holds for any matrix if
is bounded
Conjugate functions
5.3
Closed function
Definition: a function is closed if its epigraph is a closed set
Examples
• () = −log(1 −
2
) with dom = { | || < 1}
• () = log with dom =
R+ and (0) = 0
• indicator function of a closed set :
() =
{ 0
��
+�� otherwise
Not closed
• () = log with dom =
R++, or with dom =
R+ and (0) = 1
• indicator function of a set
if
is not closed
Conjugate functions
5.4
Properties
Sublevel sets:
is closed if and only if all its sublevel sets are closed
Minimum: if is closed with bounded sublevel sets then it has a minimizer
Common operations on convex functions that preserve closedness
•
sum:
= 1 + 2 is closed if 1 and 2 are closed
•
composition with affine mapping: () = ( + ) is closed if is closed
•
supremum: () = sup
() is closed if each function
is closed
in each case, we assume dom
∅
Conjugate functions
5.5
Outline
• closed functions
•
conjugate function
• duality
Conjugate function
the
conjugate of a function is
∗
() = sup
��dom
( − ())
∗ is closed and convex (even when is not)
Fenchel��s inequality: the definition implies that
() +
∗
() ��
for all ,
this is an extension to non-quadratic convex of the inequality
1
2
+
1
2
��
Conjugate functions
5.6
Quadratic function
() =
1
2
+
+
Strictly convex case ( ≻ 0)
∗
() =
1
2( − )
−1
( − ) −
General convex case ( ≽ 0)
∗
() =
1
2( − )
†
( − ) − ,
dom ∗
= range() +
Conjugate functions
5.7
Negative entropy and negative logarithm
Negative entropy
() = ��
=1
log
∗
() = ��
=1
−1
Negative logarithm
() = −��
=1
log
∗
() = −��
=1
log(− ) −
Matrix logarithm
() = −log det
(dom =
S
++)
∗
() = −log det(−) −
Conjugate functions
5.8
Indicator function and norm
Indicator of convex set : conjugate is the
support function of
() =
{ 0
��
+��
∗
() = sup
��
Indicator of convex cone : conjugate is indicator of polar (negative dual) cone
∗
() = −∗ () = ∗ (−) =
{ 0
�� 0 ∀ ��
+�� otherwise
Norm: conjugate is indicator of unit ball for dual norm
() =
∗
() =
{ 0
∗ �� 1
+�� ∗ > 1
(see next page)
Conjugate functions
5.9
Proof: recall the definition of dual norm
∗ = sup
��1
to evaluate ∗
() = sup ( − ) we distinguish two cases
• if
∗ �� 1, then (by definition of dual norm)
�� for all
and equality holds if = 0; therefore sup ( − ) = 0
• if
∗ > 1, there exists an with
�� 1,
> 1; then
∗
() ��
( )− = ( − )
and right-hand side goes to infinity if �� ��
Conjugate functions
5.10
Calculus rules
Separable sum
(1,2) = (1) + (2)
∗
(1, 2) =
∗
(1) +
∗
(2)
Scalar multiplication ( > 0)
() =
()
∗
() =
∗
(/)
() =
(/)
∗
() =
∗
()
• the operation () =
(/) is sometimes called ��right scalar multiplication��
• a convenient notation is =
for the function ( )() =
(/)
• conjugates can be written concisely as ( )
∗ =
∗ and ( )
∗ = ∗
Conjugate functions
5.11
Calculus rules
Addition to affine function
() = () +
+
∗
() =
∗
( − ) −
Translation of argument
() = ( − )
∗
() =
+
∗
()
Composition with invertible linear mapping: if is square and nonsingular,
() = ( )
∗
() =
∗
(
−
)
Infimal convolution
() = inf
+=
(() + ())
∗
() =
∗
() +
∗
()
Conjugate functions
5.12
The second conjugate
∗∗
() = sup
��dom ∗
( −
∗
())
•
∗∗ is closed and convex
• from Fenchel��s inequality,
−
∗
() �� () for all and ; therefore
∗∗
() �� () for all
equivalently, epi ⊆ epi ∗∗ (for any )
• if is closed and convex, then
∗∗
() = () for all
equivalently, epi = epi ∗∗ (if is closed and convex); proof on next page
Conjugate functions
5.13
Proof (by contradiction): assume is closed and convex, and epi ∗∗
epi
suppose ( ,
∗∗
()) epi ; then there is a strict separating hyperplane:
[
] [
−
−
∗∗
()
]
�� < 0
for all ( , ) �� epi
holds for some , , with �� 0 ( > 0 gives a contradiction as �� ��)
• if < 0, define = /(−) and maximize left-hand side over ( , ) �� epi :
∗
() −
+
∗∗
() �� /(−) < 0
this contradicts Fenchel��s inequality
• if = 0, choose ˆ �� dom ∗ and add small multiple of (ˆ,−1) to ( , ):
[ +
ˆ
−
] [
−
−
∗∗
()
]
�� +
( ∗
(ˆ) −
ˆ + ∗∗
()
)
< 0
now apply the argument for < 0
Conjugate functions
5.14
Conjugates and subgradients
if is closed and convex, then
��
() ⇐⇒
��
∗
() ⇐⇒
= () +
∗
()
Proof. if ��
(), then ∗
() = sup ( − ()) =
− (); hence
∗
() = sup ( − ())
��
− ()
=
( − ) − () +
=
∗
() + ( − )
this holds for all ; therefore, ��
∗
()
reverse implication ��
∗
() =⇒ ��
() follows from ∗∗ =
Conjugate functions
5.15
Example
−1
1
() = [−1,1]()
∗() = ||
1
−1
()
1
−1
∗()
Conjugate functions
5.16
Strongly convex function
Definition (page 1.18) is -strongly convex (for ·) if dom is convex and
( + (1 − )) ��
()+(1 − ) () − 2 (1 − ) −
2
for all , �� dom and �� [0,1]
First-order condition
• if is -strongly convex, then
() �� () + ( − ) + 2
−
2
for all , �� dom , ��
()
• for differentiable this is the inequality (4) on page 1.19
Conjugate functions
5.17
Proof
• recall the definition of directional derivative (page 2.28 and 2.29):
( , − ) = inf>0
( + ( − )) − ()
and the infimum is approached as �� 0
• if is -strongly convex and subdifferentiable at , then for all �� dom ,
( , − ) ��
inf
��(0,1]
(1 − ) () +
()−(/2)(1 − ) −
2
− ()
=
() − () − 2
−
2
• from page 2.31, the directional derivative is the support function of
():
( − ) ��
sup
˜�� ()
˜ ( − )
=
(; − )
��
() − () − 2
−
2
Conjugate functions
5.18
Conjugate of strongly convex function
assume is closed and strongly convex with parameter > 0 for the norm ·
•
∗ is defined for all (
i.e., dom ∗ =
R )
•
∗ is differentiable everywhere, with gradient
∇
∗
() = argmax ( − ())
• ∇
∗ is Lipschitz continuous with constant 1/ for the dual norm · ∗:
∇
∗
()−∇
∗
( ) ��
1
−
∗ for all and
Conjugate functions
5.19
Proof: if is strongly convex and closed
•
− () has a unique maximizer for every
• maximizes
− () if and only if ��
(); from page 5.15
��
()
⇐⇒
��
∗
() = {∇
∗
()}
hence ∇
∗
() = argmax ( − ())
• from first-order condition on page 5.17: if ��
(),
��
( ):
( ) ��
() + ( − ) + 2
−
2
() ��
( )+( ) ( − ) + 2
−
2
combining these inequalities shows
−
2
�� ( − ) ( − )�� −
∗
−
• now substitute = ∇
∗
() and
= ∇
∗
( )
Conjugate functions
5.20
Outline
• closed functions
• conjugate function
•
duality
Duality
primal:
minimize
() + ( )
dual:
maximize −
∗
() −
∗
(− )
• follows from Lagrange duality applied to reformulated primal
minimize
() + ()
subject to
=
dual function for the formulated problem is:
inf
, ( () +
+ () −
) = −
∗
(− ) −
∗
()
• Slater��s condition (for convex , ): strong duality holds if there exists an ˆ with
ˆ �� int dom ,
ˆ �� int dom
this also guarantees that the dual optimum is attained if optimal value is finite
Conjugate functions
5.21
Set constraint
minimize
()
subject to
− ��
Primal and dual problem
primal:
minimize
() + ( − )
dual:
maximize − −
∗
() −
∗
(− )
Examples
constraint
set
support function ∗
()
equality
=
{0}
0
norm inequality
−
�� 1 unit ·-ball
∗
conic inequality
≼
−
∗ ()
Conjugate functions
5.22
Norm regularization
minimize
()+ −
• take () =
−
in general problem
minimize
() + ( )
• conjugate of · is indicator of unit ball for dual norm
∗
() =
+ () where = { | ∗ �� 1}
• hence, dual problem can be written as
maximize − −
∗
(− )
subject to
∗ �� 1
Conjugate functions
5.23
Optimality conditions
minimize
() + ()
subject to
=
assume , are convex and Slater��s condition holds
Optimality conditions: is optimal if and only if there exists a such that
1. primal feasibility: �� dom and =
�� dom
2.
and =
are minimizers of the Lagrangian () +
+ () −
:
− ��
(),
��
( )
if is closed, this can be written symmetrically as
− ��
(),
��
∗
()
Conjugate functions
5.24
References
• J.-B. Hiriart-Urruty, C. Lemar��chal,
Convex Analysis and Minimization Algoritms
(1993), chapter X.
• D.P. Bertsekas, A. Nedić, A.E. Ozdaglar,
Convex Analysis and Optimization
(2003), chapter 7.
• R. T. Rockafellar,
Convex Analysis (1970).
Conjugate functions
5.25