Untitled_Free Document Search and Download

Page 1

Applications of Convex Optimization in
Image Processing
Anirudh K. Agrawal (11098) Department of Electrical Engineering Indian Institute of Technology kanirudh@iitk.ac.in
Vivek Kumar (11821)
Department of Electrical Engineering Indian Institute of Technology viveks@iitk.ac.in
,
Abstract We planning to highlight the different areas of image processing where convex optimization finds its uses. Some of the general areas where we apply convex optimization are denoising deblurring of images, segmentation, 3d reconstruction and face recognition. In the following report we will discuss different areas of Image Processing in relation to convex optimization, namely de-noising and face-recognition. To solve the image denoising problem we discuss an efficient iterative algorithm to solve a certain class of linearly constrained non-convex optimization problems, which is used to implement non-convex norms for regularization tasks in image processing and computer vision and compare different variants of it. We also discuss Augmented La- grangian Methods (ALM) for fast and scalable {1-minimization in real-world Robust Face Recognition and also compares several other popular l₁-minimization algorithm in sparse representation based classification. In this report we verify the results obtained by the authors and present our own comments.
I. INTRODUCTION
In many problems related to Image Processing and Com- puter Vision we find many optimization problems, many of which are of convex nature. We can use the various Convex optimization techniques to solve them.
There are also many computer vision problems which falls under linearly constrained convex plus concave optimization problems which are actually non-convex problems but can be efficiently minimized using state-of-the-art algorithms for convex optimization. We will study how to efficiently solve a wide class of optimization problems including l2-lp and l1-lp problems with 0 < p < 1 which are non-convex in nature. The model considered is a linearly constrained minimization on a finite dimensional Hilbert space H of the form
Ax = b,
min F(x), s.t.
xЄH
(3)
with F : H → R being a sum of two Lipschitz continuous
terms
F(x) := F1(x) + F₂(|x|)
(4)
We assume that F is bounded below, F₁: H→ RU{∞} is a proper convex and F2 : H+ → R is concave and increasing. Here, H+ denotes the non-negative orthant of the space H;
For solving the Face Recognition problem a sparsity based increasingness and the absolute value |x| are to be understood
classification framework has been used which boils down to solving a l₁ minimization problem as which can be written as follows
min ||x||1 subject to b = Ax
T
coordinate-wise. The linear constraint Ax = b is given by a linear operator A: H→ H1. For the study purpose, we take formulation of F₁ and F₂ as
(1)
F₁(x) = ||Tx — g||2, and
F₂(x) = \||x||2,p₁ (5)
where,
But in practical cases b has noise in it, so we solve a unconstrained Basis pursuit denoising problem which can be written as follows
min
-
·||b – Ax||2 + λ||x||1
X
2
(2)
=
Σ¿(xi) + ε)² is a non-convex norm for 0 < p < 1, λ = R+, T is a linear operator, and g is a vector to be approximated. To solve the optimization problem given in (3) the following iterative algorithm is proposed in the paper:
Here the matrix 'A' is made for each class, whose column vectors are the images of size(w x h) and 'y' is the test image. We already know that minimizing the l₁ produces a sparse results, this finds applications in image compression also. Although the above problem can be solved as a simple linear program due to high dimensionality of images the time and computational complexity is vary high. Many efficient methods have been developed over the years, we will discuss some of them and compare them based on their computational complexity.
xk+1
=
argmin F(x)
:=
Ax-b
argminF₁(x) + ||w*.x||1,
Ax=b
(6)
(7)
where OF2 denotes the superdifferential of the concave func- tion F2. This iterative algorithm proceeds by iteratively solving l₁ problems which approximate the original problem. We have worked on the implementation of above algorithm in image denoising by Rudin, Osher, and Fatemi (ROF) model and iteratively reweighted l₁ (IRL1) algorithm and compared the results for both.

Page 2

II. IMAGE DENOISING
The general problem (3) is formulated for specific computer vision applications as:
min F(x):
Ax=b
= min F₁(x) + F₂(|x|)
Ax-b
:= min||Tx - g|| +ATF2(|x|),
Ax=b
(8)
(9)
where F₂ : H+ → H+ is a coordinate-wise increasing and concave funtion. A : H → H∞,TH → H₁ are linear operators acting between finite dimensional Hilbert spaces H₁ or H1. The weight A € H+ has non-negative entries. The data-term is the convex lp-norm with q≥ 1. Various prototypes for F2(x) are
│Xi│→ (|Xi│+ε)P or |xi| → log(1+B\æil), Vi, (10)
i.e., the regularized 1-norm, 0 < p < 1,ɛ € R+, or a convex-log function. The p-norm becomes Lipschitz by the ɛ-regularization and the log-function naturally is Lips- chitz.Algorithm (7) simplifies to
xk+1
=
argmin||Txg||2 + ||diag(A)(w*.x||1,
Ax-b
B. Iterative Reweighted l₁ Algorithm
Paper [6] describes an iterative reweighted l₁ algorithm for non-convex lp pseudo-norms, p Є (0,1), which are non- differentiable at zero. It solves a sequence of non-smooth l₁ problems and can be seen as counterpart to IRLS (Iteratively Reweighted Least Squares) algorithm. Originally, it was pro- posed to improve the sparsity properties in l₁ regularized compressed sensing problems, but this algorithm found its uses in various computer vision algorithms. In the formulation of the IRL1 algorithm, we have minimization problem as explained in equations (10), (11) and (12) as
|2i|→log(1+B|xi), Vi,
k+1 = argmin||Txg||2 + ||diag(A)(w*.x||1,
Ax=b
where the weights given by the superdifferential of F2 are
B
wk
p
-
or wķ ພ
=
(1 + B|xi)
The results for both the algorithms discussed above are pro- (11) vided in the Section 4 (Results and Simulations).
where the weights given by the superdifferential of F½ are
Ρ
wk
or wķ
=
(|xk|+ε)¹-p
B (1 + ẞ|xi|)'
III. 1 MINIMIZATION ALGORITHMS Here we are gonna discuss various l₁ minimization tech- (12) niques.
Further in the report, we will describe the two main algorithms for Image denoising which are optimized using the above described algorithm and do their comparison.
A. ROF Image Denoising Model
Now, we'll touch upon ROF image model [5] which aims at obtaining a clean image x from a noisy image ƒ : N → R, where N is bounder open subset of R2. For solving this problem they proposed the following minimization problem:
=
u = argmin||p + X||ƒ – u||2.
Ա
(13)
=
Here F₁(u) = |x|p and F2(u) = \||ƒ—u||2 and subdifferential 0F2 2λ(u – ƒ). Formally, we can write dF2(u) -div(V), and we can write the Euler-lagrange differential equation for the RUdin Osher Fatemi model as follows:
0 = -div
Vu |vu|
+2λ(uf)
or equivalently,
1
uft
u = ƒ + 2x div (V)
21
A. Homotopy Methods
The method [4] uses the observation that while solving (2) the objective function undergoes a homotopy from the ell2 constraint to the ell₁ objective as A decreases, i.e when > → 0, * converges to the solution of (1). Also we can show that the solution path = {x* : \ € [0, ∞)} is a piecewise constant function of A. Therefore if we can the breakpoints that lead to changes in the support of xi.e either removal of a previously non-zero coefficients or addition of a new non-zero coefficient, we can construct a decreasing sequence of X.
In order to computer the gradient of objective function of the problem defined in (2), let us define a subdifferential of ||x||1 defined as follows:
Ші
=
u(x) = 8||x||1
=
Setting initial value x(0)
X
sgn(xi) xi#0
0
Uż Є [−1, 1] Xi
=
(16)
= 0 we can iterate w.r.t a non-zero
(14) A, and using the condition 0 € OF(x) we derive the following
(15)
Here, The parameter determines the denoising level. if \ is small, then it is equivalent to penalizing the fidelity term more and the minimizer u is very close to original image. On the other hand, for large λ the minimizer is a cartoon representation of the image ƒ.
c(x) = Ab – A¹ Ax € \u(x)
=
(17)
A sparse support set T {i: c) (x) = \} can be generated according to (16) at the k-th iteration. Then we can compute the update directions for non-zero coefficients of xk. The newton update equations derived involve only the non-zero coefficients of x*, which will usually be very small number since x is sparse. The complexity of Homotopy Algorithms is bounded by O(dm²+dmn) which is much better than Interior
Point Methods.

Page 3

B. Templates for Convex Cone Solvers (TFOCS)
It is a recently developed novel method [3] which can be used to solve the general constrained problem of the type
min Ax+bЄK
f(x)
where A is a linear operator and K is a convex cone.We can observe that that equation (1) and (2) fall into the above cat- egory. TFOCS proposes to use the soft-thersholding operator to solve the dual problem of l₁ minimization.
In this algorithm we wish to use the generalized project gradient descent to obtain the solution of dual problem of (1) and in turn get the primal solution. But if the dual cost function is not smooth the gradient descent cannot be applied. In order to overcome the above problem we use a smoothing term (x) to the primal problem. For the (1) problem we get the following problem
XC
min ||||1 +μφ(α) subject to b = Ax (18)
here μ > 0 is the smoothing parameter and ☀(x) is strongly convex function. (x) is usually taken as (x) = ||x-xo||2 , xo has to be specified. We can form the Lagrangian for as follows
-
Cu(a, 0) = |lx|| + |x – xo} – 07 (Ax – b)
н
2
here is the Lagrangian multiplier.
=
μ
preferred because of its smoothness.
The Lagrangian of (20) is given by the following
£g(2,0) = g(e) + \\h()+0h()
(21)
here Є Rm is a Lagrange multiplier. L(x, 0) is the Augmented Lagrangian of (1). It can be shown that there exists 0* € Rm and έ* € R such that
X*
x* = arg min (x, 0*)
Ꮖ
Vε > &*
(22)
To find the optimal solution for (1) we can use the method of multipliers for minimizing the augmented Lagrangian function L(x, 0). We get the following iterative scheme for solving (22)
xk+1 =
Øk+1
=
argminÎL¸x (x, Ok) 0k+$kh(k+1)
(23)
(24)
here {k} is a pre-defined positive sequence. Convergence is guaranteed provided that {0} is a bounded sequence and {k} is sufficiently large after a certain index.
The choice of {k} is problem dependent. Higher the value of έ more difficult it is to minimize L (x, Øk). For experiments of synthetic data we let it be that έk → ∞ while (19) for real face data we chose some fixed k = § ·
We can define the dual problem for above as maxLµ(0), where Lu(0) min Lµ(x, 0). We also note that Lμ(0) is smooth concave function whose gradient can be calculated and is given by
▼Lµ(0) = b − Аîµ(0)
D. ALM for dual problems (DALM)
The concepts described above can also be dual problem of (1) which is
y subj. toА¹y Є B∞
b
Y
(25)
here B∞ = {x € R" : ||x||∞ ≤1. Writing the augmented
Using first-order project gradient methods we can update the Lagrangian for above we get
primal and dual variables as follows
Xk
=
soft(xo+ (AT0k, 1)
0k+1 = 0k + tk (b – Axk)
A||
where {tr} are the step-sizes satisfying tATA for all k. Note that for a small μ the solution to () reduces to the solution of (1). The values of µ and xo determine the number of iterations.
C. Augmented Lagrangian Methods (PALM)
We will discuss here a special class of first-order methods called the Augmented Lagrangian Methods [2] We know that Lagrangian eliminates the equality constraints by including a penalty term in the cost function for the minimization problem. Let us consider a modified cost function for (1)
x
min_9(x)+ ||h(x)||3 subj.to_h(x) = 0
here g(x)
=
(20)
|||||1 and h(x) = b - Ax, observe that for g and h are continuous and convex functions. (20) has the same optimal solution as that of (1) for any § > 0 . Although we can use other kinds of penalty functions by the quadratic is
B
miny,z – b¹y — x¹ (z — A¹y) + ||z – A¹y||½
–
2 subj. to z Є Boo
here x is the Lagrange Multiplier of the dual problem. For ease of computation we adopt an alternation strategy to solve the above problem. In this we iteratively minimize the cost function with respect to on variable while holding the rest
constant.
Given (xk, yk) the update rule for z is
Zk+1 = Pâ÷ (A¹yk +xk/B)
(26)
here PB is the projection operator onto B. Given (xk, Zk+1) minimizing w.r.t to y we get a Least Squares problem which can be solved as follows:
-
-
BAATу = ẞAzk+1 − (Axk − b)
and the update equation for x is
xk+1 = Xk - ẞ(2k+1 − ATуk+1
(27)
(28)
The dual algorithm always converges because all the subprob- lems are solved exactly.

Page 4

Relative Error of x
((a)) Original Image
WILSON LAW FIRM, PLC
Century Title Services, Inc.
4725 GARSE MILL ROAD
((b)) After adding gaussian noise (PSNR=20.14)
((c)) Denoising using ROF model (PSNR=26.08)
((d)) Denoising using IRL1 (PSNR=24.80)
Fig. 1: Image denoising problem. Denoising the image using ROF model (100 iterations, time elapsed:1.13 sec) and IRL1 algorithm (40 iterations,time elapsed:24.57 sec)
0
10°
10-5
10-15
PALM TFOCS Homotopy DALM
100
10°
PALM
10-5
TFOCS Homotopy DALM
PALM
10-5
TFOCS Homotopy DALM
10-1
10-15
10-15
10-20
0
1
2
10-20
3
4
5
0
1
2
3
4
10-20
5
Time(s)
6
7
8
0
1
2
3
4
5
Time(s)
6
7
8
((a)) m = 800,n=1000,d=100
Time(s)
((c)) m =
800,n=2000,d=200
((b)) m = 800,n=1000,d=100
Fig. 2: Relative error v/s CPU time
IV. RESULTS AND SIMULATIONS
A. Image Denoising
Figure 1 shows original image (Figure 1(a)) in consideration and the noisy image by adding a gaussian noise of σ = 25. The peak signal to noise ration (PSNR) for the noisy image is 20.143045 db.
1) ROF Image Denoising Model: ROF algorithm was run for noisy image for λ = 10 and 100 iterations (figure 1(c)). The peak SNR of denoised Image by ROF denoising model is: 26.088432 db and the time taken for 100 iterations is 24.57 seconds. On increasing the number of iterations the PSNR value increases and saturates after around 2000 iterations.

Page 5

2) Iterative Reweighted l₁ Algorithm: The output of de- noised image by IRL1 algorithm is shown in figure 1(d). Peak SNR of denoised image by IRL1 algorithm found out to be 24.80 db after 40 iterations which is less than as compared to by ROF model. Also the time taken IRL1 is much higer (24.57 seconds).
B. l1 Minimization algorithm comparison
-
To compare the various 1 minimization algorithms we plot solve the (2) problem for each of them and plot the relative error r(x) ||xo||2/||xo||2 w.r.t to number of iterations. We have tested the algorithms by generating and sparse signal and under determined linear system based on Gaussian Distribution. The matrix A of size m x n ( m¡ n) is generated by setting each entry a i.i.d gaussian. We have normalized in column to have unit 2 norm. The support of xo is chosen based on a uniform distribution in the interval [-10,10]. The support of xo is also chosen randomly with d i.e there are d non-zero entries in xo. We plot the the relative error for different values of d and n keeping m = 800 fixed as done in the paper [2].From Fig. 2 we can see that DALM has the best performance among all other methods discussed above.
=
Fig. 2 c) that in terms on time complexity and scaling in the size of support of x. Though, we still have to test the above algorithms on real face data.
REFERENCES
[1] Ochs. P; Dosovitskiy. A; Brox. T;Pock. T;"An iterated L1 Algorithm for Non-Smooth Non-Convex Optimization in Computer Vision"IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013. [2] Yang, A.Y.; Zihan Zhou; Balasubramanian, A.G.; Sastry, S.S.; Yi Ma, "Fast l₁ -Minimization Algorithms for Robust Face Recognition," Image Processing, IEEE Transactions on, vol.22, no.8, pp.3234,3246, Aug. 2013 [3] S. R. Becker, E. J. Cands, and M. Grant, Templates for convex cone problems with applications to sparse signal recovery, Math. Program. Compon., vol. 3, no. 3, pp. 165218, 2010.
[4] D. Malioutov, M. Cetin, and A. Willsky, Homotopy continuation for sparse signal representation, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Mar. 2005, pp. 733736.
[5] Rudin, Leonid I., Stanley Osher, and Emad Fatemi. "Nonlinear total vari- ation based noise removal algorithms.” Physica D: Nonlinear Phenomena 60.1 (1992): 259-268.
[6] Candes, Emmanuel J., Michael B. Wakin, and Stephen P. Boyd. "Enhanc- ing sparsity by reweighted 1 minimization.” Journal of Fourier analysis and applications 14.5-6 (2008): 877-905.
Time (s)
1
2
5
6
PALM
TFOCS
Homotopy
DALM
0
100 120 140 160
180
200 220 240 260 280 300
d
=
Fig. 3: Variation of Time Taken v/s d
In the above Fig. 3 we calculate the time is takes for rk (x) ||x − x0||2/||x0||2 < 10-5 and plot that against the size of support of x. As can be seen from Fig. 3 we can see that we increase the size of support of x, Homotopy performance degrades strongly. While DALM,PALM,TFOCS have comparable performances. From the these two we can say DALM works best among
V. CONCLUSION
We studied an efficient iterative algorithm to solve certain class of linearly constrain non-convex problems and used it to implement two image denoising algorithms ROF image denoising and iterative reweighted l₁ algorithm and compared the results for both of them.
We presented various el₁ minimzation approaches that can be used in face recognition. From our simulation results we found that DALM has the best performance. WE can observe from

Untitled

Recent Documents:

Recent Search: