A proximal center-based decomposition method for multi-agent convex optimization

Size: px

Start display at page:

Download "A proximal center-based decomposition method for multi-agent convex optimization"

Lynne Emma Cole
5 years ago
Views:

1 Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 2008 A proximal center-based decomposition method for multi-agent convex optimization Ion Necoara and Johan A.K. Suykens Abstract In this paper we develop a new dual decomposition method for optimizing a sum of convex objective functions corresponding to multiple agents but with coupled constraints. In our method we define a smooth Lagrangian, by using a smoothing technique developed by Nesterov, which preserves separability of the problem. With this approach we propose a new decomposition method (the proximal center method) for which efficiency estimates are derived and which improves the bounds on the number of iterations of the classical dual gradient scheme by an order of magnitude. The method involves every agent optimizing an objective function that is the sum of his own objective function and a smoothing term while the coordination between agents is performed via the Lagrange multipliers corresponding to the coupled constraints. Applications of the new method for solving distributed model predictive control or network optimization problems are also illustrated. I. INTRODUCTION Recent advances in computer power have led to the derivation of many parallel and distributed computation methods for solving large-scale optimization problems (e.g. [1]). For separable convex problems, i.e. separable objective function but with coupled constraints (they arise in many fields of engineering: e.g. network optimization [9], [14], distributed model predictive control (MPC) [13]), many researchers have proposed dual decomposition algorithms such as the dual subgradient method [1], the alternating direction method [1], [4], [5], [12], proximal method of multipliers [2], partial inverse method [11], etc. In general, these methods are based on alternating minimization in a Gauss-Seidel fashion of an (Augmented) Lagrangian followed by a steepest ascent update for the multipliers. However, the stepsize parameter which has a very strong influence on the convergence rate of these methods is very difficult to tune and the minimization in a Gauss-Seidel fashion slows down the convergence towards an optimum. Moreover, they do not provide any complexity estimates for the general case (linear convergence is obtained e.g. for strongly convex functions). The purpose of this paper is to propose a new decomposition method for separable convex optimization problems that overcomes the disadvantages mentioned above. Based on a smoothing technique recently developed by Nesterov in [8], we obtain a smooth Lagrangian that preserves separability of the problem. Using this smooth Lagrangian, we derive a new dual decomposition method in which the corresponding parameters are selected optimally and thus straightforward to tune. In contrast to the dual gradient I. Necoara and J.A.K. Suykens are with the Katholieke Universiteit Leuven, Department of Electrical Engineering, ESAT SCD, Kasteelpark Arenberg 10, B 3001 Leuven (Heverlee), Belgium. {ion.necoara,johan.suykens}@esat.kuleuven.be update for the multipliers used by most of the decomposition methods from the literature, our method uses an optimal first-order oracle scheme (see e.g. [7], [8]) for updating the multipliers. Therefore, we derive for the new method an efficiency estimate for the general case which improves with one order of magnitude the complexity of the classical dual gradient method (i.e. the steepest ascent update). Up to our knowledge these are the first efficiency estimate results of a dual decomposition method for separable non-strongly convex programs. The new algorithm is suitable for solving distributed model predictive control problems or network optimization problems since it is highly parallelizable and thus it can be effectively implemented on parallel processors. The reminder of this paper is organized as follows. Section II contains the problem formulation, followed by a brief description of some of the dual decomposition methods derived in the literature. The main results of the paper are presented in Section III, where we describe our new decomposition method and its main properties. In Section IV we show that network optimization or distributed model predictive control problems corresponding to linear systems with interacting subsystem dynamics and decoupled costs can be recast in the framework of separable convex problems for which our algorithm can be applied. Finally, numerical simulations on some test problems are included. II. PRELIMINARIES A. Splitting methods for separable convex programs An important application of convex duality theory is in decomposition algorithms for solving large-scale problems but with special structure. This type of problems arises e.g. in the context of large-scale networks which consists of multiple agents with different objectives or in the context of distributive MPC for linear systems with coupled subsystem dynamics. The following separable convex program will be considered in this paper: f = min x X,z Z { φ1 (x) + φ 2 (z) : Ax + Bz = b }, (1) where φ 1 : R m R and φ 2 : R p R are continuous convex functions on X and Z, respectively, A is an n m matrix, B an n p matrix, and b is in R n. In this paper we do not assume φ 1 and/or φ 2 to be strongly convex or even smooth. Moreover, we assume that X R m and Z R p are compact convex sets. We use different norms on R n, R m and R p, not necessarily the Euclidean norms. However, for simplicity in notation we do not use indices to specify the norms on R n, R m and R p, since from the context it is clear in which Euclidean space we are and which norm we use /08/$ IEEE 3077

2 Remark 2.1 Note that the method developed in this paper can treat both coupled inequalities Ax + Bz b and/or equalities Ax+Bz=b (see [6]). Moreover, we can consider any finite sum of φ i s. However, for simplicity of the exposition we restrict ourselves to problems of the form (1). Let, denote a scalar product on the Euclidean space R n. By forming the Lagrangian corresponding to the linear constraints (with the Lagrange multipliers λ R n ), i.e. L 0 (x,z,λ) = φ 1 (x) + φ 2 (z) + λ,ax + Bz b, and using the dual decomposition method, one arrives at the following decomposition algorithm: Algorithm 2.2: ([1]) for k 0 do 1. given λ k, minimize the Lagrangian (x k+1,z k+1 ) = arg min x X,z Z L 0 (x,z,λ k ), or equivalently minimize over x and z independently x k+1 = arg min x X [φ 1(x) + λ k,ax ] z k+1 = arg min z Z [φ 2(z) + λ k,bz ] 2. update the multipliers by the iteration λ k+1 = λ k + c k (Ax k+1 + Bz k+1 b), where c k is a positive step-size. The following assumption holds throughout the paper: Assumption 2.3: Constraint qualification holds for (1). It can be shown that Algorithm 2.2 is convergent under Assumption 2.3 and the assumption that both φ 1 and φ 2 are strongly convex functions (the later guarantees that the minimizer (x k+1,z k+1 ) is unique). In fact, under the assumption of strong convexity it follows that the dual function f 0 (λ) = min φ 1(x) + φ 2 (z) + λ,ax + Bz b x X,z Z is differentiable [10], and thus Algorithm 2.2 is the gradient method with stepsize c k for maximizing the dual function. However, for many interesting problems, especially arising from transformations that leads to decomposition, the functions φ 1 and φ 2 are not strongly convex. There are some methods (e.g. alternating direction method [1], [5], [12], proximal point method [2], partial inverse method [11]) that overcome this difficulty based on alternating minimization in a Gauss-Seidel fashion of the Augmented Lagrangian, followed by a steepest ascent update of the multipliers. A computational drawback of these schemes is that the proxterm c 2 Ax+Bz b 2, using the Euclidean norm framework, present in the Augmented Lagrangian is not separable in x and z. Another disadvantage is that they cannot deal with coupled inequalities in general. Moreover, these schemes were shown to be very sensitive to the value of the parameter c, with difficulties in practice to obtain the best convergence rate. Some heuristics for choosing c can be found in the literature [2], [5], [12]. Note that alternating direction method variants which allow for inexact minimization was proposed in e.g. [2]. A closely related method is the partial inverse of a monotone operator developed in [11]. B. Accelerated scheme for smooth maximization In this section we briefly describe an accelerated scheme for smooth convex functions developed by Nesterov in [7], [8]. Let f be a given concave and differentiable function on a closed convex set ˆQ R n that has a Lipschitz continuous gradient with Lipschitz constant L. Definition 2.4: [8] We define a prox-function d of the set ˆQ as a function with the following properties: d is continuous, strongly convex on ˆQ with convexity parameter σ, and u 0 is the center of the set ˆQ, i.e. u 0 = arg min x ˆQ d(x) such that d(u 0 ) = 0. The goal is to find an approximate solution of the convex optimization problem x = arg max x ˆQ f(x). In Nesterov s scheme three sequences of points from ˆQ are updated recursively: {u k } k 0, {x k } k 0, and {v k } k 0. The algorithm can be described as follows: Algorithm 2.5: ([8]) for k 0 do 1. compute f(u k ) 2. find x k = arg max x ˆQ f(u k ),x u k L 2 x uk 2 and define x k = arg max x { xk,x k 1,u k } f(x) 3. find v k = arg max x ˆQ L σ d(x) + k l+1 2 f(ul ),x u l 4. set u k+1 = k+1 k+3 xk + 2 k+3 vk. The main property of the Algorithm 2.5 is the following relation [8]: (k + 1)(k + 2) 4 f(x k ) max L x ˆQ σ d(x)+ l [f(ul ) + f(u l ),x u l ]. (2) Using this property Nesterov proves in [8] that the efficiency L estimates of Algorithm 2.5 is of the order O( ǫ ), higher than the corresponding pure gradient method for the same smooth problem by an order of magnitude. III. A NEW DECOMPOSITION METHOD BASED ON SMOOTHING THE LAGRANGIAN In this section we propose a new method to smoothen the Lagrangian of (1), inspired from [8]. This smoothing technique preserves the separability of the problem and moreover the corresponding parameters are easy to tune. Since separability is preserved under this smoothing technique, we derive a new dual decomposition method in which the multipliers are updated according to Algorithm 2.5. Moreover, we obtain an efficiency estimate of the new method for the general case. Note that with our method we can treat both coupled inequalities (Ax+Bz b) and/or equalities (Ax+Bz = b). A. Smoothing the Lagrangian Let d X and d Z be two prox-functions for the compact convex sets X and Z, with convexity parameter σ X and σ Z, respectively. Denote x 0 = arg min x X d X (x), z 0 = arg min z Z d Z (z). Since X and Z are compact and d X and d Z are continuous, we can choose finite and positive constants D X max x X d X (x) and D Z 3078

3 max z Z d Z (z). We also introduce the following notation A = max λ =1, x =1 λ,ax. Note that A = max x =1 Ax and Ax A x x, where s = max x =1 s,x is the corresponding dual norm of the norm used on R n [10]. Similarly for B. Let us introduce the following function: f c (λ) = min φ 1(x) + φ 2 (z)+ λ,ax + Bz b + x X,z Z c ( d X (x) + d Z (z) ), (3) where c is a positive smoothness parameter that will be defined in the sequel. Note that the objective function in (3) is separable in x and z, i.e. f c (λ) = λ,b + min x X [φ 1(x) + λ,ax + c d X (x)]+ min [φ 2(z) + λ,bz + c d Z (z)]. (4) z Z Denote by x(λ) and z(λ) the optimal solution of the minimization problem in x and z, respectively. Function f c has the following smoothness properties: Theorem 3.1: [6] The function f c is concave and continuously differentiable at any λ R n. Moreover, its gradient f c (λ) = Ax(λ) + Bz(λ) b is Lipschitz continuous with Lipschitz constant L c = A 2 cσ X cσ Z. The following inequalities also hold: f c (λ) f 0 (λ) f c (λ) c(d X + D Z ) λ R n. (5) Proof: The proof follows from standard convex argument and can be found in [6]. B. A proximal center based decomposition method In this section we derive a new decomposition method based on the smoothing technique described in Section III- A. The new algorithm, called here the proximal center algorithm, has the nice feature that the coordination between agents involves the maximization of a smooth convex objective function (i.e. with Lipschitz continuous gradient). Moreover, the resource allocation stage consists in solving in parallel by the each agent of a minimization problem with strongly convex objective. The new method belongs to the class of two-level algorithms [3] and is particularly suitable for separable convex problems where the minimizations over x and z in (4) are easily carried out. Let us apply Nesterov s accelerated method described in Algorithm 2.5 to the concave function f c that has a Lipschitz continuous gradient: max f c(λ), (6) λ Q where Q is a given closed convex set in R n that contains at least one optimal multiplier λ Λ (here Λ denotes the set of optimal multipliers). Notice that Q R n in the presence of linear equalities (i.e. Ax+Bz b = 0), Q R n +, where R + denotes the set of nonnegative real numbers, in the presence of linear inequalities (i.e. Ax + Bz b 0), or Q (R n1 R n2 + ), where n 1 + n 2 = n, when both, equalities and inequalities are present. Note that according to Algorithm 2.5 we also need to choose a prox-function d Q for the set Q with the convexity parameter σ Q and center u 0. The proximal center algorithm can be described as follows: Algorithm 3.2: for k 0 do 1. given u k compute in parallel x k+1 = arg min x X [φ 1(x) + u k,ax + c d X (x)], z k+1 = arg min z Z [φ 2(z) + u k,bz + c d Z (z)]. 2. compute f c (u k ), f c (u k ) = Ax k+1 + Bz k+1 b 3. find λ k = arg max λ Q f c (u k ),λ u k Lc 2 λ u k 2 and define λ k = arg max λ { λk,λ k 1,u k } f c (λ) 4. find v k = arg max λ Q Lc σ Q d Q (λ) + k l+1 2 f c(u l ),λ u l 5. set u k+1 = k+1 k+3 λk + 2 k+3 vk. The proximal center algorithm is suitable for decomposition since it is highly parallelizable: the agents can solve their corresponding minimization problems in parallel. This is an advantage of our method compared to most of the alternating direction methods based on Gauss-Seidel iterations that do not have this feature. We now derive a lower bound for the value of the objective function which will be used frequently in the sequel: Lemma 3.3: [6] For any λ Λ and ˆx X,ẑ Z, the following lower bound on primal gap holds: [φ 1 (ˆx)+φ 2 (ẑ)] f Aˆx + Bẑ b,λ λ Aˆx + Bẑ b. The previous lemma shows that if Aˆx + Bẑ b ǫ c, then the primal gap is bounded: for all ˆλ Q ǫ c λ φ 1 (ˆx) + φ 2 (ẑ)] f (7) φ 1 (ˆx) + φ 2 (ẑ) f 0 (ˆλ). Therefore, if we are able to derive an upper bound ǫ for the duality gap and ǫ c for the coupling constraints for some given ˆλ and ˆx X,ẑ Z, then we conclude that (ˆx,ẑ) is an (ǫ,ǫ c )-solution for problem (1) (since in this case ǫ c λ φ 1 (ˆx) + φ 2 (ẑ) f ǫ for all λ Λ ). The next theorem derives an upper bound on the duality gap for our method. Theorem 3.4: Assume that there exists a closed convex set Q that contains a λ Λ. Then, after k iterations we obtain an approximate solution to the problem (1) (ˆx,ẑ) = k 2(l+1) (k+1)(k+2) (xl+1,z l+1 ) and ˆλ = λ k which satisfy the following duality gap: 0 [φ 1 (ˆx) + φ 2 (ẑ)] f 0 (ˆλ) c(d X +D Z ) [ 4L c max λ Q σ Q (k + 1) 2 d Q(λ)+ Aˆx + Bẑ b,λ ]. (8) Proof: For an arbitrary c, we have from the inequality (2) that after k iterations the Lagrange multiplier ˆλ satisfies the following relation: (k + 1)(k + 2) f c (ˆλ) max 4 L c d Q (λ)+ λ Q σ Q l [f c(u l ) + f c (u l ),λ u l ]. 3079

4 In view of the previous inequality we have: f c (ˆλ) max λ Q 4L c σ Q (k + 1) 2 d Q(λ)+ (k + 1)(k + 2) [f c(u l ) + f c (u l ),λ u l ]. Now, we replace f c (u l ) and f c (u l ) with the expressions given in step 2 of Algorithm 3.2: (k + 1)(k + 2) [f c(u l ) + f c (u l ),λ u l ] = (k + 1)(k + 2) [φ 1(x l+1 ) + φ 2 (z l+1 )+ c ( d X (x l+1 ) + d Z (z l+1 ) ) + Ax l+1 + Bz l+1 b,λ ] (k + 1)(k + 2) [φ 1(x l+1 ) + φ 2 (z l+1 )+ Ax l+1 + Bz l+1 b,λ ] φ 1 (ˆx) + φ 2 (ẑ) + Aˆx + Bẑ b,λ, λ Q. The first inequality follows from the definition of a proxfunction and the second inequality follows from the fact that φ 1 and φ 2 are convex functions. Using the last relation and (5) we derive the bound (8) on the duality gap. We now show how to construct the set Q and how to choose optimally the smoothness parameter c. In the next two sections we discuss two cases depending on the choices for Q and d Q. C. Efficiency estimates for compact Q Let D Q be a positive constant satisfying max d Q(λ) D Q. (9) λ Q Let us note that we can choose D Q finite whenever Q is compact. In this section we specialize the result of Theorem 3.4 for the case when Q has the following form: Q = {λ R n : λ R}. Theorem 3.5: [6] Assume that Λ is nonempty and bounded. Then, the sequence {λ k } k 0 generated by Algorithm 3.2 is bounded. Since Assumption 2.3 holds, then Λ is nonempty. Conditions under which Λ is bounded can be found in [10] (page 301). Under the assumptions of Theorem 3.5, it follows that there exists R > 0 sufficiently large such that the set Q = {λ R n : λ R} contains Λ, and thus we can assume D Q to be finite. Notice that similar arguments were used in order to prove convergence of two-level algorithms for convex optimization problems (see e.g. [3]). ) Denote α = 4 D Q (D X + D Z )( A 2 σ Qσ X σ Qσ Z. The next theorem shows how to choose optimally the smoothness parameter c and provides the complexity estimates of our method when Q is a ball. Our proof follows a similar reasoning as in [8], but requires more complex arguments. Theorem 3.6: Assume that there exists R such that the set Q = {λ R n : λ R} contains a λ Λ. Taking c = 2 DQ ( A 2 ) k+1 D X+D Z σ Qσ X σ Qσ Z, then after k iterations we obtain an approximate solution to the problem (1) (ˆx,ẑ) = k 2(l+1) (k+1)(k+2) (xl+1,z l+1 ) and ˆλ = λ k which satisfy the following bounds on the duality gap and constraints: 0 [φ 1 (ˆx) + φ 2 (ẑ)] f 0 (ˆλ) α k + 1 α Aˆx + Bẑ b (R λ )(k + 1). Proof: Using (9) and the form of Q we obtain that max λ Q 4L c σ Q (k + 1) 2 d Q(λ) + Aˆx + Bẑ b,λ 4L cd Q σ Q (k + 1) 2 + max Aˆx + Bẑ b,λ = λ R 4L cd Q σ Q (k + 1) 2 + R Aˆx + Bẑ b. In view of the previous relation and Theorem 3.4 we obtain the following duality gap: [φ 1 (ˆx) + φ 2 (ẑ)] f 0 (ˆλ) c(d X + D Z ) + 4L cd Q σ Q (k + 1) 2 R Aˆx + Bẑ b c(d X + D Z ) + 4L cd Q σ Q (k + 1) 2 Minimizing the right-hand side of this inequality over c we get the above expressions for c and for the upper bound on the duality gap from the theorem. Moreover, for the constraints using Lemma 3.3 and inequality (7) we have that (R λ ) Aˆx+Bẑ b c(d X +D Z )+ 4L cd Q σ Q (k + 1) 2 and replacing c derived above we also get the bound on the constraints violation. From Theorem 3.6 we obtain that the complexity for finding an (ǫ,ǫ c )-approximation of the optimum f, when the set Q is a ball, is k + 1 = 4 D Q (D X + D Z ) ( A ) 2 σ Qσ X 1 σ Qσ Z ǫ, i.e. the efficiency estimates of our scheme is of the order O( 1 ǫ ), better than most non-smooth optimization schemes such as the subgradient method that have an efficiency estimate of the order O( 1 ǫ ) 2 (see e.g. [7]). The dependence of the parameters c and L c ǫ on ǫ is as follows: c = 2(D X+D Z) and L c = ( A 2 σ X + B ) 2 DX+D Z σ Z 2ǫ. The main advantage of our scheme is that it is fully automatic, the parameter c is chosen unambiguously, which is crucial for justifying the convergence properties of Algorithm 3.2. Another advantage of the proximal center method is that we are free in the choice of the norms in the spaces R n, R m and R p, while most of the decomposition schemes are based on the Euclidean norm. Thus, we can choose the norms which make the ratio A 2 σ Qσ X as small as possible. D. Efficiency estimates for the Euclidean norm In this section we assume the Euclidean norm on R n. 3080

5 Theorem 3.7: Assume that Q = R n and d Q (λ) = 1 2 λ 2, with the Euclidean norm on R n ǫ. Taking c = D X+D Z and ( k + 1 = 2 A 2 ) σ X σ (DX Z + D Z ) 1 ǫ, then after k iterations the duality gap is less than ǫ and the constraints satisfy Aˆx + Bẑ b ǫ ( λ + λ ). Proof: Let us note that for these choices for Q and d Q we have σ Q = 1 and thus max 4L c λ R n σ Q (k + 1) 2 d Q(λ) + Aˆx + Bẑ b,λ = (k + 1) 2 Aˆx + Bẑ b 2. We obtain the following bound on the duality gap (Th. 3.4): [φ 1 (ˆx) + φ 2 (ẑ)] f 0 (ˆλ) (k + 1)2 c(d X + D Z ) Aˆx + Bẑ b 2 c(d X + D Z ). ǫ It follows that taking c = D X+D Z, the duality gap is less than ǫ. For the constraints using Lemma 3.3 and inequality (7) we get that Aˆx + Bẑ b satisfies the second order (k+1) inequality in y: 2 y 2 λ y ǫ 0. Therefore, Aˆx + Bẑ b must be less than the largest root of the corresponding second-order equation, i.e. Aˆx + Bẑ b ( λ + λ 2 ǫ(k + 1)2 ) 4L c + 2L c (k + 1) 2. After some long but straightforward computations we get that after k iterations, where k defined as in the theorem, we also get Aˆx + Bẑ b ǫ ( λ + λ ). Remark 3.8 When coupling inequalities Ax + Bz b 0 are present, then we choose Q = R n +. Using the same reasoning as before we get that max λ 0 4Lc σ Q(k+1) d 2 Q (λ)+ Aˆx + Bẑ b,λ = (k+1)2 [Aˆx + Bẑ b] + 2, where [y] + denotes the projection of y R n onto R n +. Taking for c and k the same values as in Theorem 3.7, we can conclude that after k iterations the duality gap is less than ǫ and using a modified version of Lemma 3.3 (i.e. a generalized version of Cauchy-Schwarz inequality: y,λ [y] + λ for any y R n,λ 0) the constraints violation satisfy [Aˆx + Bẑ b] + ǫ ( λ + λ ). IV. APPLICATIONS A. Applications with separable structure In this section we briefly discuss some of the applications to which our method can be applied. First application that we will discuss here is the control of large scale systems with interacting subsystem dynamics. A distributed MPC framework is appealing in this context since this framework allows us to design local subsystembased controllers that take care of the interactions between different subsystems and physical constraints. We assume that the overall system model can be decomposed into M appropriate subsystem models: x i (k + 1) = A ij x j (k) + B ij u j (k) i = 1 M, j N(i) where N(i) denotes the set of subsystems that interact with the ith subsystem, including itself. The control and state sequence must satisfy local constraints x i (k) Ω i, u i (k) U i i = 1 M and k 0, where the sets Ω i and U i are usually convex compact sets with the origin in their interior. In general the control objective is to steer the state of the system to origin or any other set point in a best way. Performance is expressed via a stage cost, which in this paper we assume to have the following form (see also [13]): M l i(x i,u i ), where usually l i is a convex quadratic function, but not necessarily strictly convex. Let N denote the prediction horizon. In MPC we must solve at each step k, given x i (k) = x i, a finitehorizon optimal control problem of the following form: N 1 min l i (x i l,u i l) (10) x i l,ui l s.t. : x i 0 = x i, x i l+1 = A ij x j l + B iju j l j N(i) x i N Ω i, x i l Ω i, u i l U i l = 0 N 1, i = 1 M Note that a similar formulation of distributed MPC for coupled linear subsystems with decoupled costs was given in [13], but without state constraints (i.e. without imposing x i (k) Ω i ). Let us introduce x i = (x i 0 x i N ui 0 u i N 1 ),X i = Ω N+1 i Ui N and ψ i (x i ) = N 1 l i(x i l,ui l ). Then, the control problem (10) can be recast as a separable convex program: { M min ψ i (x i ) : C i x i γ = 0 }, (11) x i X i where the matrices C i s are defined appropriately. Network optimization furnishes another areas in which our Algorithm 3.2 leads to a new method of solution. In this application the convex optimization problem has the following form [9], [14]: { M min ψ i (x i ) : C i x i γ = 0, x i X i D i x i β 0 }, (12) where X i are compact convex sets (in general balls) in R m, ψ i s are non-strictly convex functions and M denotes the number of agents in the network. Note that (11) is a particular case of the separable convex problem (12). In [13] the optimization problem (10) (or equivalently (11)) was solved in a decentralized fashion, iterating the Jacobi algorithm p max times [1]. But, there is no theoretical guarantee of the Jacobi algorithm about how good is the approximation of the optimum after p max iterations and moreover we need strictly convex functions ψ i to prove asymptotic convergence to the optimum. However, if we solve (10) using our Algorithm 3.2, we have a guaranteed upper bound on the approximation after p max iterations (see Theorem 3.4). In [9], [14] the optimization problem (12) is 3081

6 solved using the dual subgradient method described in the Algorithm 2.2. From the simulation tests of Section IV-B we see that our Algorithm 3.2 is superior to Algorithm 2.2. Let us describe briefly the main ingredients of Algorithm 3.2 for problem (12). Let d Xi be properly chosen proxfunctions, according to the structure of the sets X i s and the norm 1 used on R m. In this case the smooth dual function f c has the form: f c (λ) = min ψ i (x i ) + λ 1, C i x i α + x i X i λ 2, D i x i β + c d Xi (x i ). Moreover, Q (R n1 R n2 + ), where n 1 and n 2 denote the number of equality and inequality constraints. It is worth noting that in this case all the minimization problems involved in Algorithm 3.2 are decomposable in x i and thus the agents can solve the optimization problem (12) in a decentralized fashion. B. Computational results We conclude this paper with the results of computational experiments on a random set of problems of the form (12), where ψ i s are convex quadratic functions (but not strictly convex) and X i s are balls in R m defined with the Euclidean norm. Similarly, R n (corresponding to the Lagrange multipliers) is endowed with the Euclidean norm. m nr iter Alg 3.2 nr iter Alg (0.01) 5000(0.31) (0.01) 5000(0.49) (0.01) 10000(0.57) (0.0011) 5000(0.31) (0.004) 5000(0.49) (0.007) 10000(0.57) In the table we display the number of iterations of the Algorithm 2.2 and 3.2 for different values of m = 50,100 and 1000, and for M = 10 agents. For m = 50 and m = 100 the maximum number of iterations that we allow is For m = 1000 we iterate at most times. We also display between brackets the corresponding accuracy. We see that the accuracy of the approximation of the optimum is much better with our Algorithm 3.2 than with Algorithm 2.2. V. CONCLUSIONS A new decomposition method in convex programming is developed in this paper using the framework of dual decomposition. Our method combines the computationally non-expensive cost of the gradient method with the efficiency of structural optimization for solving non-smooth separable convex programs. Although our decomposition method 1 For example if X i = {x R m : x x 0 2 R}, where denotes here the Euclidean norm, then it is natural to take d Xi (x) = x x If X i = {x R m P : x 0, m P x (i) = 1}, then the norm x = m x (i) and d Xi (x) = ln m + P m x (i) ln x (i) is more suitable (see [8] for more details). resembles proximal based methods, it differs both in the computational steps and in the choice of the parameters of the method. Contrary to most proximal based methods that enforce the next iterates to be close to the previous ones, our method uses fixed centers in the prox-terms which leads to more freedom in the next iterates. Another advantage of our proximal center method is that it is fully automatic, i.e. the parameters of the scheme are chosen optimally, which are crucial for justifying the convergence properties of the new scheme. We presented efficiency estimate results of the new decomposition method for general separable convex problems. We also show that our algorithm can be used to solve network optimization or distributed model predictive control problems. The computations on some test problems confirm that the proposed method works well in practice. Acknowledgment. We acknowledge the financial support by Research Council K.U. Leuven: GOA AMBioRICS, CoE EF/05/006, OT/03/12, PhD/postdoc & fellow grants; Flemish Government: FWO PhD/postdoc grants, FWO projects G , G , G , G ; Research communities (ICCoS, ANMMM, MLDM); AWI: BIL/05/43, IWT: PhD Grants; Belgian Federal Science Policy Office: IUAP DYSCO. REFERENCES [1] D. P. Bertsekas and J. N. Tsitsiklis. Parallel and distributed computation: Numerical Methods. Prentice-Hall, Englewood Cliffs, NJ, [2] G. Chen and M. Teboulle. A proximal-based decomposition method for convex minimization problems. Mathematical Programming (A), 64:81 101, [3] G. Cohen. Optimization by decomposition and coordination: A unified approach. IEEE Transactions on Automatic Control, AC 23(2): , [4] J. Eckstein and D. Bertsekas. On the Douglas Rachford splitting method and the proximal point method for maximal monotone operators. Mathematical Programming (A), 55(3): , [5] S. Kontogiorgis, R. De Leone, and R. Meyer. Alternating direction splitings for block angular parallel optimization. Journal of Optimization Theory and Applications, 90(1):1 29, [6] I. Necoara and J.A.K. Suykens. Application of a smoothing technique to decomposition in convex programming. Technical Report 08 07, ESAT-SISTA, K.U. Leuven (Leuven, Belgium), January [7] Y. Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Boston, [8] Y. Nesterov. Smooth minimization of non-smooth functions. Mathematical Programming (A), 103(1): , [9] D. Palomar and M. Chiang. Alternative distributed algorithms for network utility maximization: Framework and applications. IEEE Transactions on Automatic Control, 52(12): , [10] R. T. Rockafellar. Convex Analysis, volume 28 of Princeton Mathematics Series. Princeton University Press, [11] J. E. Spingarn. Applications of the method of partial inverses to convex programming: decomposition. Mathematical Programming (A), 32: , [12] P. Tseng. Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIAM Journal of Control and Optimization, 29(1): , [13] A. Venkat, I. Hiskens, J. Rawlings, and S. Wright. Distributed MPC strategies with application to power system automatic generation control. IEEE Transactions on Control Systems Technology, to appear, [14] L. Xiao, M. Johansson, and S. Boyd. Simultaneous routing and resource allocation via dual decomposition. IEEE Transactions on Communications, 52(7): ,

AN IMPROVED DUAL DECOMPOSITION APPROACH TO DSL DYNAMIC SPECTRUM MANAGEMENT

AN IMPROVED DUAL DECOMPOSITION APPROACH TO DSL DYNAMIC SPECTRUM MANAGEMENT Paschalis Tsiaflais, Ion Necoara, Johan A. K. Suyens, and Marc Moonen Katholiee Universiteit Leuven, Electrical Engineering, ESAT-SCD