Augmented Lagrangian Methods

Size: px

Start display at page:

Download "Augmented Lagrangian Methods"

Colleen Harrison
5 years ago
Views:

1 Augmented Lagrangian Methods Mário A. T. Figueiredo 1 and Stephen J. Wright 2 1 Instituto de Telecomunicações, Instituto Superior Técnico, Lisboa, Portugal 2 Computer Sciences Department, University of Wisconsin, Madison, WI, USA HIM, Bonn, January 2016 M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

2 Minimization with Linear Constraints: Basics Consider the linearly constrained problem, min f () s.t. A = b, where f : R n R is smooth. How do we recognize that a point is a solution of this problem? (Such optimality conditions can provide the foundation for algorithms.) Karush-Kuhn-Tucker (KKT) condition is a first-order necessary condition. If is a local solution, there eists a vector of Lagrange multipliers λ R m such that f ( ) = A T λ, A = b. When f is smooth and conve, these conditions are also sufficient. (In fact, it s enough for f to be conve on the null space of A.) M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

3 Minimization with Linear Constraints Define the Lagrangian function: L(, λ) := f () + λ T (A b). Can write the KKT conditions in terms of L as follows: [ L(, λ L( ) =, λ ] ) λ L(, λ = 0. ) Suppose now that f is conve but not smooth. First-order optimality conditions (necessary and sufficient) are that there eists λ R m such that A T λ f ( ), A = b, where f is the subdifferential. In terms of the Lagrangian, we have 0 L(, λ ), λ L(, λ ) = 0. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

4 Augmented Lagrangian Methods With f proper, lower semi-continuous, and conve, consider: min f () s.t. A = b. The augmented Lagrangian is (with ρ > 0) L(, λ; ρ) := f () + λ T (A b) + ρ }{{} 2 A b 2 2. }{{} Lagrangian augmentation M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

5 Augmented Lagrangian Methods With f proper, lower semi-continuous, and conve, consider: min f () s.t. A = b. The augmented Lagrangian is (with ρ > 0) L(, λ; ρ) := f () + λ T (A b) + ρ }{{} 2 A b 2 2. }{{} Lagrangian augmentation Basic augmented Lagrangian (a.k.a. method of multipliers) is (Hestenes, 1969; Powell, 1969) k = arg min L(, λ k 1 ; ρ); λ k = λ k 1 + ρ(a k b); M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

6 A Favorite Derivation...more or less rigorous for conve f. Write the problem as min ma λ f () + λ T (A b). Obviously, the ma w.r.t. λ will be +, unless A = b, so this is equivalent to the original problem. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

7 A Favorite Derivation...more or less rigorous for conve f. Write the problem as min ma λ f () + λ T (A b). Obviously, the ma w.r.t. λ will be +, unless A = b, so this is equivalent to the original problem. This equivalence is not very useful, computationally: the ma λ function is highly nonsmooth w.r.t.. Smooth it by adding a proimal point term, penalizing deviations from a prior estimate λ: min { ma f () + λ T (A b) 1 2 λ λ λ 2ρ }. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

8 A Favorite Derivation...more or less rigorous for conve f. Write the problem as min ma λ f () + λ T (A b). Obviously, the ma w.r.t. λ will be +, unless A = b, so this is equivalent to the original problem. This equivalence is not very useful, computationally: the ma λ function is highly nonsmooth w.r.t.. Smooth it by adding a proimal point term, penalizing deviations from a prior estimate λ: min { ma f () + λ T (A b) 1 2 λ λ λ 2ρ }. Maimization w.r.t. λ is now trivial (a concave quadratic), yielding λ = λ + ρ(a b). M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

9 A Favorite Derivation (Cont.) Inserting λ = λ + ρ(a b) leads to min f () + λ T (A b) + ρ 2 A b 2 = L(, λ; ρ). M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

10 A Favorite Derivation (Cont.) Inserting λ = λ + ρ(a b) leads to min f () + λ T (A b) + ρ 2 A b 2 = L(, λ; ρ). Hence can view the augmented Lagrangian process as: min L(, λ; ρ) to get new ; Shift the prior on λ by updating to the latest ma: λ + ρ(a b). repeat until convergence. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

11 A Favorite Derivation (Cont.) Inserting λ = λ + ρ(a b) leads to min f () + λ T (A b) + ρ 2 A b 2 = L(, λ; ρ). Hence can view the augmented Lagrangian process as: min L(, λ; ρ) to get new ; Shift the prior on λ by updating to the latest ma: λ + ρ(a b). repeat until convergence. Add subscripts, and we recover the augmented Lagrangian algorithm of the first slide! Can also increase ρ (to sharpen the effect of the pro term), if needed. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

12 Inequality Constraints, Nonlinear Constraints The same derivation can be used for inequality constraints: min f () s.t. A b. Apply the same reasoning to the constrained min-ma formulation: min ma f () λ 0 λt (A b). M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

13 Inequality Constraints, Nonlinear Constraints The same derivation can be used for inequality constraints: min f () s.t. A b. Apply the same reasoning to the constrained min-ma formulation: min ma f () λ 0 λt (A b). After the pro-term is added, can find the minimizing λ in closed form (as for pro-operators). Leads to update formula: ma ( λ + ρ(a b), 0 ). This derivation etends immediately to nonlinear constraints c() = 0 or c() 0. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

14 Eplicit Constraints, Inequality Constraints There may be other constraints on (such as Ω) that we prefer to handle eplicitly in the subproblem. For the formulation min f (), s.t. A = b, Ω, the min step can enforce Ω eplicitly: k = arg min Ω L(, λ k 1; ρ); λ k = λ k 1 + ρ(a k b); M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

15 Eplicit Constraints, Inequality Constraints There may be other constraints on (such as Ω) that we prefer to handle eplicitly in the subproblem. For the formulation min f (), s.t. A = b, Ω, the min step can enforce Ω eplicitly: k = arg min Ω L(, λ k 1; ρ); λ k = λ k 1 + ρ(a k b); This gives an alternative way to handle inequality constraints: introduce slacks s, and enforce them eplicitly. That is, replace by min f () s.t. c() 0, min f () s.t. c() = s, s 0. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

16 Eplicit Constraints, Inequality Constraints (Cont.) The augmented Lagrangian is now L(, s, λ; ρ) := f () + λ T (c() s) + ρ 2 c() s 2 2. Enforce s 0 eplicitly in the subproblem: ( k, s k ) = arg min,s L(, s, λ k 1; ρ), s.t. s 0; λ k = λ k 1 + ρ(c( k ) s k ) M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

17 Eplicit Constraints, Inequality Constraints (Cont.) The augmented Lagrangian is now L(, s, λ; ρ) := f () + λ T (c() s) + ρ 2 c() s 2 2. Enforce s 0 eplicitly in the subproblem: ( k, s k ) = arg min,s L(, s, λ k 1; ρ), s.t. s 0; λ k = λ k 1 + ρ(c( k ) s k ) There are good algorithmic options for dealing with bound constraints s 0 (gradient projection and its enhancements). This is used in the Lancelot code (Conn et al., 1992). M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

18 Quick History of Augmented Lagrangian Dates from at least 1969: Hestenes, Powell. Developments in 1970s, early 1980s by Rockafellar, Bertsekas, and others. Lancelot code for nonlinear programming: Conn, Gould, Toint, around 1992 (Conn et al., 1992). Lost favor somewhat as an approach for general nonlinear programming during the net 15 years. Recent revival in the contet of sparse optimization and its many applications, in conjunction with splitting / coordinate descent. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

19 Alternating Direction Method of Multipliers (ADMM) Consider now problems with a separable objective of the form min f () + h(z) s.t. A + Bz = c, (,z) for which the augmented Lagrangian is L(, z, λ; ρ) := f () + h(z) + λ T (A + Bz c) + ρ 2 A Bz c 2 2. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

20 Alternating Direction Method of Multipliers (ADMM) Consider now problems with a separable objective of the form min f () + h(z) s.t. A + Bz = c, (,z) for which the augmented Lagrangian is L(, z, λ; ρ) := f () + h(z) + λ T (A + Bz c) + ρ 2 A Bz c 2 2. Standard AL would minimize L(, z, λ; ρ) w.r.t. (, z) jointly. However, since coupled in the quadratic term, separability is lost. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

21 Alternating Direction Method of Multipliers (ADMM) Consider now problems with a separable objective of the form min f () + h(z) s.t. A + Bz = c, (,z) for which the augmented Lagrangian is L(, z, λ; ρ) := f () + h(z) + λ T (A + Bz c) + ρ 2 A Bz c 2 2. Standard AL would minimize L(, z, λ; ρ) w.r.t. (, z) jointly. However, since coupled in the quadratic term, separability is lost. In ADMM, minimize over and z separately and sequentially: k = arg min L(, z k 1, λ k 1 ; ρ); z k = arg min L( k, z, λ k 1 ; ρ); z λ k = λ k 1 + ρ(a k + Bz k c). M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

22 ADMM Main features of ADMM: Does one cycle of block-coordinate descent in (, z). The minimizations over and z add only a quadratic term to f and h, respectively. Usually does not alter the cost much. Can perform the (, z) minimizations ineactly. Can add eplicit (separated) constraints: Ω, z Ω z. Many (many!) recent applications to compressed sensing, image processing, matri completion, sparse principal components analysis... ADMM has a rich collection of antecendents, dating even to the 1950s (operator splitting). For an comprehensive recent survey, including a diverse collection of machine learning applications, see Boyd et al. (2011). M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

23 ADMM for Consensus Optimization Given the unconstrained (but separable) problem m min f i (), form m copies of the, with the original as a master variable: m f i ( i ) subject to i = 0, i = 1, 2,..., m. min, 1, 2,..., m i=1 i=1 Apply ADMM, with z = ( 1, 2,..., m ). Get m L(, 1, 2,..., m, λ 1,..., λ m ; ρ) = f i ( i )+(λ i ) T ( i )+ ρ 2 i 2 2. i=1 The minimization w.r.t. z = ( 1, 2,..., m ) is separable! k i = arg min f i ( i )+(λ i i k 1 )T ( i k 1 )+ ρ k 2 i k 1 2 2, i = 1, 2,..., m. Can be implemented in parallel. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

24 Consensus, continued The minimization w.r.t. can be done eplicitly averaging: k = 1 m m i=1 ( k i + 1 ) λ i k 1. ρ k Update to λ i can also be done in parallel, once the new k is known (and communicated): λ i k = λi k 1 + ρ k( i k k), i = 1, 2,..., m. If the initial λ i 0 have m i=1 λi 0 =, can see that m iterations k. Can simplify the update for k : i=1 λi k = 0 at all k = 1 m m k i. i=1 Gather-Scatter implementation. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

25 ADMM for Awkward Intersections The feasible set is sometimes an intersection of two or more conve sets that are easy to handle separately (e.g. projections are easily computable), but whose intersection is more difficult to work with. Eample: Optimization over the cone of doubly nonnegative matrices: General form: min X f (X ) s.t. X 0, X 0. min f () s.t. Ω i, i = 1, 2,..., m Again, use a different copy i for each set, and constrain them all to be the same: min f () s.t. i Ω i, i = 0, i = 1, 2,..., m., 1, 2,..., m M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

26 ADMM for Awkward Intersections Separable minimizations over Ω i, i = 1, 2,..., m: i k = arg min i Ω i (λ i k 1 )T ( i k 1 ) + ρ k 2 k i 2 2, i = 1, 2,..., m. Optimize over the master variable (unconstrained, with quadratic added to f ): k = arg min f () + Update multipliers: m i=1 (λ i k 1 )T ( i k 1 ) + ρ k 2 i k 1 2 2, λ i k = λi k 1 + ρ k( k k i ), i = 1, 2,..., m. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

27 ADMM: A Simpler Form Often, a simpler version is enough: min f () + h(z) s.t. A = z, (,z) equivalent to min f () + h(a), often the one of interest. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

28 ADMM: A Simpler Form Often, a simpler version is enough: min f () + h(z) s.t. A = z, (,z) equivalent to min f () + h(a), often the one of interest. In this case, the ADMM can be written as k = arg min f () + ρ 2 A z k 1 d k z k = arg min z h(z) + ρ 2 A k 1 z d k d k = d k 1 (A k z k ) the so-called scaled version (Boyd et al., 2011). M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

29 ADMM: A Simpler Form Often, a simpler version is enough: min f () + h(z) s.t. A = z, (,z) equivalent to min f () + h(a), often the one of interest. In this case, the ADMM can be written as k = arg min f () + ρ 2 A z k 1 d k z k = arg min z h(z) + ρ 2 A k 1 z d k d k = d k 1 (A k z k ) the so-called scaled version (Boyd et al., 2011). Updating z k is a proimity computation: z k = pro h/ρ ( A k 1 d k 1 ) M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

30 ADMM: A Simpler Form Often, a simpler version is enough: min f () + h(z) s.t. A = z, (,z) equivalent to min f () + h(a), often the one of interest. In this case, the ADMM can be written as k = arg min f () + ρ 2 A z k 1 d k z k = arg min z h(z) + ρ 2 A k 1 z d k d k = d k 1 (A k z k ) the so-called scaled version (Boyd et al., 2011). Updating z k is a proimity computation: z k = pro h/ρ ( A k 1 d k 1 ) Updating k may be hard: if f is quadratic, involves matri inversion; if f is not quadratic, may be as hard as the original problem. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

31 ADMM: Convergence Consider the problem min f () + h(a), where f and h are lower semi-continuous, proper, conve functions and A has full column rank. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

32 ADMM: Convergence Consider the problem min f () + h(a), where f and h are lower semi-continuous, proper, conve functions and A has full column rank. The ADMM algorithm presented in the previous slide converges (for any ρ > 0) to a solution, if one eists, otherwise it diverges. This is a cornerstone result by Eckstein and Bertsekas (1992). M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

33 ADMM: Convergence Consider the problem min f () + h(a), where f and h are lower semi-continuous, proper, conve functions and A has full column rank. The ADMM algorithm presented in the previous slide converges (for any ρ > 0) to a solution, if one eists, otherwise it diverges. This is a cornerstone result by Eckstein and Bertsekas (1992). As in IST/FBS/PGA, convergence is still guaranteed with ineactly solved subproblems, as long as the errors are absolutely summable. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

34 ADMM: Convergence Consider the problem min f () + h(a), where f and h are lower semi-continuous, proper, conve functions and A has full column rank. The ADMM algorithm presented in the previous slide converges (for any ρ > 0) to a solution, if one eists, otherwise it diverges. This is a cornerstone result by Eckstein and Bertsekas (1992). As in IST/FBS/PGA, convergence is still guaranteed with ineactly solved subproblems, as long as the errors are absolutely summable. The recent eplosion of interest in ADMM is clear in the citation records of the review paper of Boyd et al. (2011) (2800 and counting) and of the paper by Eckstein and Bertsekas (1992): M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

35 ADMM for a More General Problem Consider the problem min R n J g j (H (j) ), where H (j) R p j n, i=1 and g 1,..., g J are l.s.c proper conve fuctions. Map it into min f () + h(a) as follows (with p = p p J ): f () = 0 A = H (1). H (J) R p n, z (1) h : R p1+ +p J R, h. = z (J) J g j (z (j) ) j=1 This leads to a convenient version of ADMM. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

36 ADMM for a More General Problem (Cont.) Resulting instance of k = arg min Az z k d k 2 2 M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

37 ADMM for a More General Problem (Cont.) Resulting instance of k = arg min ( J ) Az z k d k 2 1 ( J ) 2 = (H (j) ) T H (j) (H (j) ) T (z (j) k 1 + d (j) k 1 ) j=1 j=1 M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

38 ADMM for a More General Problem (Cont.) Resulting instance of k = arg min z (1) k z (J) k. ( J ) Az z k d k 2 1 ( J ) 2 = (H (j) ) T H (j) (H (j) ) T (z (j) k 1 + d (j) k 1 ) j=1 = arg min u g 1 + ρ 2 u H(1) k 1 + d (1) k = arg min g J + ρ u 2 u H(J) k 1 + d (J) k j=1 M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

39 ADMM for a More General Problem (Cont.) Resulting instance of k = arg min z (1) k z (J) k ( J ) Az z k d k 2 1 ( J ) 2 = (H (j) ) T H (j) (H (j) ) T (z (j) k 1 + d (j) k 1 ) j=1 = arg min g 1 + ρ u 2 u H(1) k 1 + d (1) ( k = pro g1 /ρ H (1) k 1 d (1) ) k = arg min g J + ρ u 2 u H(J) k 1 + d (J) ( k = pro gj /ρ H (J) k 1 d (J) ) k 1 j=1 M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

40 ADMM for a More General Problem (Cont.) Resulting instance of k = arg min z (1) k z (J) k ( J ) Az z k d k 2 1 ( J ) 2 = (H (j) ) T H (j) (H (j) ) T (z (j) k 1 + d (j) k 1 ) j=1 = arg min g 1 + ρ u 2 u H(1) k 1 + d (1) ( k = pro g1 /ρ H (1) k 1 d (1) ) k = arg min g J + ρ u 2 u H(J) k 1 + d (J) ( k = pro gj /ρ H (J) k 1 d (J) ) k 1 d k = d k 1 A k + z k j=1 M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

41 ADMM for a More General Problem (Cont.) Resulting instance of k = arg min z (1) k z (J) k ( J ) Az z k d k 2 1 ( J ) 2 = (H (j) ) T H (j) (H (j) ) T (z (j) k 1 + d (j) k 1 ) j=1 = arg min g 1 + ρ u 2 u H(1) k 1 + d (1) ( k = pro g1 /ρ H (1) k 1 d (1) ) k = arg min g J + ρ u 2 u H(J) k 1 + d (J) ( k = pro gj /ρ H (J) k 1 d (J) ) k 1 d k = d k 1 A k + z k Key features: matrices are handled separately from the pro operators; the pro operators are decoupled (can be computed in parallel); requires a matri inversion (can be a curse or a blessing). (Afonso et al., 2010; Setzer et al., 2010; Combettes and Pesquet, 2011) j=1 M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

42 Eample: Image Restoration using SALSA M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

43 Eample: Image Restoration using SALSA M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

44 Eample: Image Restoration using SALSA M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

45 Eample: Image Restoration using SALSA M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

46 Eample: Image Restoration using SALSA M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

47 Eample: Image Restoration using SALSA M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

48 ADMM for the Morozov Formulation M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

49 ADMM for the Morozov Formulation M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

50 ADMM for the Morozov Formulation M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

51 ADMM for the Morozov Formulation M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

52 ADMM for Sparse Inverse Covariance Reformulate as ma log det(x ) X, S τ X 1, X 0 Subproblems are: ma log det(x ) X, S τ Z 1 s.t. X Z = 0. X 0 X k := arg ma X log det(x ) X, S U k 1, X Z k 1 ρ k 2 X Z k 1 2 F := arg ma log det(x ) X, S ρ k X 2 X Z k 1 + U k /ρ k 2 F Z k := pro τ/ρk (X k + U k ); U k+1 := U k + ρ k (X k Z k ). M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

53 Solving for X Get optimality condition for the X subproblem by using X log det(x ) = X 1, when X is s.p.d. Thus, which is equivalent to Form eigendecomposition X 1 S ρ k (X Z k 1 + U k /ρ k ) = 0, X 1 ρ k X (S ρ k Z k 1 + U k ) = 0. (S ρ k Z k 1 + U k ) = QΛQ T, where Q is n n orthogonal and Λ is diagonal with elements λ i. Seek X with the form Q ΛQ T, where Λ has diagonals λ i. Must have 1 ρ k λ i λ i = 0, i = 1, 2,..., n. λ i Take positive roots: λ i = [λ i + λ 2 i + 4ρ k ]/(2ρ k ), i = 1, 2,..., n. M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

54 References I Afonso, M., Bioucas-Dias, J., and Figueiredo, M. (2010). Fast image recovery using variable splitting and constrained optimization. IEEE Transactions on Image Processing, 19: Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1): Combettes, P. and Pesquet, J.-C. (2011). Signal recovery by proimal forward-backward splitting. In Fied-Point Algorithms for Inverse Problems in Science and Engineering, pages Springer. Conn, A., Gould, N., and Toint, P. (1992). LANCELOT: a Fortran package for large-scale nonlinear optimization (Release A). Springer Verlag, Heidelberg. Eckstein, J. and Bertsekas, D. (1992). On the Douglas-Rachford splitting method and the proimal point algorithm for maimal monotone operators. Mathematical Programming, 5: Hestenes, M. (1969). Multiplier and gradient methods. Journal of Optimization Theory and Applications, 4: Powell, M. (1969). A method for nonlinear constraints in minimization problems. In Fletcher, R., editor, Optimization, pages Academic Press, New York. Setzer, S., Steidl, G., and Teuber, T. (2010). Deblurring poissonian images by split bregman techniques. Journal of Visual Communication and Image Representation, 21: M. Figueiredo and S. Wright () Augmented Lagrangian Methods HIM, January / 33

Augmented Lagrangian Methods

Augmented Lagrangian Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Augmented Lagrangian IMA, August 2016 1 /