Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers

Size: px
Start display at page:

Download "Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers"

Transcription

1 Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers Stephen Boyd MIIS Xi An, 1/7/12 source: Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers (Boyd, Parikh, Chu, Peleato, Eckstein) 1

2 Goals robust methods for arbitrary-scale optimization machine learning/statistics with huge data-sets dynamic optimization on large-scale network decentralized optimization devices/processors/agents coordinate to solve large problem, by passing relatively small messages 2

3 Outline Dual decomposition Method of multipliers Alternating direction method of multipliers Common patterns Examples Consensus and exchange Conclusions 3

4 Outline Dual decomposition Method of multipliers Alternating direction method of multipliers Common patterns Examples Consensus and exchange Conclusions Dual decomposition 4

5 Dual problem convex equality constrained optimization problem minimize f(x) subject to Ax = b Lagrangian: L(x,y) = f(x)+y T (Ax b) dual function: g(y) = inf x L(x,y) dual problem: maximize g(y) recover x = argmin x L(x,y ) Dual decomposition 5

6 Dual ascent gradient method for dual problem: y k+1 = y k +α k g(y k ) g(y k ) = A x b, where x = argmin x L(x,y k ) dual ascent method is x k+1 := argmin x L(x,y k ) // x-minimization y k+1 := y k +α k (Ax k+1 b) // dual update works, with lots of strong assumptions Dual decomposition 6

7 Dual decomposition suppose f is separable: f(x) = f 1 (x 1 )+ +f N (x N ), x = (x 1,...,x N ) then L is separable in x: L(x,y) = L 1 (x 1,y)+ +L N (x N,y) y T b, L i (x i,y) = f i (x i )+y T A i x i x-minimization in dual ascent splits into N separate minimizations x k+1 i := argmin x i L i (x i,y k ) which can be carried out in parallel Dual decomposition 7

8 Dual decomposition dual decomposition (Everett, Dantzig, Wolfe, Benders ) x k+1 i := argmin xi L i (x i,y k ), i = 1,...,N y k+1 := y k +α k ( N i=1 A ix k+1 i b) scatter y k ; update x i in parallel; gather A i x k+1 i solve a large problem by iteratively solving subproblems (in parallel) dual variable update provides coordination works, with lots of assumptions; often slow Dual decomposition 8

9 Outline Dual decomposition Method of multipliers Alternating direction method of multipliers Common patterns Examples Consensus and exchange Conclusions Method of multipliers 9

10 Method of multipliers a method to robustify dual ascent use augmented Lagrangian (Hestenes, Powell 1969), ρ > L ρ (x,y) = f(x)+y T (Ax b)+(ρ/2) Ax b 2 2 method of multipliers (Hestenes, Powell; analysis in Bertsekas 1982) x k+1 := argminl ρ (x,y k ) x y k+1 := y k +ρ(ax k+1 b) (note specific dual update step length ρ) Method of multipliers 1

11 Method of multipliers dual update step optimality conditions (for differentiable f): (primal and dual feasibility) since x k+1 minimizes L ρ (x,y k ) Ax b =, f(x )+A T y = = x L ρ (x k+1,y k ) = x f(x k+1 )+A T ( y k +ρ(ax k+1 b) ) = x f(x k+1 )+A T y k+1 dual update y k+1 = y k +ρ(x k+1 b) makes (x k+1,y k+1 ) dual feasible primal feasibility achieved in limit: Ax k+1 b Method of multipliers 11

12 Method of multipliers (compared to dual decomposition) good news: converges under much more relaxed conditions (f can be nondifferentiable, take on value +,...) bad news: quadratic penalty destroys splitting of the x-update, so can t do decomposition Method of multipliers 12

13 Outline Dual decomposition Method of multipliers Alternating direction method of multipliers Common patterns Examples Consensus and exchange Conclusions Alternating direction method of multipliers 13

14 Alternating direction method of multipliers a method with good robustness of method of multipliers which can support decomposition robust dual decomposition or decomposable method of multipliers proposed by Gabay, Mercier, Glowinski, Marrocco in 1976 Alternating direction method of multipliers 14

15 Alternating direction method of multipliers ADMM problem form (with f, g convex) minimize f(x) + g(z) subject to Ax+Bz = c two sets of variables, with separable objective L ρ (x,z,y) = f(x)+g(z)+y T (Ax+Bz c)+(ρ/2) Ax+Bz c 2 2 ADMM: x k+1 := argmin x L ρ (x,z k,y k ) // x-minimization z k+1 := argmin z L ρ (x k+1,z,y k ) // z-minimization y k+1 := y k +ρ(ax k+1 +Bz k+1 c) // dual update Alternating direction method of multipliers 15

16 Alternating direction method of multipliers if we minimized over x and z jointly, reduces to method of multipliers instead, we do one pass of a Gauss-Seidel method we get splitting since we minimize over x with z fixed, and vice versa Alternating direction method of multipliers 16

17 ADMM and optimality conditions optimality conditions (for differentiable case): primal feasibility: Ax+Bz c = dual feasibility: f(x)+a T y =, g(z)+b T y = since z k+1 minimizes L ρ (x k+1,z,y k ) we have = g(z k+1 )+B T y k +ρb T (Ax k+1 +Bz k+1 c) = g(z k+1 )+B T y k+1 so with ADMM dual variable update, (x k+1,z k+1,y k+1 ) satisfies second dual feasibility condition primal and first dual feasibility are achieved as k Alternating direction method of multipliers 17

18 ADMM with scaled dual variables combine linear and quadratic terms in augmented Lagrangian L ρ (x,z,y) = f(x)+g(z)+y T (Ax+Bz c)+(ρ/2) Ax+Bz c 2 2 = f(x)+g(z)+(ρ/2) Ax+Bz c+u 2 2 +const. with u k = (1/ρ)y k ADMM (scaled dual form): x k+1 ( := argmin f(x)+(ρ/2) Ax+Bz k c+u k 2 2) x z k+1 ( := argmin g(z)+(ρ/2) Ax k+1 +Bz c+u k 2 2) z u k+1 := u k +(Ax k+1 +Bz k+1 c) Alternating direction method of multipliers 18

19 Convergence assume (very little!) f, g convex, closed, proper L has a saddle point then ADMM converges: iterates approach feasibility: Ax k +Bz k c objective approaches optimal value: f(x k )+g(z k ) p Alternating direction method of multipliers 19

20 Related algorithms operator splitting methods (Douglas, Peaceman, Rachford, Lions, Mercier, s, 1979) proximal point algorithm (Rockafellar 1976) Dykstra s alternating projections algorithm (1983) Spingarn s method of partial inverses (1985) Rockafellar-Wets progressive hedging (1991) proximal methods (Rockafellar, many others, 1976 present) Bregman iterative methods (28 present) most of these are special cases of the proximal point algorithm Alternating direction method of multipliers 2

21 Outline Dual decomposition Method of multipliers Alternating direction method of multipliers Common patterns Examples Consensus and exchange Conclusions Common patterns 21

22 Common patterns x-update step requires minimizing f(x)+(ρ/2) Ax v 2 2 (with v = Bz k c+u k, which is constant during x-update) similar for z-update several special cases come up often can simplify update by exploiting structure in these cases Common patterns 22

23 Decomposition suppose f is block-separable, f(x) = f 1 (x 1 )+ +f N (x N ), x = (x 1,...,x N ) A is conformably block separable: A T A is block diagonal then x-update splits into N parallel updates of x i Common patterns 23

24 Proximal operator consider x-update when A = I some special cases: x + ( ) = argmin f(x)+(ρ/2) x v 2 2 = proxf,ρ (v) x f = I C (indicator fct. of set C) x + := Π C (v) (projection onto C) f = λ 1 (l 1 norm) x + i := S λ/ρ (v i ) (soft thresholding) (S a (v) = (v a) + ( v a) + ) Common patterns 24

25 Quadratic objective f(x) = (1/2)x T Px+q T x+r x + := (P +ρa T A) 1 (ρa T v q) use matrix inversion lemma when computationally advantageous (P +ρa T A) 1 = P 1 ρp 1 A T (I +ρap 1 A T ) 1 AP 1 (direct method) cache factorization of P +ρa T A (or I +ρap 1 A T ) (iterative method) warm start, early stopping, reducing tolerances Common patterns 25

26 Smooth objective f smooth can use standard methods for smooth minimization gradient, Newton, or quasi-newton preconditionned CG, limited-memory BFGS (scale to very large problems) can exploit warm start early stopping, with tolerances decreasing as ADMM proceeds Common patterns 26

27 Outline Dual decomposition Method of multipliers Alternating direction method of multipliers Common patterns Examples Consensus and exchange Conclusions Examples 27

28 Constrained convex optimization consider ADMM for generic problem minimize f(x) subject to x C ADMM form: take g to be indicator of C algorithm: minimize f(x) + g(z) subject to x z = x k+1 ( := argmin f(x)+(ρ/2) x z k +u k 2 2) x z k+1 := Π C (x k+1 +u k ) u k+1 := u k +x k+1 z k+1 Examples 28

29 Lasso lasso problem: minimize (1/2) Ax b 2 2 +λ x 1 ADMM form: minimize (1/2) Ax b 2 2 +λ z 1 subject to x z = ADMM: x k+1 := (A T A+ρI) 1 (A T b+ρz k y k ) z k+1 := S λ/ρ (x k+1 +y k /ρ) y k+1 := y k +ρ(x k+1 z k+1 ) Examples 29

30 Lasso example example with dense A R 15 5 (15 measurements; 5 regressors) computation times factorization (same as ridge regression) 1.3s subsequent ADMM iterations.3s lasso solve (about 5 ADMM iterations) 2.9s full regularization path (3 λ s) 4.4s not bad for a very short Matlab script Examples 3

31 Sparse inverse covariance selection S: empirical covariance of samples from N(,Σ), with Σ 1 sparse (i.e., Gaussian Markov random field) estimate Σ 1 via l 1 regularized maximum likelihood minimize Tr(SX) logdetx +λ X 1 methods: COVSEL (Banerjee et al 28), graphical lasso (FHT 28) Examples 31

32 Sparse inverse covariance selection via ADMM ADMM form: minimize Tr(SX) logdetx +λ Z 1 subject to X Z = ADMM: X k+1 ( := argmin Tr(SX) logdetx +(ρ/2) X Z k +U k 2 F) X Z k+1 := S λ/ρ (X k+1 +U k ) U k+1 := U k +(X k+1 Z k+1 ) Examples 32

33 Analytical solution for X-update compute eigendecomposition ρ(z k U k ) S = QΛQ T form diagonal matrix X with let X k+1 := Q XQ T X ii = λ i + λ 2 i +4ρ 2ρ cost of X-update is an eigendecomposition Examples 33

34 Sparse inverse covariance selection example Σ 1 is 1 1 with 1 4 nonzeros graphical lasso (Fortran): 2 seconds 3 minutes ADMM (Matlab): 3 1 minutes (depends on choice of λ) very rough experiment, but with no special tuning, ADMM is in ballpark of recent specialized methods (for comparison, COVSEL takes 25+ min when Σ 1 is a 4 4 tridiagonal matrix) Examples 34

35 Outline Dual decomposition Method of multipliers Alternating direction method of multipliers Common patterns Examples Consensus and exchange Conclusions Consensus and exchange 35

36 Consensus optimization want to solve problem with N objective terms minimize N i=1 f i(x) e.g., f i is the loss function for ith block of training data ADMM form: N minimize i=1 f i(x i ) subject to x i z = x i are local variables z is the global variable x i z = are consistency or consensus constraints can add regularization using a g(z) term Consensus and exchange 36

37 Consensus optimization via ADMM L ρ (x,z,y) = N i=1( fi (x i )+yi T(x ) i z)+(ρ/2) x i z 2 2 ADMM: x k+1 i ( := argmin fi (x i )+yi kt (x i z k )+(ρ/2) x i z k 2 ) 2 x i N z k+1 := 1 N i=1 ( x k+1 i +(1/ρ)yi k ) y k+1 i := y k i +ρ(x k+1 i z k+1 ) with regularization, averaging in z update is followed by prox g,ρ Consensus and exchange 37

38 Consensus optimization via ADMM using N i=1 yk i x k+1 i =, algorithm simplifies to := argmin x i y k+1 i := y k i +ρ(x k+1 i x k+1 ) where x k = (1/N) N i=1 xk i ( fi (x i )+yi kt (x i x k )+(ρ/2) x i x k 2 ) 2 in each iteration gather x k i and average to get x k scatter the average x k to processors update y k i locally (in each processor, in parallel) update x i locally Consensus and exchange 38

39 Statistical interpretation f i is negative log-likelihood for parameter x given ith data block x k+1 i is MAP estimate under prior N(x k +(1/ρ)y k i,ρi) prior mean is previous iteration s consensus shifted by price of processor i disagreeing with previous consensus processors only need to support a Gaussian MAP method type or number of data in each block not relevant consensus protocol yields global maximum-likelihood estimate Consensus and exchange 39

40 Consensus classification data (examples) (a i,b i ), i = 1,...,N, a i R n, b i { 1,+1} linear classifier sign(a T w +v), with weight w, offset v margin for ith example is b i (a T i w+v); want margin to be positive loss for ith example is l(b i (a T i w +v)) l is loss function (hinge, logistic, probit, exponential,...) choose w, v to minimize 1 N N i=1 l(b i(a T i w+v))+r(w) r(w) is regularization term (l 2, l 1,...) split data and use ADMM consensus to solve Consensus and exchange 4

41 Consensus SVM example hinge loss l(u) = (1 u) + with l 2 regularization baby problem with n = 2, N = 4 to illustrate examples split into 2 groups, in worst possible way: each group contains only positive or negative examples Consensus and exchange 41

42 Iteration Consensus and exchange 42

43 Iteration Consensus and exchange 43

44 Iteration Consensus and exchange 44

45 Distributed lasso example example with dense A R 4 8 (roughly 3 GB of data) distributed solver written in C using MPI and GSL no optimization or tuned libraries (like ATLAS, MKL) split into 8 subsystems across 1 (8-core) machines on Amazon EC2 computation times loading data factorization subsequent ADMM iterations 3s 5m.5 2s lasso solve (about 15 ADMM iterations) 5 6m Consensus and exchange 45

46 Exchange problem N minimize i=1 f i(x i ) subject to N i=1 x i = another canonical problem, like consensus in fact, it s the dual of consensus can interpret as N agents exchanging n goods to minimize a total cost (x i ) j means agent i receives (x i ) j of good j from exchange (x i ) j < means agent i contributes (x i ) j of good j to exchange constraint N i=1 x i = is equilibrium or market clearing constraint optimal dual variable y is a set of valid prices for the goods suggests real or virtual cash payment (y ) T x i by agent i Consensus and exchange 46

47 Exchange ADMM solve as a generic constrained convex problem with constraint set scaled form: x k+1 i unscaled form: x k+1 i C = {x R nn x 1 +x 2 + +x N = } ( := argmin fi (x i )+(ρ/2) x i x k i +x k +u k 2 ) 2 x i u k+1 := u k +x k+1 ( := argmin fi (x i )+y kt x i +(ρ/2) x i (x k i x k ) 2 ) 2 x i y k+1 := y k +ρx k+1 Consensus and exchange 47

48 Interpretation as tâtonnement process tâtonnement process: iteratively update prices to clear market work towards equilibrium by increasing/decreasing prices of goods based on excess demand/supply dual decomposition is the simplest tâtonnement algorithm ADMM adds proximal regularization incorporate agents prior commitment to help clear market convergence far more robust convergence than dual decomposition Consensus and exchange 48

49 Distributed dynamic energy management N devices exchange power in time periods t = 1,...,T x i R T is power flow profile for device i f i (x i ) is cost of profile x i (and encodes constraints) x 1 + +x N = is energy balance (in each time period) dynamic energy management problem is exchange problem exchange ADMM gives distributed method for dynamic energy management each device optimizes its own profile, with quadratic regularization for coordination residual (energy imbalance) is driven to zero Consensus and exchange 49

50 Generators x t t 3 example generators left: generator costs/limits; right: ramp constraints can add cost for power changes Consensus and exchange 5

51 Fixed loads t 2 example fixed loads cost is + for not supplying load; zero otherwise Consensus and exchange 51

52 Shiftable load t total energy consumed over an interval must exceed given minimum level limits on energy consumed in each period cost is + for violating constraints; zero otherwise Consensus and exchange 52

53 Battery energy storage system t energy store with maximum capacity, charge/discharge limits black: battery charge, red: charge/discharge profile cost is + for violating constraints; zero otherwise Consensus and exchange 53

54 Electric vehicle charging system t black: desired charge profile; blue: charge profile shortfall cost for not meeting desired charge Consensus and exchange 54

55 HVAC t thermal load (e.g., room, refrigerator) with temperature limits magenta: ambient temperature; blue: load temperature red: cooling energy profile cost is + for violating constraints; zero otherwise Consensus and exchange 55

56 External tie t buy/sell energy from/to external grid at price p ext (t)±γ(t) solid: p ext (t); dashed: p ext (t)±γ(t) Consensus and exchange 56

57 Smart grid example 1 devices 3 generators 2 fixed loads 1 shiftable load 1 EV charging systems 1 battery 1 HVAC system 1 external tie Consensus and exchange 57

58 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

59 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

60 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

61 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

62 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

63 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

64 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

65 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

66 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

67 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

68 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

69 Convergence iteration: k = t t left: solid: optimal generator profile, dashed: profile at kth iteration right: residual vector x k Consensus and exchange 58

70 Outline Dual decomposition Method of multipliers Alternating direction method of multipliers Common patterns Examples Consensus and exchange Conclusions Conclusions 59

71 Summary and conclusions ADMM is the same as, or closely related to, many methods with other names has been around since the 197s gives simple single-processor algorithms that can be competitive with state-of-the-art can be used to coordinate many processors, each solving a substantial problem, to solve a very large problem Conclusions 6

Alternating Direction Method of Multipliers

Alternating Direction Method of Multipliers Alternating Direction Method of Multipliers CS 584: Big Data Analytics Material adapted from Stephen Boyd (https://web.stanford.edu/~boyd/papers/pdf/admm_slides.pdf) & Ryan Tibshirani (http://stat.cmu.edu/~ryantibs/convexopt/lectures/21-dual-meth.pdf)

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein October 8, 2015 1 / 30 Introduction Presentation Outline 1 Convex

More information

Convex Optimization: from Real-Time Embedded to Large-Scale Distributed

Convex Optimization: from Real-Time Embedded to Large-Scale Distributed Convex Optimization: from Real-Time Embedded to Large-Scale Distributed Stephen Boyd Neal Parikh, Eric Chu, Yang Wang, Jacob Mattingley Electrical Engineering Department, Stanford University Springer Lectures,

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers Customizable software solver package Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein April 27, 2016 1 / 28 Background The Dual Problem Consider

More information

Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding

Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding B. O Donoghue E. Chu N. Parikh S. Boyd Convex Optimization and Beyond, Edinburgh, 11/6/2104 1 Outline Cone programming Homogeneous

More information

Distributed Optimization via ADMM

Distributed Optimization via ADMM Distributed Optimization via ADMM Zhimin Peng Dept. Computational and Applied Mathematics Rice University Houston, TX 77005 Aug. 15, 2012 Main Reference: Distributed Optimization and Statistical Learning

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers Mid Year Review: AMSC/CMSC 663 and 664 The Alternating Direction Method of Multipliers An Adaptive Step-size Software Library Peter Sutor, Jr. psutor@umd.edu Project Advisor Dr. Tom Goldstein tomg@cs.umd.edu

More information

Convex optimization algorithms for sparse and low-rank representations

Convex optimization algorithms for sparse and low-rank representations Convex optimization algorithms for sparse and low-rank representations Lieven Vandenberghe, Hsiao-Han Chao (UCLA) ECC 2013 Tutorial Session Sparse and low-rank representation methods in control, estimation,

More information

Convex Optimization. Stephen Boyd

Convex Optimization. Stephen Boyd Convex Optimization Stephen Boyd Electrical Engineering Computer Science Management Science and Engineering Institute for Computational Mathematics & Engineering Stanford University Institute for Advanced

More information

Convex Optimization and Machine Learning

Convex Optimization and Machine Learning Convex Optimization and Machine Learning Mengliu Zhao Machine Learning Reading Group School of Computing Science Simon Fraser University March 12, 2014 Mengliu Zhao SFU-MLRG March 12, 2014 1 / 25 Introduction

More information

A parallel implementation of the ADMM algorithm for power network control

A parallel implementation of the ADMM algorithm for power network control Aalto University School of Science Tuomas Rintamäki A parallel implementation of the ADMM algorithm for power network control The document can be stored and made available to the public on the open internet

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - C) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Augmented Lagrangian Methods

Augmented Lagrangian Methods Augmented Lagrangian Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Augmented Lagrangian IMA, August 2016 1 /

More information

Augmented Lagrangian Methods

Augmented Lagrangian Methods Augmented Lagrangian Methods Mário A. T. Figueiredo 1 and Stephen J. Wright 2 1 Instituto de Telecomunicações, Instituto Superior Técnico, Lisboa, Portugal 2 Computer Sciences Department, University of

More information

Lecture 4 Duality and Decomposition Techniques

Lecture 4 Duality and Decomposition Techniques Lecture 4 Duality and Decomposition Techniques Jie Lu (jielu@kth.se) Richard Combes Alexandre Proutiere Automatic Control, KTH September 19, 2013 Consider the primal problem Lagrange Duality Lagrangian

More information

Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual

Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know learn the proximal

More information

A Decentralized Energy Management System

A Decentralized Energy Management System A Decentralized Energy Management System Ceyhun Eksin, Ali Hooshmand and Ratnesh Sharma Energy Management Department at NEC Laboratories America, Inc., Cupertino, CA, 9514 USA. Email:{ceksin, ahooshmand,

More information

Tensor Sparse PCA and Face Recognition: A Novel Approach

Tensor Sparse PCA and Face Recognition: A Novel Approach Tensor Sparse PCA and Face Recognition: A Novel Approach Loc Tran Laboratoire CHArt EA4004 EPHE-PSL University, France tran0398@umn.edu Linh Tran Ho Chi Minh University of Technology, Vietnam linhtran.ut@gmail.com

More information

Constrained optimization

Constrained optimization Constrained optimization A general constrained optimization problem has the form where The Lagrangian function is given by Primal and dual optimization problems Primal: Dual: Weak duality: Strong duality:

More information

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation Xingguo Li Tuo Zhao Xiaoming Yuan Han Liu Abstract This paper describes an R package named flare, which implements

More information

Distributed Alternating Direction Method of Multipliers

Distributed Alternating Direction Method of Multipliers Distributed Alternating Direction Method of Multipliers Ermin Wei and Asuman Ozdaglar Abstract We consider a network of agents that are cooperatively solving a global unconstrained optimization problem,

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,

More information

Composite Self-concordant Minimization

Composite Self-concordant Minimization Composite Self-concordant Minimization Volkan Cevher Laboratory for Information and Inference Systems-LIONS Ecole Polytechnique Federale de Lausanne (EPFL) volkan.cevher@epfl.ch Paris 6 Dec 11, 2013 joint

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

POGS Proximal Operator Graph Solver

POGS Proximal Operator Graph Solver POGS Proximal Operator Graph Solver Chris Fougner June 4, 2014 1 Introduction As data sets continue to increase in size, it becomes increasingly important to design algorithms that scale with the size

More information

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms

More information

Investigating Image Inpainting via the Alternating Direction Method of Multipliers

Investigating Image Inpainting via the Alternating Direction Method of Multipliers Investigating Image Inpainting via the Alternating Direction Method of Multipliers Jonathan Tuck Stanford University jonathantuck@stanford.edu Abstract In many imaging applications, there exists potential

More information

Parallel and Distributed Sparse Optimization Algorithms

Parallel and Distributed Sparse Optimization Algorithms Parallel and Distributed Sparse Optimization Algorithms Part I Ruoyu Li 1 1 Department of Computer Science and Engineering University of Texas at Arlington March 19, 2015 Ruoyu Li (UTA) Parallel and Distributed

More information

Application of Proximal Algorithms to Three Dimensional Deconvolution Microscopy

Application of Proximal Algorithms to Three Dimensional Deconvolution Microscopy Application of Proximal Algorithms to Three Dimensional Deconvolution Microscopy Paroma Varma Stanford University paroma@stanford.edu Abstract In microscopy, shot noise dominates image formation, which

More information

A proximal center-based decomposition method for multi-agent convex optimization

A proximal center-based decomposition method for multi-agent convex optimization Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 2008 A proximal center-based decomposition method for multi-agent convex optimization Ion Necoara and Johan A.K.

More information

Epigraph proximal algorithms for general convex programming

Epigraph proximal algorithms for general convex programming Epigraph proimal algorithms for general conve programming Matt Wytock, Po-Wei Wang and J. Zico Kolter Machine Learning Department Carnegie Mellon University mwytock@cs.cmu.edu Abstract This work aims at

More information

The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R

The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R Journal of Machine Learning Research 6 (205) 553-557 Submitted /2; Revised 3/4; Published 3/5 The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R Xingguo Li Department

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Classification III Dan Klein UC Berkeley 1 Classification 2 Linear Models: Perceptron The perceptron algorithm Iteratively processes the training set, reacting to training errors

More information

EE8231 Optimization Theory Course Project

EE8231 Optimization Theory Course Project EE8231 Optimization Theory Course Project Qian Zhao (zhaox331@umn.edu) May 15, 2014 Contents 1 Introduction 2 2 Sparse 2-class Logistic Regression 2 2.1 Problem formulation....................................

More information

Project Proposals. Xiang Zhang. Department of Computer Science Courant Institute of Mathematical Sciences New York University.

Project Proposals. Xiang Zhang. Department of Computer Science Courant Institute of Mathematical Sciences New York University. Project Proposals Xiang Zhang Department of Computer Science Courant Institute of Mathematical Sciences New York University March 26, 2013 Xiang Zhang (NYU) Project Proposals March 26, 2013 1 / 9 Contents

More information

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm In the name of God Part 4. 4.1. Dantzig-Wolf Decomposition Algorithm Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Introduction Real world linear programs having thousands of rows and columns.

More information

SVMs for Structured Output. Andrea Vedaldi

SVMs for Structured Output. Andrea Vedaldi SVMs for Structured Output Andrea Vedaldi SVM struct Tsochantaridis Hofmann Joachims Altun 04 Extending SVMs 3 Extending SVMs SVM = parametric function arbitrary input binary output 3 Extending SVMs SVM

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

Notes on Robust Estimation David J. Fleet Allan Jepson March 30, 005 Robust Estimataion. The field of robust statistics [3, 4] is concerned with estimation problems in which the data contains gross errors,

More information

Introduction to Optimization Problems and Methods

Introduction to Optimization Problems and Methods Introduction to Optimization Problems and Methods wjch@umich.edu December 10, 2009 Outline 1 Linear Optimization Problem Simplex Method 2 3 Cutting Plane Method 4 Discrete Dynamic Programming Problem Simplex

More information

A More Efficient Approach to Large Scale Matrix Completion Problems

A More Efficient Approach to Large Scale Matrix Completion Problems A More Efficient Approach to Large Scale Matrix Completion Problems Matthew Olson August 25, 2014 Abstract This paper investigates a scalable optimization procedure to the low-rank matrix completion problem

More information

Theoretical Concepts of Machine Learning

Theoretical Concepts of Machine Learning Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5

More information

Performance Estimation of the Alternating Directions Method of Multipliers in a Distributed Environment

Performance Estimation of the Alternating Directions Method of Multipliers in a Distributed Environment Performance Estimation of the Alternating Directions Method of Multipliers in a Distributed Environment Johan Mathe - johmathe@stanford.edu June 3, 20 Goal We will implement the ADMM global consensus algorithm

More information

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING KELLER VANDEBOGERT AND CHARLES LANNING 1. Introduction Interior point methods are, put simply, a technique of optimization where, given a problem

More information

Convex Optimization MLSS 2015

Convex Optimization MLSS 2015 Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :

More information

IE598 Big Data Optimization Summary Nonconvex Optimization

IE598 Big Data Optimization Summary Nonconvex Optimization IE598 Big Data Optimization Summary Nonconvex Optimization Instructor: Niao He April 16, 2018 1 This Course Big Data Optimization Explore modern optimization theories, algorithms, and big data applications

More information

Logistic Regression. Abstract

Logistic Regression. Abstract Logistic Regression Tsung-Yi Lin, Chen-Yu Lee Department of Electrical and Computer Engineering University of California, San Diego {tsl008, chl60}@ucsd.edu January 4, 013 Abstract Logistic regression

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14

More information

Sparse Optimization Lecture: Parallel and Distributed Sparse Optimization

Sparse Optimization Lecture: Parallel and Distributed Sparse Optimization Sparse Optimization Lecture: Parallel and Distributed Sparse Optimization Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basics of parallel computing

More information

Network Lasso: Clustering and Optimization in Large Graphs

Network Lasso: Clustering and Optimization in Large Graphs Network Lasso: Clustering and Optimization in Large Graphs David Hallac, Jure Leskovec, Stephen Boyd Stanford University September 28, 2015 Convex optimization Convex optimization is everywhere Introduction

More information

LECTURE 18 LECTURE OUTLINE

LECTURE 18 LECTURE OUTLINE LECTURE 18 LECTURE OUTLINE Generalized polyhedral approximation methods Combined cutting plane and simplicial decomposition methods Lecture based on the paper D. P. Bertsekas and H. Yu, A Unifying Polyhedral

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Chapter 9 Chapter 9 1 / 50 1 91 Maximal margin classifier 2 92 Support vector classifiers 3 93 Support vector machines 4 94 SVMs with more than two classes 5 95 Relationshiop to

More information

Kernel Methods & Support Vector Machines

Kernel Methods & Support Vector Machines & Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector

More information

COMS 4771 Support Vector Machines. Nakul Verma

COMS 4771 Support Vector Machines. Nakul Verma COMS 4771 Support Vector Machines Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake bound for the perceptron

More information

DISTRIBUTED NETWORK RESOURCE ALLOCATION WITH INTEGER CONSTRAINTS. Yujiao Cheng, Houfeng Huang, Gang Wu, Qing Ling

DISTRIBUTED NETWORK RESOURCE ALLOCATION WITH INTEGER CONSTRAINTS. Yujiao Cheng, Houfeng Huang, Gang Wu, Qing Ling DISTRIBUTED NETWORK RESOURCE ALLOCATION WITH INTEGER CONSTRAINTS Yuao Cheng, Houfeng Huang, Gang Wu, Qing Ling Department of Automation, University of Science and Technology of China, Hefei, China ABSTRACT

More information

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics 1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!

More information

Learning with infinitely many features

Learning with infinitely many features Learning with infinitely many features R. Flamary, Joint work with A. Rakotomamonjy F. Yger, M. Volpi, M. Dalla Mura, D. Tuia Laboratoire Lagrange, Université de Nice Sophia Antipolis December 2012 Example

More information

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation Xingguo Li Tuo Zhao Xiaoming Yuan Han Liu Abstract This paper describes an R package named flare, which implements

More information

A primal-dual framework for mixtures of regularizers

A primal-dual framework for mixtures of regularizers A primal-dual framework for mixtures of regularizers Baran Gözcü baran.goezcue@epfl.ch Laboratory for Information and Inference Systems (LIONS) École Polytechnique Fédérale de Lausanne (EPFL) Switzerland

More information

Parallel and Distributed Sparse Optimization

Parallel and Distributed Sparse Optimization Parallel and Distributed Sparse Optimization Zhimin Peng Ming Yan Wotao Yin Computational and Applied Mathematics Rice University May 2013 outline 1. sparse optimization background 2. model and data distribution

More information

Iteratively Re-weighted Least Squares for Sums of Convex Functions

Iteratively Re-weighted Least Squares for Sums of Convex Functions Iteratively Re-weighted Least Squares for Sums of Convex Functions James Burke University of Washington Jiashan Wang LinkedIn Frank Curtis Lehigh University Hao Wang Shanghai Tech University Daiwei He

More information

PARALLEL OPTIMIZATION

PARALLEL OPTIMIZATION PARALLEL OPTIMIZATION Theory, Algorithms, and Applications YAIR CENSOR Department of Mathematics and Computer Science University of Haifa STAVROS A. ZENIOS Department of Public and Business Administration

More information

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines Using Analytic QP and Sparseness to Speed Training of Support Vector Machines John C. Platt Microsoft Research 1 Microsoft Way Redmond, WA 9805 jplatt@microsoft.com Abstract Training a Support Vector Machine

More information

CONLIN & MMA solvers. Pierre DUYSINX LTAS Automotive Engineering Academic year

CONLIN & MMA solvers. Pierre DUYSINX LTAS Automotive Engineering Academic year CONLIN & MMA solvers Pierre DUYSINX LTAS Automotive Engineering Academic year 2018-2019 1 CONLIN METHOD 2 LAY-OUT CONLIN SUBPROBLEMS DUAL METHOD APPROACH FOR CONLIN SUBPROBLEMS SEQUENTIAL QUADRATIC PROGRAMMING

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach Basic approaches I. Primal Approach - Feasible Direction

More information

Lecture 7: Support Vector Machine

Lecture 7: Support Vector Machine Lecture 7: Support Vector Machine Hien Van Nguyen University of Houston 9/28/2017 Separating hyperplane Red and green dots can be separated by a separating hyperplane Two classes are separable, i.e., each

More information

Machine Learning: Think Big and Parallel

Machine Learning: Think Big and Parallel Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form

More information

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017 Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of

More information

Fast Newton methods for the group fused lasso

Fast Newton methods for the group fused lasso Fast Newton methods for the group fused lasso Matt Wytock Machine Learning Dept. Carnegie Mellon University Pittsburgh, PA Suvrit Sra Machine Learning Dept. Carnegie Mellon University Pittsburgh, PA J.

More information

Characterizing Improving Directions Unconstrained Optimization

Characterizing Improving Directions Unconstrained Optimization Final Review IE417 In the Beginning... In the beginning, Weierstrass's theorem said that a continuous function achieves a minimum on a compact set. Using this, we showed that for a convex set S and y not

More information

Lecture 18: March 23

Lecture 18: March 23 0-725/36-725: Convex Optimization Spring 205 Lecturer: Ryan Tibshirani Lecture 8: March 23 Scribes: James Duyck Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

Fast-Lipschitz Optimization

Fast-Lipschitz Optimization Fast-Lipschitz Optimization DREAM Seminar Series University of California at Berkeley September 11, 2012 Carlo Fischione ACCESS Linnaeus Center, Electrical Engineering KTH Royal Institute of Technology

More information

Computational Methods. Constrained Optimization

Computational Methods. Constrained Optimization Computational Methods Constrained Optimization Manfred Huber 2010 1 Constrained Optimization Unconstrained Optimization finds a minimum of a function under the assumption that the parameters can take on

More information

Lecture 19: November 5

Lecture 19: November 5 0-725/36-725: Convex Optimization Fall 205 Lecturer: Ryan Tibshirani Lecture 9: November 5 Scribes: Hyun Ah Song Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

Lecture 17 Sparse Convex Optimization

Lecture 17 Sparse Convex Optimization Lecture 17 Sparse Convex Optimization Compressed sensing A short introduction to Compressed Sensing An imaging perspective 10 Mega Pixels Scene Image compression Picture Why do we compress images? Introduction

More information

6 Model selection and kernels

6 Model selection and kernels 6. Bias-Variance Dilemma Esercizio 6. While you fit a Linear Model to your data set. You are thinking about changing the Linear Model to a Quadratic one (i.e., a Linear Model with quadratic features φ(x)

More information

SpicyMKL Efficient multiple kernel learning method using dual augmented Lagrangian

SpicyMKL Efficient multiple kernel learning method using dual augmented Lagrangian SpicyMKL Efficient multiple kernel learning method using dual augmented Lagrangian Taiji Suzuki Ryota Tomioka The University of Tokyo Graduate School of Information Science and Technology Department of

More information

ELEG Compressive Sensing and Sparse Signal Representations

ELEG Compressive Sensing and Sparse Signal Representations ELEG 867 - Compressive Sensing and Sparse Signal Representations Gonzalo R. Arce Depart. of Electrical and Computer Engineering University of Delaware Fall 211 Compressive Sensing G. Arce Fall, 211 1 /

More information

Convergence of Nonconvex Douglas-Rachford Splitting and Nonconvex ADMM

Convergence of Nonconvex Douglas-Rachford Splitting and Nonconvex ADMM Convergence of Nonconvex Douglas-Rachford Splitting and Nonconvex ADMM Shuvomoy Das Gupta Thales Canada 56th Allerton Conference on Communication, Control and Computing, 2018 1 What is this talk about?

More information

The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R

The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R Journal of Machine Learning Research (2013) Submitted ; Published The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R Haotian Pang Han Liu Robert Vanderbei Princeton

More information

Boolean Classification

Boolean Classification EE04 Spring 08 S. Lall and S. Boyd Boolean Classification Sanjay Lall and Stephen Boyd EE04 Stanford University Boolean classification Boolean classification I supervised learning is called boolean classification

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming SECOND EDITION Dimitri P. Bertsekas Massachusetts Institute of Technology WWW site for book Information and Orders http://world.std.com/~athenasc/index.html Athena Scientific, Belmont,

More information

Two-phase matrix splitting methods for asymmetric and symmetric LCP

Two-phase matrix splitting methods for asymmetric and symmetric LCP Two-phase matrix splitting methods for asymmetric and symmetric LCP Daniel P. Robinson Department of Applied Mathematics and Statistics Johns Hopkins University Joint work with Feng, Nocedal, and Pang

More information

Distributed Optimization of Deeply Nested Systems

Distributed Optimization of Deeply Nested Systems Distributed Optimization of Deeply Nested Systems Miguel Á. Carreira-Perpiñán and Weiran Wang Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu Nested

More information

Selected Topics in Column Generation

Selected Topics in Column Generation Selected Topics in Column Generation February 1, 2007 Choosing a solver for the Master Solve in the dual space(kelly s method) by applying a cutting plane algorithm In the bundle method(lemarechal), a

More information

Convex Optimization / Homework 2, due Oct 3

Convex Optimization / Homework 2, due Oct 3 Convex Optimization 0-725/36-725 Homework 2, due Oct 3 Instructions: You must complete Problems 3 and either Problem 4 or Problem 5 (your choice between the two) When you submit the homework, upload a

More information

DM6 Support Vector Machines

DM6 Support Vector Machines DM6 Support Vector Machines Outline Large margin linear classifier Linear separable Nonlinear separable Creating nonlinear classifiers: kernel trick Discussion on SVM Conclusion SVM: LARGE MARGIN LINEAR

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Constrained Optimization Marc Toussaint U Stuttgart Constrained Optimization General constrained optimization problem: Let R n, f : R n R, g : R n R m, h : R n R l find min

More information

Alternating Projections

Alternating Projections Alternating Projections Stephen Boyd and Jon Dattorro EE392o, Stanford University Autumn, 2003 1 Alternating projection algorithm Alternating projections is a very simple algorithm for computing a point

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines CS 536: Machine Learning Littman (Wu, TA) Administration Slides borrowed from Martin Law (from the web). 1 Outline History of support vector machines (SVM) Two classes,

More information

Aspects of Convex, Nonconvex, and Geometric Optimization (Lecture 1) Suvrit Sra Massachusetts Institute of Technology

Aspects of Convex, Nonconvex, and Geometric Optimization (Lecture 1) Suvrit Sra Massachusetts Institute of Technology Aspects of Convex, Nonconvex, and Geometric Optimization (Lecture 1) Suvrit Sra Massachusetts Institute of Technology Hausdorff Institute for Mathematics (HIM) Trimester: Mathematics of Signal Processing

More information

Efficient Iterative LP Decoding of LDPC Codes with Alternating Direction Method of Multipliers

Efficient Iterative LP Decoding of LDPC Codes with Alternating Direction Method of Multipliers Efficient Iterative LP Decoding of LDPC Codes with Alternating Direction Method of Multipliers Xiaojie Zhang Samsung R&D America, Dallas, Texas 758 Email: eric.zhang@samsung.com Paul H. Siegel University

More information

Mathematical Programming and Research Methods (Part II)

Mathematical Programming and Research Methods (Part II) Mathematical Programming and Research Methods (Part II) 4. Convexity and Optimization Massimiliano Pontil (based on previous lecture by Andreas Argyriou) 1 Today s Plan Convex sets and functions Types

More information

Convexity Theory and Gradient Methods

Convexity Theory and Gradient Methods Convexity Theory and Gradient Methods Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Convex Functions Optimality

More information

Gradient LASSO algoithm

Gradient LASSO algoithm Gradient LASSO algoithm Yongdai Kim Seoul National University, Korea jointly with Yuwon Kim University of Minnesota, USA and Jinseog Kim Statistical Research Center for Complex Systems, Korea Contents

More information

arxiv: v1 [math.oc] 28 May 2018

arxiv: v1 [math.oc] 28 May 2018 Cybersecurity in Distributed and Fully-Decentralized Optimization: Distortions, Noise Injection, and ADMM* Eric Munsing 1 and Scott Moura 1,2 arxiv:185.11194v1 [math.oc] 28 May 218 Abstract As problems

More information