Lecture 4 Duality and Decomposition Techniques

Similar documents
Nonlinear Programming

Distributed Alternating Direction Method of Multipliers

Surrogate Gradient Algorithm for Lagrangian Relaxation 1,2

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

The Alternating Direction Method of Multipliers

Fast-Lipschitz Optimization

A Tutorial on Decomposition Methods for Network Utility Maximization Daniel P. Palomar, Member, IEEE, and Mung Chiang, Member, IEEE.

Augmented Lagrangian Methods

Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual

ONLY AVAILABLE IN ELECTRONIC FORM

Augmented Lagrangian Methods

1 Linear programming relaxation

Convex Optimization. Lijun Zhang Modification of

Lagrangian Relaxation: An overview

Convex Optimization and Machine Learning

DISTRIBUTED NETWORK RESOURCE ALLOCATION WITH INTEGER CONSTRAINTS. Yujiao Cheng, Houfeng Huang, Gang Wu, Qing Ling

The Alternating Direction Method of Multipliers

Optimal network flow allocation

Parallel and Distributed Graph Cuts by Dual Decomposition

Constrained optimization

Linear methods for supervised learning

Introduction to Constrained Optimization

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

Convex optimization algorithms for sparse and low-rank representations

Link Dimensioning and LSP Optimization for MPLS Networks Supporting DiffServ EF and BE Classes

15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs

Lagrangean relaxation - exercises

Modified Augmented Lagrangian Coordination and Alternating Direction Method of Multipliers with

Advanced Operations Research Techniques IE316. Quiz 2 Review. Dr. Ted Ralphs

Introduction to Mathematical Programming IE496. Final Review. Dr. Ted Ralphs

COMS 4771 Support Vector Machines. Nakul Verma

An Improved Subgradiend Optimization Technique for Solving IPs with Lagrangean Relaxation

Support Vector Machines. James McInerney Adapted from slides by Nakul Verma

D. Keerthana, V. Ajitha

ME 555: Distributed Optimization

CONLIN & MMA solvers. Pierre DUYSINX LTAS Automotive Engineering Academic year

Convex Optimization Algorithms

Convex Optimization MLSS 2015

Introduction to Optimization

ISM206 Lecture, April 26, 2005 Optimization of Nonlinear Objectives, with Non-Linear Constraints

Gate Sizing by Lagrangian Relaxation Revisited

LECTURE 18 LECTURE OUTLINE

A Decentralized Energy Management System

Optimization for Machine Learning

Network Flows. 7. Multicommodity Flows Problems. Fall 2010 Instructor: Dr. Masoud Yaghini

Lecture 7: Support Vector Machine

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm

A Joint Congestion Control, Routing, and Scheduling Algorithm in Multihop Wireless Networks with Heterogeneous Flows

A proximal center-based decomposition method for multi-agent convex optimization

Dual Subgradient Methods Using Approximate Multipliers

Principles of Wireless Sensor Networks. Fast-Lipschitz Optimization

MVE165/MMG630, Applied Optimization Lecture 8 Integer linear programming algorithms. Ann-Brith Strömberg

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

Programming, numerics and optimization

Selected Topics in Column Generation

Application of Proximal Algorithms to Three Dimensional Deconvolution Microscopy

California Institute of Technology Crash-Course on Convex Optimization Fall Ec 133 Guilherme Freitas

MVE165/MMG631 Linear and integer optimization with applications Lecture 9 Discrete optimization: theory and algorithms

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Lecture 15: Log Barrier Method

A More Efficient Approach to Large Scale Matrix Completion Problems

SINCE the publication of the seminal paper [1] by Kelly et al.

Characterizing Improving Directions Unconstrained Optimization

Lecture 18: March 23

Linear Programming Problems

Efficient Iterative LP Decoding of LDPC Codes with Alternating Direction Method of Multipliers

Stochastic Separable Mixed-Integer Nonlinear Programming via Nonconvex Generalized Benders Decomposition

Robust Principal Component Analysis (RPCA)

COLLABORATIVE RESOURCE ALLOCATION OVER A HYBRID CLOUD CENTER AND EDGE SERVER NETWORK *

Improving Dual Bound for Stochastic MILP Models Using Sensitivity Analysis

Solution Methods Numerical Algorithms

Linear Programming. Larry Blume. Cornell University & The Santa Fe Institute & IHS

SnapVX: A Network-Based Convex Optimization Solver

Convex Optimization: from Real-Time Embedded to Large-Scale Distributed

Alternating Direction Method of Multipliers

Lagrangian Relaxation

On Distributed Convex Optimization Under Inequality and Equality Constraints Minghui Zhu, Member, IEEE, and Sonia Martínez, Senior Member, IEEE

IE521 Convex Optimization Introduction

Notes for Lecture 20

TMA946/MAN280 APPLIED OPTIMIZATION. Exam instructions

College of Computer & Information Science Fall 2007 Northeastern University 14 September 2007

Linear programming and duality theory

Unconstrained Optimization Principles of Unconstrained Optimization Search Methods

1 Introduction. Yongsu Jung 1 & Namwoo Kang 2 & Ikjin Lee 1

A Feasible Region Contraction Algorithm (Frca) for Solving Linear Programming Problems

Symbolic Subdifferentiation in Python

Convex Optimization CMU-10725

15. Cutting plane and ellipsoid methods

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

PARALLEL OPTIMIZATION

CONVEX OPTIMIZATION: A SELECTIVE OVERVIEW

Introduction to Modern Control Systems

Minimum-Polytope-Based Linear Programming Decoder for LDPC Codes via ADMM Approach

DETERMINISTIC OPERATIONS RESEARCH

Optimal Network Flow Allocation. EE 384Y Almir Mutapcic and Primoz Skraba 27/05/2004

CME307/MS&E311 Optimization Theory Summary

Project Proposals. Xiang Zhang. Department of Computer Science Courant Institute of Mathematical Sciences New York University.

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning

AVERAGING RANDOM PROJECTION: A FAST ONLINE SOLUTION FOR LARGE-SCALE CONSTRAINED STOCHASTIC OPTIMIZATION. Jialin Liu, Yuantao Gu, and Mengdi Wang

Transcription:

Lecture 4 Duality and Decomposition Techniques Jie Lu (jielu@kth.se) Richard Combes Alexandre Proutiere Automatic Control, KTH September 19, 2013

Consider the primal problem Lagrange Duality Lagrangian function 2

Dual function Dual Function and Dual Problem Lagrange Dual problem 3

Strong Duality Weak duality: Strong duality: 4

Optimality Conditions (Under Strong Duality) From now on, suppose strong duality holds. 5

Primal Function Consider the previous problem with inequality constraints only: Primal function 6

Decomposition Techniques Basic idea: Decompose one complex problem into many small: Coordinator Simple subproblems 7

The Trivial Case Separable objectives and constraints Trivially separates into n decoupled subproblems that can be solved in parallel and combined. 8

More Interesting Ones Problems with coupling constraints Problems with coupled objectives Coupled objectives can be cast as a problem of coupling constraints: 9

Dual Decomposition Basic idea: decouple problem by relaxing coupling constraints. Dual function Dual problem Additive (hence, can be evaluated in parallel) with simple constraints Subgradient of the dual function: Can be solved using subgradient projection method 10

Example of Dual Decomposition Problem Optimal value Primal optimum 11

Network Utility Maximization (NUM) Adjust end-to-end rates fairly to share limited network capacity : utility function (concave) Fairness by appropriate utility functions, e.g., 12

Network Utility Maximization (NUM) Rewrite the problem in vector form R: Routing matrix Row i of R indicates which flows share link i Rewrite on our standard form ( ) 13

Form Lagrangian Dual Decomposition for NUM Dual function is additive Each source i can adjust its rate based on feedback of Use the dual subgradient projection method Use locally available information at the router queues 14

Drawback of Dual Decomposition The dual iterates can converge to some dual optimum. However, in general, the primal iterates can only reach sub-optimality and violate constraints. Suppose strong duality holds. Feasibility and primal optimality recovered in the limit. How to evaluate the primal optimality and feasibility of each primal iterate? How fast do primal iterates converge to primal optimum? 15

Evaluate Optimality of Primal Iterates Strongly convex objective function ( ) and linear constraints Dual function is differentiable Dual function has Lipschitz gradient with Lipschitz constant Guaranteed convergence rate of the dual iterates Allow the linear constraints to be both inequality and equality 16

Primal Convergence of Running Average Running average of primal iterates Using subgradient projection method, Note: L is not Lipschitz constant (but an upper bound on constraint violation of the primal iterates) For more details, A. Nedic and A. Ozdaglar, Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods, SIAM Journal on Optimization 19 (4) 1757-1780, 2009. 17

Example Simple example as before 18

Augmented Lagrangian Consider Augmented Lagrangian : penalty parameter The unaugmented Lagrangian of the equivalent problem 19

Method of Multipliers The dual function associated with the augmented Lagrangian is differentiable The dual methods lead to convergence under more general conditions Method of multipliers The dual gradient method applied to the dual associated with the augmented Lagrangian (with step-size, Why?) Does not need to be strongly convex to guarantee convergence Dual and primal convergence rates can be derived 20

Alternating Direction Method of Multipliers (ADMM) Consider The augmented Lagrangian Drawback of the method of multipliers : not parallelizable in ADMM 21

Primal Decomposition Basic idea: coordinator immediately allocates primal variables Coordinator Feasibility of primal iterates guaranteed throughout. 22

Primal Decomposition Consider the resource allocation problem Rewrite as 23

Primal Decomposition is the primal function of p(c) is convex because the above problem is convex A subgradient of p(c) is, where is the dual optimal solution of p(c) is differentiable if the optimal Lagrange multiplier is unique Hence, coordinator can use subgradient projection method to allocate resources 24

Distributed Primal Decomposition Primal decomposition advantageous when primal function has the form The projection in the update can be computed in a distributed manner. (more about this in Lecture 4) 25

Modeling for Decomposition Clever introduction of new variables enable distribution of dual, primal Example: Transforming coupling variables into coupling constraints. Example: Making problem that is a clear candidate for dual decomposition into the standard form for primal decomposition 26

Summary Duality strong duality primal and dual optimality primal function Decomposition: subdivide large problem into many small coupling constraints, coupling variables Dual decomposition relax coupling constraints to make dual function additive/distributable dual problem possibly non-smooth, might require central coordinator primal iterates not always well-behaved, might need primal recovery Primal decomposition coordinator immediately allocates resources to subsystems feasibility of primal iterates guaranteed can sometimes be distributed (more in next lecture) 27

Duality theory References D. P. Bertsekas, Nonlinear Programming. Belmont, MA: Athena Scientific, 1999. D. P. Bertsekas, A. Nedich, and A. Ozdaglar, Convex Analysis and Optimization. Belmont, MA: Athena Scientific, 2003. S. Boyd and L. Vandenberghe, Convex Optimization. New York, NY: Cambridge University Press, 2004 Decomposition techniques B. Johansson, P. Soldati and M. Johansson, Mathematical decomposition techniques for distributed cross-layer optimization of data networks, IEEE Journal on Selected Areas in Communications, Vol. 24, No. 8, pp. 1535-1547, August 2006. ADMM S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Foundations and Trends in Machine Learning, 3(1):1 122, 2011. 28