Fast-Lipschitz Optimization

Similar documents
Principles of Wireless Sensor Networks. Fast-Lipschitz Optimization

Lecture 4 Duality and Decomposition Techniques

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

Programming, numerics and optimization

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

Principles of Wireless Sensor Networks

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

Computational Methods. Constrained Optimization

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING

Theoretical Concepts of Machine Learning

Principles of Wireless Sensor Networks

CONLIN & MMA solvers. Pierre DUYSINX LTAS Automotive Engineering Academic year

Convex Optimization and Machine Learning

Convex Optimization. Lijun Zhang Modification of

Constrained Optimization COS 323

PARALLEL OPTIMIZATION

Programs. Introduction

ME 555: Distributed Optimization

Lecture 18: March 23

Nonlinear Programming

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Linear Programming Problems

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Interior Point I. Lab 21. Introduction

Unconstrained Optimization Principles of Unconstrained Optimization Search Methods

Lecture 7: Support Vector Machine

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36

DETERMINISTIC OPERATIONS RESEARCH

Introduction to Optimization

Perceptron Learning Algorithm (PLA)

Simulation. Lecture O1 Optimization: Linear Programming. Saeed Bastani April 2016

Convex Analysis and Minimization Algorithms I

A Short SVM (Support Vector Machine) Tutorial

Principles of Wireless Sensor Networks. Routing, Zigbee, and RPL

Gate Sizing by Lagrangian Relaxation Revisited

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

New Methods for Solving Large Scale Linear Programming Problems in the Windows and Linux computer operating systems

Chapter II. Linear Programming

DM6 Support Vector Machines

Lecture 2. Topology of Sets in R n. August 27, 2008

Support Vector Machines. James McInerney Adapted from slides by Nakul Verma

Principles of Wireless Sensor Networks

Introduction to Machine Learning

An Introduction to Bilevel Programming

Objectives and Homework List

of Convex Analysis Fundamentals Jean-Baptiste Hiriart-Urruty Claude Lemarechal Springer With 66 Figures

The Alternating Direction Method of Multipliers

Kernel Methods & Support Vector Machines

5 Machine Learning Abstractions and Numerical Optimization

Sequential Coordinate-wise Algorithm for Non-negative Least Squares Problem

Composite Self-concordant Minimization

Principles of Wireless Sensor Networks

A Tutorial on Decomposition Methods for Network Utility Maximization Daniel P. Palomar, Member, IEEE, and Mung Chiang, Member, IEEE.

A Feasible Region Contraction Algorithm (Frca) for Solving Linear Programming Problems

Local and Global Minimum

Chapter 3 Numerical Methods

Lecture 19: November 5

COMS 4771 Support Vector Machines. Nakul Verma

Submodularity Reading Group. Matroid Polytopes, Polymatroid. M. Pawan Kumar

Optimization. Industrial AI Lab.

OPTIMIZATION METHODS

Computational Optimization. Constrained Optimization Algorithms

minimize ½(x 1 + 2x 2 + x 32 ) subject to x 1 + x 2 + x 3 = 5

A Comparative Study of Frequency-domain Finite Element Updating Approaches Using Different Optimization Procedures

Multidimensional scaling

Lecture 15: Log Barrier Method

SDLS: a Matlab package for solving conic least-squares problems

Lecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem

Lecture 2 September 3

Linear programming and duality theory

Rules for Identifying the Initial Design Points for Use in the Quick Convergent Inflow Algorithm

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

Linear Optimization and Extensions: Theory and Algorithms

Surrogate Gradient Algorithm for Lagrangian Relaxation 1,2

Simplicial Global Optimization

Mathematical Programming and Research Methods (Part II)

Constrained optimization

Delay-minimal Transmission for Energy Constrained Wireless Communications

Computational study of the step size parameter of the subgradient optimization method

Characterizing Improving Directions Unconstrained Optimization

Lecture 12: Feasible direction methods

Chapter 15 Introduction to Linear Programming

Solution Methods Numerical Algorithms

Projection onto the probability simplex: An efficient algorithm with a simple proof, and an application

Linear methods for supervised learning

Primal and Dual Methods for Optimisation over the Non-dominated Set of a Multi-objective Programme and Computing the Nadir Point

STRUCTURAL & MULTIDISCIPLINARY OPTIMIZATION

Convex Optimization. Erick Delage, and Ashutosh Saxena. October 20, (a) (b) (c)

Optimization for Machine Learning

Fundamentals of Integer Programming

Support Vector Machines.

Discrete Geometry Processing

Distributed Optimization of Continuoustime Multi-agent Networks

Preface. and Its Applications 81, ISBN , doi: / , Springer Science+Business Media New York, 2013.

Research Interests Optimization:

Ellipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case

Simplex Algorithm in 1 Slide

A Truncated Newton Method in an Augmented Lagrangian Framework for Nonlinear Programming

Transcription:

Fast-Lipschitz Optimization DREAM Seminar Series University of California at Berkeley September 11, 2012 Carlo Fischione ACCESS Linnaeus Center, Electrical Engineering KTH Royal Institute of Technology Stockholm, Sweden web: http://www.ee.kth.se/~carlofi e-mail: carlofi@kth.se

Optimization is pervasive over networks Parallel computing Smart grids Environmental monitoring Smart buildings Intelligent transportation systems Industrial control

Optimization over networks Optimization needs fast solver algorithms of low complexity Ø Time-varying networks, little time to compute solution Ø Distributed computations Ø E.g., networks of parallel processors, cross layer networking, distributed detection, estimation, content distribution,... Parallel and distributed computation Ø Fundamental theory for optimization over networks Ø Drawback over energy-constrained wireless networks: the cost for communication not considered An alternative theory is needed Ø In a number of cases, Fast-Lipschitz optimization

Outline Motivating example: distributed detection Definition of Fast-Lipschitz optimization Computation of the optimal solution Problems in canonical form Examples Conclusions

Distributed binary detection measurements at node i hypothesis testing with S measurements and threshold x i probability of false alarm probability of misdetection A threshold minimizing the prob. of false alarm maximizes the prob. of misdetection. How to choose optimally the thresholds when nodes exchange opinions?

Threshold optimization in distributed detection How to solve the problem by distributed operations among the nodes? The problem is convex Ø Lagrangian methods (interior point) can be applied Ø Drawback: too many message passing (Lagrangian multipliers) among nodes to compute iteratively the optimal solution An alternative method: Fast-Lipschitz optimization C. Fischione, Fast-Lipschitz Optimization with Wireless Sensor Networks Applications, IEEE TAC, 2011

Outline Motivating example: distributed detection Definition of Fast-Lipschitz optimization Computation of the optimal solution Problems in canonical form Examples Conclusions

The Fast-Lipschitz optimization nonempty compact set containing the vertexes of the constraints

Computation of the solution Centralized optimization Ø Problem solved by a central processor Network of n nodes Distributed optimization Ø Decision variables and constraints are associated to nodes that cooperate to compute the solution in parallel

Pareto Optimal Solution

Notation Gradient Norm infinity: sum along a row Norm 1: sum along a column

Qualifying conditions Now that we have introduced basic notation and concepts, we give some conditions for which a problem is Fast-Lipschitz

Qualifying conditions Functions may be non-convex

Outline Motivating example: distributed detection Definition of Fast-Lipschitz optimization Computation of the optimal solution Problems in canonical form Examples Conclusions

Optimal Solution The Pareto optimal solution is just given by a set of (in general nonlinear) equations. Solving a set of equations is much easier than solving an optimization problem by traditional Lagrangian methods!

Lagrangian methods Let s have a closer look at the Lagrangian methods, which are normally used to solve optimization problems Lagrangian methods are the essential to solve, for example, convex problems

Lagrangian methods G. L. Lagrange, 1736-1813 The methods I set forth require neither constructions nor geometric or mechanical considerations. They require only algebraic operations subject to a systematic and uniform course

Lagrangian methods Theorem: Consider a feasible F-Lipschitz problem. Then, the KKT conditions are necessary and sufficient. Ø KKT conditions: Lagrangian Lagrangian methods to compute the solution

Lagrangian methods Lagrangian Lagragian methods need 1. a central computation of the Lagrangian function 2. an endless collect-and-broadcast iterative message passing of primal and dual variables Fast-Lipschitz methods avoid the central computation and substantially reduce the collect-and-broadcast procedure

The Fast-Lipschitz optimization Non-Convex Optimization Convex Optimization Fast-Lipschitz Optimization Interference Function Optimization Geometric Optimization Fast-Lipschitz optimization problems can be convex, geometric, quadratic, interference-function,...

Fast-Lipschitz methods Let us see how a Fast-Lipschitz problem is solved without Lagrangian methods

Centralized optimization The optimal solution is given by iterative methods to solve systems of non-linear equations (e.g., Newton methods) is a matrix to ensure and maximize convergence speed Many other methods are available, e.g., second-order methods

Distributed optimization

Outline Motivating example: distributed detection Definition of Fast-Lipschitz optimization Computation of the optimal solution Problems in canonical form Examples Conclusions

Problems in canonical form Canonical form Bertsekas, Non Linear Programming, 2004 Fast-Lipschitz form

Problems in canonical form

Fast-Lipschitz Matlab Toolbox M. Leithe, Introducing a Matlab Toolbox for Fast-Lipschitz optimization, Master Thesis KTH, 2011

Outline Motivating example: distributed detection Definition of Fast-Lipschitz optimization Computation of the optimal solution Problems in canonical form Examples Conclusions

Example 1: from canonical to Fast-Lipschitz The problem is both convex and Fast-Lipschitz: Off-diagonal monotonicity Diagonal dominance The optimal solution is given by the constraints at the equality, trivially

Example 2: hidden Fast-Lipschitz Non Fast-Lipschitz Simple variable transformation,, gives a Fast-Lipschitz form

Threshold optimization in distributed detection How to solve the problem by parallel and distributed operations among the nodes? The problem is convex Ø Lagrangian methods (interior point methods) could be applied Ø Drowback: too many message passing (Lagrangian multipliers) among nodes to compute iteratively the optimal solution An alternative method: F-Lipschitz optimization

Distributed detection: Fast-Lipschitz vs Lagrangian methods 231 10 nodes network Fast-Lipschitz Lagrangian methods (interior point) 31 36 5 Number of iterations Number of function evaluations

Summary yes Inequality constraints satisfy the equality at the optimum? no Compute the solution by Fast- Lipschitz methods Compute the solution by Lagrangian methods Fast-Lipschitz optimization: a class of problems for which all the constraints are active at the optimum Optimum: the solution to the set of equations given by the constraints No Lagrangian methods, which are computationally expensive, particularly on wireless networks

Conclusions Existing methods for optimization over networks are too expensive Proposed the Fast-Lipschitz optimization Ø Application to distributed detection, many other cases Fast-Lipschitz optimization is a panacea for many cases, but still there is a lack of a theory for fast parallel and distributed computations How to generalize it for Ø static optimization? Ø dynamic optimization? Ø stochastic optimization? Ø game theoretical extensions?

Selected bibliography M. Jacobsson, C. Fischione, A Comparative Analysis of the Fast-Lipschitz Convergence Speed, To Appear, IEEE CDC 2012 M. Jacobsson, C. Fischione, On Some Extensions of Fast-Lipschitz Optimization for Convex and Non-convex Problems, To Appear, IFAC NecSys 2012 C. Fischione, F-Lipschitz Optimization with Wireless Sensor Networks Applications, IEEE Transactions on Automatic Control, 2011. A. Speranzon, C. Fischione, K. H. Johansson, A. Sangiovanni-Vincentelli, A Distributed Minimum Variance Estimator for Sensor Networks, IEEE Journal on Selected Areas in Communications, special issue on Control and Communications, Vol. 26, N. 4, pp. 609 621, May 2008.