Lecture 15: Log Barrier Method

Similar documents
Lecture 18: March 23

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

Lecture 19: November 5

Lecture 4: Convexity

Lecture 12: convergence. Derivative (one variable)

Introduction to Constrained Optimization

16.410/413 Principles of Autonomy and Decision Making

Lecture 2: August 29, 2018

Lecture 2: August 29, 2018

Programming, numerics and optimization

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

Lecture 5: Duality Theory

Convex Optimization CMU-10725

Introduction to Optimization

Convex Optimization. Erick Delage, and Ashutosh Saxena. October 20, (a) (b) (c)

Lecture 5: July 7,2013. Minimum Cost perfect matching in general graphs

Mathematical Programming and Research Methods (Part II)

Lecture 2 September 3

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Lecture 12: Feasible direction methods

Characterizing Improving Directions Unconstrained Optimization

LECTURE 6: INTERIOR POINT METHOD. 1. Motivation 2. Basic concepts 3. Primal affine scaling algorithm 4. Dual affine scaling algorithm

Constrained optimization

Unconstrained Optimization Principles of Unconstrained Optimization Search Methods

California Institute of Technology Crash-Course on Convex Optimization Fall Ec 133 Guilherme Freitas

Kernels and Constrained Optimization

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36

ISM206 Lecture, April 26, 2005 Optimization of Nonlinear Objectives, with Non-Linear Constraints

Lec13p1, ORF363/COS323

Lecture 7: Support Vector Machine

Linear methods for supervised learning

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

4 Integer Linear Programming (ILP)

Linear programming and duality theory

Local and Global Minimum

Linear Programming. Larry Blume. Cornell University & The Santa Fe Institute & IHS

15. Cutting plane and ellipsoid methods

Mathematical and Algorithmic Foundations Linear Programming and Matchings

arxiv: v2 [math.oc] 12 Feb 2019

Interior Penalty Functions. Barrier Functions A Problem and a Solution

Linear Programming Duality and Algorithms

CS675: Convex and Combinatorial Optimization Spring 2018 The Simplex Algorithm. Instructor: Shaddin Dughmi

Optimization III: Constrained Optimization

Solution Methods Numerical Algorithms

25. NLP algorithms. ˆ Overview. ˆ Local methods. ˆ Constrained optimization. ˆ Global methods. ˆ Black-box methods.

Lecture 6: Faces, Facets

Machine Learning for Signal Processing Lecture 4: Optimization

Computational Methods. Constrained Optimization

Linear Optimization and Extensions: Theory and Algorithms

Introduction to Mathematical Programming IE496. Final Review. Dr. Ted Ralphs

Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 14

Lecture 2 - Introduction to Polytopes

A Short SVM (Support Vector Machine) Tutorial

Lecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem

Lecture 4 Duality and Decomposition Techniques

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

1. Lecture notes on bipartite matching February 4th,

Civil Engineering Systems Analysis Lecture XIV. Instructor: Prof. Naveen Eluru Department of Civil Engineering and Applied Mechanics

Convexity I: Sets and Functions

College of Computer & Information Science Fall 2007 Northeastern University 14 September 2007

Applied Lagrange Duality for Constrained Optimization

CS675: Convex and Combinatorial Optimization Spring 2018 Consequences of the Ellipsoid Algorithm. Instructor: Shaddin Dughmi

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

Lecture 1: Introduction

Programs. Introduction

Introduction to Modern Control Systems

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007

Interior Point I. Lab 21. Introduction

Linear Programming Problems

CME307/MS&E311 Theory Summary

Section Notes 5. Review of Linear Programming. Applied Math / Engineering Sciences 121. Week of October 15, 2017

SIMULATED ANNEALING WITH AN EFFICIENT UNIVERSAL BARRIER

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Lecture 16 October 23, 2014

Lecture 2: August 31

Introduction to Optimization

15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs

Advanced Operations Research Techniques IE316. Quiz 2 Review. Dr. Ted Ralphs

Convexity Theory and Gradient Methods

Algorithms for convex optimization

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 29

Nonlinear Programming

Lec 11 Rate-Distortion Optimization (RDO) in Video Coding-I

Gate Sizing by Lagrangian Relaxation Revisited

Repetition: Primal Dual for Set Cover

6.854 Advanced Algorithms. Scribes: Jay Kumar Sundararajan. Duality

Optimization under uncertainty: modeling and solution methods

Lagrangian Relaxation

4 Greedy Approximation for Set Cover

Submodularity Reading Group. Matroid Polytopes, Polymatroid. M. Pawan Kumar

Convex Programs. COMPSCI 371D Machine Learning. COMPSCI 371D Machine Learning Convex Programs 1 / 21

Projection onto the probability simplex: An efficient algorithm with a simple proof, and an application

CME307/MS&E311 Optimization Theory Summary

for Approximating the Analytic Center of a Polytope Abstract. The analytic center of a polytope P + = fx 0 : Ax = b e T x =1g log x j

Lecture 5: Properties of convex sets

Convex Optimization and Machine Learning

Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles

Section Notes 4. Duality, Sensitivity, and the Dual Simplex Algorithm. Applied Math / Engineering Sciences 121. Week of October 8, 2018

Problem set 2. Problem 1. Problem 2. Problem 3. CS261, Winter Instructor: Ashish Goel.

Transcription:

10-725/36-725: Convex Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 15: Log Barrier Method Scribes: Pradeep Dasigi, Mohammad Gowayyed Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications. They may be distributed outside this class only with the permission of the Instructor. 15.1 Introduction Previously, we looked at Newton s method for imizing twice differentiable convex functions with equality constraints. One of the limitations of this method is that we cannot deal with inequality constraints. The barrier method is a way to address this issue. Formally, given the following problem, f(x) h i (x) 0, i = 1,... m assug that f, h i are all convex and twice differentiable functions, all with domain R n, the log barrier is defined as φ(x) = log( h i (x)) It can be seen that the domain of the log barrier is the set of strictly feasible points, {x : h i (x) < 0, i = 1... m}. Note that the equality constraints are ignored for the rest of this chapter, because those can be solved using the Newton s method directly. 15.1.1 Approximation of Indicator Functions The idea behind the definition of the log barrier is that it is a smooth approximation of the indicator functions. More precisely, given that our problem with inequality constraints is to imize f(x) + I {hi(x) 0}(x) we approximate it as follows f(x) ( 1 / t ) log( h i (x)) As t approaches, the approximation becomes closer to the indicator function, as shown in Figure 15.1. Also, for any value of t, if any of the constraints is violated, the value of the barrier approaches infinity 15-1

15-2 Lecture 15: Log Barrier Method 15.1.2 Log Barrier Calculus The gradient and hessian of the log-barrier, defined as follows will be useful in the Newton s method with log-barrier 1 φ(x) = h i (x) h i(x) 2 φ(x) = 1 h i (x) 2 h i(x) h i (x) 1 h i (x) 2 h i (x) 15.2 Central Path Now that we have defined the log barrier, we can rewrite our problem as tf(x) + φ(x) The central path of this problem is defined as x (t), that the solution of the problem as a function of t. The necessary and sufficient conditions for x (t) to be optimal, given by the KKT conditions are Ax (t) = b, h i (x (t)) < 0, i = 1... m t f(x 1 (t)) h i (x) h i(x) + A w Primal feasibility Stationarity w is the dual variable used for defining the Lagrangian with the equality constraint. 15.2.1 Example Consider the barrier problem for the linear program tc x log(e i d i x) where the barrier function corresponds to a polyhedral constraint Dx e. Figure 15.2 shows the contours of the log barrier function at different values of t as dashed lines. It can be seen that the gradient of the log barrier contour at of the points on the central path is parallel to c. This can also be seen from applying the stationarity condition. 0 = tc 1 e i d i x (t) d i As we move along the central path, the contour gets closer to the actual polyhedron. Because of this geometry, log barrier method is called an interior point method, since we start an interior point, and move along the central path to get to the solution.

Lecture 15: Log Barrier Method 15-3 Figure 15.1: As t approaches, the approximation becomes closer to the indicator function. Figure 15.2: The contours of the log barrier function at different values of t as dashed lines.

15-4 Lecture 15: Log Barrier Method 15.2.2 Dual Points from Central Path Given x (t) and corresponding w, we can define the feasible dual points for the original problem (not the barrier problem). u 1 i (t) = th i (x (t)), i = 1... m v (t) = w t We can prove that these are indeed dual feasible. Firstly, u i (t) > 0, since h i(x (t)) < 0 for all i. If we plug in the definitions of u (t) and v (t) into the stationarity condition for the central path, and divide the whole expression by t, we get f(x (t)) + u i (x (t)) h i (x) + A v (t) = 0 which shows that (u (t), v (t)) lies in the domain of g(u, v). We can now compute the imum of the Lagrangian as follows g(u (t), v (t)) = f(x (t)) + u i (t)h i (x (t)) + v (t) (Ax (t) b) From the definition of u (t), the second term in the above expression becomes m t the third term becomes 0. Hence, we have, and from primal feasibility, g(u (t), v (t)) = f(x (t)) m t That is, m t is the duality gap, and f(x (t)) f m t This proves that as t, we approach the solution. Also, the above condition can be used as a stopping criterion in our algorithm. 15.2.3 Perturbed KKT Conditions We can think of the central path solution, x (t) and the corresponding dual solution (u (t), v (t)) as solving the following conditions f(x (t)) + u i (x (t)) h i (x) + A v (t) = 0 u i (t)h i (x ) = 1 t, i = 1... m Stationarity h i (x ) 0, i = 1... m Ax (t) = b Primal feasibility u i (x (t)) 0, i = 1... m Perturbed complementary slackness Dual feasibility It can be seen that it is just the condition for complementary slackness that got perturbed, since the original condition would have been u i (t)h i (x ) = 0. Hence, the solution above can be interpreted as solving something very close to KKT conditions. Also, as t, the condition above approaches the original condition.

Lecture 15: Log Barrier Method 15-5 15.3 Algorithm for Barrier Method Let us now try to define an algorithm for solving the barrier problem tf(x) + φ(x) Given a desired level of accuracy ɛ, we may want to solve this by setting t = m ɛ, and directly using the Newton s method. However, the required t would be huge, and it might make the algorithm numerically unstable. This is the reason why we need interior point methods, and we traverse the central path by increasing the values of t, using Newton s method at each step. This idea of solving by traversing over the central path is very similar to using warm starts, or more generally to the idea of traversing the solution path. Formally, the following is the barrier method algorithm. Start with a value t = t (0) > 0, and solve the barrier problem using Newton s method to get x (0) = x (t) For barrier parameter µ > 1, and k = 1, 2, 3,..., repeat: Solve barrier problem at t = t (k) using Newton s method initialized at x (k 1) to produce x (k) = x (t) Stop if m t ɛ Else, update t (k+1) = µt The first step in the outer loop is called the centering step, since it brings x (k) onto the central path. 15.3.1 Practical Considerations For the above algorithm, we have to choose t (0) and µ. If µ is too small, many outer iterations might be needed, and if it is too big, each Newton s method step might need many iterations. Hence, we need to choose the value from a good range. The conditions for choosing t (0) are similar. If it is too small, many outer iterations are needed, and if it is too big, many iterations of Newton s method are needed for the centering step. 15.4 Convergence analysis of barrier method Assug that we can solve the centring steps exactly, then the following theorem immediately holds. If we cannot, then there would be other more complex bounds. Theorem 15.1 The barrier method after k centering steps satisfies: f(x (k) ) f m µ k t (0) ) where, f(x (k) ) is the objective value, f is the optimal objective, µ is the factor by which we multiply t every step, and m is the number of constraints. log(m/(t (0) ɛ) So, in order to reach a desired accuracy level of ɛ, the theorem requires: logµ + 1 centring steps. Which is considered a linear convergence rate in terms of outer iterations that are required for the barrier method. There are more other fine-grained bounds, but this is the most basic one.

15-6 Lecture 15: Log Barrier Method The previous bound is only for the outer iterations, but one might ask, how many total Newton iterations are required? And the answer, under sufficiently general conditions, is still in the order of log(1/ɛ) inner iterations to get an ɛ accurate solution. There is also a much more tight bound that can be made under self-concordance, which we will talk about when we talk about the interior point method. In this bound, you can get rid of some of the parameters in this bound. It is not of this form but it gives you a more precise answer about the number of newton steps required. Figure 15.3 shows the progress of the barrier method for a linear program with a growing number of constraints. We are trying to see how the convergence of barrier method depend on number of constraints. From the bound we can see that it should depend logarithmically on m, but what about number of newton iterations? We can see that increasing the number of constraints from 50 to 500 only required additional 10 newton steps, which is almost logarithmic, as we mentioned earlier. 15.5 Feasibility methods Now, let s see how can we get the initial centering step. Remember that the algorithm requires that we have an initial strictly feasible point. Because if the point is not strictly feasible, the criteria is infinite. We get a feasible point by solving another problem: x,s s h i (x) s, i = 1,... m We can think of s as the maximum violation of feasibility. If we can solve this problem so that the solution s is negative, we are good and the solution is strictly feasible. If s is positive, it is not good. So, in total we run two interior point method, one to get a strictly feasible point to start with in the second one that actually optimize the original function. The first problem is much easier. References S. Boyd and L. Vandenberghe (2004), Convex optimization, Chapter 11. A. Nemirovski (2004), Interior point polynomial time methods in convex programg, Chapter 4. J. Nocedal and S. Wright (2006), Numerical optimization, Chapters 14 and 19.

Lecture 15: Log Barrier Method 15-7 Figure 15.3: The progress of the barrier method for a linear program with a growing number of constraints.