Ellipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case

Similar documents
Linear and Integer Programming :Algorithms in the Real World. Related Optimization Problems. How important is optimization?

EARLY INTERIOR-POINT METHODS

LECTURE 6: INTERIOR POINT METHOD. 1. Motivation 2. Basic concepts 3. Primal affine scaling algorithm 4. Dual affine scaling algorithm

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Introduction to Optimization Problems and Methods

Algorithms for convex optimization

ORIE 6300 Mathematical Programming I November 13, Lecture 23. max b T y. x 0 s 0. s.t. A T y + s = c

Introduction to Optimization

Computational Optimization. Constrained Optimization Algorithms

Convex Optimization CMU-10725

Programming, numerics and optimization

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

Constrained Optimization with Calculus. Background Three Big Problems Setup and Vocabulary

Linear Programming Duality and Algorithms

15. Cutting plane and ellipsoid methods

16.410/413 Principles of Autonomy and Decision Making

Multivariate Numerical Optimization

Chapter 3 Numerical Methods

Linear Programming 1

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING

Linear Programming. them such that they

Mathematical Programming and Research Methods (Part II)

Solving LP in polynomial time

Computational Methods. Constrained Optimization

Solution Methods Numerical Algorithms

6. Linear Discriminant Functions

Programs. Introduction

Machine Learning for Signal Processing Lecture 4: Optimization

SYSTEMS OF NONLINEAR EQUATIONS

Kernel Methods & Support Vector Machines

Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles

/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang

February 19, Integer programming. Outline. Problem formulation. Branch-andbound

The Affine Scaling Method

3 Interior Point Method

Lecture 2 September 3

The Simplex Algorithm

Convex Optimization CMU-10725

Solving basis pursuit: infeasible-point subgradient algorithm, computational comparison, and improvements

5 Machine Learning Abstractions and Numerical Optimization

Chapter Multidimensional Gradient Method

Lecture 3: Convex sets

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Linear Optimization and Extensions: Theory and Algorithms

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 6

Chapter II. Linear Programming

Linear programming and duality theory

The Ascendance of the Dual Simplex Method: A Geometric View

Interior Point I. Lab 21. Introduction

CS675: Convex and Combinatorial Optimization Spring 2018 The Simplex Algorithm. Instructor: Shaddin Dughmi

Lecture 16 October 23, 2014

Lecture 6 - Multivariate numerical optimization

Optimization Methods. Final Examination. 1. There are 5 problems each w i t h 20 p o i n ts for a maximum of 100 points.

SVMs for Structured Output. Andrea Vedaldi

Introduction. Linear because it requires linear functions. Programming as synonymous of planning.

CS 229 Midterm Review

CS675: Convex and Combinatorial Optimization Spring 2018 Consequences of the Ellipsoid Algorithm. Instructor: Shaddin Dughmi

Modern Methods of Data Analysis - WS 07/08

Characterizing Improving Directions Unconstrained Optimization

for Approximating the Analytic Center of a Polytope Abstract. The analytic center of a polytope P + = fx 0 : Ax = b e T x =1g log x j

Numerical Method in Optimization as a Multi-stage Decision Control System

Research Interests Optimization:

Mathematical and Algorithmic Foundations Linear Programming and Matchings

A Truncated Newton Method in an Augmented Lagrangian Framework for Nonlinear Programming

A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009]

Interior Penalty Functions. Barrier Functions A Problem and a Solution

Performance Evaluation of an Interior Point Filter Line Search Method for Constrained Optimization

A primal-dual Dikin affine scaling method for symmetric conic optimization

A Short SVM (Support Vector Machine) Tutorial

25. NLP algorithms. ˆ Overview. ˆ Local methods. ˆ Constrained optimization. ˆ Global methods. ˆ Black-box methods.

COMS 4771 Support Vector Machines. Nakul Verma

CS 372: Computational Geometry Lecture 10 Linear Programming in Fixed Dimension

A Feasible Region Contraction Algorithm (Frca) for Solving Linear Programming Problems

Convex Sets. CSCI5254: Convex Optimization & Its Applications. subspaces, affine sets, and convex sets. operations that preserve convexity

Lecture 12: convergence. Derivative (one variable)

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36

Constrained and Unconstrained Optimization

Math 5593 Linear Programming Lecture Notes

The Simplex Algorithm

Linear Discriminant Functions: Gradient Descent and Perceptron Convergence

Exploiting Degeneracy in MIP

A Brief Look at Optimization

Integer Programming Theory

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

APPLIED OPTIMIZATION WITH MATLAB PROGRAMMING

Lecture 15: Log Barrier Method

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Linear Programming. Readings: Read text section 11.6, and sections 1 and 2 of Tom Ferguson s notes (see course homepage).

Introduction to Machine Learning

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.

Matrix-free IPM with GPU acceleration

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

Recap, and outline of Lecture 18

Primal and Dual Methods for Optimisation over the Non-dominated Set of a Multi-objective Programme and Computing the Nadir Point

Outline. CS38 Introduction to Algorithms. Linear programming 5/21/2014. Linear programming. Lecture 15 May 20, 2014

Definition. A Taylor series of a function f is said to represent function f, iff the error term converges to 0 for n going to infinity.

Perceptron Learning Algorithm

2. Convex sets. x 1. x 2. affine set: contains the line through any two distinct points in the set

Transcription:

Ellipsoid Algorithm 15-853:Algorithms in the Real World Linear and Integer Programming II Ellipsoid algorithm Interior point methods First polynomial-time algorithm for linear programming (Khachian 79) Solves find x subject to Ax b i.e find a feasible solution Run Time: O(n 4 L), where L = #bits to represent A and b Problem in practice: always takes this much time. 15-853 Page1 15-853 Page2 Reduction from general case Ellipsoid Algorithm To solve: maximize: z = c T x subject to: Ax b, x 0 Convert to: find: x, y subject to: Ax b, OR -x 0 -ya c -y 0 -cx +by 0 find: x subject to: Ax b, -c T x -z 0 x 0 Current guess of z. Binary search to converge on proper answer Consider a sequence of smaller and smaller ellipsoids each with the feasible region inside. For iteration k: c k = center of E k Eventually c k has to be inside of F, and we are done. c k F Feasible region 15-853 Page3 15-853 Page4 1

Ellipsoid Algorithm Interior Point Methods To find the next smaller ellipsoid: find most violated constraint a k find smallest ellipsoid that includes constraint c k F a k Feasible region x 2 x 1 Travel through the interior with a combination of 1. An optimization term (moves toward objective) 2. A centering term (keeps away from boundary) Used since 50s for nonlinear programming. Karmakar proved a variant is polynomial time in 1984 15-853 Page5 15-853 Page6 Methods Affine scaling: simplest, but no known time bounds Potential reduction: O(nL) iterations Central trajectory: O(n 1/2 L) iterations The time for each iteration involves solving a linear system so it takes polynomial time. The real world time depends heavily on the matrix structure. Example times fuel continent car initial size (K) 13x31K 9x57K 43x107K 19x12K non-zeros 186K 189K 183K 80K iterations 66 64 53 58 time (sec) 2364 771 645 9252 Cholesky non-zeros 1.2M.3M.2M 6.7M Central trajectory method (Lustic, Marsten, Shanno 94) Time depends on Cholesky non-zeros (i.e. the fill ) 15-853 Page7 15-853 Page8 2

Assumptions We are trying to solve the standard-form problem: minimize z = c T x subject to Ax = b x 0 Outline 1. Centering Methods Overview 2. Picking a direction to move toward the optimal 3. Staying on the Ax = b hyperplane (projection) 4. General method 5. Example: Affine scaling 6. Example: potential reduction 7. Example: log barrier 15-853 Page9 15-853 Page10 Centering: option 1 The analytical center : Minimize: y = - i=1 n lg x i y goes to 1 as x approaches any boundary. Elliptical Scaling: Centering: option 2 x 5 (c 1,c 2 ) Dikin Ellipsoid x 1 x 2 x 4 x 3 The idea is to bias spaced based on the ellipsoid. More on this later. 15-853 Page11 15-853 Page12 3

Finding the Optimal solution Let s say f(x) is the combination of the centering term c(x) and the optimization term z(x) = c T x. We would like this to have the same minimum over the feasible region as z(x) but can otherwise be quite different. In particular c(x) and hence f(x) need not be linear. Goal: find the minimum of f(x) over the feasible region starting at some interior point x 0 Can do this by taking a sequence of steps toward the minimum. How do we pick a direction for a step? Picking a direction: steepest descent Option 1: Find the steepest descent on x at x 0 by taking the gradient: Problem: the gradient might be changing rapidly, so local steepest descent might not give us a good direction. 15-853 Page13 15-853 Page14 Picking a direction: Newton s method Consider the truncated taylor series: To find the minimum of f(x) take the derivative and set to 0. Next Step? Now that we have a direction, what do we do? The direction may move away from feasible solutions. We need to adjust the direction to force Ax=b to hold. In matrix form, for arbitrary dimension: Hessian 15-853 Page15 15-853 Page16 4

Remaining on the support plane Constraint: Ax = b A is an m (n+m) matrix. The equation describes an n-dimensional hyperplane in an (n+m)-dimensional space. The hyperplane basis is the null space of A A = defines the slope b = defines an offset x 1 + 2x 2 = 4 x 2 Projection Need to project our direction onto the plane defined by the null space of A. d = c n = c A T (AA T ) -1 Ac = (I A T (AA T ) -1 A)c = Pc P = (I A T (AA T ) -1 A) = the projection matrix x 1 + 2x 2 = 3 x 1 3 4 15-853 Page17 We want to calculate Pc 15-853 Page18 Calculating Pc Pc = (I A T (AA T ) -1 A)c = c A T w where A T w = A T (AA T ) -1 Ac giving AA T w = AA T (AA T ) -1 Ac = Ac so all we need to do is solve for w in: AA T w = Ac This can be solved with a sparse solver. This is the workhorse of the interior-point methods. Note that AA T will be more dense than A. Next step? We now have a direction c and its projection d onto the constraint plane defined by Ax = b. What do we do now? To decide how far to go we can find the minimum of f(x) along the line defined by d. Not too hard if f(x) is reasonably nice (e.g. has one minimum along the line). Alternatively we can go some fraction of the way to the boundary (e.g. 90%) 15-853 Page19 15-853 Page20 5

General Interior Point Method Pick start x 0 Factor AA T Repeat until done (within some threshold) decide on function to optimize f(x) (might be the same for all iterations) select direction d based on f(x) (e.g., with Newton s method) project d onto null space of A (using factored AA T and solving a linear system) decide how far to go along that direction Caveat: every method is slightly different 15-853 Page21 Affine Scaling Method A biased steepest descent. On each iteration solve: minimize c T y subject to Ay = 0 yd -2 y 1 Dikin ellipsoid Note that: 1. y is in the null space of A and can therefore be used as the direction d. 2. we are optimizing in the desired direction c T What does the Dikin Ellipsoid do for us? 15-853 Page22 Intuition by picture: Ax = b is a slice of the ellipsoid c = Pc Affine Scaling y x c Note that y is biased away from the boundary Dikin ellipsoid 15-853 Page23 How to compute By substitution of variables: y = Dy minimize: c T Dy subject to: ADy = 0 y D T D -2 Dy 1 (y y 1) The sphere y y 1 is unbiased. So we project the direction c T D =Dc onto the nullspace of B = AD: y = (I B T (BB T ) -1 B)Dc and y = Dy = D (I B T (BB T ) -1 B)Dc As before, solve for w in BB T w = BDc and y = D(Dc B T w) = D 2 (c A T w) 15-853 Page24 6

Affine Interior Point Method Pick start x 0 Symbolically factor AA T Repeat until done (within some threshold) B = A D i Solve BB T w = BDc for w (use symbolically factored AA T same non-zero structure) d = D i (D i c B T w) move in direction d a fraction α of the way to the boundary (something like α =.96 is used in practice) Note that D i changes on each iteration since it depends on x i 15-853 Page25 Central Trajectory (log barrier) Dates back to 50s for nonlinear problems. On step i: 1. minimize: cx - µ k j=1 n ln(x j ) s.t. Ax = b, x 0 2. select: µ k+1 µ k Each minimization can be done with a constrained Newton step. µ k needs to approach zero to terminate. A primal-dual version using higher order approximations is currently the best interiorpoint method in practice. 15-853 Page26 Potential Reduction Method minimize: z = q ln(c T x by) - j=1 n ln(x j ) subject to: Ax = b x 0 ya + s = 0 (dual problem) s 0 First term of z is the optimization term The second term of z is the centering term. The objective function is not linear. Use hill climbing or Newton Step to optimize. (c T x by) goes to 0 near the solution 15-853 Page27 Summary of Algorithms 1. Actual algorithms used in practice are very sophisticated 2. Practice matches theory reasonably well 3. Interior-point methods dominate when A. Large n B. Small Cholesky factors (i.e. low fill) C. Highly degenerate 4. Simplex dominates when starting from a previous solution very close to the final solution 5. Ellipsoid algorithm not currently practical 6. Large problems can take hours or days to solve. Parallelism is very important. 15-853 Page28 7