POLYHEDRAL GEOMETRY Mathematical Programming Niels Lauritzen 7.9.2007 Convex functions and sets Recall that a subset C R n is convex if {λx + (1 λ)y 0 λ 1} C for every x, y C and 0 λ 1. A function f : R n R is called convex if it satisfies f(λx + (1 λ)y) λf(x) + (1 λ)f(y) for every x, y R n and 0 λ 1. One of the fundamental facts following almost straight from the definitions is the following exercise. EXERCISE A local minimum for a convex function f : R n R on a convex set C R n is a global minimum on C. To be more precise: suppose you have a point z C such that f(z) f(x) for every x in C in a little ball around z. Then f(z) f(x) for every x C. This result is a surprise. To find a global minimum for a convex function on a convex set all you need to do is move towards a local minimum. A deeper result on closed convex sets is the following fundamental separation result. Theorem 1. Let C R n be a closed convex set. If w C, then there exists a line L = {x R n α x = β} separating C from w in the sense that α w < β and α x > β for every x C. Let us sketch the proof. The map T(x) = x w from R n to R n shows that it suffices to prove the theorem for T(C) (which is also a closed and convex subset) and T(w) = 0. So we may assume that C is a closed convex subset and w = 0. Since C is closed we may find v 0 C, such that { } v 0 = inf v v C. A good candidate for L is the line through v 0 /2 C with normal vector v 0 i.e. L = {x R n v 0 (x v 0 /2) = 0} or α = v 0 and β = v 0 2 /2. To show that this works, consider v C and {(1 λ)v 0 + λv 0 λ 1} C. Then v 0 2 (1 λ)v 0 + λv 2 (1) for 0 λ 1. Now expand (1) into a second degree polynomial in λ and let λ 0 + to conclude that v 0 v v 0 2 0. 1
Polyhedra The set of solutions to a system of linear inequalities is called a polyhedron. Recall that in short hand notation a linear inequality a 1 x 1 + + a n x n b in n variables is written as ax b, where a is the row n-vector (a 1,..., a n ) and x = (x 1,..., x n ) transposed (the column n-vector consisting of the variables). When we study the common solutions to a system of linear inequalities a 1 x b 1,..., a m x b m we represent this in short hand as P = {x R n Ax b}, where A is the m n-matrix with rows a 1,..., a m and b is the column m-vector with entries b 1,..., b m i.e. a polyhedron can be expressed as the set P above. In this note we will have a closed look at the geometry of the set P. The set of solutions to one linear inequality ax b is called a(n) (affine) half space. So a polyhedron is simpy an intersection of half spaces. In this context linear programming can be expressed in fancy language as the problem of optimizing a linear function over a polyhedron i.e. where c is a row n-vector. Maximize cx subject to x P, EXERCISE Prove that a polyhedron is a convex set. Fourier-Motzkin and applications What is the geometric interpretation of Fourier-Motzkin elimination? Well, consider the projection π : R n R n 1, where we forget the last coordinate i.e. π(x 1,..., x n ) = (x 1,..., x n 1 ). Given a polyhedron P = {x R n Ax b} we consider π(p) = {(x 1,..., x n 1 ) R n 1 x n R : (x 1,..., x n 1, x n ) P }. If you look carefully at the mechanics of Fourier-Motzkin elimination you will see that elimination of x n in Ax b corresponds to computing the projection π(p). In fact not only are we computing the projection, we are also giving explicit inequalities defining π(p) and thereby proving that π(p) is a polyhedron when P is. This may sound innocent, but try proving that the projection of a polyhedron is a polyhedron from scratch. Theorem 2. The projection of a polyhedron is a polyhedron. From this we have some surprising corollaries. 2
Corollary 3. The image ϕ(p) of a polyhedron P R m under a linear map ϕ : R m R n is a polyhedron. For the proof, consider ϕ represented by a matrix A. Then the statement is that Q = {y R n x P, y = Ax} is a polyhedron. But Q is the projection on the last n coordinates of the polyhedron in R m+n. {(x, y) R m+n y = Ax, x P } Corollary 3. The convex hull of finitely many points is a (bounded) polyhedron. This follows, since the convex hull of m points in R n may be viewed as the image of the polytope {(λ 1,..., λ m ) R m λ i 0, λ 1 + + λ m = 1} under an n m matrix. Extreme points, vertices and bfs In the following C will denote a convex set. A point z C is called an extreme point if z [x, y] for every x, y C \ {z} i.e. z is always an end point on the lines you can draw inside C. Similarly z is called a vertex if cz > cx for every x C \ {z} i.e. z is the unique optimum for some linear function given by c. Our intuition seems to tell us that vertices and extreme points are the same. EXERCISE Give an example of a convex set without vertices and extreme points. It is not too hard to see that a vertex has to be an extreme point in a convex set. In fact you should prove this! What about an extreme point? Does it have to be a vertex? Make some drawings in attacking this exercise. Also give an example of a convex set with infinitely many vertices. Of course we are interested in vertices and extreme points in polyhedra. Here the situation is easier than for general convex sets. However, the definition of vertices and extreme points is pleasing from a pure mathematical viewpoint but in the computational life it does not help us. We need an algebraic definition of these things. First we introduce a bit of notation. The matrix A has m rows. For a subset I {1,..., m} we let A I denote the (sub)matrix of A consisting of the rows with indices in I. Now we define for z P I z = {i 1 i m, and a i z = b i } {1, 2,..., m}. The book calls I z the set of active constraints at z P. We call z a basic feasible solution (bfs) if {a i i I z } contains n linearly independent vectors or which amounts to the same thing: the matrix A Iz has full rank n. We call z degenerate if z is a bfs with I z > n. Theorem 1. Let P be a polyhedron. Then extreme points, vertices and basic feasible solutions are one and the same thing. 3
This result is really nice! You immediately get an upper bound as to how many vertices a polyhedron can have given its representation as P = {x Ax b} (how?). We already know that a vertex is an extreme point. Let us show that an extreme point must be a bfs. We prove that if z P is not a bfs, then it cannot be an extreme point: if z is not a bfs, then there exists a non-zero x R n such that A Iz x = 0 (why exactly is this?). We can use this x to cook up a line containing z in its interior. In fact for ε > 0 small we can prove by looking at the inequalities defining P that Therefore z [z εx, z + εx], since z ± εx P. z = 1 2 (z εx) + 1 (z + εx) 2 and z cannot be an extreme point. This shows that an extreme point has to be a bfs. Let us finally show that a bfs is a vertex. So given a bfs z P we need to find a linear function given by c R n, such that cz > cx for every x P \ {z}. Here the little trick is to put c := i I z a i. Then cx = i I z a i x cz = i I z b i for x P. If cx = cz, then a i x = a i z = b i for every i I z and therefore x = z since A Iz has full rank n. EXERCISE The unit cube U in R n is a polyhedron given by the inequalities 0 x 1 1. 0 x n 1. Write U as {x R n Ax b} for suitable A and b and show that U has 2 n vertices. Why vertices and extreme points? Why are we interested in vertices and extreme points? Well, in a linear program Maximize cx subject to x P, (1) where P is a polyhedron, the following fundamental fact holds: if (1) has an optimal solution i.e. if there exists z P with cz cx for every x P, then such an optimal solution exists among the vertices of P!!! This means that a potentially infinite problem 4
has been converted into a finite one, since P has finitely many vertices!!! All we have to do is search among the vertices in a clever way (the simplex algorithm!). Let us see why this is so. Consider a polyhedron P = {x R n Ax b} and suppose we are given c R n and z P such that cz cx for every x P. We have to prove the existence of an extreme point w P with cw = cz. Let M := cz and consider the polyhedron Q := {y P cy = M}. Let w be an extreme point of Q. Then w is also an extreme point of P and thereby a vertex of P: if w was not an extreme point of P, then w = λu + (1 λ)v for u, v P with 0 < λ < 1. There is a tremendous gap in the above argument! How do we know that Q contains an extreme point? This where we need the following. Existence of extreme points Recall that a line in R n has the form {x + λd λ R} for suitable x, d R n. With this definition we can state the following fundamental result. Theorem 2. A polyhedron P = {x R n Ax b} contains an extreme point if and only if it does not contain a line. Moreover P contains an extreme point if and only if A has full rank. The most beautiful part of the proof is showing that a polyhedron not containing a line must have an extreme point: start with a point z P. If z is not extreme, then there exists a non-zero vector d with A Iz d = 0. Now use the line z + λd and vary λ to the smallest absolute value, where a constraint outside I z becomes binding. This results in a new point z = z + λ z P, with the property that A Iz has bigger rank than A Iz. Continue this procedure with z unless of course A Iz has maximal rank. The only thing used in this argument is that P does not contain a line. The proof that a polyhedron with an extreme point does not contain a line is a bit easier. Neighboring vertices Consider two vertices z and z of a polyhedron P = {x R n Ax b}. Then z and z are connected by an edge (we have not formally defined edge) if and only if I z I z contains n 1 linearly independent rows. 5
Polytopes A polytope is a bounded polyhedron. We have the following result. Theorem 4. A polytope is the convex hull of its extreme points. Let P be a polytope and let C be the convex hull of its extreme points (why is C?). Clearly C P. Suppose that x P \ C. Then we can find a line separating x from C i.e. there exists c R n, such that cx < cz for every z C. But there exists an extreme point z such that cz = min{cx x P }. This is a contradiction. The standard form of polyhedra Every polyhedron, or every set of solutions to a system of linear inequalities, may be transcribed to the form P = {x R n Ax = b, x 0} by adding (slack) variables. This turns out to be particularly convenient for enumerating the vertices and moreover move from one vertex to its neighboring vertices. If A has m rows we are in a way forcing the first m conditions in a polyhedron to be binding. This gives a nice combinatorial algorithm for computing and eventually it leads to the famous simplex algorithm. Consider a vertex z P i.e. Az = b and z 0. Let z i1,..., z ik denote the non-zero entries in z = (z 1,..., z n ) t R n. Then z i1 A i1 + + z ik A ik = b, (2) where A j denotes column j of A. Then the columns A i1,..., A ik have to be linearly independent. Suppose on the contrary that there exists λ 1,..., λ k, not all zero, such that λ 1 A i1 + + λ k A ik = 0. Then (z i1 + λ 1 )A i1 + + (z ik + λ k )A ik = b, contradicting that (2) has a unique solution. This is the advantage of the standard form. You can find all the vertices easily. Suppose that rank of the matrix is m. This means that the maximal ( ) number of linearly independent columns of A is m. Thus one can search n through the potentially linearly independent columns and find an optimum this way. m The algorithm is to pick out m linearly independent columns and then solve the equation Az = b for these columns and check z 0. 6
DEFINITION The two-dimensional polyhedron lifts to the standard form x 1 + 2x 2 3 2x 1 + x 2 3 x 1 0 x 2 0 x 1 + 2x 2 + x 3 = 3 2x 1 + x 2 + x 4 = 3, ( ) 4 where x 1, x 2, x 3, x 4 0. Here we try out the 6 = combinations of 2 columns out of 2 4 in the matrix ( ) 1 2 1 0 A = 2 1 0 1 to find the basic feasible solutions of ( ) x 1 1 2 1 0 x 2 = 2 1 0 1 x 3 x 4 ( ) 3. 3 7