Alternating Projections

Similar documents
2. Convex sets. x 1. x 2. affine set: contains the line through any two distinct points in the set

2. Convex sets. affine and convex sets. some important examples. operations that preserve convexity. generalized inequalities

Lecture 2 Convex Sets

Convex Optimization. Convex Sets. ENSAE: Optimisation 1/24

Convex Optimization. 2. Convex Sets. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University. SJTU Ying Cui 1 / 33

This lecture: Convex optimization Convex sets Convex functions Convex optimization problems Why convex optimization? Why so early in the course?

60 2 Convex sets. {x a T x b} {x ã T x b}

Convexity: an introduction

Convex Optimization M2

Introduction to Modern Control Systems

Convex Optimization - Chapter 1-2. Xiangru Lian August 28, 2015

Mathematical Programming and Research Methods (Part II)

Convex Optimization. Chapter 1 - chapter 2.2

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 6

Conic Duality. yyye

Tutorial on Convex Optimization for Engineers

Convex Sets. CSCI5254: Convex Optimization & Its Applications. subspaces, affine sets, and convex sets. operations that preserve convexity

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Convex Geometry arising in Optimization

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 4: Convex Sets. Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets. Instructor: Shaddin Dughmi

Lecture 19 Subgradient Methods. November 5, 2008

Introduction to Convex Optimization. Prof. Daniel P. Palomar

Lecture: Convex Sets

Convex sets and convex functions

Convex sets and convex functions

Affine function. suppose f : R n R m is affine (f(x) =Ax + b with A R m n, b R m ) the image of a convex set under f is convex

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization

Projection-Based Methods in Optimization

Week 5. Convex Optimization

Research Interests Optimization:

Convex Sets (cont.) Convex Functions

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

Lecture 2: August 29, 2018

Lecture 2 September 3

Lecture 3: Convex sets

Aspects of Convex, Nonconvex, and Geometric Optimization (Lecture 1) Suvrit Sra Massachusetts Institute of Technology

Lecture 4: Convexity

Math 414 Lecture 2 Everyone have a laptop?

Lec13p1, ORF363/COS323

Simplex Algorithm in 1 Slide

Combinatorial Geometry & Topology arising in Game Theory and Optimization

Math 5593 Linear Programming Lecture Notes

6 Randomized rounding of semidefinite programs

Lecture 5: Properties of convex sets

Lecture 2. Topology of Sets in R n. August 27, 2008

Lecture 2 - Introduction to Polytopes

MA4254: Discrete Optimization. Defeng Sun. Department of Mathematics National University of Singapore Office: S Telephone:

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

Convex Optimization / Homework 2, due Oct 3

COM Optimization for Communications Summary: Convex Sets and Convex Functions

EC 521 MATHEMATICAL METHODS FOR ECONOMICS. Lecture 2: Convex Sets

Convex Optimization MLSS 2015

Optimal network flow allocation

Stability of closedness of convex cones under linear mappings

CS 435, 2018 Lecture 2, Date: 1 March 2018 Instructor: Nisheeth Vishnoi. Convex Programming and Efficiency

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

maximize c, x subject to Ax b,

15. Cutting plane and ellipsoid methods

CMPSCI611: The Simplex Algorithm Lecture 24

Lecture 4: Linear Programming

The Simplex Algorithm

Lagrangean relaxation - exercises

Some Advanced Topics in Linear Programming

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007

On the null space of a Colin de Verdière matrix

Lecture 2: August 31

Algebraic Iterative Methods for Computed Tomography

2017 SOLUTIONS (PRELIMINARY VERSION)

On the positive semidenite polytope rank

4 LINEAR PROGRAMMING (LP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

Advanced Operations Research Techniques IE316. Quiz 2 Review. Dr. Ted Ralphs

MATH 890 HOMEWORK 2 DAVID MEREDITH

Chapter 4 Concepts from Geometry

Nonlinear Programming

Lecture 3. Corner Polyhedron, Intersection Cuts, Maximal Lattice-Free Convex Sets. Tepper School of Business Carnegie Mellon University, Pittsburgh

On Rainbow Cycles in Edge Colored Complete Graphs. S. Akbari, O. Etesami, H. Mahini, M. Mahmoody. Abstract

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm

A primal-dual Dikin affine scaling method for symmetric conic optimization

In this chapter we introduce some of the basic concepts that will be useful for the study of integer programming problems.

Convexity Theory and Gradient Methods

FACES OF CONVEX SETS

Convexity I: Sets and Functions

EE/AA 578: Convex Optimization

Algebraic Iterative Methods for Computed Tomography

of Convex Analysis Fundamentals Jean-Baptiste Hiriart-Urruty Claude Lemarechal Springer With 66 Figures

Lecture 9. Semidefinite programming is linear programming where variables are entries in a positive semidefinite matrix.

Chapter 15 Introduction to Linear Programming

7. The Gauss-Bonnet theorem

Convex Optimization Lecture 2

2. Convex sets. affine and convex sets. some important examples. operations that preserve convexity. generalized inequalities

Math 273a: Optimization Linear programming

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro

Mathematical and Algorithmic Foundations Linear Programming and Matchings

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

Extensions of Semidefinite Coordinate Direction Algorithm. for Detecting Necessary Constraints to Unbounded Regions

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3

Lecture 3: Linear Classification

Linear Optimization. Andongwisye John. November 17, Linkoping University. Andongwisye John (Linkoping University) November 17, / 25

Convexity. 1 X i is convex. = b is a hyperplane in R n, and is denoted H(p, b) i.e.,

Transcription:

Alternating Projections Stephen Boyd and Jon Dattorro EE392o, Stanford University Autumn, 2003 1 Alternating projection algorithm Alternating projections is a very simple algorithm for computing a point in the intersection of some convex sets, using a sequence of projections onto the sets. Like a gradient or subgradient method, alternating projections can be slow, but the method can be useful when we have some efficient method, such as an analytical formula, for carrying out the projections. In these notes, we use only the Euclidean norm, Euclidean distance, and Euclidean projection. Suppose C and D are closed convex sets in R n, and let P C and P D denote projection on C and D, respectively. The algorithm starts with any x 0 C, and then alternately projects onto C and D: y k = P D (x k ), x k+1 = P C (y k ), k = 0, 1, 2,... This generates a sequence of points x k C and y k D. A basic result, due to Cheney and Goldstein [CG59] is the following. If C D, then the sequences x k and y k both converge to a point x C D. Roughly speaking, alternating projections finds a point in the intersection of the sets, provided they intersect. Note that we re not claiming the algorithm produces a point in C D in a finite number of steps. We are claiming that the sequence x k (which lies in C) satisfies dist(x k, D) 0, and likewise for y k. A simple example is illustrated in figure 1. Alternating projections is also useful when the sets do not intersect. In this case we can prove the following. Assume the distance between C and D is achieved (i.e., there exist points in C and D whose distance is dist(c, D)). Then x k x C, and y k y D, where x y 2 = dist(c, D). In other words, alternating projections yields a pair of points in C and D that have minimum distance. In this case, alternating projections also yields (in the limit) a hyperplane that separates C and D. A simple example is illustrated in figure 2. There are many variations on and extensions of the basic alternating projections algorithm. For example, we can find a point in the intersection of k > 2 convex sets, by projecting onto C 1, then C 2,..., then C k, and then repeating the cycle of k projections. (This is called the sequential or cyclic projection algorithm, instead of alternating projection.) We ll describe several other extensions below. 1

x 1 PSfrag replacements D y1 y2 x 2 x C Figure 1: First few iterations of alternating projection algorithm. Both sequences are converging to the point x C D. x 1 PSfrag replacements D y 1 y 2 x 2 y x C Figure 2: First few iterations of alternating projection algorithm, for a case in which C D =. The sequence x k is converging to x C, and the sequence y k is converging to y D, where x y 2 = dist(c, D). 2

2 Convergence proof We ll prove convergence of the alternating projections method for the case C D. The proof of convergence for the case when C D = is similar (and can be found in Cheney and Goldstein, for example). See also Bauschke and Borwein [BB96], Bregman [Bre67], and Akgul [Akg84] for related material. Let x be any point in the intersection C D. We claim that each projection brings the point closer to x. To see this, we first observe that since y k is the projection of x k onto D, we have D {z (x k y k ) T (z y k ) 0}. In other words, the halfspace passing through y k, with outward normal x k y k, contains D. This follows from the optimality conditions for Euclidean projection, and is very easy to show directly in any case: if any point of D were on the other side of the hyperplane, a small step from y k towards the point would give a point in D that is closer to x k than y k, which is impossible. Now we note that x k x 2 = x k y k + y k x 2 = x k y k 2 + y k x 2 + 2(x k y k ) T (y k x) x k y k 2 + y k x 2 using our observation above. Thus we have y k x 2 x k x 2 y k x k 2. (1) This shows that y k is closer to x than x k is. In a similar way we can show that x k+1 x 2 y k x 2 x k+1 y k 2, (2) i.e., x k+1 is closer to x than y k is. We can make several conclusions. First, all the iterates are no farther from x than x 0 : x k x x 0 x, y k x x 0 x, k = 1, 2,... In particular, we conclude that the sequences x k and y k are bounded. Therefore the sequence x k has an accumulation point x. Since C is closed, and x k C, we have x C. We are going to show that x D, and that the sequences x k and y k both converge to x. From (1) and (2), we find that the sequence x 0 x, y 0 x, x 1 x, y 1 x,... is decreasing, and so converges. We then conclude from (1) and (2) that y k x k and x k+1 y k must converge to zero. A subsequence of x k converges to x. From dist(x k, D) = dist(x k, y k ) 0 3

and closedness of D, we conclude that x D. Thus, x C D. Since x is in the intersection, we can take x = x above (since x was any point in the intersection) to find that the distance of both x k and y k to x is decreasing. Since a subsequence converges to zero, we conclude that x k x and y k x both converge to zero. 3 Extensions and variations One simple variation starts with x 0 C and y 0 C. We then find the average, z 0 = (x 0 + y 0 )/2, and set x 1 = P C (z 0 ), y 1 = P D (z 0 ). We then repeat. In this algorithm, we average the current pair of points (one in C, one in D), and then project onto C and D respectively. In fact, this is the same as applying the alternating projection algorithm in the product space, using the sets C D, {(u, v) R n R n u = v}. It is also possible to use under projection or over projection. We set, for example, y k = θp D (x k ) + (1 θ)x k, where θ (0, 1) for under projection, and θ (1, 2) for over projection. In under projection, we step only the fraction θ from the current point to D; in over projection we move farther than the projection. Over projection is sometimes used to find a point in the intersection in a finite number of steps (assuming the intersection has nonempty interior). In both cases, the important part is that each step brings the point closer to x, for any point x C D. When sequential projection is done on more than two sets, the order does not have to be cyclic. We can project the current point onto any of the sets the point is not in, provided we infinitely often project onto any set the current point is not in. 4 Example: SDP feasibility As an example we consider the problem of finding X S n that satisfies X 0, Tr(A i X) = b i, i = 1,..., m, where A i S n. Here we take C to be the positive semidefinite cone S n +, and we take D to be the affine set in S n defined by the linear equalities. The Euclidean norm here is the Frobenius norm. The projection of iterate Y k onto C can be found from the eigenvalue decomposition Y k = n i=1 λ i q i qi T (see [BV03, 8.1.1]): n P C (Y k ) = max{0, λ i }q i qi T. i=1 The projection of iterate X k onto the affine set is also easy to work out: m P D (X k ) = X k u i A i (3) i=1 4

where u i are found from the normal equations, Gu = (Tr(A 1 X k ) b 1,..., Tr(A m X k ) b m ), G ij = Tr(A i A j ). (4) Alternating projections converges to a point in the intersection, if it is nonempty; otherwise it converges to the positive semidefinite matrix, and the symmetric matrix satisfying the linear equalities, that are of minimum Frobenius distance. Now we consider a numerical example, which is a positive semidefinite matrix completion problem [BV03, exer. 4.47]. We are given a matrix in S n with some of its entries (including all of its diagonal entries) fixed, and the others to be found. The goal is to find values for the other entries so that the (completed) matrix is positive semidefinite. For this problem, orthogonal projection onto the equality constraints is extremely easy: we simply take the matrix X, and set its fixed entries back to the fixed values given. Thus, we alternate between eigenvalue decomposition and truncation, and re-setting the fixed entries back to their required values. As a specific example we consider X = 4 3? 2 3 4 3?? 3 4 3 2? 3 4 where the question marks denote the entries to be determined. We initialize with Y 0 = X, taking the unknown entries as 0. To track convergence of the algorithm, we plot d k = X k Y k 1 F, which is the squareroot of the sum of the squares of the negative eigenvalues of Y k, and also the distance between Y k and the positive semidefinite cone. We also plot d k = Y k X k F, which is the squareroot of the sum of the squares of the adjustments made to the fixed entries of X k, and also the distance between X k and the affine set defined by the linear equalities. These are plotted in figure 3. In this example, both unknown entries converge quickly to the same value, 1.5858 (rounded), which yields a positive semidefinite completion of the original matrix. The plots show that convergence is linear. 5 Example: Relaxation method for linear inequalities As another simple application, suppose we want to find a point in the polyhedron, P = {x a T i x b i, i = 1,..., m}, assuming it is nonempty. (This can of course be done using linear programming.) We ll use alternating projections onto the m halfspaces a T i x b i to do this. By scaling we can assume, without loss of generality, that a i = 1. The projection of a point z onto the halfspace H i defined by a T i x b i is then given by P i (z) = { z a T i z b i z (a T i z b i )a i a T i z > b i. 5

10 1 10 0 10 1 dk, dk 10 2 10 3 10 4 PSfrag replacements 10 5 0 5 10 15 20 25 30 35 40 45 Figure 3: Convergence of alterating projections for a matrix completion problem. d k (solid line) gives the distance from Y k 1 (which satisfies the equality constraints) to the positive semidefinite cone; d k (dashed) gives the distance from X k (which is positive semdefinite) to the affine set defined by the equality constraints. k 6

10 2 10 1 10 0 10 1 rk 10 2 10 3 PSfrag replacements 10 4 10 5 0 5 10 15 20 25 Figure 4: Maximum constraint violation versus iteration for relaxation method for solving linear inequalities, for a problem with n = 100 variables and m = 1000 inequalities. k By cycling through these projections, we generate a sequence that converges to a point in P. This very simple algorithm for solving a set of linear inequalities is called the relaxation method for linear inequalities (see [Agm54, Ere65]). It was used, for example, to find a separating hyperplane between two sets of points. In this context it was dubbed the perceptron algorithm (see, e.g., [WW96]). To demonstrate convergence, we find a feasible point in a polyhedron in R 100 defined by m = 1000 randomly chosen linear inequalities, a T i x b i, i = 1,..., m, where a i = 1. A starting point not in the polyhedron is selected. To show convergence, we plot the maximum constraint violation, r k = max m max{0, i=1 at i x k b i }, versus iteration number, in figure 4. 6 Example: Row and column sum bounds We consider the problem of finding a matrix X R m n whose row sums and columns sums must lie in specified ranges, i.e., n X ij [r i, r i ], i = 1,..., m, j=1 m X ij [c j, c j ], j = 1,..., n. i=1 7

Here r i and r i are given lower and upper bounds for the sum of the ith row, and c j and c j are given lower and upper bounds for the sum of the jth column. We can, of course, solve this feasibility problem using linear programming. Alternating projections for this problem is very simple. To project a matrix onto the row sum range set, we can project each row separately. To project a vector u onto the set {z α 1 T z β} is easy: the projection is z, α 1 T z β P (z) = z ((1 T z β)/n)1, 1 T z > β z ((α 1 T z)/n)1, 1 T z < α. Alternating projections proceeds by alternately projecting the rows onto their required ranges (which can be done in parallel), and projecting the columns onto their required ranges (which also can be done in parallel). 8

References [Agm54] S. Agmon. The relaxation method for linear inequalities. Canadian J. Math., 6:382 392, 1954. [Akg84] [BB96] [Bre67] [BV03] [CG59] [Ere65] M. Akgül. Topics in Relaxation and Ellipsoidal Methods, volume 97 of Research Notes in Mathematics. Pitman, 1984. H. Bauschke and J. Borwein. On projection algorithms for solving convex feasibility problems. SIAM Review, 38(3):367 426, 1996. L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Computational Math. and Math. Physics, 7(3):200 217, 1967. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2003. W. Cheney and A. Goldstein. Proximity maps for convex sets. Proceedings of the AMS, 10:448 450, 1959. I. Eremin. The relaxation method of solving systems of inequalities with convex functions on the left sides. Soviet Math. Dokl., 6:219 222, 1965. [WW96] B. Widrow and E. Wallach. Adaptive Inverse Control. Prentice-Hall, 1996. 9