Lecture 1: Introduction

Similar documents
Convexity I: Sets and Functions

Lecture 2: August 31

Lecture 2: August 29, 2018

Lecture 4: Convexity

Lecture 2: August 29, 2018

Introduction to Modern Control Systems

Tutorial on Convex Optimization for Engineers

Convex Optimization - Chapter 1-2. Xiangru Lian August 28, 2015

Convex Optimization MLSS 2015

Lecture 5: Properties of convex sets

Convex sets and convex functions

Convex sets and convex functions

Convexity Theory and Gradient Methods

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions. Instructor: Shaddin Dughmi

Lecture 2 - Introduction to Polytopes

Convex Optimization Lecture 2

COM Optimization for Communications Summary: Convex Sets and Convex Functions

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro

Lecture 2 September 3

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Convex Sets (cont.) Convex Functions

Convex Sets. CSCI5254: Convex Optimization & Its Applications. subspaces, affine sets, and convex sets. operations that preserve convexity

Lecture 5: Duality Theory

Mathematical Programming and Research Methods (Part II)

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization

Convex Optimization. Convex Sets. ENSAE: Optimisation 1/24

POLYHEDRAL GEOMETRY. Convex functions and sets. Mathematical Programming Niels Lauritzen Recall that a subset C R n is convex if

Affine function. suppose f : R n R m is affine (f(x) =Ax + b with A R m n, b R m ) the image of a convex set under f is convex

Lecture 15: Log Barrier Method

minimise f(x) subject to g(x) = b, x X. inf f(x) = inf L(x,

This lecture: Convex optimization Convex sets Convex functions Convex optimization problems Why convex optimization? Why so early in the course?

2. Convex sets. x 1. x 2. affine set: contains the line through any two distinct points in the set

60 2 Convex sets. {x a T x b} {x ã T x b}

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

Math 5593 Linear Programming Lecture Notes

Unconstrained Optimization

2. Convex sets. affine and convex sets. some important examples. operations that preserve convexity. generalized inequalities

2. Convex sets. affine and convex sets. some important examples. operations that preserve convexity. generalized inequalities

Lecture 3: Convex sets

Convex Optimization M2

Aspects of Convex, Nonconvex, and Geometric Optimization (Lecture 1) Suvrit Sra Massachusetts Institute of Technology

/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 4: Convex Sets. Instructor: Shaddin Dughmi

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 1: Introduction to Optimization. Instructor: Shaddin Dughmi

we wish to minimize this function; to make life easier, we may minimize

Introduction to Mathematical Programming IE406. Lecture 20. Dr. Ted Ralphs

Simplex Algorithm in 1 Slide

Integer Programming ISE 418. Lecture 1. Dr. Ted Ralphs

Lec13p1, ORF363/COS323

Lecture 1: Introduction

Minima, Maxima, Saddle points

Chapter 4 Convex Optimization Problems

Introduction to optimization

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets. Instructor: Shaddin Dughmi

Convexity: an introduction

Discrete Optimization 2010 Lecture 5 Min-Cost Flows & Total Unimodularity

. Tutorial Class V 3-10/10/2012 First Order Partial Derivatives;...

MA4254: Discrete Optimization. Defeng Sun. Department of Mathematics National University of Singapore Office: S Telephone:

The Simplex Algorithm

Graphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007

Topic: Orientation, Surfaces, and Euler characteristic

EC 521 MATHEMATICAL METHODS FOR ECONOMICS. Lecture 2: Convex Sets

4 LINEAR PROGRAMMING (LP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

MS&E 213 / CS 269O : Chapter 1 Introduction to Introduction to Optimization Theory

Introduction to Mathematical Programming IE496. Final Review. Dr. Ted Ralphs

Convex Optimization. 2. Convex Sets. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University. SJTU Ying Cui 1 / 33

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

ORIE 6300 Mathematical Programming I September 2, Lecture 3

Lecture 6: Faces, Facets

R n a T i x = b i} is a Hyperplane.

Applied Lagrange Duality for Constrained Optimization

Chapter 1. Math review. 1.1 Some sets

Lecture 9: Pipage Rounding Method

Algebraic Graph Theory- Adjacency Matrix and Spectrum

Hw 4 Due Feb 22. D(fg) x y z (

Convex Geometry arising in Optimization

CMPSCI611: The Simplex Algorithm Lecture 24

Lecture 4: Rational IPs, Polyhedron, Decomposition Theorem

CS 435, 2018 Lecture 2, Date: 1 March 2018 Instructor: Nisheeth Vishnoi. Convex Programming and Efficiency

maximize c, x subject to Ax b,

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 6

1. Lecture notes on bipartite matching February 4th,

Week 5. Convex Optimization

Basics of Computational Geometry

Applied Integer Programming

Lecture 19 Subgradient Methods. November 5, 2008

Section 5 Convex Optimisation 1. W. Dai (IC) EE4.66 Data Proc. Convex Optimisation page 5-1

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

California Institute of Technology Crash-Course on Convex Optimization Fall Ec 133 Guilherme Freitas

Introduction to optimization methods and line search

(Refer Slide Time 3:31)

CS675: Convex and Combinatorial Optimization Spring 2018 Consequences of the Ellipsoid Algorithm. Instructor: Shaddin Dughmi

AM 221: Advanced Optimization Spring 2016

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

TMA946/MAN280 APPLIED OPTIMIZATION. Exam instructions

LP-Modelling. dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven. January 30, 2008

1 Affine and Projective Coordinate Notation

LECTURE 18 - OPTIMIZATION

Transcription:

CSE 599: Interplay between Convex Optimization and Geometry Winter 218 Lecturer: Yin Tat Lee Lecture 1: Introduction Disclaimer: Please tell me any mistake you noticed. 1.1 Course Information Objective: Recently, there are lots of progress on getting faster algorithms for various theory optimization problems, especially for combinatorial optimization. For example, the fastest algorithms for maximum flow, matroid intersection, submodular minimization are improved for the first time in decades. For theory, we will cover some of the optimization techniques used in those results and, more importantly, some basic mathematical tools underlying them. As for application, many of these algorithms are very different from what people are using and have potential to be disruptive. Course requirement: 2 problem sets (exercises in lecture notes problem sets) (5%), final project (5%) Course project: It can be anything related to convexity. For example: 1. A research project related to convexity. It can be any open problem mentioned in the class, any problem you are interested in or any part of your own ongoing research. 2. A survey type report on something related to convexity. 3. An implementation project on some ideas learned from the class or more generally, geometric methods for convex optimization. 4. Others (Optional) I encourage you to send me a 1 page project proposal by Feb 21, describing what you want to do and why you think it is interesting. Prerequisites: I try to be as self-contained as possible. However, I assume you are familiar with linear algebra, multivariate calculus, probability, inequalities and algorithms. Most importantly, I expect you love mathematics! 1.2 Course Overview Convexity is a main source of easy problems. To exaggerate, some(?) even think that optimization problem is easy if and only if it can be rewritten as a convex problem. See Figure 1.1 for a list of convex problems with increasing difficulty. In this course, we will first cover the cutting plane methods. These methods show that all convex optimization problems are in polynomial time (with some oracle assumptions). Next, we will cover the first order method. Although it is not appeared in the Figure 1.1, it is the method of choice in many machine learning applications. We will cover some which have more impact on theory. Then, we will cover different methods for solving linear systems, including the Cholesky decomposition for Laplacian matrices. Finally, we will talk about interior point methods. 1-1

1-2 Lecture 1: Introduction Figure 1.1: A list of convex problems with increasing difficulty Although convex optimization is studied for more than a half century, there are still many basic open problems. This is an example: Problem. Can we solve a random sparse linear system with O(n) non-zeros faster than n 2 time? (One can do it O(n 2 log(n/ε)) time by Chebyshev polynomial. Not 1% sure.) In every lecture, I will give some open problem(s). Some are famous and some are not. If you are interested in the problem, feel free to talk to me. I have some thought on many of the problems. If you make enough progress on any problem (not exercise) in the lecture note, then you will get 1% score for the course and do not need to any problem set or project. 1 1.3 Convex sets Definition 1.3.1. A set K R n is convexif for any x,y K and any θ 1, we have θx+(1 θ)y K. A set K is a polyhedron if K = {x R n : Ax b} for some matrix A and some vector b. The main reason convex set is so nice is that we can also use a hyperplane to separate a point outside the set and the set itself. Such hyperplane allows us to do binary search to find a point in a convex set. As we will see in the next lecture, such hyperplane can be computed efficiently. Theorem 1.3.2. Let K be a convex set in R n and y / K. Then, we can find a θ R n such that If K is closed, the inequality is strict. θ,y max x K θ,x. Remark. In infinite dimension, this theorem is called Hahn Banach theorem and is useful for functional analysis. Proof. Let x be a point in K closest to y, namely x argmin x K x y 2. Using convexity of K, for any x K and any t 1, we have that (1 t)x +tx y 2 x y 2. 1 I took a theory course in undergrad with a similar policy. I solved a problem and got a STOC paper.

Lecture 1: Introduction 1-3 Expand the left-hand side, we have (1 t)x +tx y 2 = x y +t(x x ) 2 = x y 2 +2t x y,x x +O(t 2 ). Taking t +, we have that x y,x x for all x K. Another application of this theorem is that it shows polyhedrons is essentially as general as convex sets. Throughout this course, many theorems about polyhedrons can be generalized to convex sets for this reason. Corollary 1.3.3. Any closed convex set K can be written as the intersection of half space as follows K = { } x : θ,x max θ,y. y K θ R n Namely, any convex set is 2 a polyhedron with possibly infinitely many constraints. There are many important convex sets and here I only list some that appears in this course. Example. Convexsets: Polytope{Ax b},ellipsoid{x Ax 1},positivedefinitecone{A R n n such that A }, norm ball {x : x p 1} for all p 1. 1.4 Convex functions Definition 1.4.1. A function f : R n R {+ } is convex if for all x,y R n and t 1, f(tx+(1 t)y) tf(x)+(1 t)f(y). Certainly, the most important class of convex functions is convex set :). Definition 1.4.2. Given a convex set K, we define { if x K l K =. + elses If we let domf def = {x R n : f(x) < + }, then the definition shows that domf is a convex set. By looking on the set of points above the graph, we obtain a convex set called epigraph. Definition 1.4.3. The epigraph of f is epif def = {(x,t) R n R : t f(x)}. Exercise 1.4.4. A function f is convex if and only if epif is a convex set. This characterization shows that min x f(x) is same as min (t,x) epif t. Therefore, convex optimization is same as finding an extreme point of a convex set. Another important feature of convex set is the following: Exercise 1.4.5. The sublevel set {x R n : f(x) t} is convex. 2 Formally, polyhedron only has finitely many constraints.

1-4 Lecture 1: Introduction In particular, this shows that the set of minimizer are connected. Therefore, local minimum is same as global minimum. We note that the converse is not true. We call a function is quasi-convex if every sublevel set is convex. Similar to the convex set, we have a separation theorem similar to Theorem 1.3.2. This shows that binary search on convex function is also possible. Theorem 1.4.6. Let f be a continuously differentiable function. Then, f is convex if and only if f(y) f(x)+ f(x),y x for all x,y. Proof. Fix any x,y R n. Let g(t) = f(tx+(1 t)y). Suppose f is convex, so is g. Then, we have which implies that g(t) (1 t)g()+tg(1) g(1) g()+ g(t) g(). t Taking t +, we have that g(1) g()+g (). In other words, we have f(y) f(x)+ f(x),y x. Since we will not use the converse, we left it as an exercise. Checking if a function is convex can be difficult in general. However, if the function is twice differentiable, it can be done by a simple calculus as follows. Theorem 1.4.7. Let f be a twice continuously differentiable function on R n. Then, f is convex if and only if 2 f(x) for all x R n. Remark. The Hessian 2 f of f is the n n matrix defined by ( 2 f) ij = 2 f x i x j. We write A if for any θ R n, θ Aθ. Proof. Suppose that 2 f(x). Fix any x,y R n. Let g(t) = f(tx+(1 t)y). Then, we have g(t) = g()+ t g (s)ds = g()+tg ()+ = g()+tg ()+ t s t g (u)duds (t s)g (s)ds. Similarly, we have Therefore, we have that g(1) = g()+g ()+ 1 (1 s)g (s)ds g(t) (1 t)g() tg(1) = t (t s)g (s)ds t 1 where we used that g (s) = (x y) 2 f(tx+(1 t)y)(x y). Since we will not use the converse, we left it as an exercise. (1 s)g (s)ds

Lecture 1: Introduction 1-5 Here are some convex functions. For a longer list, please take a look on convex function in Wikipedia. Example. Convex functions: x, x +, e x, x a for a 1, log(x), xlogx, x p for p 1, (x,y) x2 y, A logdeta, (x,y) x Y 1 x, log i exi, ( i x i) 1 n. Here is an example on how to check a function is convex. Sometimes, checking if the Hessian is positive definite requires some inequalities. Exercise 1.4.8. Prove that the function f(x) = log i exi is convex. Proof. Note that f x i = ex i. Hence, we have i ex i For any θ R n, we have that 2 f = exi x i x j i exi1 i=j exi+xj ( i ) 2. exi θ 2 f(x)θ = i e xi θ 2 i i exi i,j e xi+xj θ i θ j ( i exi ) 2. The right-hand side is non-negative because e xi+xj θ i θ j = ( i,j i e xi θ i ) 2 ( i e xi ) ( i e xi θ 2 i). 1.5 Convex Optimization In this course, we focus on problems of the form min x R nf(x) where f is convex. This includes min g(x) c f(x) because the function is convex. f(x)+l g(x) c Example. Convex problems: linear regression min x Ax b p for p 1, linear program min Ax b c x, semi-definite programming min Ai X=b i,x C X, logistic regression min x log(1+e y i(a T i x+bi) ), shortest path problem, maximum flow problem, matroid intersection,. For convex problems, it is easy to check if a given point is optimal or not. This is not just useful for computation purpose, but even for constructing objects and proves all kind of theorems. I will give an illustration of this in next section. Theorem 1.5.1 (Optimality condition). Let f be a continuously differentiable convex function. Then, x is the minimizer of f if and only if f(x) =. Proof. If f(x), then f(x ε f(x)) = f(x) (ε+o ε (1)) f(x) 2 < f(x) for small enough ε. Hence, such point cannot be the minimizer. On the other hand, if f(x) =, Theorem 1.4.6 shows that f(y) f(x)+ f(x),y x = f(x) for all y.

1-6 Lecture 1: Introduction Thestudyofalgebraicpropertiesofconvexfunctionsandconvexsetsisahugesubjectthatwecanonlytouch its surface today. For example, many results here can be generalized to settings that is not differentiable. If you are interested, please refer to [3]. In a later lecture, we will cover some geometric properties of convex functions that is again another huge subject. 1.5.1 Example: Matrix Balancing We are given a positive matrix A. The goal is to find a diagonal matrix D such that the sum of i th row of DAD 1 equals to the sum of i th column of DAD 1. This is one of the basic operation in numerical linear algebra to normalize a matrix. It turns out the existence of such rescaling can be found by considering the convex function n f(x) = A ij e xi xj. i,j=1 If we fix the first coordinate x 1 =, it is easy to see the function blows up when x is large, in particular f(x) e x /n (min ij A ij ). Therefore, the minimizer must exists. By the optimality condition, we have that f x i (x ) = for all i 1, which implies n A ij e xi xj j=1 n A ji e xj xi =. j=1 (Although we fixed x 1 =, it is easy to recover the condition for i = 1 by using i 1.) Therefore, each row sum of A ij e xi xj equals to each column sum A ij e xi xj. The rescaling we want is simply D ii = e xi. One extra benefit of this proof is that it also gives an algorithm for finding such rescaling, which eventually leads to an nearly linear time algorithm [1]. This is a hot topic in theory community recently because its relation to bipartite matching, polynomial identity testing, communication complexity,. It is even used recently to solve an decades-long open problem in operator theory [2]. References [1] Michael B Cohen, Aleksander Madry, Dimitris Tsipras, and Adrian Vladu. Matrix scaling and balancing via box constrained newton s method and interior point methods. arxiv preprint arxiv:174.231, 217. [2] Tsz Chiu Kwok, Lap Chi Lau, Yin Tat Lee, and Akshay Ramachandran. The paulsen problem, continuous operator scaling, and smoothed analysis. arxiv preprint arxiv:171.2587, 217. [3] Ralph Tyrell Rockafellar. Convex analysis. Princeton university press, 215.