Lecture 4: Convexity

Similar documents
Lecture 2: August 29, 2018

Lecture 2: August 29, 2018

COM Optimization for Communications Summary: Convex Sets and Convex Functions

Simplex Algorithm in 1 Slide

Lecture 2: August 31

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets. Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions. Instructor: Shaddin Dughmi

Convexity I: Sets and Functions

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization

Convex sets and convex functions

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 4: Convex Sets. Instructor: Shaddin Dughmi

Convex sets and convex functions

Lecture 5: Properties of convex sets

Tutorial on Convex Optimization for Engineers

Convex Optimization. 2. Convex Sets. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University. SJTU Ying Cui 1 / 33

2. Convex sets. x 1. x 2. affine set: contains the line through any two distinct points in the set

Convex Optimization. Convex Sets. ENSAE: Optimisation 1/24

2. Convex sets. affine and convex sets. some important examples. operations that preserve convexity. generalized inequalities

Affine function. suppose f : R n R m is affine (f(x) =Ax + b with A R m n, b R m ) the image of a convex set under f is convex

Convexity Theory and Gradient Methods

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 6

Introduction to Modern Control Systems

Convex Optimization - Chapter 1-2. Xiangru Lian August 28, 2015

Convex Sets (cont.) Convex Functions

Convex Sets. CSCI5254: Convex Optimization & Its Applications. subspaces, affine sets, and convex sets. operations that preserve convexity

2. Convex sets. affine and convex sets. some important examples. operations that preserve convexity. generalized inequalities

Convexity: an introduction

Convex Optimization Lecture 2

Lecture 2 Convex Sets

Lecture: Convex Sets

Convex Sets. Pontus Giselsson

Lecture 2 - Introduction to Polytopes

Mathematical Programming and Research Methods (Part II)

EC 521 MATHEMATICAL METHODS FOR ECONOMICS. Lecture 2: Convex Sets

60 2 Convex sets. {x a T x b} {x ã T x b}

Lecture 5: Duality Theory

Math 5593 Linear Programming Lecture Notes

Lecture 1: Introduction

Lecture 2 September 3

Numerical Optimization

Lecture 15: Log Barrier Method

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro

Conic Duality. yyye

Lecture 3: Convex sets

Introduction to Convex Optimization. Prof. Daniel P. Palomar

Convex Optimization M2

Convex Geometry arising in Optimization

Convex Optimization. Erick Delage, and Ashutosh Saxena. October 20, (a) (b) (c)

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007

This lecture: Convex optimization Convex sets Convex functions Convex optimization problems Why convex optimization? Why so early in the course?

Aspects of Convex, Nonconvex, and Geometric Optimization (Lecture 1) Suvrit Sra Massachusetts Institute of Technology

Combinatorial Geometry & Topology arising in Game Theory and Optimization

Lecture 2. Topology of Sets in R n. August 27, 2008

Lecture 18: March 23

CS675: Convex and Combinatorial Optimization Spring 2018 Consequences of the Ellipsoid Algorithm. Instructor: Shaddin Dughmi

FACES OF CONVEX SETS

Chapter 4 Concepts from Geometry

M3P1/M4P1 (2005) Dr M Ruzhansky Metric and Topological Spaces Summary of the course: definitions, examples, statements.

However, this is not always true! For example, this fails if both A and B are closed and unbounded (find an example).

LECTURE 10 LECTURE OUTLINE

Locally convex topological vector spaces

LECTURE 7 LECTURE OUTLINE. Review of hyperplane separation Nonvertical hyperplanes Convex conjugate functions Conjugacy theorem Examples

Convexity and Optimization

Linear programming and duality theory

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

IE 521 Convex Optimization

Convexity and Optimization

Topology 550A Homework 3, Week 3 (Corrections: February 22, 2012)

Differential Geometry: Circle Patterns (Part 1) [Discrete Conformal Mappinngs via Circle Patterns. Kharevych, Springborn and Schröder]

Lecture 19 Subgradient Methods. November 5, 2008

MATH 890 HOMEWORK 2 DAVID MEREDITH

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

16.410/413 Principles of Autonomy and Decision Making

Math 414 Lecture 2 Everyone have a laptop?

Week 5. Convex Optimization

Homework 1 (a and b) Convex Sets and Convex Functions

of Convex Analysis Fundamentals Jean-Baptiste Hiriart-Urruty Claude Lemarechal Springer With 66 Figures

CME307/MS&E311 Theory Summary

Applied Lagrange Duality for Constrained Optimization

Some Advanced Topics in Linear Programming

CME307/MS&E311 Optimization Theory Summary

Convexity. 1 X i is convex. = b is a hyperplane in R n, and is denoted H(p, b) i.e.,

MTAEA Convexity and Quasiconvexity

Lecture 6: Faces, Facets

Topological properties of convex sets

Convex Optimization MLSS 2015

Convex Optimization. Chapter 1 - chapter 2.2

Lecture notes for Topology MMA100

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization. Author: Martin Jaggi Presenter: Zhongxing Peng

CS 435, 2018 Lecture 2, Date: 1 March 2018 Instructor: Nisheeth Vishnoi. Convex Programming and Efficiency

Characterizing Improving Directions Unconstrained Optimization

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

Point-Set Topology 1. TOPOLOGICAL SPACES AND CONTINUOUS FUNCTIONS

Applied Integer Programming

Division of the Humanities and Social Sciences. Convex Analysis and Economic Theory Winter Separation theorems

Chapter 4 Convex Optimization Problems

Lecture 3. Corner Polyhedron, Intersection Cuts, Maximal Lattice-Free Convex Sets. Tepper School of Business Carnegie Mellon University, Pittsburgh

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Transcription:

10-725: Convex Optimization Fall 2013 Lecture 4: Convexity Lecturer: Barnabás Póczos Scribes: Jessica Chemali, David Fouhey, Yuxiong Wang Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications. They may be distributed outside this class only with the permission of the Instructor. The goals of this lecture are to: 1. give definitions that are important to convexity as well as examples of convex sets and basic properties; 2. define convex functions and their properties, as well as some examples. 4.1 Basic Definitions We begin by formalizing a few mathematical objects that we will use throughout the lecture: Definition 4.1 A line passing through x 1 and x 2 R n forms the set {x R n x = θx 1 + (1 θ)x 2, θ R}. Definition 4.2 The line segment passing through x 1 and x 2 is defined similarly, but with θ confined to [0, 1], or {x R n x = θx 1 + (1 θ)x 2, θ [0, 1]}. Figure 4.1: Line and line segment. Definition 4.3 A Euclidean ball with radius r centered at x 0, B(x o, r), is: {x R n x x 0 2 r} Definition 4.4 An L p ball is the equivalent, but for the L p distance, or x p = ( n i=1 x i p ) 1/p {x R n x x 0 p r} Definition 4.5 A half-space HS(a, b) is a set of points on one side of a hyperplane: {x R n a T x b}. 4-1

4-2 Lecture 4: Convexity Figure 4.2: Halfspace. Definition 4.6 A hyperplane HS(a, b) is defined similarly, but with equality, or {x R n a T x = b}. Figure 4.3: Hyperplane. Definition 4.7 A cone is any set such that if x C and θ 0 then θx C. We can immediately use these to give a definition of an affine set. Specifically, a set C is affine if for any points in it, the line through them is also contained in C. For example, the affine hull of a circle in R 2 is R 2 and the affine hull of a line in R 2 is R. Written more formally, for x 1, x 2 C and θ C, θx 1 + (1 θ)x 2 C. This definition also leads to the definition of the affine hull of a set, which is the smallest affine set containing C. The affine hull can be written out explicitly as: { } k Aff[C] = θ 1 x 1 + + θ k x k x i C, θ i = 1 Similarly, a conic hull is i=1 Cone[C] = {x x = θ 1 x 1 +... + θ k x k, θ i 0, x i C}

Lecture 4: Convexity 4-3 Figure 4.4: Cone. Figure 4.5: Conic Hull. Before we proceed with the remaining definitions, we also need to give a few topological definitions. Given a set C, Definition 4.8 We say that x is on the boundary of C, or C, if for some small enough ɛ > 0, the epsilonradius ball centered at x covers both inside and outside the set, or B(x, ɛ) C and B(x, ɛ) C c. Definition 4.9 We say that x is in the interior of C if ɛ > 0 s.t. B(x, ɛ) C. Definition 4.10 We say that the relative interior of C, rel int C is the same as the interior, but with the ball restricted to the affine hull of C. Specifically, x is in rel int C if B(x, ɛ) Aff[C] C. This definition is more natural in some ways. Consider a line segment in a 2D-space. No points in the space are in the interior the line segment since any epsilon ball contains points off the line; however, if we restrict our consideration to the affine hull of the line segment (the equivalent line), some points are. We can then use set operations to take these and generate more definitions and properties of C: 1. The closure of C, cl C, is C C. 2. The relative boundary of C, rel C, is cl C\rel int C. 3. C is closed if C C 4. C is open if C C =

4-4 Lecture 4: Convexity 5. C is compact if it is closed and bounded in R n. 4.2 Convex sets 4.2.1 Definitions Definition 4.11 A set C R n is convex if for any two points in C, the line segment joining them is contained in C. Formally, it is convex if and only if for all x 1, x 2 C and θ [0, 1], θx 1 + (1 θ)x 2 C. Figure 4.6: Convex sets. Definition 4.12 A convex set is strictly convex if for any two points in the set in general position, the line segment less the endpoints is contained in int C. Formally, if for all x 1, x 2 C with x 1 x 2 and θ (0, 1) θx 1 + (1 θ)x 2 int C. Definition 4.13 The convex hull of a set C is the set of all convex combinations of the points in C conv[c] = {θ 1 x 1 + + θ k x k x i C, θ i 0 i = 1,..., k, k θ i = 1, k Z + } Here, we are constraining θ i in both ways that we did before: θ i 0 as in the conic hull, and k i=1 θ i = 1 as in the affine hull. Just as in the affine hull case, the conv[c] is the smallest convex set that also contains C. Note that: conv[c] is convex and C conv[c]. i=1 Figure 4.7: Convex hull.

Lecture 4: Convexity 4-5 4.2.2 Examples Example 4.14 Many of the objects defined earlier are convex sets: lines, line segments, hyperplanes and half spaces. Additionally, the empty set and singleton sets {x} are convex, as are complete spaces R d. L p Balls are convex for p 1, but are not for 0 < p < 1. Example 4.15 A more complex example of a convex set is a polyhedron. This, at least in this course, is a solution set to a finite set number of linear equalities and inequalities (i.e., {x Ax b, Cx = d}. If the solution set is bounded, we call it a polytope. As we will see later, this is easy to show as convex, as it is a an intersection of halfspaces (from Ax b) and hyperplanes (from Cx = d), each of which are convex. Example 4.16 One can also construct convex sets via the convex combination of points. The definition of the convex hull gives it for a finite set of points. One can also define it for an infinite sum: given x 1, x 2,... C and θ 1, θ 2,... 0 such that i=1 θ i = 1, then i=1 θ ix i C so long as the series converges. Similarly, one can use a density function p, and p(x)xdx C if the integral exists. C Example 4.17 We can extend this to more complex domains, such as matrices. For instance, the set of n n positive semi-definite (PSD) matrices form a convex cone which we denote S n +. Recall that a matrix A is PSD if and only if A = A T and x T Ax 0 for any x R n ; equivalently a symmetric matrix is PSD if all its eigenvalues are positive. This may also be written as A 0, which suggests a partial ordering defined by the PSD property, namely, M N if and only if M N is PSD. We can prove convexity by linearity: given PSD matrices A and B, x T [θa + (1 θ)b]x] = θx T Ax + (1 θ)x T Bx 0 as it is a convex sum of non-negative terms (since A and B are PSD). 4.2.3 Representations We can represent a convex set in two equivalent ways. Definition 4.18 The primal representation represents a convex set C using its convex hull: a convex combination of its points. The number of points might be finite (like in the case of a polyhedron) or infinite (like the case of a circle). Definition 4.19 The dual representation represents a closed set C as the intersection of all the closed halfspaces containing it. 4.2.4 Convexity Preserving Operations One easy way to show that a set is convex is to construct it from convex sets via convexity preserving operations. Here are a few. Given convex sets C, D R n, b R n, and A R m n, α R, the following sets are convex: 1. Translation: C + b = {x + b : x C} 2. Scaling: αc = {αx : x C} 3. Intersection: C D; this can also be extended to an infinite number of sets.

4-6 Lecture 4: Convexity Figure 4.8: Dual representation as Intersection of hyperplanes. 4. Affine: AC + b = {Ax + b x C} R m 5. Set sum: C + D = {x + y x C, y D}. 6. Direct sum: C D = {(x, y) R n+m, x C, y D} A more involved operation is perspective projection: suppose C R n R ++ is convex, then P (C) is also convex, where P is perspective projection, or: P (x) = P (x 1,..., x n, t) = (x 1 /t, x 2 /t,..., x n /t) R n This can be extended further to linear-fractional functions. Theorem: image of linear-fractional functions. Let A R m n, c R n, and b R m, d R. Then, define f as: f(x) = Ax + b c T x + d, dom f = {x ct x + d > 0} Then f(x) is also convex. Example 4.20 Let C = {p ij u = u i [0, 1], v = v j [0, 1], i [1, m], j [1, n]}. The set C represents the joint probability distribution of discrete variables u and v. Let D = {f ij i [1, m], j [1, n]}, where f ij = P r(u = u i v = v j ) = p ij i p ij By the linear-fractionals theorem above: if C is convex, then D is also convex. 4.2.5 Separating hyperplanes Theorem: Let C and D be two convex sets such that C D =. Then a T R n, b R, s.t x C a T x b, x D a T x b.

Lecture 4: Convexity 4-7 Strict inequalities do not always hold even when C and D do not intersect. For example, when points of both C and D are points of the separating hyperplane. Figure 4.9: Separating hyperplane. Definition 4.21 Strong Separation: Let C and D be two disjoint convex sets. We say that they are stongly separated if they can be shifted by a small amount and stay separated by a hyperplane H(a, b). Formally: ɛ > 0 s.t. a T (C + B(0, ɛ)) > b and a T (D + B(0, ɛ)) < b. Definition 4.22 Proper Separation: C and D are properly separated, if they are separated by hyperplane H(a, b) and it is not the case that C H(a, b) and D H(a, b). Definition 4.23 Strict Separation: C and D are strictly separated by hyperplane H(a,b) if a T x < b and a T y > b x C and y D. Theorem 4.24 Strong Separation Theorem: If C and D are non-empty convex sets in R n, cl C cl D =, and one of them is bounded, then there is a hyperplane that separates them strongly. Example 4.25 Why do we need boundedness? We can construct one example to show that without the boundedness property of at least one of the sets, there s no strong separation. Let C be the epigraph of f(x) = 1 x and D be the set of all points below the line y = 0 in the 2D euclidean space. Then because f(x) tends to 0, the line y = 0 does not strongly separate the two sets. Theorem 4.26 Another version of the theorem is that C and D are strongly separated iff the distance between any two points x C and y D is greater than 0. In the example above the distance between C and D decreases to 0 as x tends to infinity. Theorem 4.27 Supporting plane theorem: For any point x 0 at the boundary of a convex set, a hyperplane that lies entirely on one side of the set. There is a very useful partial converse for this theorem: If a set C is closed and has a non-empty interior, and a supporting hyperplane for all its boundary points, then C is convex.

4-8 Lecture 4: Convexity Figure 4.10: Supporting hyperplane. 4.2.6 How to prove that a set is convex Use the definition. Represent it as a convex hull. Represent it as an intersection of halfspaces. Use the supporting hyperplane converse theorem. Construct C from convex sets using convexity-preserving operations. 4.3 Convex Function 4.3.1 Definitions Definition 4.28 A function f : R n R is convex if domf is a convex set and if for all x, y domf, and θ with 0 θ 1, we have f(θx + (1 θ)y) θf(x) + (1 θ)f(y). Definition 4.29 A function f is strictly convex if whenever x y, and 0 < θ < 1, strict inequality holds, that is, we have f(θx + (1 θ)y) < θf(x) + (1 θ)f(y). Example 4.30 x 4 is strictly convex. Definition 4.31 f is concave if f is convex. The geometric interpretation of convex functions is shown in Fig.??. The chord from x to y (i.e., line segment) between any two points on the graph lies above the graph of f.

Lecture 4: Convexity 4-9 Figure 4.11: Graph of a convex function 4.3.2 Strongly Convex Function Definition 4.32 A differentiable function f is called m strongly convex if m > 0 and An equivalent condition is ( f(x) f(y)) T (x y) m x y 2 2, x, y domf f(y) f(x) + f(x) T (y x) + m 2 y x 2 2, x, y domf It is not necessary for a function to be differentiable. We could have the definition without gradient. Definition 4.33 A function f is called m strongly convex if m > 0 and for 0 t 1 f(tx + (1 t)y) tf(x) + (1 t)f(y) 1 2 mt(1 t) x y 2 2, x, y domf If the function is twice continuously differentiable, we could have the definition with Hessian matrix. Definition 4.34 f is called m stronly convex if m > 0 and 2 f(x) mi, x, y domf A strongly convex function is also strictly convex, but not vice-versa. Example 4.35 Convex functions x p, p 1, x R. Max function: f(x) = max(x 1,..., x n ), x R n. Norms: Every norm on R n is convex.

4-10 Lecture 4: Convexity Figure 4.12: Epigraph of a function f, shown shaded. The lower boundary, shown darker, is the graph of f. Example 4.36 Concave functions Geometric mean: f(x) = ( n i=1 x i) 1/n isconcave, x R n ++. Log determinant: logdet(x)isconcave, X S n ++. 4.3.3 Extended-value Extension It is often convenient to extend a convex function to all of R n by defining its value to be outside its domain. Definition 4.37 f : R n R { } is extended-value extension of f: { f(x) x domf f(x) = x / domf The extension f is defined on all R n, and takes values in R { }. This does not change its convexity Theorem 4.38 f is convex f is convex f(θx + (1 θ)y) θ f(x) + (1 θ) f(y), 0 θ 1 4.3.4 Epigraph Definition 4.39 The epigraph of a function f : R n R is defined as epif = {(x, t) x domf, f(x) t} The definition is illustrated in Fig.??. The link between convex sets and convex functions is via the epigraph: Theorem 4.40 f is convex epif is a convex set

Lecture 4: Convexity 4-11 Figure 4.13: 0th order property. Figure 4.14: 1st order property: If f is convex and differentiable, then f(y) f(x) + f(x) T (y x). 4.3.5 Convex Function Properties 4.3.5.1 0th Order Characterization A function is convex if and only if when restricted to any line that intersects its domain is convex. That is: f : R n R is convex g(x) = f(x + tv) is convex, domg = {t x + tv domf}, x domf, v R n, This is useful, because we only need to check the convexity of 1D functions, as Fig.?? shows. 4.3.5.2 1st Order Characterization Let f be a differentiable function, domf is open and convex, then we have f is convex f(y) f(x) + f(x) T (y x) It is illustrated as Fig.?? shows. The inequality states that for a convex function, the first-order Taylor approximation is a global underestimator of the function. Conversely, if the first-order Taylor approximation of a function is always a global underestimator of the function, then the function is convex.

4-12 Lecture 4: Convexity The inequality also shows that if f(x) = 0, then for y domf, f(y) f(x), and x is a global minimizer of the function. 4.3.5.3 2nd Order Characterization Let f be twice differentiable, domf is open, then we have f is convex 2 f(x) 0, x domf If 2 f(x) > 0, x domf, f is strictly convex. The converse is not true. For example, the function f : R R given by f(x) = x 4 is strictly convex but has zero second derivative at x = 0 4.3.6 Jensen s Inequality The basic inequality, f(θx + (1 θ)y) θf(x) + (1 θ)f(y), is sometimes called Jensen s inequality. Now we extended it to convex combinations of more than two points. If x is a random variable such that x domf with probability one, and f is convex, then we have provided the expectations exist. f(ex) Ef(x), 4.3.7 Convexity-preserving function operations 1. Nonnegative Weighted Sum: If f 1, f 2 are convex, ω i 0 h(x) = ω 1 f 1 (x) + ω 2 f 2 (x) is convex. 2. Pointwise Max/Sup: If f, g are convex m(x) = max{f(x), g(x)} is convex. 3. Extension of Pointwise Max/Sup: If f(x, y) is convex in x for each y g(x) = sup y C f(x, y) is convex in x, provided g(x) > for some x. 4. Affine Map: If f : R n R is convex g(x) = f(ax + b) is convex, where A R n m, b R n. 5. Composition: If f, g are convex, and g is non-decreasing h(x) = g(f(x)) is convex. 6. Perspective Map: If f(x) is convex g(x, t) = tf(x/t) is convex.

Lecture 4: Convexity 4-13 4.3.8 How to prove a function is convex Use definition directly. Prove that epigraph is convex vis set methods. 0th, 1st, 2nd order convexity properties. Construct f from simpler convex functions using convexity-preserving operations. 4.4 Summary 4.4.1 Convex Sets Representation: convex hull, intersect hyperplanes. Supporting, separating hyperplanes. Operations that preserve convexity. 4.4.2 Convex Functions Epigraph. 0 orders, 1st order, 2nd order conditions. Operations that preserve convexity.