Tutorial on Convex Optimization for Engineers M.Sc. Jens Steinwandt Communications Research Laboratory Ilmenau University of Technology PO Box 100565 D-98684 Ilmenau, Germany jens.steinwandt@tu-ilmenau.de January 2012
Course mechanics strongly based on the advanced course Convex Optimization I by Prof. Stephen Boyd at Stanford University, CA info, slides, video lectures, textbook, software on web page: www.stanford.edu/class/ee364a/ mandatory assignment (30 % of the final grade) Prerequisites working knowledge of linear algebra
Course objectives to provide you with overview of convex optimization working knowledge on convex optimization in particular, to provide you with skills about how to recognize convex problems model nonconvex problems as convex get a feel for easy and difficult problems efficiently solve the problems using optimization tools
Outline 1. Introduction 2. Convex sets 3. Convex functions 4. Convex optimization problems 5. Lagrangian duality theory 6. Disciplined convex programming and CVX
1. Introduction
Mathematical optimization (mathematical) optimization problem minimize f 0 (x) subject to f i (x) b i, i = 1,...,m x = (x 1,...,x n ): optimization variables f 0 : R n R: objective function f i : R n R, i = 1,...,m: constraint functions optimal solution x has smallest value of f 0 among all vectors that satisfy the constraints Introduction 1 2
Solving optimization problems general optimization problem very difficult to solve methods involve some compromise, e.g., very long computation time, or not always finding the solution exceptions: certain problem classes can be solved efficiently and reliably least-squares problems linear programming problems convex optimization problems Introduction 1 4
Least-squares minimize Ax b 2 2 solving least-squares problems analytical solution: x = (A T A) 1 A T b reliable and efficient algorithms and software computation time proportional to n 2 k (A R k n ); less if structured a mature technology using least-squares least-squares problems are easy to recognize a few standard techniques increase flexibility (e.g., including weights, adding regularization terms) Introduction 1 5
Linear programming solving linear programs no analytical formula for solution minimize c T x subject to a T i x b i, i = 1,...,m reliable and efficient algorithms and software computation time proportional to n 2 m if m n; less with structure a mature technology using linear programming not as easy to recognize as least-squares problems a few standard tricks used to convert problems into linear programs (e.g., problems involving l 1 - or l -norms, piecewise-linear functions) Introduction 1 6
Convex optimization problem minimize f 0 (x) subject to f i (x) b i, i = 1,...,m objective and constraint functions are convex: f i (αx+βy) αf i (x)+βf i (y) if α+β = 1, α 0, β 0 includes least-squares problems and linear programs as special cases Introduction 1 7
Why is convex optimization so essential? convex optimization formulation always achieves global minimum, no local traps certificate for infeasibility can be solved by polynomial time complexity algorithms (e.g., interior point methods) highly efficient software available (e.g., SeDuMi, CVX) the dividing line between easy and difficult problems (compare with solving linear equations) = whenever possible, always go for convex formulation Introduction
Brief history of convex optimization theory (convex analysis): ca1900 1970 algorithms 1947: simplex algorithm for linear programming (Dantzig) 1960s: early interior-point methods (Fiacco & McCormick, Dikin,... ) 1970s: ellipsoid method and other subgradient methods 1980s: polynomial-time interior-point methods for linear programming (Karmarkar 1984) late 1980s now: polynomial-time interior-point methods for nonlinear convex optimization (Nesterov & Nemirovski 1994) applications before 1990: mostly in operations research; few in engineering since 1990: many new applications in engineering (control, signal processing, communications, circuit design,... ); new problem classes (semidefinite and second-order cone programming, robust optimization) Introduction 1 15
2. Convex sets
Subspaces a space S R n is a subspace if x 1, x 2 S, θ 1, θ 2 R = θ 1x 1 + θ 2x 2 S geometrically: plane through x 1, x 2 S and the origin representation: range(a) = {Aw w R q } (A = [a 1,... a q]) = {w 1a 1 + w qa q w i R} = span {a 1, a 2,..., a q} null(a) = {x Bx = 0} (B = [b 1,... b p] T ) = { x b T 1 x = 0,..., b T p x = 0 } Convex sets
Affine set line through x 1, x 2 : all points x = θx 1 +(1 θ)x 2 (θ R) θ = 1.2 x 1 θ = 1 θ = 0.6 x 2 θ = 0 θ = 0.2 affine set: contains the line through any two distinct points in the set example: solution set of linear equations {x Ax = b} (conversely, every affine set can be expressed as solution set of system of linear equations) Convex sets 2 2
Convex set line segment between x 1 and x 2 : all points with 0 θ 1 x = θx 1 +(1 θ)x 2 convex set: contains line segment between any two points in the set x 1,x 2 C, 0 θ 1 = θx 1 +(1 θ)x 2 C examples (one convex, two nonconvex sets) Convex sets 2 3
Convex combination and convex hull convex combination of x 1,..., x k : any point x of the form with θ 1 + +θ k = 1, θ i 0 x = θ 1 x 1 +θ 2 x 2 + +θ k x k convex hull convs: set of all convex combinations of points in S Convex sets 2 4
Convex cone conic (nonnegative) combination of x 1 and x 2 : any point of the form with θ 1 0, θ 2 0 x = θ 1 x 1 +θ 2 x 2 x 1 0 x 2 convex cone: set that contains all conic combinations of points in the set Convex sets 2 5
Hyperplanes and halfspaces hyperplane: set of the form {x a T x = b} (a 0) a x 0 x a T x = b halfspace: set of the form {x a T x b} (a 0) a x 0 a T x b a T x b a is the normal vector hyperplanes are affine and convex; halfspaces are convex Convex sets 2 6
Euclidean balls and ellipsoids (Euclidean) ball with center x c and radius r: B(x c,r) = {x x x c 2 r} = {x c +ru u 2 1} ellipsoid: set of the form {x (x x c ) T P 1 (x x c ) 1} with P S n ++ (i.e., P symmetric positive definite) x c other representation: {x c +Au u 2 1} with A square and nonsingular Convex sets 2 7
Norm balls and norm cones norm: a function that satisfies x 0; x = 0 if and only if x = 0 tx = t x for t R x+y x + y notation: is general (unspecified) norm; symb is particular norm norm ball with center x c and radius r: {x x x c r} 1 norm cone: {(x,t) x t} Euclidean norm cone is called secondorder cone norm balls and cones are convex t 0.5 1 0 0 x 2 1 1 x 1 0 1 Convex sets 2 8
Polyhedra solution set of finitely many linear inequalities and equalities Ax b, Cx = d (A R m n, C R p n, is componentwise inequality) a 1 a2 a 5 P a 3 a 4 polyhedron is intersection of finite number of halfspaces and hyperplanes Convex sets 2 9
Positive semidefinite cone notation: S n is set of symmetric n n matrices S n + = {X S n X 0}: positive semidefinite n n matrices X S n + z T Xz 0 for all z S n + is a convex cone S n ++ = {X S n X 0}: positive definite n n matrices 1 example: [ x y y z ] S 2 + z 0.5 0 1 0 y 1 0 x 0.5 1 Convex sets 2 10
Operations that preserve convexity practical methods for establishing convexity of a set C 1. apply definition x 1,x 2 C, 0 θ 1 = θx 1 +(1 θ)x 2 C 2. show that C is obtained from simple convex sets (hyperplanes, halfspaces, norm balls,... ) by operations that preserve convexity intersection affine functions perspective function linear-fractional functions Convex sets 2 11
Examples of convex sets linear subspace: S = {x Ax = 0} is a convex cone affine subspace: S = {x Ax = b} is a convex set polyhedral set: S = {x Ax b} is a convex set PSD matrix cone: S = {A A is symmetric, A 0} is convex second order cone: S = {(t, x) t x } is convex intersection intersection of linear subspaces affine subspaces convex cones convex sets is also a linear subspace affine subspace convex cone convex set example: a polyhedron is intersection of a finite number of halfspaces Convex sets
3. Convex functions
Definition f : R n R is convex if domf is a convex set and f(θx+(1 θ)y) θf(x)+(1 θ)f(y) for all x,y domf, 0 θ 1 (x,f(x)) (y,f(y)) f is concave if f is convex f is strictly convex if domf is convex and f(θx+(1 θ)y) < θf(x)+(1 θ)f(y) for x,y domf, x y, 0 < θ < 1 Convex functions 3 2
Examples on R convex: affine: ax+b on R, for any a,b R exponential: e ax, for any a R powers: x α on R ++, for α 1 or α 0 powers of absolute value: x p on R, for p 1 negative entropy: xlogx on R ++ concave: affine: ax+b on R, for any a,b R powers: x α on R ++, for 0 α 1 logarithm: logx on R ++ Convex functions 3 3
Examples on R n and R m n affine functions are convex and concave; all norms are convex examples on R n affine function f(x) = a T x+b norms: x p = ( n i=1 x i p ) 1/p for p 1; x = max k x k examples on R m n (m n matrices) affine function f(x) = tr(a T X)+b = m i=1 n A ij X ij +b j=1 spectral (maximum singular value) norm f(x) = X 2 = σ max (X) = (λ max (X T X)) 1/2 Convex functions 3 4
First-order condition f is differentiable if domf is open and the gradient f(x) = exists at each x domf ( f(x), f(x),..., f(x) ) x 1 x 2 x n 1st-order condition: differentiable f with convex domain is convex iff f(y) f(x)+ f(x) T (y x) for all x,y domf f(y) f(x) + f(x) T (y x) (x,f(x)) first-order approximation of f is global underestimator Convex functions 3 7
Second-order conditions f is twice differentiable if domf is open and the Hessian 2 f(x) S n, exists at each x domf 2 f(x) ij = 2 f(x) x i x j, i,j = 1,...,n, 2nd-order conditions: for twice differentiable f with convex domain f is convex if and only if 2 f(x) 0 for all x domf if 2 f(x) 0 for all x domf, then f is strictly convex Convex functions 3 8
Examples quadratic function: f(x) = (1/2)x T Px+q T x+r (with P S n ) f(x) = Px+q, 2 f(x) = P convex if P 0 least-squares objective: f(x) = Ax b 2 2 f(x) = 2A T (Ax b), 2 f(x) = 2A T A convex (for any A) quadratic-over-linear: f(x,y) = x 2 /y 2 2 f(x,y) = 2 y 3 [ y x ][ y x ] T 0 f(x,y) 1 0 2 2 convex for y > 0 1 y 0 2 x 0 Convex functions 3 9
Epigraph and sublevel set α-sublevel set of f : R n R: C α = {x domf f(x) α} sublevel sets of convex functions are convex (converse is false) epigraph of f : R n R: epif = {(x,t) R n+1 x domf, f(x) t} epif f f is convex if and only if epif is a convex set Convex functions 3 11
Properties of convex functions convexity over all lines: f (x) is convex f (x 0 + th) is convex in t for all x 0, h positive multiple: f (x) is convex = αf (x) is convex for all α 0 sum of convex functions: f 1(x), f 2(x) convex = f 1(x) + f 2(x) is convex pointwise maximum: f 1(x), f 2(x) convex = max{f 1(x), f 2(x)} is convex affine transformation of domain: f (x) is convex = f (Ax + b) is convex Convex functions
Operations that preserve convexity practical methods for establishing convexity of a function 1. verify definition (often simplified by restricting to a line) 2. for twice differentiable functions, show 2 f(x) 0 3. show that f is obtained from simple convex functions by operations that preserve convexity nonnegative weighted sum composition with affine function pointwise maximum and supremum composition minimization perspective Convex functions 3 13
Quasiconvex functions f : R n R is quasiconvex if domf is convex and the sublevel sets are convex for all α S α = {x domf f(x) α} β α a b c f is quasiconcave if f is quasiconvex f is quasilinear if it is quasiconvex and quasiconcave Convex functions 3 23