Machine Learning for Signal Processing Lecture 4: Optimization

Size: px
Start display at page:

Download "Machine Learning for Signal Processing Lecture 4: Optimization"

Transcription

1 Machine Learning for Signal Processing Lecture 4: Optimization 13 Sep 2015 Instructor: Bhiksha Raj (slides largely by Najim Dehak, JHU) /

2 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /

3 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /

4 A problem we recently saw M = N = S The projection matrix P is the matrix that minimizes the total error between the projected matrix S and the original matrix M /

5 The projection problem S = PM For individual vectors in the spectrogram S i = PM i Total projection error is E = M i PM i 2 i The projection matrix projects onto the space of notes in N P = NC The problem of finding P : Minimize E = M i PM i 2 i such that P = NC This is a problem of constrained optimization /

6 Optimization Optimization is finding the best value of a function (which can be the best minimum) The number of turning points of a function depend on the order of the function. Not all turning points are minima. They can be minima, maxima or inflection points. f (x) f(x) global maximum min x f (x) inflection point global minimum local minimum x

7 Examples of Optimization : Multivariate functions Find the optimal point in these functions /

8 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /

9 Simple Approach: Turning Point f(x) X The minimum of the function is always a turning point The function increases on either side How to identify these turning points? /

10 The derivative of a curve y y x x The derivative α of a curve is a multiplicative factor explaining how much y changes in response to a very small change in x y = x For scalar functions of scalar variables, often expressed as dy y = dy dx x y = f (x) x dx or as f (x) We have all learned how to compute derivatives in basic calculus /

11 Positive derivative dy dx > The derivative of a Curve Negative derivative dy dx < Zero derivative dy dx = In upward-rising regions of the curve, the derivative is positive Small increase in X cause Y to increase In downward-falling regions, the derivative is negative At turning points, the derivative is 0 Assumption: the function is differentiable at the turning point /

12 Finding the minimum of a function f(x) dy dx = 0 x Find the value x at which f (x) = 0 Solve df(x) dx = 0 The solution is a turning point But is it a minimum? /

13 Turning Points Both maxima and minima have zero derivative /

14 Derivatives of a curve f(x) x Both maxima and minima are turning points /

15 Derivatives of a curve f (x) f(x) x Both maxima and minima are turning points Both maxima and minima have zero derivative /

16 Derivative of the derivative of the curve f (x) f (x) f(x) x Both maxima and minima are turning points Both maxima and minima have zero derivative The second derivative f (x) is ve at maxima and +ve at minima! At maxima the derivative goes from +ve to ve, so the derivative decreases as x increases At minima the derivative goes from ve to +ve and increases as x increases /

17 Soln: Finding the minimum or f(x) maximum of a function dy dx = 0 Find the value x at which f (x) = 0: Solve df(x) dx = 0 The solution x soln is a turning point Check the double derivative at x soln : compute f x soln = df (x soln) dx If f x soln is positive x soln is a minimum, otherwise it is a maximum x /

18 What about functions of multiple variables? The optimum point is still turning point Shifting in any direction will increase the value For smooth functions, miniscule shifts will not result in any change at all We must find a point where shifting in any direction by a microscopic amount will not change the value of the function /

19 A brief note on derivatives of multivariate functions /

20 The Gradient of a scalar function f(x) The Gradient f(x) of a scalar function f(x) of a multi-variate input X is a multiplicative factor that gives us the change in f(x) for tiny variations in X X f(x) = f(x) T X /

21 Gradients of scalar functions with multi-variate inputs Consider f X = f x 1, x 2,, x n f(x) f(x) = x 1 f(x) x 2 f(x) Check: f X = f X T X = f(x) x x 1 + f(x) 1 x 2 x n x f(x) x n x n

22 A well-known vector property u. v = u v cosθ The inner product between two vectors of fixed lengths is maximum when the two vectors are aligned /

23 f X Properties of Gradient = f X T X The inner product between f X and X Fixing the length of X E.g. X = 1 f X is max if X = c f X f X, X = 0 The function f(x) increases most rapidly if the input increment X is perfectly aligned to f X The gradient is the direction of fastest increase in f(x) /

24 Gradient Gradient vector f(x) /

25 Gradient Gradient vector f(x) Moving in this direction increases f X fastest /

26 Gradient Gradient vector f(x) f(x) Moving in this direction decreases f X fastest Moving in this direction increases f(x) fastest /

27 Gradient Gradient here is 0 Gradient here is /

28 Properties of Gradient: 2 The gradient vector f(x) is perpendicular to the level curve 29

29 Derivatives of vector function of vector input Y = f(x) X = x 1 x 2 Y = y 1 y 2 X f(x) The Gradient f(x) of a scalar function f(x) of a multi-variate input X is a multiplicative factor that gives us the change in f(x) for tiny variations in X f(x) = f(x) T X /

30 Gradient of vector function of vector input / x n x x X n m n n n n x y x y x y x y x y x y x y x y x y X f ) ( y m y y X f.. ) ( 2 1 Properties and interpretations are similar to the case of scalar functions of vector inputs

31 The Hessian The Hessian of a function f(x 1, x 2,, x n ) is given by the second derivative é ê ê ê ê ê ê Ñ 2 f (x 1,..., x n ) := ê ê ê ê ê ê ê ë 2 f x f x 1 x f 2 f.. 2 x 2 x 1 x 2 2 f x 1 x n 2 f x 2 x n f 2 f.. x n x 1 x n x 2 2 f x n / ù ú ú ú ú ú ú ú ú ú ú ú ú ú û

32 Returning to direct optimization /

33 Finding the minimum of a scalar function of a multi-variate input The optimum point is a turning point the gradient will be /

34 Unconstrained Minimization of function (Multivariate) 1. Solve for the X where the gradient equation equals to zero f ( X ) 2. Compute the Hessian Matrix 2 f(x) at the candidate solution and verify that Hessian is positive definite (eigenvalues positive) -> to identify local minima Hessian is negative definite (eigenvalues negative) -> to identify local maxima /

35 Unconstrained Minimization of Minimize function (Example) f (x 1, x 2, x 3 ) = (x 1 ) 2 + x 1 (1- x 2 )-(x 2 ) 2 - x 2 x 3 +(x 3 ) 2 + x 3 Gradient é ê Ñf = ê ê ëê 2x x 2 -x 1 + 2x 2 - x 3 -x 2 + 2x 3 +1 ù ú ú ú ûú /

36 Unconstrained Minimization of Set the gradient to null é ê Ñf = 0Þ ê ê ëê function (Example) 2x x 2 -x 1 + 2x 2 - x 3 -x 2 + 2x 3 +1 ù é ú ê ú = ê ú ê ûú ë Solving the 3 equations system with 3 unknowns é ê x = ê ê ëê x 1 x 2 x 3 ù é ú ê ú = ê ú ê ûú ë ù ú ú ú û ù ú ú ú û /

37 Unconstrained Minimization of function (Example) Compute the Hessian matrix Evaluate the eigenvalues of the Hessian matrix l 1 = 3.414, l 2 = 0.586, l 3 = 2 All the eigenvalues are positives => the Hessian matrix is positive definite The point é ê x = ê ê ëê x 1 x 2 x 3 ù é ú ê ú = ê ú ê ûú ë ù ú ú ú û é ê Ñ 2 f = ê ê ë is a minimum / ù ú ú ú û

38 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /

39 Closed Form Solutions are not always available f(x) Often it is not possible to simply solve f X = 0 The function to minimize/maximize may have an intractable form In these situations, iterative solutions are used Begin with a guess for the optimal X and refine it iteratively until the correct value is obtained X /

40 Iterative solutions f(x) X X x 0 x 1 x 2 x 5 x 3 X 0 2 x X 4 1 Iterative solutions Start from an initial guess X 0 for the optimal X Update the guess towards a (hopefully) better value of f(x) Stop when f(x) no longer decreases Problems: Which direction to step in How big must the steps be /

41 Descent methods Iterative solutions that attempt to descend the function in steps to arrive at the minimum Based on the first order derivatives (gradient) and in some cases the second order derivatives (Hessian). Newton s method is based on both first and second derivatives Gradient descent is based only on the first derivative /

42 Descent methods Iterative solutions that attempt to descend the function in steps to arrive at the minimum Based on the first order derivatives (gradient) and in some cases the second order derivatives (Hessian). Newton s method is based on both first and second derivatives Gradient descent is based only on the first derivative /

43 Newton s Method The Newton Method is based on an iterative step to minimize the derivative f (x) x k+1 = x k - f (xk ) f (x k ) k is the current iteration The iterations continue until we achieve the stopping criterion x k+1 x k < ε /

44 Newton s method to find the zero of a function Newton s method to find the zero of a function Initialize estimate Approximate function by the tangent at initial value Update estimate to location where tangent becomes 0 Iterate /

45 Newton s Method to optimize a function y f(x) x f (x) Apply Newton s method to the derivative of the function! The derivative goes to 0 at the optimum Algorithm: Initialize x 0 K th iteration: Approximate f (x) by the tangent at x k Find the location x intersect where the tangent goes to 0. Set x k+1 = x intersect Iterate /

46 Newton s method for univariate functions Apply Newton s algorithm to find the zero of the derivative f (x) x k+1 = x k - f (xk ) f (x k ) k is the current iteration The iterations continue until we achieve the stopping criterion x k+1 x k < ε /

47 Newton s method for multivariate functions 1. Select an initial starting point X 0 2. Evaluate the gradient f X k and Hessian 2 f X k at X k 3. Calculate the new X k+1 using the following X X 2 k 1 f ( X ). f ( X ) k 1 k k 4. Repeat Steps 2 and 3 until convergence /

48 Newton s Method Newton s approach is based on the computation of both gradient and Hessian Fast to converge (few iterations) Slow to compute Newton s method (arrives at optimum in a single step) x 0 This method is very sensitive to the initial point If the initial point is very far from the optimal point, the optimization process may not converge /

49 Descent methods Iterative solutions that attempt to descend the function in steps to arrive at the minimum Based on the first order derivatives (gradient) and in some cases the second order derivatives (Hessian). Newton s method is based on both first and second derivatives Gradient descent is based only on the first derivative /

50 The Approach of Gradient Descent Iterative solution: Start at some point Find direction in which to shift this point to decrease error This can be found from the derivative of the function A positive derivative moving left decreases error A negative derivative moving right decreases error Shift point in this direction

51 The Approach of Gradient Descent Iterative solution: Trivial algorithm Initialize x 0 While f x k 0 If sign f x k is positive: Else x k+1 = x k step x k+1 = x k + step But what must step be to ensure we actually get to the optimum?

52 The Approach of Gradient Descent Iterative solution: Trivial algorithm Initialize x 0 While f x k 0 x k+1 = x k sign f x k. step Identical to previous algorithm

53 The Approach of Gradient Descent Iterative solution: Trivial algorithm Initialize x 0 Whilef x k 0 x k+1 = x k η k f x k η k is the step size What must the step size be?

54 Gradient descent/ascent (multivariate) The gradient descent/ascent method to find the minimum or maximum of a function f iteratively To find a maximum move in the direction of the gradient x k+1 = x k + η k f x k To find a minimum move exactly opposite the direction of the gradient x k+1 = x k η k f x k What is the step size η k /

55 1. Fixed step size Fixed step size Use fixed value for η k /

56 f Influence of step size example (constant step size) é 2 2 ( x1, x2) ( x1 ) x1x2 4( x2) x initial = ê 3 ë ù ú û x 0 x /

57 2. Backtracking line search for step size Two parameters α (typically 0.5) and β (typically 0.8) At each iteration, estimate step size as follows: Set η k = 1 Update η k = βη k until f x k η k f x k f x k αη k f x k 2 Update x k+1 = x k η k f x k Intuitively: At each iteration Take a unit step size and keep shrinking it until we arrive at a place where the function f x k η k f x k actually decreases sufficiently w.r.t f x k 77

58 2. Backtracking line search for step size Keep shrinking step size till we find a good one 78

59 2. Backtracking line search for step size X k+1 X k Keep shrinking step size till we find a good one Update estimate to the position at the converged step size 79

60 2. Backtracking line search for step size At each iteration, estimate step size as follows: Set η k = 1 Update η k = βη k until f x k η k f x k f x k αη k f x k 2 Update x k+1 = x k η k f x k Figure shows actual evolution of x k /

61 3. Full line search for step size At each iteration scan for η k that minimizes f x k η k f x k Update x k = x k η k f x k /

62 3. Full line search for step size At each iteration scan for η k that minimizes f x k η k f x k Can be computed by solving df x k η k f x k dη k = 0 Update x k = x k η k f x k /

63 Gradient descent convergence criteria The gradient descent algorithm converges when one of the following criteria is satisfied f (x k+1 )- f (x k ) <e 1 Or Ñf (x k ) < e /

64 Gradient descent vs. Newton s Gradient descent is typically much slower to converge than Newton s But much faster to compute x 0 Newton s method Gradient descent Newton s method is exponentially faster for convex problems Although derivatives and Hessians may be hard to derive May not converge for non-convex problems 97

65 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /

66 Online Optimization Often our objective function is an error The error is the cumulative error from many signals E.g. E(W) = y f(x, W) 2 x Optimization will find the W that minimizes total error across all x What if wanted to update our parameters after each input x instead of waiting for all of them to arrive? /

67 A problem we saw M = S = N =? Given the music M and the score S of only four of the notes, but not the notes themselves, find the notes M = NS N = MPinv(S) /

68 The Actual Problem M = S = N =? Given the music M and the score S find a matrix N such the error of reconstruction E = M i NS i 2 i is minimized This is a standard optimization problem The solution gives us N = MPinv(S) /

69 The Actual Problem M = S = N =? Given the music M and the score S find a matrix N such the error of reconstruction E = M i NS i 2 i is minimized This is a standard optimization problem The solution gives us N = MPinv(S) This requires seeing all of M and S to estimate N /

70 Online Updates M = S = N =? N 0 N 1 N 2 N 3 N 4 What if we want to update our estimate of the notes after every input After observing each vector of music and its score A situation that arises in many similar problems /

71 Incremental Updates Easy solution: To obtain the k th estimate N k, minimize the error on the k th input The error on the k th input is: E k = M K NS K 2 Differentiating it gives us N = 2 M K NS K M K T Update the parameter to move in the direction of this update N k+1 = N k T + η M K NS K M K η must typically be very small to prevent the updates from being influenced entirely by the latest observation /

72 Online update: Non-quadratic functions The earlier problem has a linear predictor as the underlying model M k = NS k The squared error E k = M k NS k 2 is quadratic and differentiable We often have non-linear predictors Y k = g WX k E k = Y k g WX k 2 The derivative of the error w.r.t W is often ugly or intractable For such problems we will use the following generalization of the online update rule for linear predictors This is the Widrow-Hoff rule W k+1 = W k + η Y k g W k X k Y k T /

73 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /

74 A problem we recently saw M = N = S The projection matrix P is the matrix that minimizes the total error between the projected matrix S and the original matrix M /

75 CONSTRAINED optimization Recall the projection problem: Find P such that we minimize E = M i PM i 2 i AND such that the projection is composed of the notes in N P = NC This is a problem of constrained optimization /

76 Optimization problem with constraints Finding the minimum of a function f: R N R subject to constraints min f (x) x s.t. g i (x) 0 h j (x) = 0 i = 1,..., k { } j = 1,...,l { } Constraints define a feasible region, which is nonempty /

77 Optimization without constraints No Constraints min x f (x, y, z)= x 2 + y 2 Best minimum point /

78 Optimization with constraints With Constraints min x,y f (x, y)= x 2 + y 2 s.t. 2x + y /

79 Optimization with constraints Minima w/ and w/o constraints min x,y f (x, y)= x 2 + y 2 s.t. 2x + y -4 Minimum Without constraints Minimum With constraints /

80 Solving for constrained optimization: the method of Lagrangians Consider a function f(x, y) that must be maximized w.r.t (x, y) subject to g(x, y) = c Note, we re using a maximization example to go with the figures that have been obtained from Wikipedia /

81 The Lagrange Method Purple surface is f(x, y) Must be maximized Red curve is constraint g(x, y) = c All solutions must line on this curve Problem: Find the position of the largest f(x, y) on the red curve! /

82 The Lagrange Method maximize f x, y, s. t. g x, y = c Dotted lines are constant-value contours f(x, y) = C f x, y has the same value C at all points on a contour The constrained optimum will be at the point where the highest constant-value contour touches the red curve It will be tangential to the red curve 115

83 The Lagrange Method maximize f x, y, s. t. g x, y = c The constrained optimum is where the highest constant-value contour is tangential to the red curve The gradient of f(x, y) = C will be parallel to the gradient of g(x, y) = c 116

84 The Lagrange Method maximize f x, y, s. t. g x, y = c At the optimum f x, y = λ g x, y g x, y = c Find x, y that satisfies both above conditions 117

85 The Lagrange Method f x, y g x, y = λ g x, y = c Find x, y that satisfies both above conditions Combine the above two into one equation L x, y, λ = f x, y λ(g x, y c) Optimize it for x, y, λ Solving for (x, y), Solving for λ x,y L x, y, λ =0 f x, y = λ g x, y L x, y, λ λ = 0 g x, y = c /

86 The Lagrange Method f x, y g x, y = λ g x, y = c Find x, y that satisfies both above conditions Combine the above two into one equation L x, y, λ = f x, y λ(g x, y c) Optimize it for x, y, λ Solving for (x, y), Formally: Solving for λ x,y L x, y, λ =0 f x, y = λ g x, y to maximize f(x, y): max L x, y, λ λ x,y to minimize f(x, y): min min λ L(x, y, λ) max L(x, y, λ) = 0 g x, y = c x,y λ /

87 Generalizes to inequality constraints Optimization problem with constraints min s. t. g ( x) Lagrange multipliers l i ³ 0,n Î Â L(x, l,n) = f (x)+ å l i g i (x)+ n j h j (x) i=1 The necessary condition ÑL(x, l,n) = 0 Û L x x h j f ( x) i ( x) 0 i 0 k j 1,..., k 1,..., l l å j=1 L L = 0, = 0, l n = /

88 Lagrange multiplier example min x, y s. t.2x f ( x, y) x y 4 2 y 2 Lagrange multiplier L x 2 y 2 (2x y 4) Evaluate ÑL(x, l,n) = 0 Û L x L L = 0, = 0, l n = /

89 Optimization with constraints Lagrange Multiplier results min x,y f (x, y)= x 2 + y 2 s.t. 2x + y -4 Minimum With constraints (-8/5,-4/5,16/5) /

90 An Alternate Approach: Projected Gradients Feasible Set The constraints specify a feasible set The region of the space where the solution can lie /

91 An Alternate Approach: Projected Gradients Feasible Set From the current estimate, take a step using the conventional gradient descent approach If the update is inside the feasible set, no further action is required /

92 An Alternate Approach: Projected Gradients Feasible Set If the update falls outside the feasible set, /

93 An Alternate Approach: Projected Gradients Feasible Set If the update falls outside the feasible set, find the closest point to the update on the boundary of the feasible set /

94 An Alternate Approach: Projected Gradients Feasible Set If the update falls outside the feasible set, find the closest point to the update on the boundary of the feasible set And move the updated estimate to this new point /

95 The method of projected gradients min x s. t. g f ( x) i ( x) 0 i 1,..., k The constraints specify a feasible set The region of the space where the solution can lie Update current estimate using the conventional gradient descent approach If the update is inside the feasible set, no further action is required If the update falls outside the feasible set, find the closest point to the update on the boundary of the feasible set And move the updated estimate to this new point The closest point projects the update onto the feasible set For many problems, however, finding this projection can be difficult or intractable /

96 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /

97 Regularization Sometimes we have additional regularization on the parameters E.g. Note these are not hard constraints Minimize f(x) while requiring that the length X 2 is also minimum Minimize f(x) while requiring that X 1 is also minimal Minimize f(x) such that g(x) is maximum We will encounter problems where such requirements are logical /

98 Contour Plot of a Quadratic Objective f(x) f(x) x 2 x 2 x 1 x 1 Left: Actual 3D plot x = [x 1, x 2 ] Right: constant-value contours Innermost contour has lowest value Unconstrained/unregularized solution: The center of the innermost contour /

99 Examples of regularization Image Credit: Tibshirani Left: L 1 regularization, find x that minimizes f(x) o Also minimize x 1 o x 1 = const is a diamond o Find x that also minimizes diameter of diamond Right: L 2 or Tikhonov regularization o Also minimize x 2 o x 2 = const is a circle (sphere) o Find x that also minimizes diameter of circle 11755/

100 Regularization The problem: multiple simultaneous objectives Minimize f(x) Also minimize g 1 X, g 2 X, These are regularizers Solution: Define L X = f X + λ 1 g 1 X + λ 2 g 2 X + λ 1, λ 2 etc are regularization parameters. These are set and not estimated Unlike Lagrange multipliers Minimize L X /

101 Contour Plot of a Quadratic Objective f(x) f(x) x 2 x 2 x 1 x 1 Left: Actual 3D plot x = [x 1, x 2 ] Right: equal-value contours of f(x) Innermost contour has lowest value /

102 With L 1 regularization f(x) x 1 f(x) + x 1 x 2 x 2 x 1 L 1 regularized objective f x + λ x 1, for different values of regularization parameter Note: Minimum value occurs on x 1 axis for = 1 Sparse solution / x 1

103 L 2 and L 1 -L 2 regularization f(x) + x 2 2 f(x) + x 1 + x 2 2 x 2 x 2 x 1 L 2 regularized objective f x + λ x 2 results in shorter optimum L 1 -L 2 regularized objective results in sparse, short optimum = 1 for both regularizers in example x /

104 Sparse signal reconstruction Minimum Square Error Signal x of length non-zero component Regularization Reconstructing the original signal from noisy 50 measurements b = Ax + ε /

105 Signal reconstruction Minimum Square Error Signal reconstruction Least square problem min Ax - b /

106 L2-Regularization Signal reconstruction Least squares problem min Ax - b 2 2 +g x /

107 L1-Regularization Signal reconstruction Least square problem min Ax - b 2 2 +g x /

108 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /

109 Convex optimization Problems An convex optimization problem is defined by convex objective function Convex inequality constraints Affine equality constraints h j min f 0 (x) (convex function) x s.t. f i (x) 0 (convex sets) h j (x) = 0 (Affine) f i /

110 Convex Sets a set is convex, if for each x, y Î C and a Î 0,1 C Î Â n [ ] then ax +(1-a)y Î C x Non Convex Convex y y x /

111 Convex functions A function f: R N R is convex if for each x, y domain(f) and α [0,1] f (ax +(1-a)y) a f (x)+(1-a) f (y) Convex f (x) Non Convex f (y) a f (x)+(1-a) f (y) a f (x)+(1-a) f (y) f (y) f (x) /

112 Concave functions A function f: R N R is convex if for each x, y domain(f) and α [0,1] f (ax +(1-a)y) ³a f (x)+(1-a) f (y) f (y) Concave a f (x)+(1-a) f (y) /

113 First order convexity conditions A differentiable function f: R N R is convex if and only if for x, y domain(f) the following condition is satisfied f (y) ³ f (x)+ñf (x) t (y - x) Lower Bound (x, f (x)) f (x)+ Ñf (x) T (y - x) /

114 Second order convexity conditions A twice-differentiable function f: R N R is convex if and only if for all x, y domain(f) the Hessian is superior or equal to zero Ñ 2 f (x) ³ /

115 Properties of Convex Optimization For convex objectives over convex feasible sets, the optimum value is unique There are no local minima/maxima that are not also the global minima/maxima Any gradient-based solution will find this optimum eventually Primary problem: speed of convergence to this optimum /

116 Lagrange multiplier duality Optimization problem with constraints min f (x) x s.t. g i (x) 0 h j (x) = 0 i = { 1,..., k} j = { 1,...,l } Lagrange multipliers l i ³ 0,n Î Â L(x, l,n) = f (x)+ å l i g i (x)+ n j h j (x) The Dual function inf x L(x, l,n) = inf x k i=1 l å j=1 ì k l ï üï í f (x)+ å l i g i (x)+ ån j h j (x) ý îï i=1 j=1 þï /

117 Lagrange multiplier duality The Original optimization problem min{ sup L(x, l,n) } x The Dual optimization Property of the Dual for convex function sup l³0,n l³0,n { inf L(x, l,n) } = f (x * ) x l³0,n max{ inf L(x, l,n) } x /

118 Lagrange multiplier duality Previous Example f x, y is convex Constraint function is convex min x,y f (x, y)= x 2 + y 2 s.t. 2x + y /

119 Lagrange multiplier duality Primal system min x,y f (x, y)= x 2 + y 2 s.t. 2x + y -4 Lagrange Multiplier L = x 2 + y 2 + l(2x + y - 4) L = 2x + 2l = 0 Þ x = -l x L y = 2y + l = 0 Þ y = - l 2 Dual system max l w(l) = 5 4 l 2 + 4l s.t. l ³ 0 Property w(l*) = f (x*, y*) /

120 Lagrange multiplier duality Dual system max l w(l) = 5 4 l 2 + 4l s.t. l ³ 0 Concave function Convex function become concave function in dual problem w x = l + 4 = 0 Þ l* = /

121 Lagrange multiplier duality Primal system min x,y Evaluating f (x, y)= x 2 + y 2 s.t. 2x + y -4 w(l*) = f (x*, y*) Dual system max l w(l) = 5 4 l 2 + 4l s.t. l ³ 0 x* = - 8 5, y* = l* = 8 5 æ f (x*, y*) = - 8 ö ç è 5ø 2 æ ö ç è 5ø 2 w(l*) = - 5 æ8 ç ö 2 4 è 5ø f (x*, y*) = 16 5 w(l*) = /

Lecture 2 Optimization with equality constraints

Lecture 2 Optimization with equality constraints Lecture 2 Optimization with equality constraints Constrained optimization The idea of constrained optimisation is that the choice of one variable often affects the amount of another variable that can be

More information

Convexity Theory and Gradient Methods

Convexity Theory and Gradient Methods Convexity Theory and Gradient Methods Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Convex Functions Optimality

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June

More information

1. Show that the rectangle of maximum area that has a given perimeter p is a square.

1. Show that the rectangle of maximum area that has a given perimeter p is a square. Constrained Optimization - Examples - 1 Unit #23 : Goals: Lagrange Multipliers To study constrained optimization; that is, the maximizing or minimizing of a function subject to a constraint (or side condition).

More information

In other words, we want to find the domain points that yield the maximum or minimum values (extrema) of the function.

In other words, we want to find the domain points that yield the maximum or minimum values (extrema) of the function. 1 The Lagrange multipliers is a mathematical method for performing constrained optimization of differentiable functions. Recall unconstrained optimization of differentiable functions, in which we want

More information

A Short SVM (Support Vector Machine) Tutorial

A Short SVM (Support Vector Machine) Tutorial A Short SVM (Support Vector Machine) Tutorial j.p.lewis CGIT Lab / IMSC U. Southern California version 0.zz dec 004 This tutorial assumes you are familiar with linear algebra and equality-constrained optimization/lagrange

More information

Chapter 3 Numerical Methods

Chapter 3 Numerical Methods Chapter 3 Numerical Methods Part 1 3.1 Linearization and Optimization of Functions of Vectors 1 Problem Notation 2 Outline 3.1.1 Linearization 3.1.2 Optimization of Objective Functions 3.1.3 Constrained

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information

Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual

Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know learn the proximal

More information

EC5555 Economics Masters Refresher Course in Mathematics September Lecture 6 Optimization with equality constraints Francesco Feri

EC5555 Economics Masters Refresher Course in Mathematics September Lecture 6 Optimization with equality constraints Francesco Feri EC5555 Economics Masters Refresher Course in Mathematics September 2013 Lecture 6 Optimization with equality constraints Francesco Feri Constrained optimization The idea of constrained optimisation is

More information

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach Basic approaches I. Primal Approach - Feasible Direction

More information

Second Midterm Exam Math 212 Fall 2010

Second Midterm Exam Math 212 Fall 2010 Second Midterm Exam Math 22 Fall 2 Instructions: This is a 9 minute exam. You should work alone, without access to any book or notes. No calculators are allowed. Do not discuss this exam with anyone other

More information

Local and Global Minimum

Local and Global Minimum Local and Global Minimum Stationary Point. From elementary calculus, a single variable function has a stationary point at if the derivative vanishes at, i.e., 0. Graphically, the slope of the function

More information

Computational Methods. Constrained Optimization

Computational Methods. Constrained Optimization Computational Methods Constrained Optimization Manfred Huber 2010 1 Constrained Optimization Unconstrained Optimization finds a minimum of a function under the assumption that the parameters can take on

More information

REVIEW I MATH 254 Calculus IV. Exam I (Friday, April 29) will cover sections

REVIEW I MATH 254 Calculus IV. Exam I (Friday, April 29) will cover sections REVIEW I MATH 254 Calculus IV Exam I (Friday, April 29 will cover sections 14.1-8. 1. Functions of multivariables The definition of multivariable functions is similar to that of functions of one variable.

More information

Unconstrained Optimization Principles of Unconstrained Optimization Search Methods

Unconstrained Optimization Principles of Unconstrained Optimization Search Methods 1 Nonlinear Programming Types of Nonlinear Programs (NLP) Convexity and Convex Programs NLP Solutions Unconstrained Optimization Principles of Unconstrained Optimization Search Methods Constrained Optimization

More information

Lecture 15: Log Barrier Method

Lecture 15: Log Barrier Method 10-725/36-725: Convex Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 15: Log Barrier Method Scribes: Pradeep Dasigi, Mohammad Gowayyed Note: LaTeX template courtesy of UC Berkeley EECS dept.

More information

Lagrange multipliers October 2013

Lagrange multipliers October 2013 Lagrange multipliers 14.8 14 October 2013 Example: Optimization with constraint. Example: Find the extreme values of f (x, y) = x + 2y on the ellipse 3x 2 + 4y 2 = 3. 3/2 1 1 3/2 Example: Optimization

More information

Characterizing Improving Directions Unconstrained Optimization

Characterizing Improving Directions Unconstrained Optimization Final Review IE417 In the Beginning... In the beginning, Weierstrass's theorem said that a continuous function achieves a minimum on a compact set. Using this, we showed that for a convex set S and y not

More information

3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers

3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers 3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers Prof. Tesler Math 20C Fall 2018 Prof. Tesler 3.3 3.4 Optimization Math 20C / Fall 2018 1 / 56 Optimizing y = f (x) In Math 20A, we

More information

Lagrange multipliers 14.8

Lagrange multipliers 14.8 Lagrange multipliers 14.8 14 October 2013 Example: Optimization with constraint. Example: Find the extreme values of f (x, y) = x + 2y on the ellipse 3x 2 + 4y 2 = 3. 3/2 Maximum? 1 1 Minimum? 3/2 Idea:

More information

Lecture 18: March 23

Lecture 18: March 23 0-725/36-725: Convex Optimization Spring 205 Lecturer: Ryan Tibshirani Lecture 8: March 23 Scribes: James Duyck Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

COMS 4771 Support Vector Machines. Nakul Verma

COMS 4771 Support Vector Machines. Nakul Verma COMS 4771 Support Vector Machines Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake bound for the perceptron

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 2 Convex Optimization Shiqian Ma, MAT-258A: Numerical Optimization 2 2.1. Convex Optimization General optimization problem: min f 0 (x) s.t., f i

More information

CPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015

CPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015 CPSC 340: Machine Learning and Data Mining Robust Regression Fall 2015 Admin Can you see Assignment 1 grades on UBC connect? Auditors, don t worry about it. You should already be working on Assignment

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

Applied Lagrange Duality for Constrained Optimization

Applied Lagrange Duality for Constrained Optimization Applied Lagrange Duality for Constrained Optimization Robert M. Freund February 10, 2004 c 2004 Massachusetts Institute of Technology. 1 1 Overview The Practical Importance of Duality Review of Convexity

More information

Lecture 12: convergence. Derivative (one variable)

Lecture 12: convergence. Derivative (one variable) Lecture 12: convergence More about multivariable calculus Descent methods Backtracking line search More about convexity (first and second order) Newton step Example 1: linear programming (one var., one

More information

Lecture 7: Support Vector Machine

Lecture 7: Support Vector Machine Lecture 7: Support Vector Machine Hien Van Nguyen University of Houston 9/28/2017 Separating hyperplane Red and green dots can be separated by a separating hyperplane Two classes are separable, i.e., each

More information

Introduction to Modern Control Systems

Introduction to Modern Control Systems Introduction to Modern Control Systems Convex Optimization, Duality and Linear Matrix Inequalities Kostas Margellos University of Oxford AIMS CDT 2016-17 Introduction to Modern Control Systems November

More information

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING KELLER VANDEBOGERT AND CHARLES LANNING 1. Introduction Interior point methods are, put simply, a technique of optimization where, given a problem

More information

Solution Methods Numerical Algorithms

Solution Methods Numerical Algorithms Solution Methods Numerical Algorithms Evelien van der Hurk DTU Managment Engineering Class Exercises From Last Time 2 DTU Management Engineering 42111: Static and Dynamic Optimization (6) 09/10/2017 Class

More information

MATH 19520/51 Class 10

MATH 19520/51 Class 10 MATH 19520/51 Class 10 Minh-Tam Trinh University of Chicago 2017-10-16 1 Method of Lagrange multipliers. 2 Examples of Lagrange multipliers. The Problem The ingredients: 1 A set of parameters, say x 1,...,

More information

Inverse and Implicit functions

Inverse and Implicit functions CHAPTER 3 Inverse and Implicit functions. Inverse Functions and Coordinate Changes Let U R d be a domain. Theorem. (Inverse function theorem). If ϕ : U R d is differentiable at a and Dϕ a is invertible,

More information

Convexization in Markov Chain Monte Carlo

Convexization in Markov Chain Monte Carlo in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non

More information

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro CMU-Q 15-381 Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization Teacher: Gianni A. Di Caro GLOBAL FUNCTION OPTIMIZATION Find the global maximum of the function f x (and

More information

Ellipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case

Ellipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case Ellipsoid Algorithm 15-853:Algorithms in the Real World Linear and Integer Programming II Ellipsoid algorithm Interior point methods First polynomial-time algorithm for linear programming (Khachian 79)

More information

Chapter II. Linear Programming

Chapter II. Linear Programming 1 Chapter II Linear Programming 1. Introduction 2. Simplex Method 3. Duality Theory 4. Optimality Conditions 5. Applications (QP & SLP) 6. Sensitivity Analysis 7. Interior Point Methods 1 INTRODUCTION

More information

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs) Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based

More information

EXTRA-CREDIT PROBLEMS ON SURFACES, MULTIVARIABLE FUNCTIONS AND PARTIAL DERIVATIVES

EXTRA-CREDIT PROBLEMS ON SURFACES, MULTIVARIABLE FUNCTIONS AND PARTIAL DERIVATIVES EXTRA-CREDIT PROBLEMS ON SURFACES, MULTIVARIABLE FUNCTIONS AND PARTIAL DERIVATIVES A. HAVENS These problems are for extra-credit, which is counted against lost points on quizzes or WebAssign. You do not

More information

Constrained Optimization

Constrained Optimization Constrained Optimization Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Constrained Optimization 1 / 46 EC2040 Topic 5 - Constrained Optimization Reading 1 Chapters 12.1-12.3

More information

14.5 Directional Derivatives and the Gradient Vector

14.5 Directional Derivatives and the Gradient Vector 14.5 Directional Derivatives and the Gradient Vector 1. Directional Derivatives. Recall z = f (x, y) and the partial derivatives f x and f y are defined as f (x 0 + h, y 0 ) f (x 0, y 0 ) f x (x 0, y 0

More information

Lecture 25 Nonlinear Programming. November 9, 2009

Lecture 25 Nonlinear Programming. November 9, 2009 Nonlinear Programming November 9, 2009 Outline Nonlinear Programming Another example of NLP problem What makes these problems complex Scalar Function Unconstrained Problem Local and global optima: definition,

More information

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007 : Convex Non-Smooth Optimization April 2, 2007 Outline Lecture 19 Convex non-smooth problems Examples Subgradients and subdifferentials Subgradient properties Operations with subgradients and subdifferentials

More information

Today. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient

Today. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient Optimization Last time Root finding: definition, motivation Algorithms: Bisection, false position, secant, Newton-Raphson Convergence & tradeoffs Example applications of Newton s method Root finding in

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Maximum Margin Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints

More information

Winter 2012 Math 255 Section 006. Problem Set 7

Winter 2012 Math 255 Section 006. Problem Set 7 Problem Set 7 1 a) Carry out the partials with respect to t and x, substitute and check b) Use separation of varibles, i.e. write as dx/x 2 = dt, integrate both sides and observe that the solution also

More information

Convex Optimization and Machine Learning

Convex Optimization and Machine Learning Convex Optimization and Machine Learning Mengliu Zhao Machine Learning Reading Group School of Computing Science Simon Fraser University March 12, 2014 Mengliu Zhao SFU-MLRG March 12, 2014 1 / 25 Introduction

More information

5 Day 5: Maxima and minima for n variables.

5 Day 5: Maxima and minima for n variables. UNIVERSITAT POMPEU FABRA INTERNATIONAL BUSINESS ECONOMICS MATHEMATICS III. Pelegrí Viader. 2012-201 Updated May 14, 201 5 Day 5: Maxima and minima for n variables. The same kind of first-order and second-order

More information

Mathematical Programming and Research Methods (Part II)

Mathematical Programming and Research Methods (Part II) Mathematical Programming and Research Methods (Part II) 4. Convexity and Optimization Massimiliano Pontil (based on previous lecture by Andreas Argyriou) 1 Today s Plan Convex sets and functions Types

More information

Section 4: Extreme Values & Lagrange Multipliers.

Section 4: Extreme Values & Lagrange Multipliers. Section 4: Extreme Values & Lagrange Multipliers. Compiled by Chris Tisdell S1: Motivation S2: What are local maxima & minima? S3: What is a critical point? S4: Second derivative test S5: Maxima and Minima

More information

Convex Optimization MLSS 2015

Convex Optimization MLSS 2015 Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :

More information

Lagrange multipliers. Contents. Introduction. From Wikipedia, the free encyclopedia

Lagrange multipliers. Contents. Introduction. From Wikipedia, the free encyclopedia Lagrange multipliers From Wikipedia, the free encyclopedia In mathematical optimization problems, Lagrange multipliers, named after Joseph Louis Lagrange, is a method for finding the local extrema of a

More information

Support Vector Machines. James McInerney Adapted from slides by Nakul Verma

Support Vector Machines. James McInerney Adapted from slides by Nakul Verma Support Vector Machines James McInerney Adapted from slides by Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake

More information

Convexity and Optimization

Convexity and Optimization Convexity and Optimization Richard Lusby Department of Management Engineering Technical University of Denmark Today s Material Extrema Convex Function Convex Sets Other Convexity Concepts Unconstrained

More information

16.410/413 Principles of Autonomy and Decision Making

16.410/413 Principles of Autonomy and Decision Making 16.410/413 Principles of Autonomy and Decision Making Lecture 17: The Simplex Method Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology November 10, 2010 Frazzoli (MIT)

More information

ISM206 Lecture, April 26, 2005 Optimization of Nonlinear Objectives, with Non-Linear Constraints

ISM206 Lecture, April 26, 2005 Optimization of Nonlinear Objectives, with Non-Linear Constraints ISM206 Lecture, April 26, 2005 Optimization of Nonlinear Objectives, with Non-Linear Constraints Instructor: Kevin Ross Scribe: Pritam Roy May 0, 2005 Outline of topics for the lecture We will discuss

More information

MEI Desmos Tasks for AS Pure

MEI Desmos Tasks for AS Pure Task 1: Coordinate Geometry Intersection of a line and a curve 1. Add a quadratic curve, e.g. y = x² 4x + 1 2. Add a line, e.g. y = x 3 3. Select the points of intersection of the line and the curve. What

More information

Kernel Methods & Support Vector Machines

Kernel Methods & Support Vector Machines & Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector

More information

30. Constrained Optimization

30. Constrained Optimization 30. Constrained Optimization The graph of z = f(x, y) is represented by a surface in R 3. Normally, x and y are chosen independently of one another so that one may roam over the entire surface of f (within

More information

R f da (where da denotes the differential of area dxdy (or dydx)

R f da (where da denotes the differential of area dxdy (or dydx) Math 28H Topics for the second exam (Technically, everything covered on the first exam, plus) Constrained Optimization: Lagrange Multipliers Most optimization problems that arise naturally are not unconstrained;

More information

Physics 235 Chapter 6. Chapter 6 Some Methods in the Calculus of Variations

Physics 235 Chapter 6. Chapter 6 Some Methods in the Calculus of Variations Chapter 6 Some Methods in the Calculus of Variations In this Chapter we focus on an important method of solving certain problems in Classical Mechanics. In many problems we need to determine how a system

More information

NAME: Section # SSN: X X X X

NAME: Section # SSN: X X X X Math 155 FINAL EXAM A May 5, 2003 NAME: Section # SSN: X X X X Question Grade 1 5 (out of 25) 6 10 (out of 25) 11 (out of 20) 12 (out of 20) 13 (out of 10) 14 (out of 10) 15 (out of 16) 16 (out of 24)

More information

MATH2111 Higher Several Variable Calculus Lagrange Multipliers

MATH2111 Higher Several Variable Calculus Lagrange Multipliers MATH2111 Higher Several Variable Calculus Lagrange Multipliers Dr. Jonathan Kress School of Mathematics and Statistics University of New South Wales Semester 1, 2016 [updated: February 29, 2016] JM Kress

More information

Constrained extrema of two variables functions

Constrained extrema of two variables functions Constrained extrema of two variables functions Apellidos, Nombre: Departamento: Centro: Alicia Herrero Debón aherrero@mat.upv.es) Departamento de Matemática Aplicada Instituto de Matemática Multidisciplnar

More information

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING DAVID G. LUENBERGER Stanford University TT ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California London Don Mills, Ontario CONTENTS

More information

Math 113 Calculus III Final Exam Practice Problems Spring 2003

Math 113 Calculus III Final Exam Practice Problems Spring 2003 Math 113 Calculus III Final Exam Practice Problems Spring 23 1. Let g(x, y, z) = 2x 2 + y 2 + 4z 2. (a) Describe the shapes of the level surfaces of g. (b) In three different graphs, sketch the three cross

More information

(1) Given the following system of linear equations, which depends on a parameter a R, 3x y + 5z = 2 4x + y + (a 2 14)z = a + 2

(1) Given the following system of linear equations, which depends on a parameter a R, 3x y + 5z = 2 4x + y + (a 2 14)z = a + 2 (1 Given the following system of linear equations, which depends on a parameter a R, x + 2y 3z = 4 3x y + 5z = 2 4x + y + (a 2 14z = a + 2 (a Classify the system of equations depending on the values of

More information

Constrained and Unconstrained Optimization

Constrained and Unconstrained Optimization Constrained and Unconstrained Optimization Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Oct 10th, 2017 C. Hurtado (UIUC - Economics) Numerical

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Constrained Optimization Marc Toussaint U Stuttgart Constrained Optimization General constrained optimization problem: Let R n, f : R n R, g : R n R m, h : R n R l find min

More information

UNIT 2 LINEAR PROGRAMMING PROBLEMS

UNIT 2 LINEAR PROGRAMMING PROBLEMS UNIT 2 LINEAR PROGRAMMING PROBLEMS Structure 2.1 Introduction Objectives 2.2 Linear Programming Problem (LPP) 2.3 Mathematical Formulation of LPP 2.4 Graphical Solution of Linear Programming Problems 2.5

More information

Lecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem

Lecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem Computational Learning Theory Fall Semester, 2012/13 Lecture 10: SVM Lecturer: Yishay Mansour Scribe: Gitit Kehat, Yogev Vaknin and Ezra Levin 1 10.1 Lecture Overview In this lecture we present in detail

More information

Convex Optimization. Erick Delage, and Ashutosh Saxena. October 20, (a) (b) (c)

Convex Optimization. Erick Delage, and Ashutosh Saxena. October 20, (a) (b) (c) Convex Optimization (for CS229) Erick Delage, and Ashutosh Saxena October 20, 2006 1 Convex Sets Definition: A set G R n is convex if every pair of point (x, y) G, the segment beteen x and y is in A. More

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

Convexity and Optimization

Convexity and Optimization Convexity and Optimization Richard Lusby DTU Management Engineering Class Exercises From Last Time 2 DTU Management Engineering 42111: Static and Dynamic Optimization (3) 18/09/2017 Today s Material Extrema

More information

Lecture 2: August 31

Lecture 2: August 31 10-725/36-725: Convex Optimization Fall 2016 Lecture 2: August 31 Lecturer: Lecturer: Ryan Tibshirani Scribes: Scribes: Lidan Mu, Simon Du, Binxuan Huang 2.1 Review A convex optimization problem is of

More information

Neural Networks: Optimization Part 1. Intro to Deep Learning, Fall 2018

Neural Networks: Optimization Part 1. Intro to Deep Learning, Fall 2018 Neural Networks: Optimization Part 1 Intro to Deep Learning, Fall 2018 1 Story so far Neural networks are universal approximators Can model any odd thing Provided they have the right architecture We must

More information

Constrained Optimization and Lagrange Multipliers

Constrained Optimization and Lagrange Multipliers Constrained Optimization and Lagrange Multipliers MATH 311, Calculus III J. Robert Buchanan Department of Mathematics Fall 2011 Constrained Optimization In the previous section we found the local or absolute

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

Practice problems from old exams for math 233 William H. Meeks III December 21, 2009

Practice problems from old exams for math 233 William H. Meeks III December 21, 2009 Practice problems from old exams for math 233 William H. Meeks III December 21, 2009 Disclaimer: Your instructor covers far more materials that we can possibly fit into a four/five questions exams. These

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions

More information

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein October 8, 2015 1 / 30 Introduction Presentation Outline 1 Convex

More information

. Tutorial Class V 3-10/10/2012 First Order Partial Derivatives;...

. Tutorial Class V 3-10/10/2012 First Order Partial Derivatives;... Tutorial Class V 3-10/10/2012 1 First Order Partial Derivatives; Tutorial Class V 3-10/10/2012 1 First Order Partial Derivatives; 2 Application of Gradient; Tutorial Class V 3-10/10/2012 1 First Order

More information

Lecture 2: August 29, 2018

Lecture 2: August 29, 2018 10-725/36-725: Convex Optimization Fall 2018 Lecturer: Ryan Tibshirani Lecture 2: August 29, 2018 Scribes: Adam Harley Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

More information

Lecture 19: November 5

Lecture 19: November 5 0-725/36-725: Convex Optimization Fall 205 Lecturer: Ryan Tibshirani Lecture 9: November 5 Scribes: Hyun Ah Song Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

The big picture and a math review

The big picture and a math review The big picture and a math review Intermediate Micro Topic 1 Mathematical appendix of Varian The big picture Microeconomics Study of decision-making under scarcity Individuals households firms investors

More information

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with

More information

Multivariate Calculus Review Problems for Examination Two

Multivariate Calculus Review Problems for Examination Two Multivariate Calculus Review Problems for Examination Two Note: Exam Two is on Thursday, February 28, class time. The coverage is multivariate differential calculus and double integration: sections 13.3,

More information

Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers

Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers In this section we present Lagrange s method for maximizing or minimizing a general function f(x, y, z) subject to a constraint (or side condition) of the form g(x, y, z) = k. Figure 1 shows this curve

More information

Computational Optimization. Constrained Optimization Algorithms

Computational Optimization. Constrained Optimization Algorithms Computational Optimization Constrained Optimization Algorithms Same basic algorithms Repeat Determine descent direction Determine step size Take a step Until Optimal But now must consider feasibility,

More information

MATH Lagrange multipliers in 3 variables Fall 2016

MATH Lagrange multipliers in 3 variables Fall 2016 MATH 20550 Lagrange multipliers in 3 variables Fall 2016 1. The one constraint they The problem is to find the extrema of a function f(x, y, z) subject to the constraint g(x, y, z) = c. The book gives

More information

Graphs of Exponential

Graphs of Exponential Graphs of Exponential Functions By: OpenStaxCollege As we discussed in the previous section, exponential functions are used for many realworld applications such as finance, forensics, computer science,

More information

CSE 554 Lecture 6: Fairing and Simplification

CSE 554 Lecture 6: Fairing and Simplification CSE 554 Lecture 6: Fairing and Simplification Fall 2012 CSE554 Fairing and simplification Slide 1 Review Iso-contours in grayscale images and volumes Piece-wise linear representations Polylines (2D) and

More information

MEI GeoGebra Tasks for AS Pure

MEI GeoGebra Tasks for AS Pure Task 1: Coordinate Geometry Intersection of a line and a curve 1. Add a quadratic curve, e.g. y = x 2 4x + 1 2. Add a line, e.g. y = x 3 3. Use the Intersect tool to find the points of intersection of

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming SECOND EDITION Dimitri P. Bertsekas Massachusetts Institute of Technology WWW site for book Information and Orders http://world.std.com/~athenasc/index.html Athena Scientific, Belmont,

More information

Convex Programs. COMPSCI 371D Machine Learning. COMPSCI 371D Machine Learning Convex Programs 1 / 21

Convex Programs. COMPSCI 371D Machine Learning. COMPSCI 371D Machine Learning Convex Programs 1 / 21 Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Convex Programs 1 / 21 Logistic Regression! Support Vector Machines Support Vector Machines (SVMs) and Convex Programs SVMs are

More information

Multivariate Numerical Optimization

Multivariate Numerical Optimization Jianxin Wei March 1, 2013 Outline 1 Graphics for Function of Two Variables 2 Nelder-Mead Simplex Method 3 Steepest Descent Method 4 Newton s Method 5 Quasi-Newton s Method 6 Built-in R Function 7 Linear

More information

SYSTEMS OF NONLINEAR EQUATIONS

SYSTEMS OF NONLINEAR EQUATIONS SYSTEMS OF NONLINEAR EQUATIONS Widely used in the mathematical modeling of real world phenomena. We introduce some numerical methods for their solution. For better intuition, we examine systems of two

More information