Machine Learning for Signal Processing Lecture 4: Optimization
|
|
- Arleen Cain
- 6 years ago
- Views:
Transcription
1 Machine Learning for Signal Processing Lecture 4: Optimization 13 Sep 2015 Instructor: Bhiksha Raj (slides largely by Najim Dehak, JHU) /
2 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /
3 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /
4 A problem we recently saw M = N = S The projection matrix P is the matrix that minimizes the total error between the projected matrix S and the original matrix M /
5 The projection problem S = PM For individual vectors in the spectrogram S i = PM i Total projection error is E = M i PM i 2 i The projection matrix projects onto the space of notes in N P = NC The problem of finding P : Minimize E = M i PM i 2 i such that P = NC This is a problem of constrained optimization /
6 Optimization Optimization is finding the best value of a function (which can be the best minimum) The number of turning points of a function depend on the order of the function. Not all turning points are minima. They can be minima, maxima or inflection points. f (x) f(x) global maximum min x f (x) inflection point global minimum local minimum x
7 Examples of Optimization : Multivariate functions Find the optimal point in these functions /
8 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /
9 Simple Approach: Turning Point f(x) X The minimum of the function is always a turning point The function increases on either side How to identify these turning points? /
10 The derivative of a curve y y x x The derivative α of a curve is a multiplicative factor explaining how much y changes in response to a very small change in x y = x For scalar functions of scalar variables, often expressed as dy y = dy dx x y = f (x) x dx or as f (x) We have all learned how to compute derivatives in basic calculus /
11 Positive derivative dy dx > The derivative of a Curve Negative derivative dy dx < Zero derivative dy dx = In upward-rising regions of the curve, the derivative is positive Small increase in X cause Y to increase In downward-falling regions, the derivative is negative At turning points, the derivative is 0 Assumption: the function is differentiable at the turning point /
12 Finding the minimum of a function f(x) dy dx = 0 x Find the value x at which f (x) = 0 Solve df(x) dx = 0 The solution is a turning point But is it a minimum? /
13 Turning Points Both maxima and minima have zero derivative /
14 Derivatives of a curve f(x) x Both maxima and minima are turning points /
15 Derivatives of a curve f (x) f(x) x Both maxima and minima are turning points Both maxima and minima have zero derivative /
16 Derivative of the derivative of the curve f (x) f (x) f(x) x Both maxima and minima are turning points Both maxima and minima have zero derivative The second derivative f (x) is ve at maxima and +ve at minima! At maxima the derivative goes from +ve to ve, so the derivative decreases as x increases At minima the derivative goes from ve to +ve and increases as x increases /
17 Soln: Finding the minimum or f(x) maximum of a function dy dx = 0 Find the value x at which f (x) = 0: Solve df(x) dx = 0 The solution x soln is a turning point Check the double derivative at x soln : compute f x soln = df (x soln) dx If f x soln is positive x soln is a minimum, otherwise it is a maximum x /
18 What about functions of multiple variables? The optimum point is still turning point Shifting in any direction will increase the value For smooth functions, miniscule shifts will not result in any change at all We must find a point where shifting in any direction by a microscopic amount will not change the value of the function /
19 A brief note on derivatives of multivariate functions /
20 The Gradient of a scalar function f(x) The Gradient f(x) of a scalar function f(x) of a multi-variate input X is a multiplicative factor that gives us the change in f(x) for tiny variations in X X f(x) = f(x) T X /
21 Gradients of scalar functions with multi-variate inputs Consider f X = f x 1, x 2,, x n f(x) f(x) = x 1 f(x) x 2 f(x) Check: f X = f X T X = f(x) x x 1 + f(x) 1 x 2 x n x f(x) x n x n
22 A well-known vector property u. v = u v cosθ The inner product between two vectors of fixed lengths is maximum when the two vectors are aligned /
23 f X Properties of Gradient = f X T X The inner product between f X and X Fixing the length of X E.g. X = 1 f X is max if X = c f X f X, X = 0 The function f(x) increases most rapidly if the input increment X is perfectly aligned to f X The gradient is the direction of fastest increase in f(x) /
24 Gradient Gradient vector f(x) /
25 Gradient Gradient vector f(x) Moving in this direction increases f X fastest /
26 Gradient Gradient vector f(x) f(x) Moving in this direction decreases f X fastest Moving in this direction increases f(x) fastest /
27 Gradient Gradient here is 0 Gradient here is /
28 Properties of Gradient: 2 The gradient vector f(x) is perpendicular to the level curve 29
29 Derivatives of vector function of vector input Y = f(x) X = x 1 x 2 Y = y 1 y 2 X f(x) The Gradient f(x) of a scalar function f(x) of a multi-variate input X is a multiplicative factor that gives us the change in f(x) for tiny variations in X f(x) = f(x) T X /
30 Gradient of vector function of vector input / x n x x X n m n n n n x y x y x y x y x y x y x y x y x y X f ) ( y m y y X f.. ) ( 2 1 Properties and interpretations are similar to the case of scalar functions of vector inputs
31 The Hessian The Hessian of a function f(x 1, x 2,, x n ) is given by the second derivative é ê ê ê ê ê ê Ñ 2 f (x 1,..., x n ) := ê ê ê ê ê ê ê ë 2 f x f x 1 x f 2 f.. 2 x 2 x 1 x 2 2 f x 1 x n 2 f x 2 x n f 2 f.. x n x 1 x n x 2 2 f x n / ù ú ú ú ú ú ú ú ú ú ú ú ú ú û
32 Returning to direct optimization /
33 Finding the minimum of a scalar function of a multi-variate input The optimum point is a turning point the gradient will be /
34 Unconstrained Minimization of function (Multivariate) 1. Solve for the X where the gradient equation equals to zero f ( X ) 2. Compute the Hessian Matrix 2 f(x) at the candidate solution and verify that Hessian is positive definite (eigenvalues positive) -> to identify local minima Hessian is negative definite (eigenvalues negative) -> to identify local maxima /
35 Unconstrained Minimization of Minimize function (Example) f (x 1, x 2, x 3 ) = (x 1 ) 2 + x 1 (1- x 2 )-(x 2 ) 2 - x 2 x 3 +(x 3 ) 2 + x 3 Gradient é ê Ñf = ê ê ëê 2x x 2 -x 1 + 2x 2 - x 3 -x 2 + 2x 3 +1 ù ú ú ú ûú /
36 Unconstrained Minimization of Set the gradient to null é ê Ñf = 0Þ ê ê ëê function (Example) 2x x 2 -x 1 + 2x 2 - x 3 -x 2 + 2x 3 +1 ù é ú ê ú = ê ú ê ûú ë Solving the 3 equations system with 3 unknowns é ê x = ê ê ëê x 1 x 2 x 3 ù é ú ê ú = ê ú ê ûú ë ù ú ú ú û ù ú ú ú û /
37 Unconstrained Minimization of function (Example) Compute the Hessian matrix Evaluate the eigenvalues of the Hessian matrix l 1 = 3.414, l 2 = 0.586, l 3 = 2 All the eigenvalues are positives => the Hessian matrix is positive definite The point é ê x = ê ê ëê x 1 x 2 x 3 ù é ú ê ú = ê ú ê ûú ë ù ú ú ú û é ê Ñ 2 f = ê ê ë is a minimum / ù ú ú ú û
38 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /
39 Closed Form Solutions are not always available f(x) Often it is not possible to simply solve f X = 0 The function to minimize/maximize may have an intractable form In these situations, iterative solutions are used Begin with a guess for the optimal X and refine it iteratively until the correct value is obtained X /
40 Iterative solutions f(x) X X x 0 x 1 x 2 x 5 x 3 X 0 2 x X 4 1 Iterative solutions Start from an initial guess X 0 for the optimal X Update the guess towards a (hopefully) better value of f(x) Stop when f(x) no longer decreases Problems: Which direction to step in How big must the steps be /
41 Descent methods Iterative solutions that attempt to descend the function in steps to arrive at the minimum Based on the first order derivatives (gradient) and in some cases the second order derivatives (Hessian). Newton s method is based on both first and second derivatives Gradient descent is based only on the first derivative /
42 Descent methods Iterative solutions that attempt to descend the function in steps to arrive at the minimum Based on the first order derivatives (gradient) and in some cases the second order derivatives (Hessian). Newton s method is based on both first and second derivatives Gradient descent is based only on the first derivative /
43 Newton s Method The Newton Method is based on an iterative step to minimize the derivative f (x) x k+1 = x k - f (xk ) f (x k ) k is the current iteration The iterations continue until we achieve the stopping criterion x k+1 x k < ε /
44 Newton s method to find the zero of a function Newton s method to find the zero of a function Initialize estimate Approximate function by the tangent at initial value Update estimate to location where tangent becomes 0 Iterate /
45 Newton s Method to optimize a function y f(x) x f (x) Apply Newton s method to the derivative of the function! The derivative goes to 0 at the optimum Algorithm: Initialize x 0 K th iteration: Approximate f (x) by the tangent at x k Find the location x intersect where the tangent goes to 0. Set x k+1 = x intersect Iterate /
46 Newton s method for univariate functions Apply Newton s algorithm to find the zero of the derivative f (x) x k+1 = x k - f (xk ) f (x k ) k is the current iteration The iterations continue until we achieve the stopping criterion x k+1 x k < ε /
47 Newton s method for multivariate functions 1. Select an initial starting point X 0 2. Evaluate the gradient f X k and Hessian 2 f X k at X k 3. Calculate the new X k+1 using the following X X 2 k 1 f ( X ). f ( X ) k 1 k k 4. Repeat Steps 2 and 3 until convergence /
48 Newton s Method Newton s approach is based on the computation of both gradient and Hessian Fast to converge (few iterations) Slow to compute Newton s method (arrives at optimum in a single step) x 0 This method is very sensitive to the initial point If the initial point is very far from the optimal point, the optimization process may not converge /
49 Descent methods Iterative solutions that attempt to descend the function in steps to arrive at the minimum Based on the first order derivatives (gradient) and in some cases the second order derivatives (Hessian). Newton s method is based on both first and second derivatives Gradient descent is based only on the first derivative /
50 The Approach of Gradient Descent Iterative solution: Start at some point Find direction in which to shift this point to decrease error This can be found from the derivative of the function A positive derivative moving left decreases error A negative derivative moving right decreases error Shift point in this direction
51 The Approach of Gradient Descent Iterative solution: Trivial algorithm Initialize x 0 While f x k 0 If sign f x k is positive: Else x k+1 = x k step x k+1 = x k + step But what must step be to ensure we actually get to the optimum?
52 The Approach of Gradient Descent Iterative solution: Trivial algorithm Initialize x 0 While f x k 0 x k+1 = x k sign f x k. step Identical to previous algorithm
53 The Approach of Gradient Descent Iterative solution: Trivial algorithm Initialize x 0 Whilef x k 0 x k+1 = x k η k f x k η k is the step size What must the step size be?
54 Gradient descent/ascent (multivariate) The gradient descent/ascent method to find the minimum or maximum of a function f iteratively To find a maximum move in the direction of the gradient x k+1 = x k + η k f x k To find a minimum move exactly opposite the direction of the gradient x k+1 = x k η k f x k What is the step size η k /
55 1. Fixed step size Fixed step size Use fixed value for η k /
56 f Influence of step size example (constant step size) é 2 2 ( x1, x2) ( x1 ) x1x2 4( x2) x initial = ê 3 ë ù ú û x 0 x /
57 2. Backtracking line search for step size Two parameters α (typically 0.5) and β (typically 0.8) At each iteration, estimate step size as follows: Set η k = 1 Update η k = βη k until f x k η k f x k f x k αη k f x k 2 Update x k+1 = x k η k f x k Intuitively: At each iteration Take a unit step size and keep shrinking it until we arrive at a place where the function f x k η k f x k actually decreases sufficiently w.r.t f x k 77
58 2. Backtracking line search for step size Keep shrinking step size till we find a good one 78
59 2. Backtracking line search for step size X k+1 X k Keep shrinking step size till we find a good one Update estimate to the position at the converged step size 79
60 2. Backtracking line search for step size At each iteration, estimate step size as follows: Set η k = 1 Update η k = βη k until f x k η k f x k f x k αη k f x k 2 Update x k+1 = x k η k f x k Figure shows actual evolution of x k /
61 3. Full line search for step size At each iteration scan for η k that minimizes f x k η k f x k Update x k = x k η k f x k /
62 3. Full line search for step size At each iteration scan for η k that minimizes f x k η k f x k Can be computed by solving df x k η k f x k dη k = 0 Update x k = x k η k f x k /
63 Gradient descent convergence criteria The gradient descent algorithm converges when one of the following criteria is satisfied f (x k+1 )- f (x k ) <e 1 Or Ñf (x k ) < e /
64 Gradient descent vs. Newton s Gradient descent is typically much slower to converge than Newton s But much faster to compute x 0 Newton s method Gradient descent Newton s method is exponentially faster for convex problems Although derivatives and Hessians may be hard to derive May not converge for non-convex problems 97
65 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /
66 Online Optimization Often our objective function is an error The error is the cumulative error from many signals E.g. E(W) = y f(x, W) 2 x Optimization will find the W that minimizes total error across all x What if wanted to update our parameters after each input x instead of waiting for all of them to arrive? /
67 A problem we saw M = S = N =? Given the music M and the score S of only four of the notes, but not the notes themselves, find the notes M = NS N = MPinv(S) /
68 The Actual Problem M = S = N =? Given the music M and the score S find a matrix N such the error of reconstruction E = M i NS i 2 i is minimized This is a standard optimization problem The solution gives us N = MPinv(S) /
69 The Actual Problem M = S = N =? Given the music M and the score S find a matrix N such the error of reconstruction E = M i NS i 2 i is minimized This is a standard optimization problem The solution gives us N = MPinv(S) This requires seeing all of M and S to estimate N /
70 Online Updates M = S = N =? N 0 N 1 N 2 N 3 N 4 What if we want to update our estimate of the notes after every input After observing each vector of music and its score A situation that arises in many similar problems /
71 Incremental Updates Easy solution: To obtain the k th estimate N k, minimize the error on the k th input The error on the k th input is: E k = M K NS K 2 Differentiating it gives us N = 2 M K NS K M K T Update the parameter to move in the direction of this update N k+1 = N k T + η M K NS K M K η must typically be very small to prevent the updates from being influenced entirely by the latest observation /
72 Online update: Non-quadratic functions The earlier problem has a linear predictor as the underlying model M k = NS k The squared error E k = M k NS k 2 is quadratic and differentiable We often have non-linear predictors Y k = g WX k E k = Y k g WX k 2 The derivative of the error w.r.t W is often ugly or intractable For such problems we will use the following generalization of the online update rule for linear predictors This is the Widrow-Hoff rule W k+1 = W k + η Y k g W k X k Y k T /
73 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /
74 A problem we recently saw M = N = S The projection matrix P is the matrix that minimizes the total error between the projected matrix S and the original matrix M /
75 CONSTRAINED optimization Recall the projection problem: Find P such that we minimize E = M i PM i 2 i AND such that the projection is composed of the notes in N P = NC This is a problem of constrained optimization /
76 Optimization problem with constraints Finding the minimum of a function f: R N R subject to constraints min f (x) x s.t. g i (x) 0 h j (x) = 0 i = 1,..., k { } j = 1,...,l { } Constraints define a feasible region, which is nonempty /
77 Optimization without constraints No Constraints min x f (x, y, z)= x 2 + y 2 Best minimum point /
78 Optimization with constraints With Constraints min x,y f (x, y)= x 2 + y 2 s.t. 2x + y /
79 Optimization with constraints Minima w/ and w/o constraints min x,y f (x, y)= x 2 + y 2 s.t. 2x + y -4 Minimum Without constraints Minimum With constraints /
80 Solving for constrained optimization: the method of Lagrangians Consider a function f(x, y) that must be maximized w.r.t (x, y) subject to g(x, y) = c Note, we re using a maximization example to go with the figures that have been obtained from Wikipedia /
81 The Lagrange Method Purple surface is f(x, y) Must be maximized Red curve is constraint g(x, y) = c All solutions must line on this curve Problem: Find the position of the largest f(x, y) on the red curve! /
82 The Lagrange Method maximize f x, y, s. t. g x, y = c Dotted lines are constant-value contours f(x, y) = C f x, y has the same value C at all points on a contour The constrained optimum will be at the point where the highest constant-value contour touches the red curve It will be tangential to the red curve 115
83 The Lagrange Method maximize f x, y, s. t. g x, y = c The constrained optimum is where the highest constant-value contour is tangential to the red curve The gradient of f(x, y) = C will be parallel to the gradient of g(x, y) = c 116
84 The Lagrange Method maximize f x, y, s. t. g x, y = c At the optimum f x, y = λ g x, y g x, y = c Find x, y that satisfies both above conditions 117
85 The Lagrange Method f x, y g x, y = λ g x, y = c Find x, y that satisfies both above conditions Combine the above two into one equation L x, y, λ = f x, y λ(g x, y c) Optimize it for x, y, λ Solving for (x, y), Solving for λ x,y L x, y, λ =0 f x, y = λ g x, y L x, y, λ λ = 0 g x, y = c /
86 The Lagrange Method f x, y g x, y = λ g x, y = c Find x, y that satisfies both above conditions Combine the above two into one equation L x, y, λ = f x, y λ(g x, y c) Optimize it for x, y, λ Solving for (x, y), Formally: Solving for λ x,y L x, y, λ =0 f x, y = λ g x, y to maximize f(x, y): max L x, y, λ λ x,y to minimize f(x, y): min min λ L(x, y, λ) max L(x, y, λ) = 0 g x, y = c x,y λ /
87 Generalizes to inequality constraints Optimization problem with constraints min s. t. g ( x) Lagrange multipliers l i ³ 0,n Î Â L(x, l,n) = f (x)+ å l i g i (x)+ n j h j (x) i=1 The necessary condition ÑL(x, l,n) = 0 Û L x x h j f ( x) i ( x) 0 i 0 k j 1,..., k 1,..., l l å j=1 L L = 0, = 0, l n = /
88 Lagrange multiplier example min x, y s. t.2x f ( x, y) x y 4 2 y 2 Lagrange multiplier L x 2 y 2 (2x y 4) Evaluate ÑL(x, l,n) = 0 Û L x L L = 0, = 0, l n = /
89 Optimization with constraints Lagrange Multiplier results min x,y f (x, y)= x 2 + y 2 s.t. 2x + y -4 Minimum With constraints (-8/5,-4/5,16/5) /
90 An Alternate Approach: Projected Gradients Feasible Set The constraints specify a feasible set The region of the space where the solution can lie /
91 An Alternate Approach: Projected Gradients Feasible Set From the current estimate, take a step using the conventional gradient descent approach If the update is inside the feasible set, no further action is required /
92 An Alternate Approach: Projected Gradients Feasible Set If the update falls outside the feasible set, /
93 An Alternate Approach: Projected Gradients Feasible Set If the update falls outside the feasible set, find the closest point to the update on the boundary of the feasible set /
94 An Alternate Approach: Projected Gradients Feasible Set If the update falls outside the feasible set, find the closest point to the update on the boundary of the feasible set And move the updated estimate to this new point /
95 The method of projected gradients min x s. t. g f ( x) i ( x) 0 i 1,..., k The constraints specify a feasible set The region of the space where the solution can lie Update current estimate using the conventional gradient descent approach If the update is inside the feasible set, no further action is required If the update falls outside the feasible set, find the closest point to the update on the boundary of the feasible set And move the updated estimate to this new point The closest point projects the update onto the feasible set For many problems, however, finding this projection can be difficult or intractable /
96 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /
97 Regularization Sometimes we have additional regularization on the parameters E.g. Note these are not hard constraints Minimize f(x) while requiring that the length X 2 is also minimum Minimize f(x) while requiring that X 1 is also minimal Minimize f(x) such that g(x) is maximum We will encounter problems where such requirements are logical /
98 Contour Plot of a Quadratic Objective f(x) f(x) x 2 x 2 x 1 x 1 Left: Actual 3D plot x = [x 1, x 2 ] Right: constant-value contours Innermost contour has lowest value Unconstrained/unregularized solution: The center of the innermost contour /
99 Examples of regularization Image Credit: Tibshirani Left: L 1 regularization, find x that minimizes f(x) o Also minimize x 1 o x 1 = const is a diamond o Find x that also minimizes diameter of diamond Right: L 2 or Tikhonov regularization o Also minimize x 2 o x 2 = const is a circle (sphere) o Find x that also minimizes diameter of circle 11755/
100 Regularization The problem: multiple simultaneous objectives Minimize f(x) Also minimize g 1 X, g 2 X, These are regularizers Solution: Define L X = f X + λ 1 g 1 X + λ 2 g 2 X + λ 1, λ 2 etc are regularization parameters. These are set and not estimated Unlike Lagrange multipliers Minimize L X /
101 Contour Plot of a Quadratic Objective f(x) f(x) x 2 x 2 x 1 x 1 Left: Actual 3D plot x = [x 1, x 2 ] Right: equal-value contours of f(x) Innermost contour has lowest value /
102 With L 1 regularization f(x) x 1 f(x) + x 1 x 2 x 2 x 1 L 1 regularized objective f x + λ x 1, for different values of regularization parameter Note: Minimum value occurs on x 1 axis for = 1 Sparse solution / x 1
103 L 2 and L 1 -L 2 regularization f(x) + x 2 2 f(x) + x 1 + x 2 2 x 2 x 2 x 1 L 2 regularized objective f x + λ x 2 results in shorter optimum L 1 -L 2 regularized objective results in sparse, short optimum = 1 for both regularizers in example x /
104 Sparse signal reconstruction Minimum Square Error Signal x of length non-zero component Regularization Reconstructing the original signal from noisy 50 measurements b = Ax + ε /
105 Signal reconstruction Minimum Square Error Signal reconstruction Least square problem min Ax - b /
106 L2-Regularization Signal reconstruction Least squares problem min Ax - b 2 2 +g x /
107 L1-Regularization Signal reconstruction Least square problem min Ax - b 2 2 +g x /
108 Index 1. The problem of optimization 2. Direct optimization 3. Descent methods Newton s method Gradient methods 4. Online optimization 5. Constrained optimization Lagrange s method Projected gradients 6. Regularization 7. Convex optimization and Lagrangian duals /
109 Convex optimization Problems An convex optimization problem is defined by convex objective function Convex inequality constraints Affine equality constraints h j min f 0 (x) (convex function) x s.t. f i (x) 0 (convex sets) h j (x) = 0 (Affine) f i /
110 Convex Sets a set is convex, if for each x, y Î C and a Î 0,1 C Î Â n [ ] then ax +(1-a)y Î C x Non Convex Convex y y x /
111 Convex functions A function f: R N R is convex if for each x, y domain(f) and α [0,1] f (ax +(1-a)y) a f (x)+(1-a) f (y) Convex f (x) Non Convex f (y) a f (x)+(1-a) f (y) a f (x)+(1-a) f (y) f (y) f (x) /
112 Concave functions A function f: R N R is convex if for each x, y domain(f) and α [0,1] f (ax +(1-a)y) ³a f (x)+(1-a) f (y) f (y) Concave a f (x)+(1-a) f (y) /
113 First order convexity conditions A differentiable function f: R N R is convex if and only if for x, y domain(f) the following condition is satisfied f (y) ³ f (x)+ñf (x) t (y - x) Lower Bound (x, f (x)) f (x)+ Ñf (x) T (y - x) /
114 Second order convexity conditions A twice-differentiable function f: R N R is convex if and only if for all x, y domain(f) the Hessian is superior or equal to zero Ñ 2 f (x) ³ /
115 Properties of Convex Optimization For convex objectives over convex feasible sets, the optimum value is unique There are no local minima/maxima that are not also the global minima/maxima Any gradient-based solution will find this optimum eventually Primary problem: speed of convergence to this optimum /
116 Lagrange multiplier duality Optimization problem with constraints min f (x) x s.t. g i (x) 0 h j (x) = 0 i = { 1,..., k} j = { 1,...,l } Lagrange multipliers l i ³ 0,n Î Â L(x, l,n) = f (x)+ å l i g i (x)+ n j h j (x) The Dual function inf x L(x, l,n) = inf x k i=1 l å j=1 ì k l ï üï í f (x)+ å l i g i (x)+ ån j h j (x) ý îï i=1 j=1 þï /
117 Lagrange multiplier duality The Original optimization problem min{ sup L(x, l,n) } x The Dual optimization Property of the Dual for convex function sup l³0,n l³0,n { inf L(x, l,n) } = f (x * ) x l³0,n max{ inf L(x, l,n) } x /
118 Lagrange multiplier duality Previous Example f x, y is convex Constraint function is convex min x,y f (x, y)= x 2 + y 2 s.t. 2x + y /
119 Lagrange multiplier duality Primal system min x,y f (x, y)= x 2 + y 2 s.t. 2x + y -4 Lagrange Multiplier L = x 2 + y 2 + l(2x + y - 4) L = 2x + 2l = 0 Þ x = -l x L y = 2y + l = 0 Þ y = - l 2 Dual system max l w(l) = 5 4 l 2 + 4l s.t. l ³ 0 Property w(l*) = f (x*, y*) /
120 Lagrange multiplier duality Dual system max l w(l) = 5 4 l 2 + 4l s.t. l ³ 0 Concave function Convex function become concave function in dual problem w x = l + 4 = 0 Þ l* = /
121 Lagrange multiplier duality Primal system min x,y Evaluating f (x, y)= x 2 + y 2 s.t. 2x + y -4 w(l*) = f (x*, y*) Dual system max l w(l) = 5 4 l 2 + 4l s.t. l ³ 0 x* = - 8 5, y* = l* = 8 5 æ f (x*, y*) = - 8 ö ç è 5ø 2 æ ö ç è 5ø 2 w(l*) = - 5 æ8 ç ö 2 4 è 5ø f (x*, y*) = 16 5 w(l*) = /
Lecture 2 Optimization with equality constraints
Lecture 2 Optimization with equality constraints Constrained optimization The idea of constrained optimisation is that the choice of one variable often affects the amount of another variable that can be
More informationConvexity Theory and Gradient Methods
Convexity Theory and Gradient Methods Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Convex Functions Optimality
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June
More information1. Show that the rectangle of maximum area that has a given perimeter p is a square.
Constrained Optimization - Examples - 1 Unit #23 : Goals: Lagrange Multipliers To study constrained optimization; that is, the maximizing or minimizing of a function subject to a constraint (or side condition).
More informationIn other words, we want to find the domain points that yield the maximum or minimum values (extrema) of the function.
1 The Lagrange multipliers is a mathematical method for performing constrained optimization of differentiable functions. Recall unconstrained optimization of differentiable functions, in which we want
More informationA Short SVM (Support Vector Machine) Tutorial
A Short SVM (Support Vector Machine) Tutorial j.p.lewis CGIT Lab / IMSC U. Southern California version 0.zz dec 004 This tutorial assumes you are familiar with linear algebra and equality-constrained optimization/lagrange
More informationChapter 3 Numerical Methods
Chapter 3 Numerical Methods Part 1 3.1 Linearization and Optimization of Functions of Vectors 1 Problem Notation 2 Outline 3.1.1 Linearization 3.1.2 Optimization of Objective Functions 3.1.3 Constrained
More informationLecture 2 September 3
EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give
More informationSparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual
Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know learn the proximal
More informationEC5555 Economics Masters Refresher Course in Mathematics September Lecture 6 Optimization with equality constraints Francesco Feri
EC5555 Economics Masters Refresher Course in Mathematics September 2013 Lecture 6 Optimization with equality constraints Francesco Feri Constrained optimization The idea of constrained optimisation is
More informationLECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach
LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach Basic approaches I. Primal Approach - Feasible Direction
More informationSecond Midterm Exam Math 212 Fall 2010
Second Midterm Exam Math 22 Fall 2 Instructions: This is a 9 minute exam. You should work alone, without access to any book or notes. No calculators are allowed. Do not discuss this exam with anyone other
More informationLocal and Global Minimum
Local and Global Minimum Stationary Point. From elementary calculus, a single variable function has a stationary point at if the derivative vanishes at, i.e., 0. Graphically, the slope of the function
More informationComputational Methods. Constrained Optimization
Computational Methods Constrained Optimization Manfred Huber 2010 1 Constrained Optimization Unconstrained Optimization finds a minimum of a function under the assumption that the parameters can take on
More informationREVIEW I MATH 254 Calculus IV. Exam I (Friday, April 29) will cover sections
REVIEW I MATH 254 Calculus IV Exam I (Friday, April 29 will cover sections 14.1-8. 1. Functions of multivariables The definition of multivariable functions is similar to that of functions of one variable.
More informationUnconstrained Optimization Principles of Unconstrained Optimization Search Methods
1 Nonlinear Programming Types of Nonlinear Programs (NLP) Convexity and Convex Programs NLP Solutions Unconstrained Optimization Principles of Unconstrained Optimization Search Methods Constrained Optimization
More informationLecture 15: Log Barrier Method
10-725/36-725: Convex Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 15: Log Barrier Method Scribes: Pradeep Dasigi, Mohammad Gowayyed Note: LaTeX template courtesy of UC Berkeley EECS dept.
More informationLagrange multipliers October 2013
Lagrange multipliers 14.8 14 October 2013 Example: Optimization with constraint. Example: Find the extreme values of f (x, y) = x + 2y on the ellipse 3x 2 + 4y 2 = 3. 3/2 1 1 3/2 Example: Optimization
More informationCharacterizing Improving Directions Unconstrained Optimization
Final Review IE417 In the Beginning... In the beginning, Weierstrass's theorem said that a continuous function achieves a minimum on a compact set. Using this, we showed that for a convex set S and y not
More information3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers
3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers Prof. Tesler Math 20C Fall 2018 Prof. Tesler 3.3 3.4 Optimization Math 20C / Fall 2018 1 / 56 Optimizing y = f (x) In Math 20A, we
More informationLagrange multipliers 14.8
Lagrange multipliers 14.8 14 October 2013 Example: Optimization with constraint. Example: Find the extreme values of f (x, y) = x + 2y on the ellipse 3x 2 + 4y 2 = 3. 3/2 Maximum? 1 1 Minimum? 3/2 Idea:
More informationLecture 18: March 23
0-725/36-725: Convex Optimization Spring 205 Lecturer: Ryan Tibshirani Lecture 8: March 23 Scribes: James Duyck Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationCOMS 4771 Support Vector Machines. Nakul Verma
COMS 4771 Support Vector Machines Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake bound for the perceptron
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 2 Convex Optimization Shiqian Ma, MAT-258A: Numerical Optimization 2 2.1. Convex Optimization General optimization problem: min f 0 (x) s.t., f i
More informationCPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015
CPSC 340: Machine Learning and Data Mining Robust Regression Fall 2015 Admin Can you see Assignment 1 grades on UBC connect? Auditors, don t worry about it. You should already be working on Assignment
More informationContents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5
More informationApplied Lagrange Duality for Constrained Optimization
Applied Lagrange Duality for Constrained Optimization Robert M. Freund February 10, 2004 c 2004 Massachusetts Institute of Technology. 1 1 Overview The Practical Importance of Duality Review of Convexity
More informationLecture 12: convergence. Derivative (one variable)
Lecture 12: convergence More about multivariable calculus Descent methods Backtracking line search More about convexity (first and second order) Newton step Example 1: linear programming (one var., one
More informationLecture 7: Support Vector Machine
Lecture 7: Support Vector Machine Hien Van Nguyen University of Houston 9/28/2017 Separating hyperplane Red and green dots can be separated by a separating hyperplane Two classes are separable, i.e., each
More informationIntroduction to Modern Control Systems
Introduction to Modern Control Systems Convex Optimization, Duality and Linear Matrix Inequalities Kostas Margellos University of Oxford AIMS CDT 2016-17 Introduction to Modern Control Systems November
More informationPRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction
PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING KELLER VANDEBOGERT AND CHARLES LANNING 1. Introduction Interior point methods are, put simply, a technique of optimization where, given a problem
More informationSolution Methods Numerical Algorithms
Solution Methods Numerical Algorithms Evelien van der Hurk DTU Managment Engineering Class Exercises From Last Time 2 DTU Management Engineering 42111: Static and Dynamic Optimization (6) 09/10/2017 Class
More informationMATH 19520/51 Class 10
MATH 19520/51 Class 10 Minh-Tam Trinh University of Chicago 2017-10-16 1 Method of Lagrange multipliers. 2 Examples of Lagrange multipliers. The Problem The ingredients: 1 A set of parameters, say x 1,...,
More informationInverse and Implicit functions
CHAPTER 3 Inverse and Implicit functions. Inverse Functions and Coordinate Changes Let U R d be a domain. Theorem. (Inverse function theorem). If ϕ : U R d is differentiable at a and Dϕ a is invertible,
More informationConvexization in Markov Chain Monte Carlo
in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non
More informationCMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro
CMU-Q 15-381 Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization Teacher: Gianni A. Di Caro GLOBAL FUNCTION OPTIMIZATION Find the global maximum of the function f x (and
More informationEllipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case
Ellipsoid Algorithm 15-853:Algorithms in the Real World Linear and Integer Programming II Ellipsoid algorithm Interior point methods First polynomial-time algorithm for linear programming (Khachian 79)
More informationChapter II. Linear Programming
1 Chapter II Linear Programming 1. Introduction 2. Simplex Method 3. Duality Theory 4. Optimality Conditions 5. Applications (QP & SLP) 6. Sensitivity Analysis 7. Interior Point Methods 1 INTRODUCTION
More informationData Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)
Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based
More informationEXTRA-CREDIT PROBLEMS ON SURFACES, MULTIVARIABLE FUNCTIONS AND PARTIAL DERIVATIVES
EXTRA-CREDIT PROBLEMS ON SURFACES, MULTIVARIABLE FUNCTIONS AND PARTIAL DERIVATIVES A. HAVENS These problems are for extra-credit, which is counted against lost points on quizzes or WebAssign. You do not
More informationConstrained Optimization
Constrained Optimization Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Constrained Optimization 1 / 46 EC2040 Topic 5 - Constrained Optimization Reading 1 Chapters 12.1-12.3
More information14.5 Directional Derivatives and the Gradient Vector
14.5 Directional Derivatives and the Gradient Vector 1. Directional Derivatives. Recall z = f (x, y) and the partial derivatives f x and f y are defined as f (x 0 + h, y 0 ) f (x 0, y 0 ) f x (x 0, y 0
More informationLecture 25 Nonlinear Programming. November 9, 2009
Nonlinear Programming November 9, 2009 Outline Nonlinear Programming Another example of NLP problem What makes these problems complex Scalar Function Unconstrained Problem Local and global optima: definition,
More informationLecture 19: Convex Non-Smooth Optimization. April 2, 2007
: Convex Non-Smooth Optimization April 2, 2007 Outline Lecture 19 Convex non-smooth problems Examples Subgradients and subdifferentials Subgradient properties Operations with subgradients and subdifferentials
More informationToday. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient
Optimization Last time Root finding: definition, motivation Algorithms: Bisection, false position, secant, Newton-Raphson Convergence & tradeoffs Example applications of Newton s method Root finding in
More informationIntroduction to Machine Learning
Introduction to Machine Learning Maximum Margin Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationModern Methods of Data Analysis - WS 07/08
Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints
More informationWinter 2012 Math 255 Section 006. Problem Set 7
Problem Set 7 1 a) Carry out the partials with respect to t and x, substitute and check b) Use separation of varibles, i.e. write as dx/x 2 = dt, integrate both sides and observe that the solution also
More informationConvex Optimization and Machine Learning
Convex Optimization and Machine Learning Mengliu Zhao Machine Learning Reading Group School of Computing Science Simon Fraser University March 12, 2014 Mengliu Zhao SFU-MLRG March 12, 2014 1 / 25 Introduction
More information5 Day 5: Maxima and minima for n variables.
UNIVERSITAT POMPEU FABRA INTERNATIONAL BUSINESS ECONOMICS MATHEMATICS III. Pelegrí Viader. 2012-201 Updated May 14, 201 5 Day 5: Maxima and minima for n variables. The same kind of first-order and second-order
More informationMathematical Programming and Research Methods (Part II)
Mathematical Programming and Research Methods (Part II) 4. Convexity and Optimization Massimiliano Pontil (based on previous lecture by Andreas Argyriou) 1 Today s Plan Convex sets and functions Types
More informationSection 4: Extreme Values & Lagrange Multipliers.
Section 4: Extreme Values & Lagrange Multipliers. Compiled by Chris Tisdell S1: Motivation S2: What are local maxima & minima? S3: What is a critical point? S4: Second derivative test S5: Maxima and Minima
More informationConvex Optimization MLSS 2015
Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :
More informationLagrange multipliers. Contents. Introduction. From Wikipedia, the free encyclopedia
Lagrange multipliers From Wikipedia, the free encyclopedia In mathematical optimization problems, Lagrange multipliers, named after Joseph Louis Lagrange, is a method for finding the local extrema of a
More informationSupport Vector Machines. James McInerney Adapted from slides by Nakul Verma
Support Vector Machines James McInerney Adapted from slides by Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake
More informationConvexity and Optimization
Convexity and Optimization Richard Lusby Department of Management Engineering Technical University of Denmark Today s Material Extrema Convex Function Convex Sets Other Convexity Concepts Unconstrained
More information16.410/413 Principles of Autonomy and Decision Making
16.410/413 Principles of Autonomy and Decision Making Lecture 17: The Simplex Method Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology November 10, 2010 Frazzoli (MIT)
More informationISM206 Lecture, April 26, 2005 Optimization of Nonlinear Objectives, with Non-Linear Constraints
ISM206 Lecture, April 26, 2005 Optimization of Nonlinear Objectives, with Non-Linear Constraints Instructor: Kevin Ross Scribe: Pritam Roy May 0, 2005 Outline of topics for the lecture We will discuss
More informationMEI Desmos Tasks for AS Pure
Task 1: Coordinate Geometry Intersection of a line and a curve 1. Add a quadratic curve, e.g. y = x² 4x + 1 2. Add a line, e.g. y = x 3 3. Select the points of intersection of the line and the curve. What
More informationKernel Methods & Support Vector Machines
& Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector
More information30. Constrained Optimization
30. Constrained Optimization The graph of z = f(x, y) is represented by a surface in R 3. Normally, x and y are chosen independently of one another so that one may roam over the entire surface of f (within
More informationR f da (where da denotes the differential of area dxdy (or dydx)
Math 28H Topics for the second exam (Technically, everything covered on the first exam, plus) Constrained Optimization: Lagrange Multipliers Most optimization problems that arise naturally are not unconstrained;
More informationPhysics 235 Chapter 6. Chapter 6 Some Methods in the Calculus of Variations
Chapter 6 Some Methods in the Calculus of Variations In this Chapter we focus on an important method of solving certain problems in Classical Mechanics. In many problems we need to determine how a system
More informationNAME: Section # SSN: X X X X
Math 155 FINAL EXAM A May 5, 2003 NAME: Section # SSN: X X X X Question Grade 1 5 (out of 25) 6 10 (out of 25) 11 (out of 20) 12 (out of 20) 13 (out of 10) 14 (out of 10) 15 (out of 16) 16 (out of 24)
More informationMATH2111 Higher Several Variable Calculus Lagrange Multipliers
MATH2111 Higher Several Variable Calculus Lagrange Multipliers Dr. Jonathan Kress School of Mathematics and Statistics University of New South Wales Semester 1, 2016 [updated: February 29, 2016] JM Kress
More informationConstrained extrema of two variables functions
Constrained extrema of two variables functions Apellidos, Nombre: Departamento: Centro: Alicia Herrero Debón aherrero@mat.upv.es) Departamento de Matemática Aplicada Instituto de Matemática Multidisciplnar
More informationINTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING
INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING DAVID G. LUENBERGER Stanford University TT ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California London Don Mills, Ontario CONTENTS
More informationMath 113 Calculus III Final Exam Practice Problems Spring 2003
Math 113 Calculus III Final Exam Practice Problems Spring 23 1. Let g(x, y, z) = 2x 2 + y 2 + 4z 2. (a) Describe the shapes of the level surfaces of g. (b) In three different graphs, sketch the three cross
More information(1) Given the following system of linear equations, which depends on a parameter a R, 3x y + 5z = 2 4x + y + (a 2 14)z = a + 2
(1 Given the following system of linear equations, which depends on a parameter a R, x + 2y 3z = 4 3x y + 5z = 2 4x + y + (a 2 14z = a + 2 (a Classify the system of equations depending on the values of
More informationConstrained and Unconstrained Optimization
Constrained and Unconstrained Optimization Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Oct 10th, 2017 C. Hurtado (UIUC - Economics) Numerical
More informationIntroduction to Optimization
Introduction to Optimization Constrained Optimization Marc Toussaint U Stuttgart Constrained Optimization General constrained optimization problem: Let R n, f : R n R, g : R n R m, h : R n R l find min
More informationUNIT 2 LINEAR PROGRAMMING PROBLEMS
UNIT 2 LINEAR PROGRAMMING PROBLEMS Structure 2.1 Introduction Objectives 2.2 Linear Programming Problem (LPP) 2.3 Mathematical Formulation of LPP 2.4 Graphical Solution of Linear Programming Problems 2.5
More informationLecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem
Computational Learning Theory Fall Semester, 2012/13 Lecture 10: SVM Lecturer: Yishay Mansour Scribe: Gitit Kehat, Yogev Vaknin and Ezra Levin 1 10.1 Lecture Overview In this lecture we present in detail
More informationConvex Optimization. Erick Delage, and Ashutosh Saxena. October 20, (a) (b) (c)
Convex Optimization (for CS229) Erick Delage, and Ashutosh Saxena October 20, 2006 1 Convex Sets Definition: A set G R n is convex if every pair of point (x, y) G, the segment beteen x and y is in A. More
More informationModule 1 Lecture Notes 2. Optimization Problem and Model Formulation
Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization
More informationConvexity and Optimization
Convexity and Optimization Richard Lusby DTU Management Engineering Class Exercises From Last Time 2 DTU Management Engineering 42111: Static and Dynamic Optimization (3) 18/09/2017 Today s Material Extrema
More informationLecture 2: August 31
10-725/36-725: Convex Optimization Fall 2016 Lecture 2: August 31 Lecturer: Lecturer: Ryan Tibshirani Scribes: Scribes: Lidan Mu, Simon Du, Binxuan Huang 2.1 Review A convex optimization problem is of
More informationNeural Networks: Optimization Part 1. Intro to Deep Learning, Fall 2018
Neural Networks: Optimization Part 1 Intro to Deep Learning, Fall 2018 1 Story so far Neural networks are universal approximators Can model any odd thing Provided they have the right architecture We must
More informationConstrained Optimization and Lagrange Multipliers
Constrained Optimization and Lagrange Multipliers MATH 311, Calculus III J. Robert Buchanan Department of Mathematics Fall 2011 Constrained Optimization In the previous section we found the local or absolute
More informationSupport Vector Machines.
Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing
More informationPractice problems from old exams for math 233 William H. Meeks III December 21, 2009
Practice problems from old exams for math 233 William H. Meeks III December 21, 2009 Disclaimer: Your instructor covers far more materials that we can possibly fit into a four/five questions exams. These
More informationSupport Vector Machines.
Support Vector Machines srihari@buffalo.edu SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions
More informationDavid G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer
David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein October 8, 2015 1 / 30 Introduction Presentation Outline 1 Convex
More information. Tutorial Class V 3-10/10/2012 First Order Partial Derivatives;...
Tutorial Class V 3-10/10/2012 1 First Order Partial Derivatives; Tutorial Class V 3-10/10/2012 1 First Order Partial Derivatives; 2 Application of Gradient; Tutorial Class V 3-10/10/2012 1 First Order
More informationLecture 2: August 29, 2018
10-725/36-725: Convex Optimization Fall 2018 Lecturer: Ryan Tibshirani Lecture 2: August 29, 2018 Scribes: Adam Harley Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have
More informationLecture 19: November 5
0-725/36-725: Convex Optimization Fall 205 Lecturer: Ryan Tibshirani Lecture 9: November 5 Scribes: Hyun Ah Song Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationThe big picture and a math review
The big picture and a math review Intermediate Micro Topic 1 Mathematical appendix of Varian The big picture Microeconomics Study of decision-making under scarcity Individuals households firms investors
More informationData Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with
More informationMultivariate Calculus Review Problems for Examination Two
Multivariate Calculus Review Problems for Examination Two Note: Exam Two is on Thursday, February 28, class time. The coverage is multivariate differential calculus and double integration: sections 13.3,
More informationLagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers
In this section we present Lagrange s method for maximizing or minimizing a general function f(x, y, z) subject to a constraint (or side condition) of the form g(x, y, z) = k. Figure 1 shows this curve
More informationComputational Optimization. Constrained Optimization Algorithms
Computational Optimization Constrained Optimization Algorithms Same basic algorithms Repeat Determine descent direction Determine step size Take a step Until Optimal But now must consider feasibility,
More informationMATH Lagrange multipliers in 3 variables Fall 2016
MATH 20550 Lagrange multipliers in 3 variables Fall 2016 1. The one constraint they The problem is to find the extrema of a function f(x, y, z) subject to the constraint g(x, y, z) = c. The book gives
More informationGraphs of Exponential
Graphs of Exponential Functions By: OpenStaxCollege As we discussed in the previous section, exponential functions are used for many realworld applications such as finance, forensics, computer science,
More informationCSE 554 Lecture 6: Fairing and Simplification
CSE 554 Lecture 6: Fairing and Simplification Fall 2012 CSE554 Fairing and simplification Slide 1 Review Iso-contours in grayscale images and volumes Piece-wise linear representations Polylines (2D) and
More informationMEI GeoGebra Tasks for AS Pure
Task 1: Coordinate Geometry Intersection of a line and a curve 1. Add a quadratic curve, e.g. y = x 2 4x + 1 2. Add a line, e.g. y = x 3 3. Use the Intersect tool to find the points of intersection of
More informationNonlinear Programming
Nonlinear Programming SECOND EDITION Dimitri P. Bertsekas Massachusetts Institute of Technology WWW site for book Information and Orders http://world.std.com/~athenasc/index.html Athena Scientific, Belmont,
More informationConvex Programs. COMPSCI 371D Machine Learning. COMPSCI 371D Machine Learning Convex Programs 1 / 21
Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Convex Programs 1 / 21 Logistic Regression! Support Vector Machines Support Vector Machines (SVMs) and Convex Programs SVMs are
More informationMultivariate Numerical Optimization
Jianxin Wei March 1, 2013 Outline 1 Graphics for Function of Two Variables 2 Nelder-Mead Simplex Method 3 Steepest Descent Method 4 Newton s Method 5 Quasi-Newton s Method 6 Built-in R Function 7 Linear
More informationSYSTEMS OF NONLINEAR EQUATIONS
SYSTEMS OF NONLINEAR EQUATIONS Widely used in the mathematical modeling of real world phenomena. We introduce some numerical methods for their solution. For better intuition, we examine systems of two
More information