LECTURE 18 - OPTIMIZATION CHRIS JOHNSON Abstract. In this lecture we ll describe extend the optimization techniques you learned in your first semester calculus class to optimize functions of multiple variables. In particular, we ll see how the tangent planes of a function s graph will allow us to us to search for local maxima and minima. We ll also describe a second-derivative test for functions of two variables, and discuss some conditions on the domain of a multivariable function that guarantee the existence of extrema. 1. Maxima & Minima Given a function of two variables, f(x, y), we say that the point (x 0, y 0 ) is a local maximum of f if there exists some ɛ > 0 such that f(x 0, y 0 ) f(x, y) for all (x, y) that are within ɛ-distance of (x 0, y 0 ). (For this lecture we will assume all functions are differentiable.) Graphically this means that the point (x 0, y 0, f(x 0, y 0 )) occurs at the top of a hill on the surface z = f(x, y) (or the surface could be flat near this point). A local minimum is defined the same way, except we require f(x 0, y 0 ) f(x, y) for all (x, y) within ɛ-distance of (x 0, y 0 ). So instead of being at the top of a hill, (x 0, y 0, f(x 0, y 0 )) is at the bottom of a valley. If f(x 0, y 0 ) f(x, y) for all (x, y) in the domain of f, then we say that (x 0, y 0 ) is a global maximum. If f(x 0, y 0 ) f(x, y) for all (x, y) in the doamin of f, then we say that (x 0, y 0 ) is a global minimum. Sometimes the words absolute maximum and absolute minimum are used instead. Graphically, the global maxima (minima) occur at the peaks (valleys) of the largest (deepest) hills (valleys). Notice that local and global extrema are not unique in general, and that global extrema are also local extrema. The obsveration that extrema occur at hills and valleys suggest a nice way of looking for extrema: find all of the hills and valleys. Geometrically, this means we want to find all of the places where the tangent plane of the function is flat (parallel to the xy-plane). This is the same thing as saying that the normal vector of this tangent plane Date: February 28, 2014. 1
2 CHRIS JOHNSON points straight up, or straight down. Since we know from the previous lecture on tangent planes that the normal vector of the tangent plane of z = f(x, y) at the point (x 0, y 0 ) is given by f x (x 0, y 0 ), f y (x 0, y 0 ), 1, this is equivalent to asking that f x (x 0, y 0 ) = f y (x 0, y 0 ) = 0. Theorem 1.1. If f(x, y) is differentiable and (x 0, y 0 ) is a local extremum, then f x (x 0, y 0 ) = f y (x 0, y 0 ) = 0. It turns out that in order to guarantee a point is a global extremum, and not just a local one, we ll need a few more conditions on the function. For this reason we ll only search for local maxima and minima for the time being. If a function isn t differentiable at a point in its domain (which effectively means the function s graph has a sharp point), it could be that this non-differentiable point is an extremum of the function. As a simple example, consider the cone z = x 2 + y 2. The function isn t differentiable at the point (0, 0), but this is clearly a local (in fact global) minimum of the function. We will call a point (x 0, y 0 ) a critical point of f(x, y) if any of the following conditions are met: (1) f x (x 0, y 0 ) = f y (x 0, y 0 ) = 0, or (2) f x (x 0, y 0 ) does not exist, or (3) f y (x 0, y 0 ) does not exist. Example 1.1. Find all of the critical points of f(x, y) = x 2 +xy+y 2 +y. First we calculate the partial derivatives, f x (x, y) = 2x + y f y (x, y) = x + 2y + 1 These partial derivatives obviously exist, and are continuous, for all (x, y) since they re polynomials. So to find the critical points, we now need to look for the solutions to the system 2x + y = 0 x + 2y + 1 = 0. Moving the 1 in the second equation to the right-hand side, this is equivalent to solving the system 2x + y = 0 x + 2y = 1.
LECTURE 18 - OPTIMIZATION 3 Let s multiply the second equation by 2 and subtract it from the first: 2x + y 2(x + 2y) = 3y. Since the first equation equals 0 and the first equation equals 1, we must have 3y = 0 2( 1) = 2. Hence y = 2 /3. Plugging y = 2 /3 back into the first equation, we find 2x + 2 /3 = 0 = 2x = 2 /3 = x = 1 /3. Thus ( 1 /3, 2 /3) is the only critical point of the function. Example 1.2. Find all of the critical points of the function f(x, y) = sin(x) sin(y). Taking partial derivatives we have f x (x, y) = cos(x) sin(y) f y (x, y) = sin(x) cos(y) We need the x s and y s which simultaneously make both of these quantities zero. Notice cos θ = 0 if θ = (2k + 1) π 2 for k Z sin θ = 0 if θ = kπ for k Z. So the points f x (x, y) = f y (x, y) = 0 are {((2k + 1) π /2, (2j + 1) π /2)} {(kπ, jπ)} = { (m π /2, n π /2) m, n Z, with m and n both odd, or both even }. Once we know where all of the critical points of f are, we need some way of determining if the points maxima, minima, or neither. If the function f(x, y) has continuous second-order partial derivatives, then we can use the multivariable version of the second derivative test. Theorem 1.2 (Second Derivative Test). If f(x, y) has continuous second-order partial derivatives and if (x 0, y 0 ) is a critical point of f, then define ( ) fxx (x D(x 0, y 0 ) = det 0, y 0 ) f xy (x 0, y 0 ) f yx (x 0, y 0 ) f yy (x 0, y 0 ) Then = f xx (x 0, y 0 ) f yy (x 0, y 0 ) f xy (x 0, y 0 ) f yx (x 0, y 0 ) = f xx (x 0, y 0 ) f yy (x 0, y 0 ) f xy (x 0, y 0 ) 2
4 CHRIS JOHNSON (1) If D(x 0, y 0 ) > 0 and f xx (x 0, y 0 ) > 0, then f(x 0, y 0 ) is a local minimum. (2) If D(x 0, y 0 ) > 0 and f xx (x 0, y 0 ) < 0, then f(x 0, y 0 ) is a local maximum. (3) If D(x 0, y 0 ) < 0, then (x 0, y 0 ) is neither a maximum nor a minimum. (In this case we say that (x 0, y 0 ) is a saddle point.) (4) If D(x 0, y 0 ) = 0, then the test is inconclusive. Remark. We won t prove the second derivative test, but will mention the underlying idea. Recall that the second derivative of a function of a single variable measures the function s concavity at a point. The threedimensional analogue of concavity is called curvature. Curvature at a point on a surface is determined by looking at the curvatures of curves on the surface which pass through the point. The largest and smallest curvatures of these curves are then multiplied together, and this is the curvature of the surface. A surface has positive curvature if these two largest and smallest curvatures of curves on the surface (called the principal curvatures) have the same sign; negative curvature means the principal curvatures have opposite signs; and zero curvature if one of the principal curvatures is zero in this case the surface is flat near the point. Intuitively, surfaces of positive curvature look like a bowl; surfaces of negative curvature look like a Pringles chip; and surfaces of zero curvature contain a straight line. The D that appears in the second-derivative test is (essentially) the curvature of the surface. So when the curvature is positive (D > 0) we have a bowl, and it s only a question of whether the bowl opens up or down (f xx (x 0, y 0 ) is negative or positive). Example 1.3. Find the maxima and minima of the function f(x, y) = y 3 + 3x 2 y 6x 2 6y 2 + 2. First we have to find the critical points, which means calculating the partial derivatives. f x (x, y) = 6x 12x = 6x(y 2) f y (x, y) = 3y 2 + 3x 2 12y = 3(y 2 4y + x 2 ). Notice these functions are continuous everywhere, so we need to look for solutions to the system f x (x, y) = f y (x, y) = 0. If f x (x, y) = 0, then either x = 0 or y = 2. We ll plug each of these into y 2 4y + x 2. If x = 0, y 2 4y + x 2 = 0 becomes y 2 4y = y(y 4) = 0, and so y = 0 or y = 4. Thus (0, 0) and (0, 4) are two of our critical points. If y = 2, y 2 4y + x 2 = 0 becomes 4 + x 2 = 0 which means x 2 = 4, so x = ±2. Hence (2, 2) and ( 2, 2) are our other critical points.
LECTURE 18 - OPTIMIZATION 5 We have four critical points: (0, 0), (0, 4), (2, 2), and ( 2, 2). To which are maxima and which are minima, we apply our second derivative test, which means we need to calculate D(x, y) for each value. In our situation, ( ) fxx f D(x, y) = det xy f yx f yy ( ) 6y 12 6x = det 6x 6y 12 = (6y 12) 2 36x 2 Now we calculate this value for each of our critical points. For (0, 0) we have D(0, 0) = ( 12) 2 0 2 = 144. So D > 0 and we do have a local extremum. To see if it s a local max or a local min, we need to calculate f xx (0, 0) = 12 < 0, so (0, 0) is a local max. For (0, 4) we have D(0, 4) = (24 12) 2 0 = 144, and f xx (0, 4) = 12 > 0, so (0, 4) is a local minimum. For (2, 2) we have D(2, 2) = (12 12) 2 36 4 = 144, and so (2, 2) is a saddle point. For ( 2, 2) we have D( 2, 2) = (12 12) 2 36 4 = 144 and so ( 2, 2) is also a saddle point. 2. Finding Global Extrema In general a local extremum doesn t have to be a global extremum. You might like to say that the largest local maximum is a global maximum, and the smallest local minimum is a global minimum, but this isn t necessarily the case. [Insert picture here.] What we d like is some condition that will guarantee that a function does in fact have global extrema, and furthermore say that the largest local maximum is a global maximum. Again, this doesn t have to happen in general, so we re looking for special cases where this does happen. What prevents us from making such a statement is that the domain of a function could be infinite, and the outputs of the function could become arbitrarily large. So what we d like to do is to consider functions whose domain is small enough that this sort of thing can t happen. Unfortunately this is a little bit more involved than you might think, so first we need to make a few definitions. We ll say that a subset S R 2 of the plane is bounded if points of S can t become arbitrariliy far away from the origin. Equivalently, all of S can be fit into a circle of finite radius.
6 CHRIS JOHNSON A point P is called a boundary point of a set S R 2 if every circle centered at P contains points that are both inside of and outside of S. The collection of all boundary points of S is called the boundary of S and denoted S. Finally we ll say that a set S which contains all of its boundary points is closed. For example, the set S = {(x, y) x 2 + y 2 1} is closed, but the set S = {(x, y) x 2 + y 2 < 1} is not closed. Remark. If the complement of S is closed (that is, if the collection of all points that are in R 2 but not in S is closed), then we say that S is open. Equivalently, for every point in S, we can put a small disc around that point, and the disc lies entirely in S. Notice that there are sets that are neither open nor closed, as well as sets that are both open and closed. If a set is both closed and bounded, we say the set is compact. Intuitively, compact sets are small and particularly small enough to ensure that any function whose domain is compact has global extrema. Theorem 2.1 (Extreme Value Theorem). If f(x, y) is a continuous function, and if the domain of f is compact, then f has both a global maximum and a global minimum.
LECTURE 18 - OPTIMIZATION 7 3. Exercises Exercise 1. Find the global maxima and minima of the function on the rectangle [ 1, 1] [ 1, 1]. f(x, y) = 6x 2 y 3x 2 12y 2 + 7 Exercise 2. Explain why the function f(x, y) = 1 + x3 1 y does not have a global maximum on the disc, C = {(x, y) x 2 + y 2 < 1}. Note that f(x, y) is defined on all of C. Does this contradict Theorem 2.1?