1 Chain rules 2 Directional derivative 3 Gradient Vector Field 4 Most Rapid Increase 5 Implicit Function Theorem, Implicit Differentiation 6 Lagrange Multiplier 7 Second Derivative Test
Theorem Suppose that w = f (x, y, z) is a differentiable function, where x = x(u, v), y = y(u, v), z = z(u, v), where the coordinate functions are parameterized by differentiable functions Then the composite function w(u, v) = f ( x(u, v), (u, v), (u, v) ) is a differentiable function in u and v, such that the partial functions are given by w u w v = w x x u + w y y u + w z z u ; = w x x v + w y y v + w z z v Remark The formula stated above is very important in the theory of surface integral
Theorem (Chain Rule for Coordinate Changes) Suppose that s = f (x, y, z) is a differentiable function, where x = x(u, v, w), y = y(u, v, w), z = z(u, v, w), where the coordinate functions are parameterized by differentiable functions in variables u, v and w Then the composite function S(u, v, w) = f ( x(u, v, w), (u, v, w), (u, v, w) ) is a differentiable function in u, v and w, such that the partial functions are given by S u S v = S x x u + S y y u + S z z u ; = S x x v + S y y v + S z z v ; S w = S x x w + S y y w + S z z w Remark The formula stated above is very important in the theory of inverse function theory and integration theory
Example In spherical coordinates, we have the parameters (ρ, θ, ϕ) to represent (x, y, z) as follows: x = ρ sin ϕ cos θ, y = ρ sin ϕ sin θ, z = ρ cos ϕ, with ρ 0, 0 θ 2π, and 0 ϕ π Define S(x, y, z) = x 2 + y 2 + z 2 Evaluate the partial derivative S ρ in two ways Solution (i) Since S(ρ sin ϕ cos θ, ρ sin ϕ sin θ, ρ cos ϕ) = ρ, so S ρ = 1 for any choices of parameters involved (ii) The second method is to apply chain rule S x = S y = and S z = x = sin ϕ cos θ, x x 2 +y 2 +z 2 ρ = ρ (ρ sin ϕ cos θ) = sin ϕ cos θ, y = sin ϕ sin θ, y x 2 +y 2 +z 2 ρ = ρ (ρ sin ϕ sin θ) = sin ϕ sin θ, z = cos ϕ, z x 2 +y 2 +z 2 ρ = ρ (ρ cos ϕ) = cos ϕ And S ρ = S x x ρ + S y y ρ + S z z ρ = (sin ϕ cos θ) 2 + (sin ϕ sin θ) 2 + (cos ϕ) 2 = (sin 2 ϕ)(cos 2 θ + sin 2 θ ) + cos 2 ϕ = 1
Theorem (Chain Rule of 2-variables) Suppose that f (x, y)nd is a real valued function defined on the planar domain D, and that r(t) = x(t)i + y(t)j is a curve in the domain D Then we obtain a real-valued function g(t) = f (x(t), y(t)) which is a function of t Then the derivative of g is given by g (t) = d f ( f (x(t), y(t)) ) = dt x f x (r(t))x (t) + f y (r(t))y (t) dx (r(t)) dt + f dy (r(t)) y dt =
Theorem Chain Rule of 3-variables Suppose that f (x, y, z) is real valued function defined on the domain D which is part of R 3, and that x = x(t), y = y(t) and z = z(t) is a curve in the domain D One can think of the a particle moving in domain D, and its position is given by (x(t), y(t), z(t)) changing with respect to t, so it traces out a path in domain D given by r(t) = x(t)i + y(t)j + z(t)k Then we obtain a real-valued function g(t) = f (x(t), y(t), z(t)) Then the chain rule tells us that g (t) = d ( f (x(t), y(t), z(t)) ) dt = f dx x (r(t)) dt + f dy y (r(t)) dz (r(t)) dt = f x (r(t))x (t) + f y (r(t))y (t) + f z (r(t))z (t) dt + f z (chain rule)
Partial Derivatives Suppose that z = f (x, y) is a function defined in a domain D, and P(a, b) is a point in D Recall that the partial derivatives f (a + h, b) f (a, b) f x (a, b) = lim, and h 0 h f (a, b + k) f (a, b) f y (a, b) = lim k 0 k The limits are taken along the coordinate axes
Directional Derivative Through the point P(a, b) we choose any direction u = (h, k) = hi + kj, then we consider the straight line through the point P along the direction u given by r(t) = (a + ht, b + kt), and the rate of change of g(t) = f (r(t)) at t = 0 is z g f ( r(t) ) f ( r(0) ) f (a+th,b+tk) f (a,b) (0) = lim t 0 t = lim t 0 t T P(x,y,z ) y x
Directional Derivative Through the point P(a, b) we choose any direction u = (h, k) = hi + kj, then we consider the straight line through the point P along the direction u given by r(t) = (a + ht, b + kt), and the rate of change of g(t) = f (r(t)) at t = 0 is g f ( r(t) ) f ( r(0) ) f (a+th,b+tk) f (a,b) (0) = lim t 0 t = lim t 0 t Suppose that f is differentiable, then it follows from the (multivariate) chain rule that g (0) = f dx x dt + f dy y dt = f f (P)h + x y (P)k = ( f x (a, b)i + f y (a, b)j ) (hi + kj) = f (a, b) u, where f is the vector-valued function f x i + f y j, called the gradient of f at the point (x, y) Note g (0) only depends of the choice of the curve through P(a, b) with tangent direction r (0) only
In order to simplify the notation more, one requires the directional vector u to be an unit vector Definition (Directional derivative) The resulting derivative g (0) is called the directional derivative D u f of f along the direction u, and hence we write D u f = f u to represent the rate of the change of f in the unit direction u Remark In general, if f = f (x 1,, x n ) is a function of n variables, one define (i) the gradient of f to be f = ( f f x, 1, x n ), and (ii) the directional derivative D u f by D u f = f u
Example Evaluate the directional derivative f (x, y) = xe y + cos(xy) at the point P 0 (2, 0) in the direction 3i 4j Remark The blue curve is the level curve of f at different values Solution Let u = 3i 4i = 3 3 2 +4 2 5 i 4 5j And f (2, 0) = (f x (2, 0), f y (2, 0)) = (e y y sin(xy), xe y x sin(xy)) (x,y)=(2,0) = (1, 2) It follows that D u f (2, 0) = f (2, 0) u = 3 5 8 5 = 1
Proposition The greatest rate of change of a scalar function f, ie, the maximum directional derivative, takes place in the direction of, and has the magnitude of, the vector f Proof For any direction v, the directional derivative of f along the direction v at a point P in the domain of f, is given by v D v (P) := f (P), = f cos θ, where θ is the angle between v the vectors f (P) and v Hence D v (P) attains maximum (minimum) value if and only if cos θ = 1 ( 1), if and only if f (P) ( f (P) ) is parallel to v In this case, we have D v (P) = f ( f )
Example (a) Find the directional derivative of f (x, y, z) = 2x 3 y 3y 2 z at P(1, 2, 1) in a direction v = 2i 3j + 6k (b) In what direction from P is the directional derivative a maximum? (c) What is the magnitude of the maximum directional derivative? Solution (a) f (P) = 6x 2 yi + (2x 3 6yz)j 3y 2 k (1,2, 1) = 12i + 14j 12k at P Then the directional derivative of f along the direction v is given by D v f = f v v = 12i + 14j 12k, 2i 3j+6k 2 2 +3 2 +6 2 = 90 7 (b) D v f (P) is maximum(minimum) v ( v) is parallel to f (P) = 12i + 14j 12k (c) The maximum magnitude of D v f (P) is given by f (P) f = f (P) = 12i + 14j 12k = f (P) f (P) 144 + 196 + 144 = 22 = f (P) 2
Proposition Let C : r(t) = x(t)i + y(t)j + z(t)k be a curve lying on the level surface S : f (x, y, z) = c for some c, ie c = f (r(t)) = f ( x(t), y(t), z(t) ) for all t Then the gradient vector f of f is always perpendicular to the tangent vector r (t) at r(t) for all t ie f (r(t)) r (t) Proof Define the composite function g(t) = f ( x(t), y(t), z(t) ), it follows from the given condition that g(t) = f ( x(t), y(t), z(t) ) = c is a constant function, so one can differentiate the identity c = g(t) = f ( x(t), y(t), z(t) ), so 0 = g (t) = f x dx dt + f y dy dt + f z dx dt = f (r(t)), r (t) for all t So f r (t) at r(t) for all t
Proposition Let f (x, y, z) be a differentiable function defined in R 3, and S : f (x, y, z) = c be a level surface for some constant c, ie S = { (x, y, z) f (x, y, z) = c } Suppose that P(a, b, c) be a point on S such that the gradient vector f (a, b, c) of f at point P(a, b, c) is not zero, then the equation of the tangent plane of S at P is given by < f (a, b, c), (x a, y b, z c) >= 0, ie f x (a, b, c)(x a) + f y (a, b, c)(y b) + f z (a, b, c)(z c) = 0 Remark For any given level surface S defined by a scalar function f, the tangent plane of S at any P of S is spanned by the tangent vector of the curve contained in S The result above tells us that the normal direction to the tangent plane of S at any point P of S is parallel to f (P)
Let f (x, y) be a differentiable function defined on xy-plane For any real number k, recall the level level of f for k is given by the set { (x, y) f (x, y) = k } When the value k changes, the level curve changes gradually Let f (x, y) = x 2 7xy + 2y 2 defined on xy-plane The blue curves represent the level curves C k : f (x, y) = k of various values c And the red arrows represent the gradient vector field f (a, b) = ( f x (a, b), f y (a, b) ) which is normal to the tangent vector to level curve C a at P(a, b) of various values k Proposition Let C k : f (x, y) = k be a fixed level curve with a point P(a, b) in C k If f (a, b) = (0, 0), then the equation of the tangent line of C k at P is given by f (a, b) (x a, y b) = 0, ie f x (a, b)(x a) + f y (a, b)(y b) = 0
Example Let F(x, y, z) = x α + y α + z α, where α is non-zero number Determine the equation of the tangent plane of the level surface S : F(x, y, z) = k of some point P(a, b, c) in S, where k is a positive constant Solution The normal direction N of the tangent plane of S at P is parallel to F(x, y, z) = α(x α 1, y α 1, z α 1 ) evaluated at P(a, b, c) So N = (a α 1, b α 1, c α 1 ), it follows that the equation of the tangent plane of S at P(a, b, c) is given by 0 = N, (x a, y b, z c) = a α 1 (x a) + b α 1 (y b) + c α 1 (z c), So the equation of the tangent plane of S at P is given by a α 1 x + b α 1 y + c α 1 z = a α + b α + c α = F(a, b, c) = k
Example Show that the surface S : x 2 2yz + y 3 = 4 is perpendicular to any member of the family of surfaces S a : x 2 + 1 = (2 4a)y 2 + az 2 at the point of intersection P(1, 1, 2) Solution Let the defining equations of level surfaces S, S a be F(x, y, z) = x 2 2yz + y 3 4 = 0 and G(x, y, z) = x 2 + 1 (2 4a)y 2 az 2 = 0 Then F(x, y, z) = 2xi + (3y2 2z)j 2yk, and G(x, y, z) = 2xi 2(2 4a)yj 2azk Thus, the normals to the two surfaces at P(1, 1, 2) are given by N 1 = 2i j + 2k, N 2 = 2i + 2(2 4a)j 4ak Since N 1 N 2 = (2)(2) 2(2 4a) (2)(4a) = 0, it follows that N 1 and N 2 are perpendicular for all a, and so the required result follows
Implicit Functions Given a relation between two variables expressed by an equation of the form f (x, y) = k, we often want to solve for y That is, for each given x in some interval, we expect to find one and only one value y = φ(x) that satisfies the relation The function φ is thus implicit in the relation; geometrically, the locus of the equation f (x, y) = k is a level curve in the (x, y)-plane that serves as the graph of the function y = φ(x) Example Let C : x 2 + y 2 = 1 be a level curve defined by a function f (x, y) = x 2 + y 2 Is it possible that this level curve C in xy-plane is given by the graph of some "nice" function? If y = g(x), then we have 1 = x 2 + (g(x)) 2, and hence g(x) 2 = 1 x 2, so g(x) = ± 1 x 2 Though we find out a possible representation of y = g(x), which is usually called "explicit function," in fact g not differentiable at x = ±1 On the contrary, we call y is defined implicitly by f (x, y) = c
The most familiar example of an implicitly defined function is provided by the equation f (x, y) = x 2 + y 2 The locus or level curve f (x, y) = k is a circle of radius k if k > 0, we can view it as the graph of two different functions, y = φ ± (x) = ± k x 2 we can take either P + (0, + k) or P ( 0, k) as a fixed point of the level curve, so that φ ± defines a function respectively such that (i) the graph passes through the point P ± (0, ± k), and (ii) the graph of f lies completely on the level curve, ie all the points (x, φ ± (x)) lies on the level curve, f (x, φ ± (x)) = k for all x dom(f )
The explicit functions φ ± defined by means of implicit function f (x, y) = k, satisfy (i) the graph passes through the point P ± (0, ± k), and (ii) the graph of f lies completely on the level curve, ie all the points (x, φ ± (x)) lies on the level curve, f (x, φ ± (x)) = k for all x dom(f ) Thing completely fails if we chose the point P( k, 0), the reason is that a function can takes on only one value, though we can write down y = + k x 2 for k x k, but the graph can not be extended to any bigger domain to meet the second condition (ii) Moreover, the function y = + k x 2 does not have any derivative at x = k, which checked directly
Implicit Function Theorem I Let C : f (x, y) = k be a level curve defined by a differentiable scalar function f of 2 variables Suppose P(a, b) is a point in the domain of f such that f y (P) = 0, then there exists δ > 0 and a differentiable function g defined in an interval I = (a δ, a + δ) such that (i) f (x, g(x)) = c for all x I with g(a) = b; ie y = g(x) is an explicit function defined by the level curve C; and (ii) g (x) = f x(x, g(x)) for all x I f y (x, g(x)) Remark (i) In general, we can t write down the explicit function g (ii) one can interchange the role of x and y, if f x (P) = 0
Remark Recall that the level surface associated to a scalar function f and a fixed number k, is the set { (x, y, z) f (x, y, z) = k } In general, this set is not expected to have any nice condition However, we have the following important Implicit Function Theorem II Let S : F(x, y, z) = k be a level surface defined by a differentiable scalar function F, and suppose that P(a, b, c) is a point on the level surface, ie F(a, b, c) = k Suppose that F z (P) = 0, then there exists δ > 0 and a differentiable function z = g(x, y) defined on the open disc B( (a, b), δ) such that (i) F(x, y, g(x, y)) = k for all (x, y) B( (a, b), δ), with g(a, b) = c; and (ii) g x (x, y) = F x(x, y, g(x, y) ) g F z (x, y, g(x, y) ) y (x, y) = F y(x, y, g(x, y) ) for all F z (x, y, g(x, y) ) (x, y) B( (a, b), δ)
Implicit Function Theorem II Let S : F(x, y, z) = k be a level surface defined by a differentiable scalar function F, and suppose that P(a, b, c) is a point on the level surface, ie F(a, b, c) = k Suppose that F z (P) = 0, then there exists δ > 0 and a differentiable function z = g(x, y) defined on the open disc B( (a, b), δ) such that (i) F(x, y, g(x, y)) = k for all (x, y) B( (a, b), δ), with g(a, b) = c; and (ii) g x (x, y) = F x(x, y, g(x, y) ) g F z (x, y, g(x, y) ) y (x, y) = F y(x, y, g(x, y) ) for all F z (x, y, g(x, y) ) (x, y) B( (a, b), δ) Remark Differentiate F(x, y, g(x, y)) = k with respect to x and y respectively by means of chain rule, we have F F g (x, y, g(x, y)) + (x, y, g(x, y)) (x, y) = (k) = 0, and the x z x x result follows
Let z = z(x, y) be implicitly defined by ze xz = 2z + y + 1 Find z x at the point (x, y, z) = (0, 0, 1) Solution Write z(x, y) instead of z, and then differentiate the identity z(x, y)e xz(x,y) = 2z(x, y) + y + 1 with respect to x, we have z x e xz + ze xz (xz x + z) = (ze xz ) x = (2z + y + 1) x = 2z x, hence z x (e xz + xze xz 2) = z 2 e xz At (x, y, z) = (0, 0, 1), we have z x (1 + 0 2) = ( 1) 2, ie z x (0, 0) = 1
Example Suppose that the implicit function given by the level surface S : F(x, y, z) = 0 defines the following explicit functions: x = x(y, z), y = y(x, z) and z = x(x, y), where F is a differentiable function Then x y y z z x = Solution It follows from the implicit function theorem that x y = F y F x, for all (x, y, z) in S Similarly, we have y z = F z, and z F y x = F x, for F z all (x, y, z) in S It follows that x y y z z ( x = F ) ( y F ) ( z F ) x = 1, for all (x, y, z) in S F x F y F z
Theorem Let r(t) = (x(t), y(t), z(t)) be a curve on the level surface S : f (x, y, z) = c, prove that the tangent vector r (t) of the curve r(t) is normal to the gradient f at the point of S Consequently, f is the normal vector of the tangent plane of level surface S at any point P(x, y, z) of S Proof The result follows easily from differentiate the identity c = f ( x(t), y(t), z(t) ) for all the t in the domain of r with the help of chain rule, so 0 = d dt (c) = d ( f (x(t), y(t), z(t)) ) = dt f dx x dt + f dy y dt + f dz dx = f ( z dt dt, dy dt, dz dt ) = f r (t), so f is normal to the tangent vector r (t) of the curve
Example Determine the extremum of the function z = z(x, y) defined implicitly by the equation 3x 2 + 2y 2 + z 2 + 8yz z + 8 = 0 Solution Define F(x, y, z) = 3x 2 + 2y 2 + z 2 + 8yz z + 8, so the function z = z(x, y) is in fact the graph of the level surface S associated to the equation F(x, y, z) = 0, or sometimes we just denote it by S : F(x, y, z) = 0 It follows that F(x, y, z(x, y)) = 0, for all (x, y) in the (unspecified) domain of z(x, y), in fact we just think of the equality as an identity in x and y So differentiate with respect to x and y respectively by means of chain rule, we have 0 = x (0) = F x ( F(x, y, z(x, y)) ) = x x x + F z z x = F z x + F z x, so one has z x (x, y) = F x(x, y, z(x, y)) F z (x, y, z(x, y)) = F x and F z z y (x, y) = F y(x, y, z(x, y)) F z (x, y, z(x, y)) = F y One should notice that the F z assumption F z = 0 for all (x, y) in the domain of z = z(x, y) is necessary, which one can obtain explicitly if F z is known
Example Determine the extremum of the function z = z(x, y) defined implicitly by the equation 3x 2 + 2y 2 + z 2 + 8yz z + 8 = 0 Solution Let F(x, y, z) = 3x 2 + 2y 2 + z 2 + 8yz z + 8 so z x = F x 6x z =, and F z 2z + 8y 1 y = F y 4y + 8z = F z 2z + 8y 1 To locate the extremum value of z = z(x, y), one need its two partial derivatives z x and z y vanish, ie (6x, 4y + 8z) = (0, 0) where (x, y, z) is a point of the level surface S : F(x, y, z) = 0 Hence, x = 0, and y = 2z Then 0 = F(0, 2z, z) = 2( 2z) 2 + z 2 + 8( 2z)z z + 8 = 7z 2 z + 8 = (7z + 8)(z 1) so z = 1 or 7 8 Hence P(0, 2, 1) or Q(0, 16 7, 8 7 ) are the only critical point of the function z = z(x, y), however, z = z(x, y) is not explicitly determined yet One can determine use the quadratic formula to express z in terms of x and y, and then one can see that z max = 1 and z min = 8 7 Remark In the last part, we skip some details, but the gap can be filled in after we learn the second derivative test
Theorem (Lagrange multiplier) Let f (x, y) and g(x, y) be functions with continuous first-order partial derivatives If the maximum (minimum) value of f subject to the condition (constraint) given by a level curve C : g(x, y) = 0 occurs at a point P where f (P) = 0, then f (P) = λ g(p) for some real number λ Remarks 1 The last condition just means that these two vectors f (P) and g(p) are parallel, in other words, at the point where f attains maximum, the level curve of f will tangent to the constraint curve 2 The last equation f (P) = λ g(p) gives a necessary condition for finding the point P, though λ is also an unknown: f x (x, y) = λg x (x, y), f y (x, y) = λg y (x, y), g(x, y) = 0 3 The similar condition f (P) = λ g(p) works for functions of any variables, and the constant λ is called a multiplier
Example Determine min value of x 2 + y 2 subject to the constraint xy = 1 Solution Let f (x, y) = x 2 + y 2 be the objective function, and g(x, y) = xy be the constraint with the level curve given C : g(x, y) = 1 Though C is not a bounded set, one can put more restriction x 2 + y 2 M with the result curve C M which is closed and bounded As C M is closed and bounded in R 2, then the continuous function f attains its minimum on C M at some point in C M In fact, the minimum value always occurs exactly at the same two points One apply the Lagrange multiplier to locate the minimum that f (x, y) = λ g(x, y) at those extremum points
Example Determine min value of x 2 + y 2 subject to the constraint xy = 1 Solution (One should know that it is only a necessary condition, but not sufficient one) Hence we have: (2x, 2y) = (λy, λx), and xy = 1 From the last equation, one knows that x = 0 and y = 0, so 2x = λy, and then λ = 2x/y Substituting, we have 2y = (2x/y)x and hence y 2 = x 2, ie y = ±x But xy = 1, so x = y = ±1 and the possible points for the extreme values of f are (1, 1) and ( 1, 1) The minimum value is f (1, 1) = f ( 1, 1) = 2 Remark Here there is no maximum value for f, since the constraint xy = 1 allows x or y to become arbitrarily large, and hence f (x, y) = x 2 + y 2 can be made arbitrarily large
Steps of implementing Lagrange multipliers To find the maximum and minimum values of f (x, y, z) subject to the constraint defined by the level surface S : g(x, y, z) = k Suppose that these extreme values exist and on the surface S, which is related to the condition of S 1 Find all values of x, y, z and λ such that f x (x, y, z) = λg x (x, y, z) (1) f = λ g f y (x, y, z) = λg y (x, y, z) (2) f z (x, y, z) = λg z (x, y, z) (3) and g(x, y, z) = k (4) 2 Evaluate f at all the points (x, y, z) that result from step (a) The largest of these values is the maximum value of f ; the smallest is the minimum value of f
Example Use Lagrange multipliers to find the point (x, y, z) at which x 2 + y 2 + z 2 is minimal subject to x + 2y + 3z = 1 Solution Let f (x, y, z) = x 2 + y 2 + z 2, and g(x, y, z) = x + 2y + 3z be the objective function and the constraint function respectively We want to locate the point P(x, y) on the plane x + 2y + 3z = 1, such that f = λ g for some λ, ie (2x, 2y, 2z) = λ(1, 2, 3), and so 1 = x + 2y + 3z = λ 2 + 2λ + 3 3λ 2 = 7λ, ie λ = 7 1 And hence (x, y, z) = ( λ 2, λ, 3λ 2 ) = (1/14, 1/7, 3/14) Remark Why does the point (x, y, z) = (1/14, 1/7, 3/14) give the minimum of f? One can consider the moving point (x, y, z) = (3t + 1, 0, t) lying on the plane x + 2y + 3z = 1, then f (2t + 1, 0, t) = (2t + 1) 2 + 0 2 + ( t) 2 = (2t + 1) 2 + t 2 t 2 which does not have any maximum value However, one can prove that the absolute minimum value of f does exist by means of Cauchy s inequality, and we skip the proof of this fact
Example A rectangular box is placed on the xy-plane so that one vertex at the origin, and the opposite vertex lies in the plane Ax + By + Cz = 1, where A, B and C are positive Find the maximum volume of such a box Solution It follows from the given condition that the box has dimension x y z, with x, y, z > 0 and satisfy Ax + By + Cz = 1 Then the volume V(x, y, z) = xyz, subject to the constraint D = { (x, y, z) Ax + By + Cz = 1, and x, y, z 0 }, which is a closed and bounded subset of R 3, hence the volume function V attains both maximum and minimum The minimum volume is obviously 0; and we use Lagrange multiplier to find the maximum volume as follows(yz, xz, xy) = V(x, y, z) = λ (Ax + By + Cz 1) = (λa, λb, λc) If λ = 0, then one of x, y, and z will be zero, in this case, V(x, y, z) = 0, which is not maximum Assume λ = 0 so x 2 = xy xz yz AC = λc λb λa = λ BC A, ie x = BC A λ Similarly, we have B λ, and z = y = ( BC A A + B AC B AB + C AB C λ At last, we have 1 = Ax + By + Cz = ) λ C = 3 ABC λ, so λ = 1 9ABC Then 1 1 1 1
Example Let r(t) = (a + ht, b + kt) be the line in xy-plane passing through the point P(a, b) Let f be a function defined in a domain D containing P with continuous second order partial derivatives, and that P is a critical point of f ie f (P) = 0 Let g(t) = f (r(t)), (i) evaluate the second derivatives of g at t = 0; and (ii) the sign of g (0) provided that f xx (a, b) > 0 and f xx (a, b)f yy (a, b) (f xy (a, b)) 2 > 0 for (h, k) = (0, 0) Solution (i) Let A = f xx (a, b), C = f yy (a, b), B = (f xy (a, b) It follows from chain rule that g (t) = f x (a + ht, b + kt)h + f y (a + ht, b + kt)k, and hence g (t) = f xx (a + ht, b + kt)h 2 + f xy (a + ht, b + kt)hk + f yx (a + ht, b + kt)kh + f yy (a + ht, b + kt)k 2 In particular, at t = 0, g (0) = f xx (a, b)h 2 + 2f xy (a, b)hk + f yy (a, b)k 2 = Ah 2 + 2Bhk + Ck 2 (ii) As A = f xx (a, b) > 0, and AC B 2 > 0, then for s R, then l(s) = As 2 + 2Bs + C = 1 A (A2 s 2 + 2ABs + B 2 ) + C B 2 /A = 1 A (As + B)2 + AC B2 A AC B2 A > 0 So, and hence g (0) = Ah 2 + 2Bhk + Ck 2 = k 2 (A(h/k) 2 + 2Bh/k + C) = k 2 l( h k ) > 0 for all (h, k) R 2 with k = 0 If k = 0, then g (0) = Ah 2 > 0 for all h = 0
Proposition (Maximum-Minimum Test for Quadratic Functions) Let g(x, y) = Ax 2 + 2Bxy + Cy 2, where A, B, C are constants 1 If AC B 2 > 0, and A > 0, [respectively A < 0], then g(x, y) has a minimum [respectively maximum] at (0, 0) 2 If AC B 2 < 0, then g(x, y) takes both positive and negative values near (0, 0), so (0, 0) is not a local extremum for g Proof To prove these assertions, we consider the two cases separately (1) ( If AC B 2 > 0, then A ( cannot be zero (why?), so g(x, y) = A x 2 + 2B A xy + A C y2) = A x 2 + 2B B2 A xy + y 2 C A 2 A y2 B2 y 2) A ( ) 2 2 = A x + A B y + 1 A (AC B 2 )y 2 Both terms above have the same sign as A, and they are both zero only when x + A B y = 0 and y = 0, ie (x, y) = (0, 0) Thus (0, 0) is a minimum point for g if A > 0 (since g(x, y) > 0 if (x, y) = (0, 0)) and a maximum point if A < 0 (since g(x, y) < 0 if (x, y) = (0, 0) ) This completes the proof of (1)
Proposition (Maximum-Minimum Test for Quadratic Functions) Let g(x, y) = Ax 2 + 2Bxy + Cy 2, where A, B, C are constants 1 If AC B 2 > 0, and A > 0, [respectively A < 0], then g(x, y) has a minimum [respectively maximum] at (0, 0) 2 If AC B 2 < 0, then g(x, y) takes both positive and negative values near (0, 0), so (0, 0) is not a local extremum for g Proof (2) If AC B 2 < 0 and A = 0, then formula (1) still applies, but now the terms on the right-hand side have opposite signs By suitable choices of x and y (try it!), we can make either term zero and the other nonzero If A = 0, then g(x, y) = y(2bx + Cy), so we can again achieve both signs Remark In case (2), (0, 0) is called a saddle point for g(x, y)
Theorem (Second Derivatives Test) Suppose the second partial derivatives of f (x, y) are continuous on a disk with center (a, b), and suppose that f (a, b) = (0, 0) ie (a, b) is a critical point of f Let D = D(a, b) = f xx (a, b)f yy (a, b) [f xy (a, b)] 2 1 If D > 0 and f xx (a, b) > 0, then f (a, b) is a local minimum; 2 If D > 0 and f xx (a, b) < 0, then f (a, b) is a local maximum; 3 If D < 0, then f (a, b) is neither a local maximum nor a local minimum
Example Determine the nature of the critical points of f (x, y) = x 3 + y 3 6xy Solution As f (x, y) = (3x 2 6y, 3y 2 6x), so x 2 = 2y and y 2 = 2x It follows that x 4 = 4y 2 = 4 2x = 8x, ie 0 = x 4 8x = x(x 3 2 3 ) = x(x 2)(x 2 + 2x + 4) As (x 2 + 2x + 4) = (x + 1) 2 + 3 > 0, we have x = 0 or x = 2 So 2y = 0 2 or 2 2, so the critical points of f are (0, 0) and (2, 2) Next we need to apply the 2nd derivative testand f xx = 6x, f yy = 6y, f xy = 6, and the discriminant (x, y) = f xx f yy f 2 xy = (6x)(6y) ( 6) 2 = 36(xy 1) And (0, 0) = 36 < 0, (2, 2) = 36(4 1) = 108 Hence (0, 0) is a saddle point of f, where (2, 2) is a local minimum point of f
Taylor s Formula for f (x, y) at the Point (a, b) Theorem Suppose f (x, y) and its partial derivatives through order n + 1 are continuous throughout an open rectangular region R centered at a point (a, b) Then, throughout R, f (a + h, b + k) = f (a, b) + (hf x + kf y ) (a,b) }{{} + Linear or 1st order approximation f (a + h, b + k) = f (a, b) + (hf x + kf y ) (a,b) + 1 2! (h2 f xx + 2hkf xy + k 2 f yy ) (a,b) + }{{} 2nd order approximation f (a + h, b + k) = f (a, b) + (hf x + kf y ) (a,b) + 2! 1 (h2 f xx + 2hkf xy + k 2 f yy ) (a,b) + 3! 1 (h3 f xxx + 3h 2 kf xxy + 2hk 2 f xyy + k 3 f yyy ) (a,b) + ( ) n + n! 1 h x + k y f (a,b) + 1 (n+1)! for some c (0, 1) ( h x + k y ) n+1 f (a+ch,b+ck)
Taylor s Theorem Suppose f (x, y) and its partial derivatives through order n + 1 are continuous throughout an open rectangular region R centered at a point (a, b) Then, throughout R, = f (a, b) + (hf x + kf y ) (a,b) + 2! 1 (h2 f xx + 2hkf xy + k 2 f yy ) (a,b) + 3! 1 (h3 f xxx + 3h 2 kf xxy + 2hk 2 f xyy + k 3 f yyy ) (a,b) + ( ) n ( ) n+1 + n! 1 h x + k y f (a,b) + 1 h (n+1)! x + k y f (a+ch,b+ck) for some c (0, 1) Remarks 1 The proof just applies the chain rule and the trick of n-th Taylor polynomial to the function g(t) = f (a + ht, b + kt) one variable 2 If one have an estimate the last term (in blue), for example an upper bound, then we can estimate the given function by means of polynomials in 2 variables 3 The theorem can be easily generalized to function of n variables for n 1 Though this topics is not treated in this book, but its application is important in other courses, so we put the result in this notes for the sake of students