An Introduction to the Directional Derivative and the Gradient Math Insight

An Introduction to the Directional Derivative and the Gradient Math Insight The directional derivative Let the function f(x,y) be the height of a mountain range at each point x=(x,y). If you stand at some point x=a, the slope of the ground in front of you will depend on the direction you are facing. It might slope steeply up in one direction, be relatively flat in another direction, and slope steeply down in yet another direction. The partial derivatives 1 of f will give the slope in the positive x direction and the slope in the positive y direction. We can generalize the partial derivatives to calculate the slope in any direction. The result is called the directional derivative. The first step in taking a directional derivative, is to specify the direction. One way to specify a direction is with a vector u=(u1,u2)that points in the direction in which we want to compute the slope. For simplicity, we will insist that u is a unit vector 2. We write the directional derivative of f in the direction u at the point a as Duf(a). We could define it with a limit definition just as an ordinary derivative or a partial derivative 3. However, it turns out that for differentiable 4 f(x,y), we won't need to worry about that definition. The concept of the directional derivative is simple; Duf(a) is the slope of f(x,y) when standing at the point a and facing the direction given by u. If x and y were given in meters, then Duf(a) would be the change in height per meter as you moved in the direction given by u when you are at the point a. Note that Duf(a) is a number, not a matrix. In fact, the directional derivative is the same as a partial derivative if u points in the positive x or positive y direction. For example, if In the following image, the height f(x,y) of a mountain range is shown as a level curve plot 5. You can recognize two steep mountain peaks by the closely spaced circular level curves. If u points straight east (θ=0 in the image), then u points in Page 1 of 7

the positive x direction (u=(1,0)) so that Duf(a)= (a). Similarly, when u points straight north (θ=π/2), then u points in the positive y direction (u=(0,1)) so that Duf(a)= (a). Directional derivative on a mountain shown as level curves. The height of a mountain ranged described by a function f(x,y)is shown as a level curve plot. If you make u point in a direction parallel to the level curve, what happens to Duf(a)? (Since the height is constant along a level curve, you should be able to infer what the slope in that direction should be.) What happens to Duf(a) when you turn u to point in the opposite direction (i.e., add or subtract π from θ)? To help you visualize what is going on in case you are not yet comfortable with level curve plots, a second applet, below, duplicates the above applet but with a mesh plot of the surface z=f(x,y). In this view, the steepness may be easier to see. However, this view is a little misleading for two reasons. First, the dark red dot now floats on the surface of the mountain. Hence, the dark red dot is no longer a, which for this example is really a point in two dimensions. Second, the light green vector is now a three-dimensional vector that points up or down the mountain. The light green vector is no longer exactly the direction vector u, which for this example is really a two-dimensional vector. Nonetheless, this second Page 2 of 7

view further illustrates the concepts of the directional derivative. You can use it to help you understand what is happening in the above level curve plot. Directional derivative on a mountain shown as mesh plot. The height of a mountain ranged described by a function f(x,y) is shown as a mesh plot. The gradient In most cases, there is always one direction u where the directional derivative Duf(a) is the largest. This is the uphill direction. (In some cases, such as when you are at the top of a mountain peak or at the lowest point in a valley, this might not be true.) Let's call this direction of maximal slope m. Both the direction m and the maximal directional derivative Dmf(a) are captured by something called the gradient 6 of f and denoted by f(a). The gradient is a vector that points in the direction of m and whose magnitude is Dmf(a). In math, we can write this as The below image illustrates the gradient, as well as its relationship to the directional derivative. The definition of θ is different from that of the above applets. Here θ is the angle between the gradient and vector u. Page 3 of 7

When θ=0, u points in the same direction as the gradient (and is hidden in the image). Gradient and directional derivative on a mountain shown as level curves. The height of a mountain ranged described by a function f(x,y) is shown as a level curve plot. The height f(a) is shown on the bottom cyan slider labeled by f. The direction of steepest increase of f is given by the gradient vector f(a) (the dark blue vector is ten times longer than the actual gradient). The actual length of the gradient f(a) is shown by the dark blue line on the middle (light green) slider. The light green line on that slider indicates the value of the directional derivative Duf(a), where u is represented by the light green vector coming out of a. The direction of u is controlled by θ where θ is the angle between f(a) and u. Notice how the dark blue gradient vector always points up the mountains (in fact, the gradient is always perpendicular to the level curves). When the level curves are close together, the gradient is large. What happens to the gradient at the tops of the mountains? Note that when θ=0 (or θ=2π), the directional derivative Duf(a) (shown by the light green line on the middle slider) and the magnitude of the gradient f(a) (shown by the dark blue line on the middle slider) are identical, i.e., Duf(a)= f(a). When θ=π, then u points in the opposite direction of the gradient, and Duf(a)= f(a). For what values of θ is Duf(a)=0? By moving a (the dark red point) around and changing θ, I hope you can convince yourself that, for a fixed a, the maximal value of Duf(a) occurs when u and f(a) point in the same direction (i.e., when θ=0 or θ=2π), and the Page 4 of 7

minimum value occurs when u and f(a) point in opposite directions (i.e., when θ=π). Hence Duf(a) always lies between f(a) and f(a). It turns out that the relationship between the gradient and the directional derivative can be summarized by the equation where θ is the angle between u and the gradient. (Recall that u is a unit vector, meaning that u =1.) The image is repeated using a plot of z=f(x,y), below. Although its steepness may be easier to see, recall from the above discussion that the dark red point is no longer really a and the light green vector is no longer really u. Similarly, since the dark blue vector points up the mountain, it is no longer really the gradient f(a), which, for a function f(x,y) of two variables, is a two-dimensional vector. Despite its shortcomings, this image may help you see how the gradient always points in the direction where the mountain rises most steeply. Gradient and directional derivative on a mountain shown as mesh plot. The dark blue vector points in the direction of the gradient. The magnitude of the gradient is shown by the dark blue line on the light green slider. The light green vector points at an angle θ from the gradient; the directional derivative in that direction is shown by the light green line on the light green slider. The dark blue and the light green vectors are shown as three-dimensional vectors titling up or down the mountain, and hence are not exactly the two dimensional vectors f or the u of Duf. Page 5 of 7

But what exactly is the gradient? This page was designed to give you an intuitive feel for what the directional directive and gradient are. But, we've failed to mention what exactly is the gradient. The above formula for the directional derivative is nice, but it's not very useful if you don't know how to calculate f. Fortunately, the end result is fairly simple, as the gradient 7 is just a reformulation of the matrix of partial derivatives 8. You can check out a simple derivation of the gradient 9 to see why this is true. Once you know how to calculate the gradient 10, you can follow these examples 11. Page 6 of 7

Notes and Links: 1. http://mathinsight.org/partial_derivative_introduction 2. http://mathinsight.org/unit_vector_definition 3. http://mathinsight.org/partial_derivative_limit_definition 4. http://mathinsight.org/differentiability_multivariable_introduction 5. http://mathinsight.org/level_sets#level_curves 6. http://mathinsight.org/gradient_vector 7. http://mathinsight.org/gradient_vector 8. http://mathinsight.org/derivative_matrix 9. http://mathinsight.org/directional_derivative_gradient_derivation 10. http://mathinsight.org/directional_derivative_gradient_derivation 11. http://mathinsight.org/directional_derivative_gradient_examples Page 7 of 7