Lagrange Multipliers and Problem Formulation

Similar documents
Bounded, Closed, and Compact Sets

2 Second Derivatives. As we have seen, a function f (x, y) of two variables has four different partial derivatives: f xx. f yx. f x y.

5 Day 5: Maxima and minima for n variables.

Math 21a Homework 22 Solutions Spring, 2014

we wish to minimize this function; to make life easier, we may minimize

Math 233. Lagrange Multipliers Basics

GCSE-AS Mathematics Bridging Course. Chellaston School. Dr P. Leary (KS5 Coordinator) Monday Objectives. The Equation of a Line.

Math 213 Exam 2. Each question is followed by a space to write your answer. Please write your answer neatly in the space provided.

Constrained Optimization and Lagrange Multipliers

Math 209 (Fall 2007) Calculus III. Solution #5. 1. Find the minimum and maximum values of the following functions f under the given constraints:

by Kevin M. Chevalier

Table of Laplace Transforms

Optimizations and Lagrange Multiplier Method

MEI Desmos Tasks for AS Pure

Math 233. Lagrange Multipliers Basics

3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers

MATH2111 Higher Several Variable Calculus Lagrange Multipliers

Basics of Computational Geometry

CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12

TABLE 2: Mathematics College Readiness Standards for Score Range 13 15

College Technical Mathematics 1

The following information is for reviewing the material since Exam 3:

. Tutorial Class V 3-10/10/2012 First Order Partial Derivatives;...

College Technical Mathematics 1

1. The Pythagorean Theorem

Trigonometry Summer Assignment

Foundations for Functions Knowledge and Skills: Foundations for Functions Knowledge and Skills:

Math 126 Number Theory

1. Suppose that the equation F (x, y, z) = 0 implicitly defines each of the three variables x, y, and z as functions of the other two:

Chapter 7: Analytic Trigonometry

Trigonometry. 9.1 Radian and Degree Measure

Linear algebra deals with matrixes: two-dimensional arrays of values. Here s a matrix: [ x + 5y + 7z 9x + 3y + 11z

d f(g(t), h(t)) = x dt + f ( y dt = 0. Notice that we can rewrite the relationship on the left hand side of the equality using the dot product: ( f

Lagrange multipliers October 2013

Solution 2. ((3)(1) (2)(1), (4 3), (4)(2) (3)(3)) = (1, 1, 1) D u (f) = (6x + 2yz, 2y + 2xz, 2xy) (0,1,1) = = 4 14

1. Use the Trapezium Rule with five ordinates to find an approximate value for the integral

This blog addresses the question: how do we determine the intersection of two circles in the Cartesian plane?

CW High School. Advanced Math A. 1.1 I can make connections between the algebraic equation or description for a function, its name, and its graph.

Section 4.1: Introduction to Trigonometry

Parallel and perspective projections such as used in representing 3d images.

Local and Global Minimum

Lagrange multipliers 14.8

A small review, Second Midterm, Calculus 3, Prof. Montero 3450: , Fall 2008

, etc. Let s work with the last one. We can graph a few points determined by this equation.

Total. Math 2130 Practice Final (Spring 2017) (1) (2) (3) (4) (5) (6) (7) (8)

Generating Functions

LECTURE 18 - OPTIMIZATION

R f da (where da denotes the differential of area dxdy (or dydx)

Rectangular Coordinates in Space

A Quick Review of Trigonometry

2 Computation with Floating-Point Numbers

Albertson AP Calculus AB AP CALCULUS AB SUMMER PACKET DUE DATE: The beginning of class on the last class day of the first week of school.

List of NEW Maths content

MATH. 2153, Spring 16, MWF 12:40 p.m. QUIZ 1 January 25, 2016 PRINT NAME A. Derdzinski Show all work. No calculators. The problem is worth 10 points.

2012 Mathematics. Intermediate 2 Units 1, 2 and 3, Paper 2. Finalised Marking Instructions

Integrated Algebra 2 and Trigonometry. Quarter 1

Limits. f(x) and lim. g(x) g(x)

MATH STUDENT BOOK. 12th Grade Unit 4

Vector Calculus: Understanding the Cross Product

Math 241, Final Exam. 12/11/12.

AP Calculus AB. Table of Contents. Slide 1 / 180. Slide 2 / 180. Slide 3 / 180. Review Unit

Summer Review for Students Entering Pre-Calculus with Trigonometry. TI-84 Plus Graphing Calculator is required for this course.

Final Exam: Precalculus

13.7 LAGRANGE MULTIPLIER METHOD

Year 8 Mathematics Curriculum Map

UPCAT Reviewer Booklet

Unit 3, Lesson 1.3 Special Angles in the Unit Circle

RATIONAL FUNCTIONS Introductory Material from Earl Please read this!

Analytic Spherical Geometry:

TEACHER CERTIFICATION STUDY GUIDE KNOWLEDGE OF MATHEMATICS THROUGH SOLVING...1

Geometry. Cluster: Experiment with transformations in the plane. G.CO.1 G.CO.2. Common Core Institute

Mathematics Background

Summer Review for Students Entering Pre-Calculus with Trigonometry. TI-84 Plus Graphing Calculator is required for this course.

Paul's Online Math Notes Calculus III (Notes) / Applications of Partial Derivatives / Lagrange Multipliers Problems][Assignment Problems]

Mathematics in Orbit

MEI GeoGebra Tasks for AS Pure

1.5 Equations of Lines and Planes in 3-D

2.2 Limit of a Function and Limit Laws

function at the given point. Thus, for our previous example we would find that lim f(x) = 0

Inverse and Implicit functions

Prentice Hall Algebra Correlated to: ACT College Readiness Standards for Mathematics

SNAP Centre Workshop. Introduction to Trigonometry

Syllabus Form 3. Objectives Students will be able to: Mathematics Department. Revision of numbers. Form 3 Syllabus Page 1 of 7

Pre-Calculus Summer Assignment

Calculus I Review Handout 1.3 Introduction to Calculus - Limits. by Kevin M. Chevalier

Trigonometry made somewhat easier

Grad operator, triple and line integrals. Notice: this material must not be used as a substitute for attending the lectures

MATHEMATICS 105 Plane Trigonometry

Welcome. Please Sign-In

Increasing/Decreasing Behavior

1

Outcomes List for Math Multivariable Calculus (9 th edition of text) Spring

Calculus III. Math 233 Spring In-term exam April 11th. Suggested solutions

AP Calculus AB. Table of Contents. Slide 1 / 180. Slide 2 / 180. Slide 3 / 180. Review Unit

Geometry I Can Statements I can describe the undefined terms: point, line, and distance along a line in a plane I can describe the undefined terms:

Multivariate Calculus Review Problems for Examination Two

NARROW CORRIDOR. Teacher s Guide Getting Started. Lay Chin Tan Singapore

Chislehurst and Sidcup Grammar School Mathematics Department Year 9 Programme of Study

Introduction to Functions of Several Variables

Solving Trigonometric Equations

Transcription:

Lagrange Multipliers and Problem Formulation Steven J. Miller Department of Mathematics and Statistics Williams College Williamstown, MA 01267 Abstract The method of Lagrange Multipliers (and its generalizations) provide answers to numerous important tractable optimization problems in a variety of subjects, ranging from physics to economics to information theory. Below we discuss five different formulations for a military problem (which can be re-interpreted as a problem in business). We will show how the answer to this problem depends on our choice of measuring. This illustrates a very powerful and important lesson: depending on your evaluation criteria, you can end up with very different answers. Updated February 2016, thanks to Stewart A. Levin for pointing out several typos. Contents 1 Description of the Problem 2 2 Interpreting the problem 2 3 Solving the models 5 3.1 First Model......................................... 6 3.2 Second Model........................................ 8 3.3 Third Model......................................... 8 3.4 Model 4........................................... 10 3.5 Model 5........................................... 10 4 Comparing the Models Solutions 10 5 Connection to the rest of the course 11 E-mail: sjm1@williams.edu 1

1 Description of the Problem Consider the following problem from Military Theory. Imagine that the Earth is sphere of radius 1, and that we are trying to determine where to place a military installation in order to minimize the average deployment time to three trouble spots, located at P 1 = (5/, 12/, 0), P 2 = (12/, 5/, 0), P 3 = (3/, 4/, 12/). Assuming we may build our base anywhere on the Earth, where should we place it? While stated as a military exercise, the above problem is very important in business. Instead of three trouble spots imagine we have three markets. Now our goal is to locate our factory or distribution center in order to minimize shipping costs. 2 Interpreting the problem There are two competing influences in building a mathematical model. The first is that we desire a model which captures as many features as possible about the system of interest, while the second is that we wish the model to be mathematically tractable and hence solvable. In general, the more features the model captures, the harder it will be to solve it. For this problem, we need to decide on how to measure the distance between points on the surface of the Earth. Assume our base is located at (x, y, z) with x 2 + y 2 + z 2 = 1. We give five different ways to measure the distance. 1. We use the sum of the distance-squared, where we compute distance by burrowing through the Earth: D 1 (x, y, z) = (x, y, z) P i 2. Expanding, we find D 1 (x, y, z) = (x 5/12) 2 + (y 12/) 2 + z 2 + (x 12/) 2 + (y 5/12) 2 + z 2 + (x 3/) 2 + (y 4/) 2 + (z 12/) 2. There are several advantages of using this measure of distance. The first is that we have a very nice formula (essentially the Pythagorean Theorem) for computing the distance between two points. The second is that since we are squaring the distances before adding, our distance function is a polynomial of degree two in x, y and z. This means D 1 will be a nice function, and the algebra won t be too bad. The disadvantage of this method, of course, is that we cannot burrow through the Earth! 2

2. We use the sum of the distances, where we compute distance by burrowing through the Earth: Expanding, we find D 2 (x, y, z) = (x, y, z) P i. D 1 (x, y, z) = (x 5/12) 2 + (y 12/) 2 + z 2 + (x 12/) 2 + (y 5/12) 2 + z 2 + (x 3/) 2 + (y 4/) 2 + (z 12/) 2. There are two problems with this method. The first is that the square-roots lead to a function which will have a very messy derivative, and the second of course is that this is burrowing through the Earth. 3. We use the sum of 2(1 cos ψ i ), where ψ i is the angle between the point P i and our base at (x, y, z): D 3 (x, y, z) = 2(1 cos ψ i ). There are a lot of advantages of this. The first is that the cosine of an angle is easily computed using the dot product. As all vectors are unit vectors, we have (x, y, z) P i = (x, y, z) P i cos ψ i ; thus for the first point we have 5x + 12y = cos ψ 1. The second advantage is that, for ψ i small, 1 cos ψ i ψi 2 /2; this follows from the Taylor series of cosine (cos θ = 1 θ 2 /2! + θ 4 /4! ). As we are multiplying by 2, we see that for small ψ i each summand is about ψi 2 ; this is why we chose to look at 2(1 cos ψ i ). Note that the closer two points are, the smaller their angle and thus the closer the cosine will be to 1; this shows our quantity is a reasonable measure of distance. For a circle of unit radius, the length of an arc equals the angle in radians, and thus this should be a reasonable approach. Explicitly, we have ( 5x D 3 (x, y, z) = 6 2 + 12y + 12x + 5y + 3x + 4y + 12z ). Note that D 3 is a linear function in the unknown location (x, y, z); thus we expect this will be a very easy function to optimize. 4. We use the sum of the square of the sines of the angles ψ i (with ψ i as above). Explicitly, D 4 (x, y, z) = sin 2 ψ i. 3

From the Taylor expansion of sine (sin θ = θ θ 3 /3! + ), we see for small ψ i that sin 2 ψ i ψ 2 i, and thus we expect the answer to be similar to the previous method. Why are we using this, and more importantly, why are we squaring each sine? Unlike the dot product, our formula for sines is a bit more involved, coming from the cross product. We have as every vector has length 1, we have (x, y, z) P i = (x, y, z) P i sin ψ i ; (x, y, z) P i = sin ψ i. The absolute value function is a horrible nightmare computationally. It is not differentiable, and thus if we use it we cannot apply any of the techniques of calculus. Thus we must remove the absolute value at all costs. We can do this by squaring the above, and obtain (x, y, z) P i 2 = sin 2 ψ i. While the formula for the square of the sine is a bit more involved, it will be a quadratic in x, y and z and thus its gradient won t be too bad. For example, ( (x, y, z) (5/, 12/, 0) = 12z, 5z, 12x 5y ), and thus (x, y, z) (5/, 12/, 0) 2 = Adding the contributions from the other two points yields ( 12x 5y ) 2 + z 2. D 4 (x, y, z) = 329x 2 264xy + 322y 2 72xz 96yz + 363z 2. Note again how nice the above function looks. 5. We use the sum of the distances of the arcs on the surface of the Earth. We know from geometry that, for a circle of radius 1, the length of the arc equals the measure of the angle in radians. Thus the angle ψ i between (x, y, z) and P i satisfies cos ψ i = (x, y, z) P i, or Thus the distance is ψ i = arccos ((x, y, z) P i ). D 5 (x, y, z) = arccos ((x, y, z) P i ). 4

The advantage of this formula is that it is the correct measure of distance; the disadvantage is that it involves the function arccosine. The derivative of arccos t is (1 t 2 ) 1/2. Similar to the second approach, this will lead to very messy derivatives and it won t be possible to get a nice, clean closed form expression for the solution. Explicitly, our function is D 5 (x, y, z) = arccos ( 5x + 12y ) + arccos ( 12x + 5y ) ( 3x + arccos + 4y + 12z ). We now try and use the method of Lagrange Multipliers to find the optimal site for our base. As our constraint function is g(x, y, z) = x 2 + y 2 + z 2 = 1, we have Thus we are trying to solve for k {1, 2, 3, 4, 5}, or ( g)(x, y, z) = 2(x, y, z). ( D k )(x, y, z) = λ( g)(x, y, z), g(x, y, z) = 1 ( D k )(x, y, z) = 2λ(x, y, z), g(x, y, z) = 1 for k {1, 2, 3, 4, 5}. The natural question to ask is how our choice of measuring distance affects the final answer; do the five different methods lead to our base being in approximately the same place, or are there significant differences in its location? Remark 2.1. Note that it is possible to generalize this problem by incorporating weights. For example, it may be the case that the three trouble spots are not equally unstable, and thus we are less concerned with one of them. Or perhaps the three points represent markets for our product, and one market is not as important as another and buys significantly less. Similar to our many choices for measuring the distance, we have many ways of including weights. One possibility is to have linear weights, such as D 2;weighted (x, y, z) = w i (x, y, z) P i ; for example, we might take w 1 = 6, w 2 = 3 and w 1 = 1, indicating that the first location is significantly more important than the second which is significantly more important than the third. Such linear weights lead to trivial modifications in these functions, and similar arguments will lead to their solutions. 3 Solving the models We now describe how to solve the five different models. Our goal is to see how significant these different formulations are. Remember the constraint is g(x, y, z) = x 2 + y 2 + z 2 = 1, ( g)(x, y, z) = 2(x, y, z). 5

3.1 First Model Recall Expanding, we find D 1 (x, y, z) = (x, y, z) P i 2. D 1 (x, y, z) = (x 5/12) 2 + (y 12/) 2 + z 2 + (x 12/) 2 + (y 5/12) 2 + z 2 + (x 3/) 2 + (y 4/) 2 + (z 12/) 2. We thus find that ( D 1 )(x, y, z) = ( 40 + 78x, 42 + 78y, 24 + 78z). Setting D 1 = λ g yields ( 40 + 78x, 42 + 78y, 24 + 78z) = 2λ(x, y, z). As happens for many problems like this, it helps to break the analysis into two cases: λ = 0 and λ 0. If λ = 0 we find ( 40 + 78x, 42 + 78y, 24 + 78z) = (0, 0, 0), which yields x = 40/78, y = 42/78, z = 24/78; however, this point must also satisfy our constraint. Substituting we find (40/78) 2 + (42/78) 2 + (24/78) 2 = 985/1521 1; as this point does not satisfy the constraint, it is neither a maximum nor a minimum. Remember that the Method of Lagrange Multipliers finds candidates for maxima and minima; just because a point is a candidate does not ensure that it must be a local max or a local min. We now turn to the other case, λ 0. We thus have the following system of equations to solve: 40 + 78x = 2λx 42 + 78y = 2λy 24 + 78z = 2λz x 2 + y 2 + z 2 = 1. As λ 0, we may take ratios of the various equations. The advantage of doing this is that the λ s disappear. It s worth thinking about the seeming absurdity of our approach: we introduce λ to help solve the problem, and then proceed to eliminate it without ever finding its value! It s a placeholder whose 6

existence helps us find the max/min. (Note: there are many problems where the value of λ is important, as it has a physical interpretation.) Taking ratios yields 40 + 78x 42 + 78y = 2λx 2λy = x y. When we cross multiply, we are quite fortunate: 40y + 78xy = 42x + 78xy. We have the same number of xy on both sides, which cancels, and we find Taking the ratio of the first and the third yields Cross multiplying yields Canceling the 78xz from both sides yields y = 42x 40. 40 + 78x 24 + 78z = 2λx 2λz = x z. 40z + 78xz = 24x + 78xz. z = 24x 40. We now substitute into our constraint; as we can write y and z in terms of x, we will have one equation in one unknown: x 2 + y 2 + z 2 = 1 becomes x 2 + ( ) 2 42 x 2 + 40 ( ) 2 24 x 2 = 1, 40 or 197 80 x2 = 1, which implies 80 400 x = ± 197 = ± 985 0.637253, where we have written our answer with the following denominator to facilitate comparisons with other methods. Now that we have x it is easy to find y and z. While there are two different choices for x, a moments thought shows that the minimum distance has to be when we take the positive square root. The reason is our three trouble points are all in the first octant of the Earth (they all have non-negative coordinates), and 7

thus the minimum distance will be from the point with three non-negative coordinates and not the point with three non-positive coordinates. We thus find the optimal location, using the above method to measure distance, is ( ) 400 420 240 (x, y, z) = 985, 985, (0.637253, 0.669116, 0.382352). 985 3.2 Second Model We encourage the reader to find the gradient of D 2 (x, y, z). The first two terms of D 2 / x are x 3 (x 3/) 2 + (y 4/) 2 + (z 12/) + x + 5 2 (x 5/) 2 + (y 4/) 2 + (z 12/) +. 2 Using numerical approximation techniques, Mathematica computes that D 2 is minimized, subject to the constraint, when (x, y, z) = (0.658882, 0.723174, 0.207108). 3.3 Third Model This is perhaps the easiest of all cases, as we have D 3 (x, y, z) = 78 40x 42y 24z. This leads to Setting D 3 = λ g yields ( D 3 )(x, y, z) = ( 40, 42, 24). ( 40, 42, 24) = 2λ(x, y, z) and x 2 + y 2 + z 2 = 1. Again, in problems like this it is often convenient to break the analysis into two cases, when λ = 0 and when λ 0. If λ = 0 then we quickly see no choice of x, y, z works. Thus we may assume λ 0. We have 40 = 2λx 42 = 2λy 24 = 2λz x 2 + y 2 + z 2 = 1. 8

Taking the ratio of the first and the second equation gives or which yields 40 42 = 2λx 2λy, 40 42 = x y, y = 42x 40. If we take the ratio of the first and the third we similarly find 40 24 = 2λx 2λz, or z = 24x 40. Note that these are exactly the relation between x and y that we found in Model 1, and thus the solution will also be ( ) 400 420 240 (x, y, z) = 985, 985, (0.637253, 0.669116, 0.382352). 985 As Models 1 and 3 give the same location, it is worth thinking about why this is. We can use the law of cosines to determine the distance (burrowing through the Earth) from (x, y, z) to P i. Letting ψ i denote the angle between these two vectors and d i the distance between them, and noting that the two vectors have unit length, we have In other words, and d 2 i = (x, y, z) 2 + P i 2 2 (x, y, z) P i cos ψ i = 2 2 cos ψ i d 2 i = 2(1 cos ψ i ), D 1 (x, y, z) = D 3 (x, y, z). While the algebra after computing the gradients in Model 1 and Model 3 is the same, I think the algebra is significantly easier in Model 3 in the beginning. 9

3.4 Model 4 Recall which leads to the gradient D 4 (x, y, z) = 329x 2 264xy + 322y 2 72xz 96yz + 363z 2, ( D 4 )(x, y, z) = (658x 264y 72z, 264x + 644y 96z, 72x 96y + 726z). While we again have four unknowns in four equations, the algebra becomes a bit more tedious and involved here. If you know linear algebra, it is not that bad to find the solutions. As linear algebra is not a pre-requisite for the course, we will simply state what you would find (the closed form solution is a pain to write down). 3.5 Model 5 (x, y, z) (0.659301, 0.688772, 0.301521) Due to the presence of the arccosine function, D 5 (x, y, z) has a very messy derivative, and we must resort to numerical approximations. This show that the optimal location is approximately (x, y, z) (0.659441, 0.7781, 0.2359). 4 Comparing the Models Solutions We compare the five models: Model 1 Model 2 Model 3 Model 4 Model 5 x.637.659.637.659.659 y.669.723.669.689.714 z.382.207.382.302.236 Note five different models returned four different answers! Two of the five models (numbers 2 and 5) had to be solved numerically; the others could be solved using elementary techniques. The fifth model is the best way to measure distance, but not the easiest to work with. Of the models where we can solve everything in closed form, it s clear that Model 4 is closer to the true solution than Model 1 (which is the same as Model 3). Again, the very important takeaway for this problem is that the answer to a problem can depend on how you choose to ask the question. Different notions of distance lead to different locations. 10

5 Connection to the rest of the course This material is a natural stepping stone to the Method of Least Squares. The purpose of the Method of Least Squares is to find the best values of parameters in a theory given experimental observations. The main idea is that we have an error function which depends on the unknown parameters, and we must minimize it, with the minimum occurring when the gradient vanishes. We thus need to decide on what is the best way to measure errors. We cannot do just the amount we are off, as that is a signed quantity. We could use the absolute value of the difference between theory and experiment, but the absolute value function is not differentiable. We thus use the square of the difference. This turns out to be mathematically tractable, and leads to very nice closed form expressions. 11