Chapter 3 Numerical Methods

Similar documents
LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

Local and Global Minimum

(1) Given the following system of linear equations, which depends on a parameter a R, 3x y + 5z = 2 4x + y + (a 2 14)z = a + 2

Inverse and Implicit functions

Chapter 4 Dynamics. Part Constrained Kinematics and Dynamics. Mobile Robotics - Prof Alonzo Kelly, CMU RI

Lagrange multipliers October 2013

Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers. Lagrange Multipliers

Computational Methods. Constrained Optimization

CS231A. Review for Problem Set 1. Saumitro Dasgupta

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Lagrange multipliers 14.8

Programming, numerics and optimization

Unconstrained Optimization Principles of Unconstrained Optimization Search Methods

. Tutorial Class V 3-10/10/2012 First Order Partial Derivatives;...

Math 241, Final Exam. 12/11/12.

Machine Learning for Signal Processing Lecture 4: Optimization

21-256: Lagrange multipliers

Quasilinear First-Order PDEs

Key points. Assume (except for point 4) f : R n R is twice continuously differentiable. strictly above the graph of f except at x

Minima, Maxima, Saddle points

Introduction to Optimization

Characterizing Improving Directions Unconstrained Optimization

In other words, we want to find the domain points that yield the maximum or minimum values (extrema) of the function.

REVIEW I MATH 254 Calculus IV. Exam I (Friday, April 29) will cover sections

Lecture 2 Optimization with equality constraints

Lecture 25 Nonlinear Programming. November 9, 2009

Solution Methods Numerical Algorithms

Introduction to Optimization Problems and Methods

Support Vector Machines.

Animation Lecture 10 Slide Fall 2003

Section 4: Extreme Values & Lagrange Multipliers.

CONLIN & MMA solvers. Pierre DUYSINX LTAS Automotive Engineering Academic year

1. Suppose that the equation F (x, y, z) = 0 implicitly defines each of the three variables x, y, and z as functions of the other two:

All lecture slides will be available at CSC2515_Winter15.html

Ellipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case

5 Day 5: Maxima and minima for n variables.

Chapter 6. Curves and Surfaces. 6.1 Graphs as Surfaces

Constrained Optimization

Constrained Optimization and Lagrange Multipliers

Background for Surface Integration

Lecture 2 September 3

MATH2111 Higher Several Variable Calculus Lagrange Multipliers

Computational Optimization. Constrained Optimization Algorithms

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Optimization Methods: Optimization using Calculus-Stationary Points 1. Module - 2 Lecture Notes 1

EXTRA-CREDIT PROBLEMS ON SURFACES, MULTIVARIABLE FUNCTIONS AND PARTIAL DERIVATIVES

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING

Constrained and Unconstrained Optimization

Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview

Bounded, Closed, and Compact Sets

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

COMS 4771 Support Vector Machines. Nakul Verma

Lagrangian Multipliers

Multivariate Calculus Review Problems for Examination Two

Solution 2. ((3)(1) (2)(1), (4 3), (4)(2) (3)(3)) = (1, 1, 1) D u (f) = (6x + 2yz, 2y + 2xz, 2xy) (0,1,1) = = 4 14

Convexization in Markov Chain Monte Carlo

ISM206 Lecture, April 26, 2005 Optimization of Nonlinear Objectives, with Non-Linear Constraints

x 6 + λ 2 x 6 = for the curve y = 1 2 x3 gives f(1, 1 2 ) = λ actually has another solution besides λ = 1 2 = However, the equation λ

15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs

Lecture Notes: Constraint Optimization

Calculus III Meets the Final

Directional Derivatives. Directional Derivatives. Directional Derivatives. Directional Derivatives. Directional Derivatives. Directional Derivatives

Chapter 5 Partial Differentiation

Jacobian: Velocities and Static Forces 1/4

AP Calculus BC Course Description

Outcomes List for Math Multivariable Calculus (9 th edition of text) Spring

Jacobian: Velocities and Static Forces 1/4

Math 213 Exam 2. Each question is followed by a space to write your answer. Please write your answer neatly in the space provided.

Department of Mathematics Oleg Burdakov of 30 October Consider the following linear programming problem (LP):

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization

Lecture 5: Duality Theory

A Short SVM (Support Vector Machine) Tutorial

Unconstrained Optimization

we wish to minimize this function; to make life easier, we may minimize

Let and be a differentiable function. Let Then be the level surface given by

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

Convexity and Optimization

Chapter 4 Concepts from Geometry

MATH Lagrange multipliers in 3 variables Fall 2016

Chapter 3: Kinematics Locomotion. Ross Hatton and Howie Choset

Optimal Control Techniques for Dynamic Walking

1 Vector Functions and Space Curves

Kernel Methods & Support Vector Machines

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

MTAEA Convexity and Quasiconvexity

Lagrangian methods for the regularization of discrete ill-posed problems. G. Landi

What you will learn today

Convexity and Optimization

Linear Methods for Regression and Shrinkage Methods

Math 113 Calculus III Final Exam Practice Problems Spring 2003

Winter 2012 Math 255 Section 006. Problem Set 7

Introduction to Modern Control Systems

Linear Transformations

10/11/07 1. Motion Control (wheeled robots) Representing Robot Position ( ) ( ) [ ] T

MATH 2400, Analytic Geometry and Calculus 3

Lagrange multipliers. Contents. Introduction. From Wikipedia, the free encyclopedia

Optimizations and Lagrange Multiplier Method

Fast marching methods

Chapter 15 Introduction to Linear Programming

Reminder: Lecture 20: The Eight-Point Algorithm. Essential/Fundamental Matrix. E/F Matrix Summary. Computing F. Computing F from Point Matches

Transcription:

Chapter 3 Numerical Methods Part 1 3.1 Linearization and Optimization of Functions of Vectors 1

Problem Notation 2

Outline 3.1.1 Linearization 3.1.2 Optimization of Objective Functions 3.1.3 Constrained Optimization Summary 3

Motivation A small number of numerical methods occur frequently. Roots of nonlinear equations Optimization Integration of Diff Eqs. Can t use MATLAB to control the robot. You need to know How to implement these. How to cast a problem in standard form. 4

Motivation Techniques will be used for control, perception, position estimation, and mapping. Specifically: compute wheel velocities invert dynamic models generate trajectories track features in an image construct globally consistent maps identify dynamic models calibrate cameras 5

Outline 3.1.1 Linearization 3.1.2 Optimization of Objective Functions 3.1.3 Constrained Optimization Summary 6

3.1.1.1 Taylor Series (About Any Point) Scalar function of scalar: df x 2 fx ( + x) fx ( ) x dx -------- df x 3 2! dx 2 -------- df = + + + 3! dx 3 + Scalar function of vector: Vector function of vector: 1d 1d 1d x 2 1d 2d 3d f x 2 fx ( + x) fx ( ) x x -------- f x 3 2! x 2 -------- f = + + + 3! x 3 + 2d x f x 2 2 f x 3 3 f f ( x + x) = f ( x) + x x + -------- x 2! x 2 + -------- 3! x 3 + x x x 2 3d x 3 x 4d 3 x 7

3.1.1.2 Taylor Remainder Theorem fx ( + x) = T n ( x ) + R n ( x) R n ( x ) = ( x + x) x ---- 1 x ζ n! ( n) ( ) n f x n ζ dζ Provided the nth derivative is bounded, the error in a truncated Taylor series is of order ( x) n+1. When x is small, ( x) n+1 is very small. 8

3.1.1.3 Linearization f f ( x + x) = f ( x) + x x x This is accurate to first order meaning the error is second order. 9

3.1.1.4 Gradients and Level Curves and Surfaces This forms an n-1 dimensional subspace of R n called a level surface (a set of level curves). Let the set of solutions be parameterized locally in 1D by the scalar s to form the level curve x(s). Each element of x(s) is a level curve for which g(x) is constant. You could plot x 1 (s) vs x 2 (s)for example. By the chain rule applied along the constraint: g( xs ()) s g x T c = = = x s s 0 T Gradient is normal to all the level curves 10

Consider the constraint: 3.1.1.5 Example x 2 The constraint gradient is: R s x 1 Parameterize the level curve with: The level curve tangent: Which is orthogonal to 11

3.1.1.6 Jacobians and Level Surfaces This forms an n-m dimensional subspace of R n called a level surface (a set of level curves). Let the set of solutions be parameterized by the vector s to form x(s). x(s 1 ) and x(s 2 ) are different level curves. By the chain rule: g( xs ()) s g x T c = = = x s s [ 0] Rows of Jacobian are orthogonal to tangents to level curves 12

Jacobian and Tangent Plane Consider again: Jacobian g x x T = [ s 0 ] Surface Tangents The set of all linear combinations of the rows of x s is called the tangent plane. Therefore, the constraint tangent plane lies in the nullspace of the constraint Jacobian. 13

Outline 3.1.1 Linearization 3.1.2 Optimization of Objective Functions 3.1.3 Constrained Optimization Summary 14

3.1.2 Optimization of Objective Functions optimize: x fx ( ) x R n f is always a scalar. Called objective, cost or utility function The max(f) problem is equivalent to min(-f) so we will treat all problems as minimization. 15

Convexity and Global Minima A convex (concave) function is uniformly on or below (above) any line between any two points in its domain. For these, local and global minima are the same. Convex Not Convex 16

3.1.2.1 Local Minima Suppose we compute a minimum over a specified region Which one you get often depends on where you start searching. Any global minimum (not on a boundary of the set) must also be a local minimum. Therefore a local minimum is a more fundamental problem. 17

3.1.2.1 Notation This section will henceforth use two forms of shorthand in one compact form: So f x () is a Jacobian or a gradient (if f is scalarvalued): 18

3.1.2.1 First Order (Necessary) Conditions Consider satisfying the weaker condition of being a local minimum. In a given neighborhood, a Taylor series approximation applies. Consider the change f after moving a distance x from a local minimum x*. f = fx ( * + x ) fx ( * ) = fx ( * ) f fx ( * ) = -------------- x x = fx ( x * ) x fx ( * ) + -------------- x x fx ( * ) Objective Gradient 19

3.1.2.1 First Order (Necessary) Conditions From last slide = ( ) x If x* is a local minimum then f 0for an arbitrary x. But if f( x) > 0 then f(- x) would be < 0. Contradiction. Hence we must have equality f f x x * f = f x ( x * ) x = 0 x But since x is arbitrary, we must have: f x ( x * ) = 0 T at a local minimum. 20

3.1.2.1 First Order (Necessary) Conditions Note: The gradient exists in the x1-x2 plane, not in the x1-x2-f volume. 21

3.1.2.2 Second Order (Sufficient) Conditions At a local minimum, a perturbation in any direction satisfies: fx ( * + x) = fx ( * ) + f x ( x * ) x = fx ( * ) x Therefore, the order 2 Taylor series has no linear term at this point: Instead, it looks like: fx ( + x ) fx ( ) + 2 x 2 f -------- 2! x 2 Implicit = Layout x 22

3.1.2.2 Second Order (Sufficient) Conditions In explicit vector layout this is: f = fx ( + x) fx ( ) = fx ( ) So the optimality condition is: f = 1 -- x T F 2 xx ( x) x 0 x In other words, the Hessian must be positive semi-definite. The sufficient condition is: + 1 -- x T F 2 xx ( x) x x T F xx ( x) x > 0 Hessian Must Be Positive Definite 23

Feasible Points Consider a set of m nonlinear constraints on the n elements of x. cx ( ) = 0 c R m x R n Any point that satisfies these is called a feasible point. Suppose the point x is feasible. 24

3.1.2.3 Feasible Perturbations Consider a feasible perturbation x That is, one which remains feasible to first order. c( x' + x) = c( x' ) + c x ( x' ) x = 0 The Jacobian c x is nonsquare m by n. Since x is feasible, c(x ) =0. Hence: c x ( x' ) x = 0 For a feasible perturbation from a feasible point. 25

From last slide 3.1.2.4 Constraint Nullspace c x ( x' ) x = 0 The Jacobian is composed of m gradients (rows): { c 1 ( x ), c 2 ( x) c m ( x) } The feasibility condition is requiring the perturbation to be in the nullspace of the constraints. N( c x ) = { x: c x ( x' ) x = 0} 26

3.1.2.4 Constraint Tangent Plane The constraint nullspace is also known as the constraint tangent plane. 27

Conditions for Optimality and Feasibility 28

Outline 3.1.1 Linearization 3.1.2 Optimization of Objective Functions 3.1.3 Constrained Optimization Summary 29

3.1.3 Constrained Optimization optimize: x fx ( ) x R n subject to: cx ( ) = 0 c R m n variables in f() subject to m<n constraints. Also known as nonlinear programming. If there is no f(), it is constraint satisfaction or rootfinding. If there are no constraints it is unconstrained optimization. 30

3.1.3.1 First Order (Necessary) Conditions There are n-m degrees of freedom left after the constraints are enforced. Cannot just set objective gradient to zero because such a point may not be feasible. Define a local minimum to mean for a feasible perturbation from a feasible point the objective must not increase c x ( x * ) x = 0 x feasible f x ( x * ) x = 0 x feasible 31

Lagrange Multipliers First, a feasible perturbation lies in the constraint tangent plane. c x ( x * ) x = 0 x feasible Second, the objective gradient is orthogonal to an arbitrary vector in the tangent plane. f x ( x * ) x = 0 x feasible The rows of c x ( x * ) span the space of all vectors orthogonal to the constraints. Hence, f x () must be some linear combination of them: f x ( x * ) = λ T c x ( x * ) 32

Constrained Local Minimum f 3 f 4 c(x)=0 Moving left or right on the constraint increases the objective The objective gradient is orthogonal to the level curves of both the constraints and the objective. Level curves of constraints and objective are mutually tangent. 33

Necessary Conditions Reverse the sign of the (unknown) multipliers to get: f T x ( x * ) + c T x ( x * )λ = 0 cx ( ) = 0 It is customary to define the Lagrangian: lxλ (, ) = fx ( ) + λ T cx ( ) Eqn A 34

Define the Lagrangian: Necessary Conditions lxλ (, ) = fx ( ) + λ T cx ( ) The necessary conditions (Eqn A) can then be written as: l( xλ, ) T = 0 x l( xλ, ) T = 0 λ Necessary Conditions For A Constrained Local Mimimum 35

3.1.3.2 Solving NLP Problems Four basic techniques: Substitution: Substitute constraints into objective. Solve resulting n-m dof unconstrained problem. Descent: Follow negative gradient of f() while remaining feasible. Lagrange Multipliers: Solve n+m (linearized) necessary conditions directly. Penalty Function: Convert constraints to a cost and solve resulting n+m dof unconstrained problem. 36

3.1.3.5 Example: Penalty Function Consider the problem: Suppose constraints are not satisfied and define the residual: Define a weight matrix R and form the new cost function: 37

From last slide: Example: Penalty Function Differentiate f(x) wrt x and solve: Hence: Solution Depends on Weight Matrix 38

Outline 3.1.1 Linearization 3.1.2 Optimization of Objective Functions 3.1.3 Constrained Optimization Summary 39

Summary Linearization, derived from the Taylor series, is the fundamental basis of many numerical methods. The gradient vanishes at a local minimum of an unconstrained objective. All the gradient(s) of a set of constraints are orthogonal to every vector in the constraint tangent plane. 40

Summary A constrained minimum occurs when the objective tangent plane is tangent to the constraint tangent plane. Equivalently, when the objective gradient is a linear combination of the constraint gradients. The Lagrange multipliers are a set of unknown projections of a vector onto a set of constraint gradients (disallowed directions). 41