Optimal Control Techniques for Dynamic Walking Optimization in Robotics & Biomechanics IWR, University of Heidelberg Presentation partly based on slides by Sebastian Sager, Moritz Diehl and Peter Riede Dynamic Walking 2011 Jena
Why do we do this tutorial? There are a lot of optimal control problems in walking robots There are often statements like We use SNOPT (fmincon,... etc.) to optimize robot motions There is a lot more to optimal control than to nonlinear optimization!!
Introduction Katja Mombaur
Nonlinear Optimization Problem Famous nonlinear example function Rosenbrock function Fletcher Banana valley function
Important property of nonlinear optimization problems The optimization is performed in finite dimensional space The result is a point in finite dimensional space This will change for the next problem class
Optimal control problems Optimal control = Optimal choice of inputs for a dynamic system
Optimal control problems Optimization problems including dynamics e.g. optimal control problems State and control variables are functions in time (infinite-dimensional variables)
Optimal control methods Transformation from optimal control to nonlinear optimization problem Optimal control method Nonlinear Optimization Methods
MUSCOD an efficient tool for optimal control problems Developed by Bock and co-workers (IWR, University of Heidelberg) Original version Muscod I 1981 (Fortran 77) MUSCOD II developed in 1995 ( C ) Since then continuous extensions and updates (e.g. Mixed integer optimal control, NMPC,...) also contains efficient integrators with sensitivity generation Unfortunately not yet publicly available, but maybe in the near future If you are interested, contact us
Optimal Control
Optimal control problem (simplifed) Lagrange type Mayer type Objective function Process dynamics Initial and final constrainte
Optimal control problem (more complex) e.g. for the optimization of dynamic robot gaits: Multiphase problems with many additional constraints But we will stay with the simple problems for now
3 different approaches for optimal control problems Reminder Key problem: How to handle infinite dimensionality of states x(t) and controls u(t)? Dynamic Programming / Hamilton-Jacobi-Bellman equation Indirect Methods/ calculus of variations/ Pontryagin Maximum Principle Direct Methods (discretization of controls) Only approach treated in this talk
Direct methods for optimal control problems Common idea: replace control functions u by a discretization (= a finite-dimensional parameter vector) Are also called First-dicretize-then-optimize methods Infinite dimensionality of controls resolved But what about states x(t)?
Three different methods for state discretization Direct Collocation Direct Single Shooting Direct Multiple Shooting
Direct Single Shooting
NLP in Direct Single Shooting
Numerical example
Single shooting optimization for x 0 = 0.05
Single shooting iteration Solution
Direct Single Shooting: Pros and Cons + Concept easy to understand + Can use state-of-the art ODE or DAE integrators + (comparatively) Few optimization variables even for large ODE/DAE systems + Need only initial guess for controls q - Can not use knowledge of x in initialization (e.g. in tracking problems) - Often does not work when initial guesses for controls are far off trajectory explodes - Unstable systes are difficult/impossible to treat These drawbacks are adressed by Multiple Shooting Often used in enigineering applications, e.g. in packages gopt (PSE), ADOPT (Marquardt),...
Direct Multiple Shooting Bock, Plitt, 1981 Discretize controls piecewise on a coarse grid Use functions which have only local support, e.g. piecewise constant or piecewise linear functions
Direct Multiple Shooting Bock, Plitt, 1981 Split long integration interval into many shorter ones Use initial values of all integration intervals as free variables and shoot a new integration from each initial value
Direct Multiple Shooting Bock, Plitt, 1981 Introduce constraints to close gaps ( continuity conditions ) Also use integators to compute objective function
Discretized Optimal Control Problem
How to solve the NLP
The Lagrangian Function for NLP Lagrangian function of the constrained NLP equality multipliers λ i may have both signs in a solution inequality multipliers µ i cannot be negative (cf. shadow prices) for inactive constraints, multipliers µ i are zero
First order optimality conditions Karush-Kuhn-Tucker necessary conditions (KKT-conditions): x * feasible there exist λ *, µ * such that and it holds the complementary condition i.e. µ i* = 0 or h i (x * )= 0 for all i
Newton s method for unconstrained NLP Idea: calculate zeros of the equation to satisfy first order optimality constraints Taylor series: Newton Iteration: Inverse of Hessian Gradient of Lagrangian function
SQP - applying Newton s method to KKT Newton s method finds zeros of nonlinear equations
SQP sequential quadratic programming This is equivalent to solving the following quadratic programming problem (QP) in each step of the SQP Hessian of Lagrangian function use active-set-strategy to determine active inequality constraints Practical SQP: use update formula for Hessian use step size adaption
Example - Newton s method Behavior for banana valley function
Example - comparison between methods Newton s method much faster than steepest descent method convergence roughly linear for first 15-20 iterations since step length α k <1 convergence roughly quadratic for last iterations with step length α k 1
Essential steps in NLP solution Sequential quadratic programming (SQP) methods are very efficient methods to solve nonlinear programming problems Use a QP approximation to the NLP at every iteration SQP = Newton s method applied to KKT conditions Reduction of computation times by computation of Hessians via update formula Globalization of convergence by using damped Newton steps Important to note: iterates can be infeasible, only solution must be feasible
Back to the optimal control problem
Direct multiple shooting Bock, Plitt, 1981 Simultanuous optimization and simulation Discretized optimal control problem (NLP) Variables: discretized states s i, discretized controls u i, parameters p, Phase durations h i s i, u i, p, h x e,i + sens. Initial value problem solution (Integration + derivatives) on each multiple shooting interval s i x e,i Solution by efficient SQP Solution by efficient integrators
Trajectory sensitivity generation: black box - a bad idea
Trajectory sensitivity generation: IND a good idea! IND = internal numerical differentiation (Bock, 1981)
Direct Multiple Shooting Result of discretization: Large-scale nonlinear programming problem (NLP) Special structure originating from discretization (in KKT matrix) Bock, Plitt, 1981 Hessian of Lagrange Fct Constraint Jacobians MUSCOD Solution with structure-exploiting, tailored sequential quadratic programming (SQP) method, special condensing techniques
Direct Multiple Shooting example Same example as before; same initialization : u = 0 3 iterations compared to 7 for single shooting
Direct Multiple Shooting: Pros and Cons
Demonstration of example optimal control problem solution with MUSCOD-II A simple example
A simple optimal control problem A car is supposed to travel 300 m in 32 seconds Its initial and final velocity are zero The acceleration can vary between -2 m/s 2 and 1m/s 2 The maximum velocity is 30 m/s Optimization goal: minimize energy in terms of accelrations squared
A simple optimal control problem Mathematical formulation: subject to the constraints: Objective function (Lagrange type) lfcn Dynamic process model ffcn Initial and final constraints rdfcn Bounds (data file)
MUSCOD-II Model file Macros 1 0 0 0 2 0 1 0 2 2 2 2 # of model stages # of global parameters # coupled constraints among them, # of equaliy cnstraints # of differential state variables # of algebraic state variables # of controls # of local parameters # of start point constraints among them, # of equality constraints # of end point constraints among them, # of equality constraints
MUSCOD II source file Objective function: Integral term, Lagrange type Dynamical equations (right hand side)
MUSCOD II source file Constraint equations
MUSCOD-II source file // Define optimal control problem //Define model dimensions // Define model stage with index, dimensions of // state and control variables, pointers to // right hand side and objective fnctions,... etc.
MUSCOD-II source file // Define multipoint constraints (at start point, interior points or end point (of last phase)
MUSCOD-II data file Define number of multiple shooting intervals (also equals number of control intervals) Define start values, min and max values, and scaling factors for phase time flag to fix variable
MUSCOD-II data file Define type of state initialization: s_spec= 0: every point 1: start value generation by linear interpolation 2: start value generation by forward integration Define start values, scaling factors, min and max values for states
MUSCOD-II data file Type of control discretization u_ Control variable start values, scaling, min and max
MUSCOD-II data file Scaling of constraints: Scaling, min and max of objective function
Results
Same problem with different objective function Mathematical formulation: subject to the constraints: Objective function (Mayer type) mfcn Dynamic process model ffcn Initial and final constraints rdfcn Bounds (data file)
Results for time minimization
Demonstration of example optimal control problem solution with MUSCOD-II: A bipedal running robot
A bipedal running robot Running motion: alternation between single leg contact phases and flight phases Touch -down Leg1 Phase 1: Contact Takeoff Phase 2: Flight Touch -down Leg2 + Leg shift
A bipedal running robot 5 DOF in flight 4 DOF in contact 4 actuators : trunk Upper legs lower legs (massless) Only active in contact phase
Model of bipedal running robot Flight phase
Contact phase Spring damper forces Coupling between state variables
Model of bipedal running robot Contact phase
Form of model equations required for optimal control code MUSCOD. x = ffcn where a is the solution of see previous slides also satisfying
Switching functions Switching function for lift-off: Must also satisfy condition: Leg spring at rest length Positive vertical velocity (moving up) Switching function for take-off: Must also satisfy condition: Height of contact point = zero Negative vertical velocity at contact point
Touchdown discontinuity & Periodicity Inelastic impact: there is a jump in all model velocities at touchdown compute velocity after touchdown: Periodicity constraints on all positions (except forward direction) and velocities
The same robot performing summersaults... Mombaur et al. 2005 Only requires modification of periodicity constraints: + 360 in torso and leg angles per step
... and even open-loop stable flip-flops Requires another modification of periodicity constraints: + 180 in torso and leg angles per step
Literature
Thank you for your attention! www.orb.uni-hd.de Katja Mombaur Pictures taken at the Musée de l Automate, Souillac