Fachgebiet Simulation und Optimale Prozesse Fakultät für Informatik und Automatisierung Institut für Automatisierungsund Systemtechnik Laboratory exercise Laboratory experiment OPT-1 Nonlinear Optimization Responsible professor: Prof. Dr. Ing. habil. P. Li Responsible for lab experiment: Dr. Ing. S. Hopfgarten Name, Surname Matrikel (registration) no. Coworker Date, mark, signature
Lab experiment OPT-1 2 1 Aim The lab experiment serves to deepen the knowledge of the corresponding lectures and exercises and illustrates the procedure concerning the solution of unconstrained nonlinear optimization problems min x f(x), x R n, f : R n R 1 with different methods. Based on the software package MATLAB 1 it permits the investigation of properties of numerical methods of unconstrained nonlinear optimization. These methods can be evaluated using either prepared test functions or user-defined optimization problems with regard to effort, convergence rate, and other criteria. A visualization program with a graphical user interface is provided, allowing the 3D representation of the cost function and an isolines diagram for two-dimensional optimization problems (n = 2). Start points can graphically be selected or given by values. Search paths of different algorithms or multiple computations using the same algorithm can be compared. The graphical illustration of the iterative procedures facilitate the evaluation. 2 Realisation of the lab experiment The software package MATLAB is the base for the establishment of this lab exercise. This package enables scientific and engineering numerical computations (numerical analysis, matrices calculations, signal processing, graphical illustrations, etc.) in an easy-to-use environment. The matrix is the base data element (with in general complex elements). Problems, expressions, algorithms, etc., can be noted in a manner like mathematical notation. In the framework of this lab experiment the following derivative-free and gradient-based numerical methods of unconstrained optimization are made available: Gradient-based methods: Gradient method (Steepest descent), Conjugate gradient method according to Fletcher-Reeves, Polak-Ribiere, Hestenes-Stiefel Quasi-Newton method Wolfe (rank 1 update), Davidon-Fletcher-Powell (rank 2 update), Broyden-Fletcher-Goldfarb-Shanno (rank 2 update), each with (approximative) exact line search Quasi-Newton method according to Broyden-Fletcher-Goldfarb-Shanno with Armijo step-size rule Derivative-free methods: Gauss-Seidel method (coordinate search method with line search) 1 MATLAB is a registered trademark of The MathWorks, Inc.
Lab experiment OPT-1 3 Hooke-Jeeves method (pattern search) Rosenbrock method (rotating coordinates) Nelder-Mead simplex search method Evolutionary strategies: Single mutant (1+1) evolutionary strategy according to Schwefel [6] Multiple mutant (5/5,20) evolutionary strategy according to Rechenberg [7] Cascaded (1,5(5/5,20)) evolutionary strategy according to Rechenberg [7] Hybrid methods: Hybrid from (1,5) evolutionary strategy and Rosenbrock method (combined by method of direct integration) Hybrid from (1,5) evolutionary strategy and simplex method according to Nelder-Mead (combined by method of direct integration) Besides these optimization methods a set of test functions are implemented, e. g.: f(x) = 1 2 xt P x, P - symmetric (n, n) matrix f(x) = (x 2 1 + x 2 2 2x 1 ) 2 + 1 4 x 1 (function of Zettl) f(x) = 100(x 2 x 2 1) 2 + (x 1 1) 2 (Rosenbrock valley) n f(x) = x 10 i See appendix, concerning more details about the implemented search methods, a complete listing of test functions, and hints with regard to the graphical user interface. 3 Preparation (Written homework) 3.1 Establish a positive definite, a negative definite, and an indefinite quadratic form, respectively, for the two-dimensional case (x R 2 )! 3.2 Calculate the location and the type of stationary points for the following cost functions: a) f(x) = 100(x 2 x 2 1) 2 + (x 1 1) 2 (Rosenbrock valley) b) f(x) = x 1 exp ( x 2 1 x 2 ) 2 (problem Nice ) c) f(x) = 2x 3 1 3x 2 1 6x 1 x 2 (x 1 x 2 1) (problem Fletcher25 ) 3.3 Repeat the theoretical fundamentals, procedure, and essential properties of selected numerical methods for unconstrained optimization (derivative-free methods: Gauss-Seidel, Hooke- Jeeves; gradient-based methods: gradient method, conjugate gradient method, Quasi-Newton method)!
Lab experiment OPT-1 4 3.4 As a result of a theroetical process analysis for a given system a static behaviour (static characteristic curve) ŷ = (1 a 1 u) a 2 1 with the unknown parameters a 1 und a 2 was determined. Under utilization of measuring results u i 2 5 10 20 30 50 ŷ i 0.9427 0.8616 0.7384 0.5362 0.3739 0.3096 a 1 und a 2 are to be calculeted by means of the least squares method. For this optimization problem, formulate a suited cost function f(x) with x = [a 1 a 2 ] T and a corresponding MATLAB M file which looks like the following for the Nice problem: function f=f_nice(x) f=x(1)*exp(-x(1)^2-x(2)^2) with x as optimization variable and f as cost function! 4 Execution of the laboratory experiment All following investigations are performed by means of the MATLAB program opt1 (visualization, user interface), see appendix. Please, use table 1 from appendix for evaluation of convergence behaviour of numerical methods! 4.1 Display the 3D graphs corresponding to the quadratic forms established under 3.1! For that purpose, load the cost function f quad (data set Quad.mat) and modify the parameter P1 (Hessian matrix) according to your choice (homework)! In addition, investigate a positive-semidefinite and a negative-semidefinite quadratic form! 4.2 Solve the following two-dimensional quadratic optimization problems (cost function f quad and data set Quad.mat, resp., parameter P1: Hessian matrix) by means of Gauß-Seidel method, gradient method (steepest descent), conjugate gradient method, and Quasi-Newton method (BFGS) outgoing from different start points! Answer the questions below! a) P 1 = [ 7 0 0 7 ] b) P 1 = Proposed start points: [ ] [ 1 α) x 0 = β) x 1 0 = [ 0 2 4 0 0 8 ] ] c) P 1 = γ) x 0 = [ [ 1 2 4 3 3 4 ] ] δ) x 0 = How do different start points influence the convergence behaviour of gradient, conjugate gradient, and Quasi-Newton method? Which influence has the axes position of the isolines regarding the coordinate system on the convergence behaviour of Gauss-Seidel method? 4.3 Investigate the procedure of selected derivative-free and gradient-based methods using the following simple non-quadratic optimization problems: [ 1.5 2 ]
Lab experiment OPT-1 5 a) 3.2a (Rosenbrock valley; cost function f rose, data set Rose.mat); start points: [-1,0] T, [-1,1] T, [1,-1] T b) 3.2b (cost function f nice, data set Nice.mat); start point: [0.3,0.3] T c) problem according to Zettl (cost function f zettl, data set Zettl.mat); start points: [2,0.25] T, [1.2,0] T Put together the advantages and disadvantages of investigated methods and derive recommendations for the usage! 4.4 Solve the model building problem 3.4 by means of a method of your choice! Write an M file for cost function and gradient evaluation (if necessary)! Display the identified static characteristic curve together with the measurements! 4.5 Test selected optimization methods at pathological cost functions! a) f(x) = n x 10 i (f 10, data set F 10.mat) b) f(x) = n x i (f abs, data set Abs.mat) c) cost function f patho, data set Patho.mat. Literatur [1] P. Li.Lecture Steady-state optimization. TU Ilmenau [2] Taschenbuch Elektrotechnik. 1. Auflage, Berlin 1977, Bd. 2; 3. Auflage, Berlin 1987 Bd. 1. [3] R. Fletcher. Practical Methods of Optimization. Vol. 1: Unconstrained Optimization. Wiley, Chichester 1980. [4] The MathWorks, Inc., Natick, Massachusetts: Using MATLAB, 2000. [5] The MathWorks, Inc., Natick, Massachusetts: Optimization TOOLBOX for use with MATLAB, 2000. [6] H.-P. Schwefel. Numerische Optimierung von Computer-Modellen mittels der Evolutionsstrategie. Birkhäuser, Basel 1977. [7] I. Rechenberg. Evolutionsstrategie 94. frommann-holzboog, Stuttgart 1994. [8] T. Bäck. Handbook of evolutionary computation. Inst. of Physics Publ. Bristol 1997 A Appendix: Table 1 See next page. Appendix B (MATLAB programs) isn t immediately necessary for performing the laboratory experiment. For deeper understanding, structure and call of optimization routines (also for more than two optimization variables), cost function and gradient calculation procedures, examples of procedure calls, and visualization, the appendix delivers useful hints and can be used if needed.
Lab experiment OPT-1 6 method start optimal opt. no. CPU time no. no. point solution cost f. val. iterations ( value) c. f. eval. grad. eval. Tabelle 1: Table for convergence behaviour
Lab experiment OPT-1 7 B Appendix: MATLAB programs B.1 Optimization routines ovmeth: ovbfgs: ovevol: ovevol520: ovfmins: ovgs: ovhoje: ovrose: oveses: ovesrose: ovesfmins: Gradient-based search methods with (approximately) exact line search: gradient method (steepest descent), conjugate gradient method according to Fletcher-Reeves, Polak-Ribiere, Hestenes-Stiefel, quasi-newton method according to Wolfe (rank-1 update), Davidon-Fletcher-Powell (rank-2 update), Broyden-Fletcher-Goldfarb-Shanno (rank-2 update) Quasi-Newton method according to Broyden-Fletcher-Goldfarb- Shanno with Armijo step size rule single mutant (1+1) evolutionary strategy according to Schwefel (5/5,20) evolutionary strategy according to Rechenberg Simplex method of Nelder-Mead, corresponds to fmins from MAT- LAB Optimization Toolbox [5] Gauss-Seidel method (coordinate search method with line search) Hooke-Jeeves method (pattern search) Rosenbrock method (rotating coordinate system) cascaded (1,5(5/5,20)) evolutionary strategy according to Rechenberg hybrid method ((1,5) evolutionary strategy and Rosenbrock method, combined by direct integration method) hybrid method ((1,5) evolutionary strategy and Simplex method according to Nelder-Mead, combined by direct integration) The parameter lists of optimization routines were unified as far as possible and correspond to those of MATLAB Optimization Toolbox [5]. They contain the following parameters for all methods: fun: cost function procedure; either name of an M file (e. g. f rose), calculating the cost function value at the given point (f=fun(x)) or cost function as a character string of MATLAB statements (e. g. x(1)^2+2*x(2)^4 with the optimization variable x) x: start point (column or row vector) options: specification of truncation threshold, parameters for the methods, etc. options is a vector of legth 18; it s sufficient to give different than standard values (values in square brackets) ; options is completed up to length 18. options(1): options(2): options(3): Control of output ( 1: no, 0: standard, 1: iterations course numerically) [0] truncation threshold (change of variables) [1.0E-4] truncation threshold (change of cost function) (and gradient norm for gradient-based methods) [1.0E-4]
Lab experiment OPT-1 8 gradfun: P1,...,P10: options(4): options(5): options(6): options(7): options(8): options(9): options(10): options(11): not used not used method variant (ovmeth: calculation of search direction, start approximation of Hessian matrix) [0] line search algorithm (step length calculation, method dependent) [0] cost function value at point x after truncation test of gradient calculation (0: no test, 1: check of calculated gradient with gradfun by difference approximation; only for derivative-free methods) [0] no. of cost function evaluations no. of gradient calculations options(14): maximum no. of iterations [100] options(16): options(17): Return parameters of optimization routines: step length factor (method dependent) step length factor (method dependent) options(18): start step length for line search [0] (only for gradient-based methods) procedure for calculation of gradient of the cost function (column vector), see fun maximal 10 parameters (matrices), given to the cost function and the gradient calculation procedures. (They serve for the avoidance of global variables.) x: solution vector or value of optimization variables after truncation of iterations, respectively options: see above xpath: search path. The matrix xpath contains the optimization variable, cost function value, CPU time, and cumulative no. of cost function evaluations at each iteration. B.2 Prepared cost function and gradient evaluation procedurers The names of the M files start with f (cost function evaluation procedure) and df (gradient evaluation procedure), resp. The data set name given in brackets is used in the graphical user interface. f abs (Abs.mat) f ackley (Ackley.mat) sum of absolute values: f(x) = n x i problem of Ackley: f(x) = 20 exp ( 0.2 x / n ) ( n ) exp cos 2πx i /n
Lab experiment OPT-1 9 f beale (Beale.mat) problem of Beale: f(x) = (1.5 x 1 (1 x 2 )) 2 + ( 2.25 x 1 (1 x 2 2) ) 2 + + ( 2.625 x 1 (1 x 3 2) ) 2 (Flet23.mat) problem of Fletcher, p. 23: f(x) = 2x 3 1 3x 2 1 6x 1 x 2 (x 1 x 2 1) (Flet25.mat) problem of Fletcher, p. 25: f(x) = 2x 2 1 + x 2 2 2x 1 x 2 + 2x 3 1 + x 4 1 (Flet59.mat) problem of Fletcher, p. 59: ( f(x) = (x 2 1) 2 + (2x 1 1) 2 + (2x 2 1) 2 2 ) 2 3 f kowa parameter estimation problem with least squares method (n = 4) (Kursaw.mat) f leon (Leon.mat) problem of Kursawe: f(x) = n ( xi 0.8 + 5 sin x 3 ) i problem of Leon: f(x) = 100 ( x 2 x 3 ) 2 1 + (x1 1) 2 f nice (F Nice.mat, Nice.mat) f(x) = x 1 exp ( x 2 1 x 2 ) 2 f patho (Patho.mat) non-differentiable cost function: f(x) = 1 2 max { x 1, x 2 } + min { [x 1 ] x 1, [x 2 ] x 2 } f quad (Quad.mat) quadratic cost function (with cost function parameter matrix P ): f(x) = 1 2 xt P x (Peaks.mat) f rast (Rast.mat) f regler f(x) = 3(1 x 1 ) 2 exp ( x 2 1 (x 2 + 1) 2) ( ) 1 10 5 x 1 x 3 1 x 5 2 exp ( x 2 1 x 2 2) 1 3 exp ( (x 1 + 1) 2 x 2 ( 2) + 0.1 x 2 1 + x 2 ) 2 problem of Rastrigin: f(x) = 10 n + n ( x 2 i 10 cos (2 π x i ) ) control problem example (design of a PD controller)
Lab experiment OPT-1 10 f rose (Rose.mat) f foxholes (Foxholes.mat) (Sixhump.mat) f walsh (Walsh.mat) f zettl (Zettl.mat) f 10 (F 10) problem of Rosenbrock (Rosenbrock valley, banana function): f(x) = 100 ( x 2 x 2 ) 2 1 + (x1 1) 2 problem of Shekel (Shekel s foxholes): 1 f(x) = 1 25 K + j=1 K = 500, c j = j 1 c j + 2 (x i a ij ) 6 32 16 0 16 32 32... 0 16 32 (a ij ) = 32 32 32 32 32 16... 32 32 32 f(x) = x 2 1 ( ( 4 + x 2 1 2.1 + 1 )) 3 x2 1 + x 1 x 2 + 4x 2 ( ) 2 1 + x 2 2 + 1.0316 model building problem of Walsh problem of Zettl: f(x) = ( x 2 1 + x 2 ) 2 1 2 2x 1 + 4 x 1 10th power: f(x) = n x 10 i B.3 Example for procedure calls B.3.1 Cost function The optimization routines require a MATLAB M file for the evaluation of the cost function getting the optimization variable x as an argument and delivering the cost function value f(x) as the result. function f=f_nice(x) f=x(1)*exp(-x(1)^2-x(2)^2); The name of this M file (f nice) has to be given during the call of the optimization routine in the MATLAB Command Window, if the graphical user interface is not used. >>x0=[-1-1] ; >>ovfmins( f_nice,x0) x0 is the start point for Simplex search method according to Nelder-Mead used here. Alternatively, the cost function can also be entered as a MATLAB statement in a character string. The identifier x must be used for the optimization variable. >>ovfmins( x(1)*exp(-x(1)^2-x(2)^2),x0)
Lab experiment OPT-1 11 B.3.2 Gradient Some of the implemented optimization algorithms use the gradient f(x) of the cost function to determine the search direction. The gradient calculation can be done in a MATLAB M file. The optimzation variables x are given as an argument to the gradient calculation procedure delivering the n-dimensional column vector as the result. function df=df_nice(x) df=exp(-x(1)^2-x(2)^2)*[1-2*x(1)^2; -2*x(1)*x(2)]; >>x0=[1 1] ; >>ovbfgs( f_nice,x0,[], df_nice ) Alternatively, the gradient can also be entered as a MATLAB statement in a character string. The identifier x must be used for the optimization variable. >>ovbfgs( x(1)*exp(-x(1)^2-x(2)^2),x0,[],... exp(-x(1)^2-x(2)^2)*[1-2*x(1)^2; -2*x(1)*x(2)] ) If during a call of a gradient-based optimization routine no gradient is given, the derivatives needed are approximately calculated by finite differences. >>ovbfgs( x(1)*exp(-x(1)^2-x(2)^2),x0) B.3.3 Cost function parameters In many cases the cost function depends on additional parameters besides the optimization variables. The parameters themselves are not optimized, but there influence on the optimal solution is of interest. To avoid global variables in such cases, up to 10 such cost function parameters can directly be given to the cost function as additional arguments at the end of parameter list. function f=f_nice(x,p) if nargin<2, p=0; end f=x(1)*exp(-x(1)^2-x(2)^2)+p/2*(x(1)^2+x(2)^2); >>x0=[1 1] ; >>p=0.1; >>ovfmins( f_nice,x0,[],p) If the cost function (or the gradient) is given as a MATLAB statement in character string form, the identifier for the cost function parameters must be P1, P2, etc. >>ovfmins( x(1)*exp(-x(1)^2-x(2)^2)+p1/2*(x(1)^2+x(2)^2),x0,[],p)
Lab experiment OPT-1 12 B.4 Visualization/graphical user interface 3D graphs and search directions of the solution routines can be visualized for optimization problems with two variables (n = 2). For that purpose the graphical user interface opt1 is available and can be started from the MATLAB command window. >>cd OPT1 >>opt1( english ) The graphical user interface consists of 4 windows: The optimization problem to be investigated is defined by entering in the window Optimization problem : the cost function (M file or MATLAB statement) the gradient (M file or MATLAB statement) cost function parameters the graphical display area (grid points in (x 1, x 2 ) plane) isolines to be displayed (no. of isolines or vector of cost function levels) The gradient calculation can be validated by comparison with an numerical approximation of the gradient. If no gradient is given and the gradient is entered by a MATLAB statement the gradient is symbolically computed, otherwise the gradient-based routines use an approximate gradient calculation. Further dialog elements permit saving and reading of a prepared optimization problem, and closing of the program.
Lab experiment OPT-1 13 The cost function is visualized in the window Cost function in a (pseudo-)3d manner. The color map, type of display, the horizontal and vertical view angle can be modified by corresponding dialogue elements. Optionally the search path can be displayed. The isolines and the search path of optimization runs are shown in the window Cost function levels in dependence of the optimization variables.
Lab experiment OPT-1 14 Up to 4 optimization runs can be selected in the window Optimization runs identifiable via different colors. The method used, options for routine call, and start point can be selected. The start point can numerically or graphically (in window Cost function levels ) be set. After termination of an optimization run the solution found can numerically be seen, and the iteration course, i. e. the dependence of the cost function value on the no. of iterations, on the no. of function evaluations, and the CPU time is displayed in diagrams.