Search direction improvement for gradient-based optimization problems

Computer Aided Optimum Design in Engineering IX 3 Search direction improvement for gradient-based optimization problems S Ganguly & W L Neu Aerospace and Ocean Engineering, Virginia Tech, USA Abstract Most gradient-based optimization algorithms calculate the search vector using the gradient or the Hessian of the obective function This causes the optimization algorithm to perform poorly in cases where the dimensionality of the obective function is less than that of the problem Though some methods like the Modified Method of Feasible Directions tend to overcome this shortcoming, they again perform poorly in situations of competing constraints This paper introduces a simple modification in the calculation of the search vector that not only provides significant improvements in the solutions of optimization problems but also helps to reduce or, in some cases, overcome the problem of competing constraints Keywords: optimization, multidisciplinary design optimization, gradient-based algorithm, method of feasible directions, search direction 1 Introduction An optimization problem can be defined, in terms of a set of decision variables, X, as: Minimize Obective Function: F( X), Subect to: g ( X) 0 = 1 m Inequality Constraints, (1) hk ( X) = 0 k = 1 l Equality Constraints, l u Xi Xi Xi i = 1 n Side Constraints A number of numerical algorithms have been devised to solve this problem Gradient-based algorithms are based on the following recursive equation: q q 1 X = X + αs (2) q 1 q where X and X are the vectors of the decision variables in the ( q 1) th and th the q iterations respectively Considering the optimization problem in an

4 Computer Aided Optimum Design in Engineering IX n-dimensional space with each dimension corresponding to a decision variable, S represents the search vector It determines the direction of change in the decision variable space Each point in the decision space represents a vector of decision variables α is a scalar which represents the desired amount of change in the decision variables along the S vector Thus the optimization problem can be split into finding the search vector and calculating the parameter α This paper focuses on finding an improved search vector 2 Finding the search vector For unconstrained problems the search vector in the th q q iteration, S, is usually calculated by a standard method, such as Steepest Descent ( S q = F ) or Conugate Direction ( S = F ( X ) + βs q q 1 q 1, where F β = F X q 1 2 ( X ) q 2 2 ( ) ) or 1 Newton s method ( q [ ( q S = H X )] F ( X q )) In many situations, particularly in cases of competing constraints as discussed below, the calculation of an appropriate search direction can be a challenge for an optimization algorithm Most engineering design problems are nonlinear constrained problems Gradient-based optimization algorithms generally use a combination of unconstrained and constrained search methods A method for unconstrained optimization is used to calculate the search direction to arrive at a point in the solution space where a constraint(s) is(are) active or violated For a constraint to be active, its function value must lie within a specified range Once a constraint is active or violated, the search direction is obtained by one of the methods used for constrained optimization The constrained search direction method gives the algorithm the ability to move along a constraint boundary to arrive at a better solution point One of the simpler direction finding methods is Zoutendik s [1] Method of Feasible Directions Here, a sub-optimization problem, referred to as the Direction Finding Sub-problem (DFS) finds a usable and feasible search direction when a constraint is active A usable search direction is a direction which improves the obective function value whereas a feasible search direction maintains feasibility or reduces constraint violation For all points in the solution space that are within the active region, search directions are calculated using the DFS When a constraint is active, the DFS is mathematically represented as: q 1 T q Minimize: F ( X ) S Usability Condition, q 1 T q Subect to: g ( X ) S 0 Feasibility condition, (3) q T ( S ) S q 1 Bounds on S Fig 1 demonstrates the DFS in the active region for a two dimensional case

Computer Aided Optimum Design in Engineering IX 5 Active region F 0 ( X ) F ( X) = const X 0 Feasible Sector 0 g( X ) Usable Sector S g ( X) = 0 Figure 1: DFS in the active region at point X 0 in a 2-D solution space When a constraint is violated the DFS finds a search direction that rapidly decreases the constraint violation while also trying to reduce the obective function The DFS is given by: q 1 T q Minimize: F ( X ) S ΦW (4) q 1 T q Subect to: g ( X ) S + θ W 0for active and violated constraints (5) ( q T ) q + 2 1 S S W (6) Here Φ is a large positive number which makes the second term in eqn (4) dominant W is an artificial variable that is used only to formulate the DFS and has no physical significance In order to minimize eqn (4), W tends to increase If θ is positive, as W increases, the first term in eqn (5) is driven more negative, ie, q S is pushed more in the direction opposite that of g This, in effect, drives the search direction to a direction that reduces constraint violation Further details can be found in Vanderplaats [2] It can be shown that the algorithmic map of the feasible directions method is not closed Wolfe [3] showed through a counter example that Zoutendik s algorithm, given by eqn (3) does not converge to a Karush-Kuhn-Tucker point Topkis and Veinott s [4] modification to the feasible direction algorithm guarantees convergence to a Fritz-John point The DFS is represented as:

6 Computer Aided Optimum Design in Engineering IX Minimize: z st q 1 T q ( ) z 0 F X S g z g q 1 T q q 1 ( X ) S ( X ) q T q ( S ) S 1 q 1 T q q 1 T q where z = Max{ F( X ) S, g ( X ) S } (7) Here, both active and non-active constraints are involved in the direction finding algorithm Although this method derives a direction that incorporates all the dimensions of the problem, it fails when the decision space contains regions of competing constraints as defined in the next section 3 Problems with gradient-based algorithms In some optimization problems, the dimensionality of the obective function is less than that of the problem In other words, some of the decision variables do not appear in the obective function These decision variables increase the dimensionality of the problem through the constraints In such cases, the search vector has a zero component for the decision variables that are not explicitly present in the obective function until a constraint that explicitly contains these decision variables becomes active Simply stated, the search direction cannot see the dimensions added to the problem by decision variables not explicitly present in the obective function until a constraint that is a function of these decision variables becomes active This often forces the algorithm to terminate before reaching at least a local minimum A more important problem, which inevitably results in the premature termination of the optimization process, is the problem of competing constraints Fig 2 demonstrates a situation of competing constraints in a two dimensional case A region of competing constraints exists if the solution space has an equality and an inequality constraint placed in such a way that the initial set of design variables satisfies the inequality constraint but violates the equality constraint If the equality constraint is represented as two inequality constraints or if the decision variables are split into dependent and independent variables (reduced gradient method, see [2]), the optimizer will attempt to satisfy the equality constraint resulting in a solution point that violates the inequality constraint The optimizer s attempt at reducing the violation of the inequality constraint will tend to violate the equality This leads to oscillations about the two constraints, as illustrated by the traectory of solid arrows in the figure, until the optimizer finds no feasible solution and terminates We say that the optimization process is trapped within the region of competing constraints The solution does not follow a constraint to find a feasible solution If the obective function does not contain decision variables which are present in the competing constraints (the dimensionality issue above), the optimizer terminates even earlier The obective of this work is to modify the search vector such that the solution follows a traectory similar to that described by the dashed arrows

Computer Aided Optimum Design in Engineering IX 7 x 2 hx ( 1, x 2) = 0 Minimize: x 1 Subect to: g(x 1, x 2 ) 0 constraint h(x 1, x 2 ) = 0 inequality equality constraint g < 0 region gx ( 1, x 2) = 0 Starting point Figure 2: Example of competing constraints Solid arrows are typical method of feasible directions solution, dashed arrows are desired solution x 1 4 Modification of the search direction The above-mentioned problems can be reduced or overcome by involving the constraints in the calculation of the search vector even when there are no active or violated constraints Experience shows that the DFS is the most computationally intensive part of the optimization process If a constraint becomes active, then the solution follows the constraint boundary until it reaches a local minimum This means that after a constraint becomes active, the DFS is solved every iteration, a behaviour that makes the algorithm computationally expensive It is desirable to find a simpler algorithm that causes the search vector to turn along the contours of the constraints, in the direction that improves the value of the obective function, much before the constraints become active or violated Consider the two-dimensional optimization problem of fig 2 If the equality constraint is represented by two inequality constraints then the search vector is calculated from the DFS given by eqns (4) to (6) Notice that, for the example given above, the first term in eqn (4) has a zero x 2 component Thus the search vector will try to reduce the constraint violation only by moving in a direction opposite that of the gradient of the violated constraint with no heed to the direction of increasing x 2 The solid arrows in the diagram show this behavior Once in the region of competing constraints, the optimizer finds no feasible solution and terminates We can also see that the smaller the starting value of x 2, the more difficult it becomes for the optimizer to get through the region of competing constraints This algorithm can only be successful if started at a large enough value of x 2 Now, if we were to modify the search direction so that the optimizer followed a path similar to that of the dashed arrows, we could reasonably reduce the chances of being trapped in the region of competing constraints even for low

8 Computer Aided Optimum Design in Engineering IX starting values of x 2 We can also imagine that even when there is no region of competing constraints, by taking the path of the dashed arrows, the optimizer can actually improve the rate at which it reaches the optimum Clearly, for the optimizer to follow this path, it needs to have a search direction that has a component along the constraint contours as it reduces the obective function This turning of the direction vector along the constraint contour needs to be activated from a position much before the constraint is active In other words, the solution should move up or down a constraint depending on which direction improves the obective, from a point much before the constraint is active Fig 3 shows the basis of the proposed modification g(x 1, x 2 ) = 0 S ' T g S g Figure 3: Modification of the search direction To turn the search vector toward a constraint contour at a point where the constraint is not active, a vector is added to the search vector that is normal to the search vector but has a magnitude proportional to the proection of the gradient of the constraint onto the search vector For the problem stated earlier, where the obective function is not a function of x 2, the search direction is now aware of the full dimensionality of the problem and can move the solution along the dashed arrow path of fig 2 The turning vector is constructed as follows (refer to fig 4 for a two-dimensional visualization) The unit vector in the direction of S is sˆ = S S The proection of the gradient of the constraint onto the search vector is R 1 = g s ˆ ( = g ( s ˆ) ) The proection of the gradient of the constraint onto the hyperplane normal to the search vector is R = R 1ˆ g 2 s The unit vector in the direction of R 2 is then rˆ 2 = R 2 R 2 Note that as the search direction approaches a direction normal to the gradient of the constraint less correction to the search direction is needed and vice versa Thus the correction vector should be proportional to the magnitude R 1 but in the direction of R 2 Hence we define the turning vector as T= R ˆ 1r 2

Computer Aided Optimum Design in Engineering IX 9 g λt S' S R 2 R 1 S g Figure 4: Construction of the turning vector and modified search direction The new search direction is then, S' = S+ λt (8) Where λ is a scalar multiplier that adusts the magnitude of the correction depending on the position of the solution point Typically, we would like λ to have smaller values for points far away from the active region and larger values for points closer to the active region It is proposed that: 0 gx ( 1, x2) C1 1 λ = CT g( x1, x2) > C1 (9) C2 g( x1, x2) 0 gx ( 1, x2) > CT Where CT is a small negative number called the constraint tolerance When a constraint takes a value greater than CT, it is considered active C 1 and C 2 are tuning constants In the examples that follow, setting each to 50 was found to work well The formulation can be generalized to the case of multiple constraints, g i (X), each producing a turning vector, T i, and an associated multiplier, λ i The modified search direction is then: T S' = S+λ T (10) 5 Implementation of the search direction modification The modification proposed above was implemented in the commercial optimizer DOT (Design Optimization Tools) TM, version 420, produced and marketed by Vanderplaats Research and Development, Inc This implementation was solely for academic purposes Each of the examples below used DOT s implementation of the Modified Method of Feasible Directions 51 A simple two dimensional example Consider the problem in two design variables, x and y, stated as:

10 Computer Aided Optimum Design in Engineering IX Minimize: x Subect to: y y 28x 10 = 0 Equality constraint 0 x 5 0 y 50 4 5x 14 0 Inequality constraint Fig 5 shows the progression of the solution using the original Modified Method of Feasible Directions algorithm starting from the point (3, 35) It is seen that the solution becomes trapped in the region of competing constraints and terminates before finding an optimum 50 40 Infeasible Initial Design y 30 20 Feasible Solution Path Inequality constraint Equality Constraint 10 Figure 5: 0 0 05 1 15 2 25 3 35 x Performance of the original DOT implementation of the Modified Method of Feasible Directions Feasible and infeasible labels pertain to the inequality constraint Fig 6 demonstrates how the performance of DOT changes when the search direction modification is incorporated into the algorithm The solution quickly converges to the optimum, (0, 10), within the specified tolerance 50 40 Infeasible Initial Design 30 Solution Path y 20 Feasible Inequality constraint Equality Constraint 10 0 0 05 1 15 2 25 3 35 x Figure 6: Same as fig 5 but with modification of the search direction

Computer Aided Optimum Design in Engineering IX 11 52 Multidisciplinary design optimization of a containership The search direction modification was implemented in the multidisciplinary optimization of a containership design The details of the containership synthesis model are extensive and are summarized in Neu et al [5] The obective function for the optimization is the required freight rate, a measure of how much it costs the owner to operate the ship per freight unit The eight design variables are ship geometry and performance characteristics Among the design variables is the amount of ballast that the ship carries The ballast does not enter directly into the obective function calculation but is important for the stability of the ship One of the ten constraints is that the ship must have a certain minimum stability Adding ballast increases the stability while stacking containers on the deck of the ship reduces the stability By increasing the amount of ballast, the ship can increase the amount of freight it can carry while still meeting the stability constraint and thereby reducing the required freight rate Using the unmodified search direction algorithm, the optimizer terminated with a relatively small amount of ballast yielding a design with a significantly larger than optimal required freight rate The optimizer got trapped in the competing constraints of minimum stability and the requirement that the weight of the ship be equal to the weight of the water it displaces After implementation of the search direction modification to the stability constraint, the optimizer was able to follow the stability constraint to a design with a significantly larger amount of ballast and smaller required freight rate A comparison of the iteration histories of obective function values for each case is shown in fig 7 Obective 0007 0006 0005 0004 0003 0002 0001 0 0 10 20 30 40 Iteration Original Search Algorithm Modified Search Algorithm Figure 7: Iteration history of the containership optimization with and without the search direction modification 6 Conclusion In this paper, a class of nonlinear, constrained optimization problems, with regions of competing constraints, that are particularly difficult to solve using conventional gradient-based algorithms has been addressed In conventional

12 Computer Aided Optimum Design in Engineering IX gradient-based algorithms, constraints are involved in calculating the search direction only if they are active This may cause optimization problems with regions of competing constraints to terminate prematurely The problem is augmented if some decision variables are not explicitly present in the obective function The practice of starting from various points within the decision space may help us avoid regions of competing constraints but as the nonlinearity and the dimensionality of a problem grows, the chances of finding a starting point that avoids these regions may be small Involving the constraints in calculating the search direction at every iteration of the optimization process is a more effective way of solving this class of problems, however, solution of a direction finding sub-optimization problem at each step can be expensive A modification of the search direction such that the solution path is partly driven by constraints, with little computational effort, before any constraint(s) is (are) active or violated is proposed This not only avoids regions of competing constraints, which can lead to premature termination, but also captures the full dimensionality of the problem in calculating the search direction This modification was implemented in a standard commercial gradient-based optimization program and applied to a simple two-dimensional problem to illustrate the mechanics of the modification The same modification was then found to produce superior results for a multidimensional, multidisciplinary design optimization problem It is to be noted that in the MDO problem, the presence of competing constraints was identified a posteriori and the modification was applied only to the constraints identified to be competing It remains as a topic for further research to automate the process of identifying the constraints that are competing and using ust those constraints to modify the search direction References [1] Zoutendik, G, Methods of Feasible Directions, Elisevier, Amsterdam, Netherlands, 1960 [2] Vanderplaats, GN, Numerical Optimization Techniques for Engineering Design, 3 rd ed, Vanderplaats Research & Development, Inc, Colorado Springs, 2001 [3] Wolfe, P, On the convergence of gradient methods under constraint IBM Journal of Research and Development, 16, pp 407 411, 1972 [4] Topkis, DM, & Veinott, AF, On the Convergence of Some Feasible Direction Algorithms for Nonlinear Programming, SIAM Journal of Control, 5, pp 268 279, 1967 [5] Neu, WL, Mason, WH, Ni, S, Lin, Z, Dasgupta, A, & Chen, Y, A Multidisciplinary Design Optimization Scheme for Containerships 8 th AIAA Symposium on Multidisciplinary Analysis and Optimization, 6-8 September 2000, Long Beach, CA