A Truncated Newton Method in an Augmented Lagrangian Framework for Nonlinear Programming Gianni Di Pillo (dipillo@dis.uniroma1.it) Giampaolo Liuzzi (liuzzi@iasi.cnr.it) Stefano Lucidi (lucidi@dis.uniroma1.it) Laura Palagi (palagi@dis.uniroma1.it) Abstract Technical Report DIS 09-07 In this paper we propose a primal-dual algorithm for the solution of general nonlinear programming problems. The core of the method is a local algorithm which relies on a truncated procedure for the computation of a search direction, thus resulting suitable for large scale problems. The truncated direction produces a sequence of points which locally converges to a KKT pair with superlinear convergence rate. The local algorithm is globalized by means of a suitable merit function which is able to measure and enforce progress of the iterates towards a KKT pair, without deteriorating the local efficiency. In particular, we adopt the exact augmented Lagrangian function introduced in [9], which allows us to guarantee the boundedness of the sequence produced by the algorithm and which has strong connections with the above mentioned truncated direction. The resulting overall algorithm is globally and superlinearly convergent under mild assumptions. Keywords: constrained optimization, nonlinear programming algorithms, large scale optimization, truncated Newton-type algorithms, exact augmented Lagrangian functions. AMS subject classification: 90C30, 65K05 This work has been supported by MIUR-PRIN 2005 Research Program on New Problems and Innovative Methods in Nonlinear Optimization. Università di Roma La Sapienza - Dipartimento di Informatica e Sistemistica Antonio Ruberti - via Ariosto, 25-00185 Roma, Italy. CNR - Consiglio Nazionale delle Ricerche, IASI - Istituto di Analisi dei Sistemi ed Informatica A. Ruberti, Viale Manzoni 30-00185 Roma, Italy.
2 G. Di Pillo, G. Liuzzi, S. Lucidi, L. Palagi 1 Introduction In this paper, we are interested in the solution of smooth constrained optimization problems of the type: min f(x) g(x) 0 (1) where x IR n, f : IR n IR, g : IR n IR m are three times continuously differentiable functions. For simplicity we consider only inequality constraints. However, the extension to equality constrained problems requires only technical effort (see, for example, the analysis of [8]). With reference to Problem (1), we denote by F the feasible set and by L(x, λ) the Lagrangian function L(x, λ) = f(x) + λ g(x), where λ IR m is the Karush-Kuhn-Tucker (KKT) multiplier. A pair ( x, λ) IR n+m such that x L( x, λ) = 0, λ g( x) = 0, g( x) 0, λ 0, (2) is said to be a KKT pair for Problem (1). At a given point x IR n, not necessarily feasible, we associate the index sets: A 0 (x) = {j : g j (x) = 0}, N 0 (x) = {j : g j (x) < 0}. Moreover, in correspondence of a KKT pair ( x, λ) we consider also the index set: A + ( x, λ) = {j A 0 ( x) : λj > 0}. The strict complementarity condition holds at ( x, λ) if A + ( x, λ) = A 0 ( x). The linear independence constraints qualification (LICQ) holds at x if the gradients g j (x) with j A 0 (x) are linearly independent. Under LICQ the KKT conditions (2) are necessary optimality conditions for Problem (1).
A truncated Newton method for nonlinear programming 3 At a KKT pair ( x, λ), the strong second order sufficient condition (SSOSC) holds if z 2 xl( x, λ)z > 0, for all z 0 : g j ( x) z = 0, j A + ( x, λ). Our aim in this paper is to define a globally and superlinearly convergent algorithm suitable to tackle large dimensional instances of Problem (1), requiring a limited computational cost per iteration. The algorithm considered is a primal-dual method, in the sense that it generates a sequence of pairs (x k, λ k ) that can be proved, under suitable assumptions, to converge to a KKT pair ( x, λ) of Problem (1). The points x k may be infeasible, but they belong to a predetermined open set S F, S, for all k. Thus, the proposed method can be viewed as a shifted-barrier infeasible primal-dual method. The starting point of the paper is the design of a local algorithm, which can be shown to be efficient both in terms of convergence rate and of computational cost per iteration. The local algorithm that we design is a modification of a well-known local algorithm, introduced in [1, 6] where a Newton-type direction d k is obtained by the exact solution of a linear system per iteration. In [13], this local algorithm has been proved to be quadratically convergent, thus qualifying as an efficient method for the solution of Problem (1), at least locally. In this paper, we show that the linear system can be solved inexactly by making use of a truncated procedure suitable for largescale problems. The inexact local algorithm can be proved to be superlinearly convergent under LICQ and SSOSC, without requiring the strict complementarity condition. The above local algorithm is then stabilized so as to guarantee that the iterates stay bounded and that the superlinear convergence rate is retained. This task can be accomplished by using a merit function [30, 17, 16, 4, 28, 18, 5], that combines the minimization of the objective function and of the constraints violation, or a filter technique [15, 14, 29] that keeps separated the minimizations of the objective function and of the constraints violation.
4 G. Di Pillo, G. Liuzzi, S. Lucidi, L. Palagi Our globalization strategy consists in a linesearch procedure using a merit function. In particular, as merit function we use the exact augmented Lagrangian function L a (x, λ; ɛ) proposed in [9]. Function L a (x, λ; ɛ) depends on the penalty parameter ɛ > 0 and has some remarkable properties. Indeed, it has compact level sets and it has strong exactness properties provided that ɛ is smaller than a threshold value ɛ > 0, whose existence is proven. We note that, even though the threshold value ɛ is not known, it is possible to devise rules which, in a finite number of steps, can find a value ɛ which is guaranteed to be smaller than ɛ. In order to link the local algorithm with the globalization strategy, we show that the truncated direction d k used in the local algorithm and the merit function L a can be strongly related. Indeed, we show that, in a neighborhood of a KKT pair, and for ɛ sufficiently small, d k is a good descent direction for the merit function, so that we can compute the new trial point by performing an Armijo-type line-search along this direction. The relevant fact is that, near a KKT pair satisfying the LICQ and the SSOSC, the line-search always accepts the unit stepsize. Thus, eventually, the iterates are produced by the pure local algorithm, which is why the overall algorithm retains a superlinear convergence rate. The overall primal-dual augmented Lagrangian algorithm can be seen as an iterative process in which the local algorithm is controlled by means of the merit function L a with a line-search procedure. The line-search is performed along a direction p k which is either the truncated direction d k, whenever d k turns out to be a good descent direction for L a, or an alternative gradient related descent direction z k for L a, whenever d k turns out to be a poor descent direction for L a. Furthermore, during the iteration process, the algorithm also checks for the correctness of the penalty parameter ɛ, which is decreased, on the basis of some technical condition, until a value below the threshold value is determined. We remark that, even if the problem functions are required to be three times continuously differentiable for analytical purposes, third order derivatives will never be required in the computations. We describe and analyze the building blocks of the overall algo-
A truncated Newton method for nonlinear programming 5 rithm according to the following organization. In Section 2 we describe the inexact local algorithm for Problem (1). We show that the truncated direction produced by the algorithm is well defined, in the sense that it goes to zero if and only if the sequence {(x k, λ k )} approaches a KKT pair. Furthermore, we show that the local algorithm produces a sequence of pairs {(x k, λ k )} converging superlinearly to a KKT pair of Problem (1), without requiring the strict complementarity condition. In Section 3 we recall the exact augmented Langrangian function L a and we analyse its relationships with the truncated direction d k. In Section 4, we describe the overall primal-dual Augmented Lagrangian Algorithm model (ALA) and we study its convergence properties. Finally, in Section 5 we present some preliminary numerical results. We conclude this section by introducing some notation. Given a vector v IR p, we denote by the uppercase V the diagonal matrix V = diag 1 i p {v i }. Given two vectors u, v IR p, the operation max{u, v} is intended component-wise, namely max{u, v} denotes the vector with components max{u i, v i }. Moreover we denote v + = max{v, 0}. Let K {1,..., p} be an index subset, we denote by v K the subvector of v with components v i such that i K. We denote by v q the l q norm of the vector v, and when q is not specified we intend q = 2. Given a matrix Q, we denote by Q the induced l 2 norm. For the sake of simplicity, given a function w(x) and a point x k we shall occasionally denote by w k the function evaluated at x k, that is w(x k ). 2 The local algorithm 2.1 Introduction We consider a primal-dual local algorithm described by the iteration: x k+1 = x k + d k x, λ k+1 = λ k + d k λ, (3) where d k = (d k x, d k λ ) IRn IR m is the search direction.