-17 11976 SLC01 3375 DRC-4 Reduced Jacobians and the Solution of Large Nonlinear Simultaneous Equations 1NTTRNA VIJPAL O)ANK FQR RKFTCiIO N NI) OEVELOPMENT by AUG 2 1 1989 Johannes Bisschop and Alexander Meeraus Technical Note No. 4 Research Project 670-24 June 1976 Development Research Center World Bank 1818 H Street, N.W. Washington, D.C. 20433 Reduced Jacobians and the Solution of Large Nonlinear Simultaneous Equations by Johannes Bisschop and Alexander Meeraus ABSTRACT: The Hellerman-Rarick preassigned pivot procedure and a partitioning algorithm are used to construct reduced Jacobian matrices for the solution of large sparse nonlinear systems of equations. 1. Introduction The study of large sparse linear systems of equations has resulted in computational procedures which can handle problems involving several thousands of equations. These procedures are based on structure recognition methods which perform row and column interchanges such that the permuted matrix is block lower triangular. A method frequently employed in commercial linear programming codes is the Hellerman-Rarick preassigned pivot procedure (from here on referred to as the HR procedure) [3] . This method carries out the triangularization of the matrix not only between diagonal blocks (or bumps), but also within these blocks. Each bump is almost strictly lower triangular except for some columns which still contain nonzero elements above the diagonal. These columns are referred to as spikes . Even though this algorithm is heuristic in nature, it tends to give very good results in terms of speed and minimizing the number of spikes. We will study the specific structure identified by the HR procedure, and show how this can be used in finding solutions to large systems of nonlinear equations. In the event the methodology is applied to a linear system, the - 2- algorithm degenerates to the solution procedure especially designed for linear systems as is described in (1] . Several papers have been written on the solution of large nonlinear systems (see e.g. [4], [6], and [7]). Our work is somewhat related to the works of Kevorkian and Snoek [4], and Steward [8], but out implementation of their basic ideas is quite different. The emphasis of the paper is mostly on structure identification and its importance for solving large sparse systems of nonlinear equations. 2. The Basic Problem Consider the system of nonlinear equations f(x) = 0 where f is a nxl vector-valued function, and x is a nxl vector of unknowns. We assume that f has continuous partial derivatives. The incidence matrix A associated with the function f has elements aij where 1 if variable x. occurs in equation i ij 0 otherwise It is assumed that n is large and that A is sparse. We will use the HR procedure to identify the bumps and the spikes in the matrix A . For a complete discussion of the HR procedure the reader is referred to the original paper by Hellerman and Rarick [3] . The diagonal blocks of A determine the sets of equations and variables in f(x) = 0 that need to be solved simultaneously. It is this observation that allows us to focus the major part of this paper on single bumps. Consider a bump B in the matrix A of size (m+q) x (m+q) - 3- containing q spikes. A reorganization of the rows and columns of B provides the partitioned matrix B B 1B2 B 3 B ] where B1 consists of all rows and columns of B minus those corresponding to the spikes. The remaining q rows and columns in their original order make up the matrices B2 , B3 and B4 . By construction, the matrix B1 is lower triangular with ones on the diagonal. Let the relevant equations and variables in f(x) = 0 be ordered as in the partitioned matrix B Then one may consider the following system of equations. fl (xl,x2) 0 f2 (x1,x2) 0 Here fl is mx1 ; f2 is qxl ; x is mxl ; and x2 is qxl Assume any initial guess for the vector (xl , x2) , say (x, x2 and solve the triangular system f1(xl,2) = 0 by sequentially solving one- dimensional nonlinear equations using the elements of x1 as starting values. These equations usually involve a few Newton-steps, and may some- times be solved in closed form. Note that we implicitly assume the initial guess x2 to be such that the system f l(x 1x = 0 does have a solution. While computing the solution xl , one may also compute the matrices and 21 of sizes mxm and mxq respectively, where all derivatives 3x i3x2 1 2 are evaluated at the current solution (xl , x2) . We also assume a af nonvanishing determinant for the lower triangular matrix . Then, 1 using the Implicit Function Theorem, there exists a function x1(x2) in -4- a neighborhood of (x1,x2) (xl x2) such that F1(x2 f1 (x1(x2))x2 0 in this neighborhood. By definition, 3F1 X 2) af 1ax af1 _Fl(x 1 1x 1 + 0 ax2 ax 1 ax2 ax2 af1 Since is strictly lower triangular and assumed nonsingular, one may xax1 solve explicitly for the mxq matrix ax 2 2 If (xl , x2) is not yet a solution of the system f2 (xl,x2) = 0 then the estimate of x2 can be updated using a Newton step on the equation F2(x2) 2(x1(x2), x2) 0 . Here aF 3 2 - 2 2 2ax 2(X1x2), x2 In this manner the only matrix to be explicitly inverted is a matrix of size qxq rather than (m+q) x (m+q) . The above procedure corresponds to inverting the augmented Jacobian matrix J af af 1 1 1 2 ax1 ax2 1f2 a 2 J = = af f b = L2 (x2x2) The solution x = Jm b , where x = (xi,xr can be written as -5- x2 4 3 1 2) - f2 (X1 X2) x = - J J x, 11 2 2 where the lower triangular matrix J-1 need not to be calculated explicitly. 1 The estimate x2 is updated via the equation x2 = x2 + x2 . The triangular system f1 (x1,x2) = 0 is solved again using the elements of x = x1 + X as starting values. In the special event that J is a constant matrix for all values of x , i.e. the systems f and f2 are linear systems of equations, the updated pair (x1,x2) = (x1 + xI 2 + x2) is the exact solution provided the point-derivatives determining the matrix J are exact. This may be verified as follows. Let the systems involving f1 and f2 be represented by the equations Jxl +Jx2 = b1 1 X11 2 x2 b1 3 x1 4 x2 = b2 The initial estimate for x2 is x2 , and x1 is computed via the equation Letx = J (b -+J x )-. 1 31 2 2 Let b 2 13 xl 4 x 2 - b 2 if (x x2) is not a solution to both sets of equations, b2 is not the null vector. The Newton-step involves the computation of :1 1 2 L -x 2 - - 3 14- 2 2- -6- where x2 -(4-J3J1 J2 b2 xl = - J1 J2x * X, 1 1 2 2 Then the pair (xl , x2) = x1+x1 , x2+x2) satisfies the initial linear equations. - 1-1 - -1l 1 i1 +J2 R2 1 1 1 1 2 x2 1 2 x+2 2 2 = b1 - 2 (x2 + x2 + 2 x2 b1 S+ x = (1 -1 - -1 + J x-+JJx'=+ (J -J - J - 3 1 4 2 3 1 1 1 2 2 1 2 2 4 2 -1 b-1 - 3 1 1 4 3 1 2 2 =- J b + (J - J J J )x -1b 3 1 1 4 3 1 2 2 2 1 -1 - - - = JJ b + (J- J J J ) x + b -J x- J x 3 1 1 4 3 1 2 2 2 3 1 4 2 -1 -1 -1 - 3 1 1 4 3 1 2 2 +b2 3 1 b1 + 3 1 2 2 b 2 The Newton method together with reduced Jacobians can be used to solve the nonlinear system f(x) = 0 . As the number of spikes in any given bump is usually small in most large sparse matrices, the computational work per iteration is reduced to the inversion of a sequence of small matrices. -7- 3. Summary and Conclusions In this paper we have shown how reduced Jacobians can be used in solving systems of simultaneous equations whenever their incidence matrix can be partitioned as described in the previous section. The methodology is not only efficient for nonlinear systems but also for linear systems. This observation takes on importance if one notes that many large nonlinear systems are mostly linear with the exception of a relatively small number of equations. It is important to note that the inversion of the reduced Jacobian matrix can be made efficient if one takes advantage of the nested structure displayed by the spikes in a typical bump. We refer to [1] where the major part of the paper focuses on the nested bordered block lower triangular form of the reduced Jacobian matrix. It is also interesting to note that quasi-Newton type methods can be used in revising the reduced Jacobian matrix every iteration (see e.g. [2] and [6]) . In conclusion, it should be observed than an effective implementation of the basic ideas presented in this paper is strongly dependent on a well-implemented Newton- type method and an efficient way of generating exact partial derivatives. Both these typics will be studied in a separate discussion paper. References [1] Bisschop, J. and A. Meeraus, "A Recursive Form of the Inverse of General Sparse Matrices", DRC Technical Note # 1 1976. [2] Broyden, C.G., "Quasi-Newton, or Modification Methods", Numerical Solution of Systems of Nonlinear Algebraic Equations. Edited by G.B. Byrne and C.A. Hall, Academic Press: New York, 1973. [3] Hellerman, E., and D. Rarick, "Reinversion with the Preassigned Pivot Procedure", Math. Programming 2 (1971), pp. 195-216. [4] Kevorkian, A.K., and J. Snoek, "Theory and Applications in Solving Large Sets of Nonlinear Simultaneous Equations", Decomposi- tion in Large Scale Systems. Edited by D.M. Himmelblau, North-Holland,Publishing Company: Amsterdam, 1973. [5] Kron, G., Diakoptics, MacDonald: London, 1956. [6] Schubert, L.K., "Modification of a Quasi-Newton Method for Nonlinear Equations with a Sparse Jacobian", Mathematics of Computa- tion, (1970), pp. 27-30 . [7] Steward, D.V., "Partitioning and Tearing Systems of Equations", J. Siam Numer. Anal. Ser. B 2 (1965), pp. 345-365