DOMAIN ANALYSIS AND EXACT ?OINT-DERIVATIVE GENERATION FOR
LARGE NONLINEAR PROGRAMMING SYSTEMS
by
Johannes Bisschop and Alexander Meeraus
Technical Note No. 7
Research Project 671-58
July 1978
Development Research Center
World Bank
1818 H Street, N.W.
Washington, D.C. 20433
Preliminary and Confidential: Not for quotation or attribution without
prior clearance from the authors. Views expressed are those of the
authors and do not necessarily ref-ect those of the World Bank.


ABSTRACT
The role of domain analysis and the automatic generation of exact
point-derivatives in the solution of large nonlinear systems of algebraic
equations is examined. The paper provides an informal technical discussion
of these two topics, and emphasizes the importance of a software system
with algebraic manipulative capabilities.
Key Words:
Nonlinear Algebraic Equations
Large-scale Programing
Domain Analysis
Exact ?oint-Derivatives


1. in-:roduction
Most large nonlinear programming problems formulated today tend
to become linear programming problems incorporating ?iecewise approximations
of simple nonlinearities. For some problems the nonlinearities involve
several variables at once, and piecewise linear approximations become prohibitive,
as the required number of linear approximation variables is an exponential
function of the number of variables determining a specific nonlinearity.
Those problems can be solved with current (experimental) nonlinear program-
ming codes such as the LSGRG code developed by Lasdon, Jain and Saunders.
There still is, however, an large gap between the reliability associated
with commercial linear programming codes and that associated with existing
nonlinear programming codes. Technical difficulties form the major reason
for the existence of this gap as nonlinear problems are so much harder to
solve than linear ones.
One of the main differences between linear and nonlinear systems
of algebraic equations is their domain of definition. Linear equations are
always deffned for all ?oints of the Eucledian space E in which they are
ccntained, while nonlinear equations are sometimes defined only on (possibly
unconnected) subsets of E . Similarly, the solution space of a set of
linear algebraic equations is always convex, while this is not true for
nonlinear systems of equations where examples of non-convex, non-connected
solution sets are easil7 constructed. These differences become i-nediately
apparent to anyone involved in developing a solution algorithm for non-
linear prov  iming prnobems. Most algorithms require gradient information,
and have to search alcng a direction within the domain of definition. it
is for cais reason :ha: an efficient internal genera:icn of botn feasible


-2-
domains and exact partial differences takes on importance in the development
of large-scale nonlinear programming codes.
Characteristics of special classes of nonlinear problems also affect
che design of nonlinear programming codes. In most engineering applications
the feasible solution set is well within the limits of the domain of definition
of the algebraic equations used in the mathematical model. Infeasibilities
permitzed during the solution process stay within the domain of definition, and
no premacure termination of the solution algorithm occurs. This is not the
case for many economic models where the feasible solution set and the domain of
definition for each variabl4 often coincide. Any code designed to solve
this class of problems must recognize specific simple model constraints
as definite boundaries that cannot be crossed at any stage of the algorithm.
Although we are interested in large nonlinear programming problems
in general, we have a special interest in large nonlinear economic models.
They have certain characteristics that make then suitable candidates for
large-scale codes. They tend to be sparse; have few nonlinear equations;
have simple nonlinearities such as the multiplication of two variables, or
unKnowns raised to a fractional power, or simple logarithms; and have
non-negative domains as their natural property.
Most nonlinear equatio:s in economic models are composed vith
:he operators + , - , * , / , ** , r , and F . For conceptual
reasons ecuations =ay be represented by a tree, and written as a sequence of
simple operations invclving two operands and one operator. Consider the
follow:ng illustratcon, wnmore all the s7mbols cn :he left side are unknowns
in a model.


-3-
Equation              xa +   p/q +    Zn(k) =   g
Tree                                     9
+
f+               'nf
x      a     P      q
Secuence of si=ole operations
T1 =   x  ** a                -
T2  =  p/q
These are intermediate
T3  =  T1 +   T2                           equations
T4  - Zi(k)
g   =  T3 +   T4
Representing the tree as a sequence of simple operations allows
for automatic domain analysis. In the above representation a system could
detect on the basis of the operators that
Ti. - x   **a       requires   x  >     , or, 0   < x   < c   and   a- > 0
T2 =   p/q          requires   q  >      , or, q  < -c
and T4   =  Zn(k)        requires   k  >  z
where  s >   0  is some pre-specified tolerance level.
On the basis of our experience with nonlinear models and nonlinear
solution algorithms, we know that these intermediate equations are very
i=portant. Solution algorithms have sometimes failed simply Decause they
searched ir a direction chat di' not consider explicity :be various dcmain


-.4-
constraints imposed by the intermediate equations. There are two extremes
that will remedy this problem.     .-.ier a model builder is required to enter
all intermediate equations togeth-: with the associated domain constraints (to
be entered as hard ccnstraints and never to be violated), or a modeling system
assumes this role automatically. We feel strongly that the initial model
representation should look as natural as possible, and that any unnecessary
impositions on the model builder should be avoided. We therefore suport the
idea that the initial model formulation is a zoact version of a larger model
to be solved, where the latter, blown-up version is generated internally by
a machine.
The representation of algebraic equations in ter-s of a secuence
of simple operations (intermediate equations) can provide another =a-or
advantage in large nonlinear systems. Let us illustrate this with an
example. Assume that the equation    xa +        _z/ - n(k) =  g with  x
a , p , q , and k as variables is part of a triangular structure not
uncommon to large sparse problens. In ocher words, there are five ecuations
in terms of these variables such that their incidence matrix (indicating
which variable is in which equation) takes on a lower triangular for- with
nonzero diazonal terms. One -ay visualize :his as follows.
Right-azi
X    a -       K    a          side
-~   r                            occurrance
S(x           possible oc:rrance
(x) 1(x)   x
no occurrance
(x)  (x)  (x)      i


-5-
In such a system one can solve first for x , then for a giv n
a
x , then for  p  given x   and  a , etc. If the equation    x      p/q  +
Zn(k) = g is the last equation of the system with g prespecified,
one can perform a Newton Step to solve for q after one has solved for
x , a , p , and k. The iterative Newton Step, however, can be replaced
with one single function evaluation if one rearranges the equation and
expresses q as a function of x , a , p , k , and g. This can be
done by inverting some of the intermediate equations, and re-ordering
them such that a is evaluated as the last equation. For the above
example the re-ordered set of intermediate equations is
Tl  =  x  ** a          x  >  E , or, 0   < x   < c and    a  > 0
T4  =  Zn(k)            k  >
T3 =   g  - T4
T2 =   T3 -   Tl
q   =  p/T2            T2  >  e or T2    < -c
Note that we have a new domain constraint on T2 that was not there
previously, while the explicit domain constraint on q has disappeared
The above iy.ample shows that if an equation can be rearranged
after one has detec.-d a certain structure induced by the occurrance or
non-occurrance of variables in equations, siple function evalua:ons
sometimes suffice for the calculation of variables wnich occur as nonlinear
terms in the original equations. As most solution algorithms have to
take advantage of sparsity in order to become efficient for large problems,
this observation becomes impcrzant.


-6-
The following table enumerates all possible intermediate equations,
their associated inverses, and their respective domains of definition.
Equation       Domain                Inversion            Domain
z = x + y                            x= z -
Y= z - x
z = x -y                            x= z + y
y = x - z
z     *y                            x =z/y       y   -c or y I
y = z/x     x < -E  or x >
z = X/Y     7 < -C or v > e         X = zy
y = X/z     z < -C or   z > e
V                                    1/y
z = x'      x >c , cr,              x =          z > E, y <       or V > E
0 < X < E and                       ,or,
y > 0                               0 < z < E and v>
z = ex                              x = Zn(z)    z >
z = zn.(x)  X > E:                  x = az
Appendix A describes how the above inversion formula's can be used while
rearranging algebraic equations at a symbolic level.
One may wonder if it is always possible to rearrange nonlinear
equations as desired. For arbitrary nonlinear equations the answer is
negative. it should be noted, however, that if a variable oczurs only
once in the original equation, and the operators + , - , * , / , ** ,
Zn , and e   are used, it is always possible not only to express one
variable in ter=s of the remaining ones, but also to have this task performed
automatically by a machine. Although these conditions are only sufficient
and not necessary, they have been satisfied Zor all economic models we have
Abserved thus far.


-7-
3.              D a  eriv=.ves
Derivative information (gradients, Hessians) is important in many
nonlinear programing algorithms. It is either supplied by the model builder
or generated internally by a machine. For model builders this task becomes
prohibitive whenever their models are large. An internal generation of
derivative information is, therefore, imperative. There are several ways
to accomplish this task. Most solution routines have an option to use first
differences in the numerical approximation of derivatives. These approximati
usually serve their purpose very well, but sometmes bring numerical trouble
(inaccuracies) when they are taken near the boundary of their domain of
definition. A second me:hod is to have the machine take derivatives at a
sy,,mbolic level , and to compute point evaluations of these expressions.
In general these generated strings of symbols tend to become quite cumber-
some, and require substantial storage. A third possiblity is the evaluation
of exact point-derivatives. This method requires the tree representation
of equations (i.e. the intermediate equations) and a set of rules. Its
major advantages are that no extra storage is needed for generated strings,
and no numerical approximations are involved.
3 2
Consider the function  f = Zn(x y ) which has the intermediate
equations
T 1 =x *^3
T2 = 7    =* 2
T3 = Tl * T2
4-   *.-, ( 3)


-8-
One may think of TI , T2 , T3 and f as functions of x and y with
associated gradients and Hessians. As the machine goes through the inter-
mediate equations it can build up not only a point evaluation of the
funczion f , but also a point evaluation of the gradient and Hessian of
* Using the rules of Appendix 3 and letting x    =  2 and y    =  3
che gradient of f can be computed as follows.
Tl  =  x  **  3                    VTI =   (Tl* 3 / x    ,0)
V 8                                 =  (12 ,0)
Using Rule 4C
T2 =   y  **  3                    VT2 =    (0,  T2 * 2 /y)
=9                                  =   (0,  6)
Using Rule 4C
T3  =  Ti * T2                     VT3 =   T2(12 , 0) +    T1(0 , 6)
=  72                               =  (108 , 48)
Using Rule 2a
f      Zr.(T3)                     Vf   =  1 / T3(108 , 48)
- 4.2767                            =  (1.5 , 0.6667)
Using Rule 7
The above example illustrates how a small set of rules can be
employed to build up exact gradient information for a large class of
algebraic functions. Appendix B develops the necessary notation, and
enumerates :he rules by which gradients and Hessians can be calculated.
T'..s note has emphasized domain analysis and the generation of
exact partial derivatives as two importn t -omponents in the development
of large-scale nonlinear programming codes. A special compiler is needed
for the implementation of these components as they both opera 2 on the
tree structure ass-.:iated with algebraic equations. We feel, however,
that the extra effort required for building a compiler is a relatively small
price for the gain in reliability that one should expect from any code
incorporating the ideas of this -aDer.


A-1
Arvendix A                5um.boZic ?earranaemen of Alcebraic Ecuaticrs
Although it is not our intent to describe a detailed algorithm
for a machinf. to rearrange equations, we do want to give an example to
describe how this might be accomplished.
Consider the following intermediate equations for the algebraic
a
equation  x   +  p/q +   DZ(k) =   g of Section II, and assume that we
want to express  q  as a function of the remaining variables.
Ti = x    ** a               x  >  E , or, 0   < x   <  e and   a  > 0
T2 =   p/q                   q  > c    or   q  < -s
T3  =  Ti +   T2
T4  =  Zn(k)                 k >   E
g = T3 +     T4
Search for the q variable on the right-hand side (should be
found only once), invert for q , cross out the equation, and start a
new list with the inverted equation at 'e bottom.
Tl = x ** a
T3     Ti  +  T2
T4 = Zn(k)
g = T3 +     T4             q =   D/T2
Search for T2 on the right-hand side (occurs only once), invert
for T2 , cross out the equation, and add the inverted equation next to the


A-2
bottom of the new list
Tl     x  ** a
T4     Zn(k)                 T2  - T3 - T1
g     T3 +   T4             q =   p/T2
Search for T3 on the right-hand side, invert for    T3 , cross
out the equation, and add the inverted equation to the new list.
Ti  - x   ** a
TA3 -  T!     T2-            T3 =   g  -T4
T4  =  Zn(k)                 T2 =   T3 - T1
-      :E4            q = p/T2
Search for T4   on the right-hand side. As it cannot be found,
the rearrangement process is complete, and :he remaining intermediate
equations are added to the new list in their original order. The final
ne,a list is as follows.
T1  =  x  ** a               x  <    ,or, 0    < x         and  a >   0
T4  =  z?(k)                k >   c
T3 =   g - T4
T2 =   T3 -   TI
q =   ?/T2                  T2 >      or   :2< -


B-1
Annendix 3                      Rules for Point Derivatives
The purpose of this appendix is to summarize the various rules
that one needs to consider in the automatic generation of exact first and
second order partial derivatives. The following notation will be used
throughout the sequel.
a ,      refer to scalars
a , b    and  c   refer to function of    n  variables   xl , x2 ,             n
V refers to a row vector of partial derivative operators
7   refers to the transpose of     7
Va  =      a                       a(x1,     ***,   x)
ox1  3x2       axn          1              n
a    Ia 3a
2x T        2         x
Y2a       7 (Va)
;a         ;a
1   "C         x n
32a           3a                        (.a
;x3x          1   n ; n
2             2     j
a 3 a
n _ n n


B-2
7a is called the aradient of the function a
V 2a  is called the Eessian of the function  a
Va x Tb  =   ( 7a ) Vb
3>3b 3b
ax1        ax1        axn
3a
n
aa    3b        3a     3b
3x    ax        ax    3x
n     1                                          n
3a  ;b         3a
*  3x         ax     ax
Va  x  Vb  is called the cross product of   Va  and  Vb
The following properties hold for the cross product operator
i)    aa( Va x Vb )=    ava x B7b                  (Associative Law)
ii)   ( Va  =   7b) x Vc = Va x Ve   =  Vb x Vc    (Distributive Law)
iii) V[ a7b ]    - aV2b   +  Va x vb
The remaining portion of this appendix gives the ruies required
for the differentiation of functions of several variables. It should be
noted that whenever the function is defined on some domain of definition,
so are its derivative formula's. One exception is    c = ab  (rules Aa, 4b,
4c). If 0 < a < e and b > 0 , then Vc and Vc are both zero, and


B-3
the formula's for Vc and 7c are to be ignored. For notational
2             2
convenience we have used Zn2 (a) and ran (a)  for  [ Zn(a) ]   and
2
[t,---n(a) I


Rules for Polint Derivatives
12)   c   -  a     b           Vc  -   Va +   Vb                      v2e  m  V2a  +    2b
2a)   e   -  ab                Vc  -   bVa  +  aVb                    V2e  =  bV2a   +  aV b  +   (Va xVb   +   Vb x Va)
2b)   c   -  na                Vc  -   nVa                            V2c  -  a?2
3a)   c   -                    Vc-     Va   - jb                      V2c  m-   2a +       2b     -   (Va x Vb   +  Vb x Va)
b                                        b          b
+  2-b x Vb
b3
g (Va   -  cVb)                       -   [V2a  -   cV b  -   (Vc x Vb  +   Vb x Vc) 1
c             i2                                   et  2      2ai
3b)   c    -                   Vc  -   - -^Vb                         V c    -    b  +   --% x Vb
bb                           2                                   b2         b3
e                             _~1  f C 2
S1Vb                                    -    [  V b  -  (Vc x Vb   +  Vb x Vc) I
3c)   c    -                   Ve    -Va                              V2c  -    2
30'Vtv V3 va


4a)   c  -   ab               Vc  =  bab-1 Va +   ab In(a)Vb        v2c  -  a b    2a +    n(a)V2b   +     (b-1)Va x Va +
['a    +                 2(-)a9
a
+  In 2(a) Vb x Vb  +  - (b In(a) + 1)
a
(Va x Vb  +  Vb x Va) ]
-  c (-Va +    ln(a)Vb)                   c [ bV2a +   ln(a)V2b   +    (Va x Vb  +
Vb x Va)  -     Va x Va] +    -Ve x Vc
2 c
a
4b)   c  -   Ub               Ve       bin(a)Vb                     v2c -    b [()2b +         ln2(a)Vb x Vb
2i
-  el n(e)Vb                           -  cnt(u)V b  +  !Ve x Ve
c
4c)   c -    a               Vc -    B,a0''Va                      v2c  =  a( [   2a +       (-1)Va x Va ]
c1a 13                            2       -
" ýVa                                  =  c-[ V2a +         Va x Va
5)    c   _    ae                  m e a Va                         v2e c   ea (2a +     Va x Va)
-  cVa                                 -  C2a +      c x V
c
raV                                  2       i     2      i
6 )   c   -   a               Ve   -7Va                             V2ea C      (V2a  --     a x Va)
(con'td...)


1 va                                    e (   a - Vc x VC)
7            n(a             Vc -    va                          v2c -    V2a -   1Va x Va
2     2
-< 2a -    Ve x Vc
8 )   c -   sin(a)           Ve =- cov(a)Va                      V c -   coo(a)V 2a -  sin(a)Va x Va
9 )   c - coo(a)             Ve -   - jin(a)Va                   V2c -   - sin(a)v2   - cod(a)Va x Va
10)   c -   tcl(a)           Ve -   ( 1 + t01 (a) )Va            v2      ( 1 + ti 2(a) )V a +     2 tan(a)Va x Val
-  (   + c2 )Va                      .(     + c2    v2 a +  2 cVa x Va 1
11)      -  |al              Vc   Va   If a > 0                 V2c  - V2a   If a > 0
Vat (f a < 0                      - V2a  tf a < 0
undeflned  if a - 0                   undefitned  if a - 0
a,


隊：__-!_＿一；〔一了一一一一一不「一一一一一一不
-!&&；·：,:-!;&&1.;,:!.-&“雜．;_:-―蘇“&1.:,l
-j‘細‘I｝一‘。“&-―辟“」‘-jt.&cj&‘一―
-&“。方！!:&&”方―-&“才〕〕‘&”〕―
―凡曰不一一一咸L7一一
&&&&!&&”兀一‘’勿”&&-
「不兀不不一不7可不方不一一一一方「不一不才一不一不一兀方一一一一一n
11一－一一一一！‘必”J必一X唱
!L Oc一Oc」、‘&&&&l--cx,：】c,-(1一：）仁〔‘z一1
―奮h·。，_i－上二I_,.1-，二「×一
〕．＿二‘.&&&.&L邢’丫．叮1曰“&L&’方‘-
＼、．&/&
、、、／/
、洶了／
&”·―.：。：_
州妒
I
l「，二：&，二，〕「。‘、二，n「，o〕r 0 01
I勵I一i{+,！輩。1 IA一才乙
-L‘之‘&&&’」乞‘:&&&&L。‘&&L。’；」“
】1誠＿二：&＿人｀x：】。IA:&c】＿x,10,＿二。1＿人。；&
-&&&&
斗
《，斗可。5--？一不－?T二
-
l一‘－一＿_&I、
才f才j不才匹工1「兀I
勵I之二一O!I么～二二11蠶？咸，。1 .1
奎｛Z一l?，勾1 If一‘一7一，IA蠶I
召l，豐•1 .1一】【‘‘怔一•j△l
1 IX&1 Xl,I,&，→I
』J?tl么？一戶居．才“l
l!0；胎一I！一。一！面人＿絮q＿蘊唱11
■么絮一，r廈
聖I－一細一iA么了甲么X「1 gee二→一】．嗚
1〔A籵〕：＇一之｛
L-－一一一一一」
乙一g