COMBINING AN OPTIMAL CONTROL PROGRAM VITH THE TIME SERIES PROCESSOR SYSTEM by Arne Drud Technical Note No. 15 Research Project 671-58 September 1979 Development Research Center World Bank 1818 H Street, N.W. Washington, D.C. 20433 The views expressed in this paper are those of the author and do not necessarily reflect those of the World Bank or its affiliated organizations. The research project described in this paper was performed in cooperation between Arne Facins and Henrik Bjerre-Nielsen, The Danish Ministry of Finance, and Arne Drud, IMSOR, The Technical University of Denmark, in the spring of L979. The codes described in the paper have been implemented and some runs have been made, but due to the authorb temporary assignment in the World Bank a thorough testing will not be finished until the spring of 1180. The paper is more detailed and technical version of a paper to be presented at the "Third IFAC/IFORS Conference on Dynamic Modelling and Control of National Economies," tiarsaw, June, 1980. COMBINING AN OPTIMAL CONTROL PROGRAM WITH THE TIME SERIES PROCESSOR SYSTEM Arne Drud Development Research Center World Bank 1818 H Street, N.W., Washington, D.C. 20433 0. Abstract The TSP (Time Series Processor) system developed at various uni- versities in the United States is used by many universities and government offices to perform data manipulation, estimate econometric models, ari solve or simulate the model, and many models and data banks are stired in TSP for- mat. The project described here was to combine an existing optimal control program with the TSP system in order to make the transition from simulation to optimization fast, easy, and reliable. The optimal control program requires very detailed information about the model such as analytic expressions for the derivative of the equa- tions, the nonzero pattern of the Jacobian of the model, bounds on the endo- genous and the control variables, etc. The paper describes the necessary input to the optimal control program and how TSP is used to generate this input. Five new commands are added to the TSP command set in order to make the definition of the econometric model, the data, the bounds, and the objec- tive function easy. The new commands are described and it is shown how to define an optimal control problem via TSP and how to generate related optimal control problus, e.g. with another objective function or with another set of lounds. Key words: Optimal Control, GRG-algorithm, Econometric Models, Automatic Formula Manipulation, Numerical Formula Representation, Automatic Differentiation. -2- 1. Introduction During many years it has been realized that optimal control methods can be a valuable tool when applied to econometric models. Optimal control methods can be used to find a good time path for the decisions a policy maker should make in order to reach certain targets. But, just as important, optimal conti:l methods are very good in testing the behavior of economic models since they tend to reveal the weak spots in the model. Many examples based on small models has already shown this. In the beginning of the seventies, it was shown that optimal control was also feasible for larger models (Fair, 1974 and Holbrooke, 1975). These first optimization approaches were based on repeated calls to existing simula- tion routines for the econometric models, usually Gauss-Seidel routines, and they were relatively easy to set up and interface with existing models and data banks. However, the solution times were large and the accuracy was not very good due to inaccurate simulation routines. Later attempts to increase the efficiency of optimal control pro- grams have been based on Generalized Reduced Gradients for finding search directions and in many cases on variations of Newton's method for solving, or simulating the model (Bird, 1976, Drud, 1976 and 1978a, Lasdon and Mantell 1978, and Nepomiastchy and Ravelli, 1978). Small examples indicates that these methods are indeed faster. However, until now they have not been applied to really large models, mainly because they require a description of the econometric model that is more detailed than usually available. A manual preparation of t-is detailed description is very slow and the prob- ability of errors is large. Therefore, research has in recent-years been directed towards methods for autumatic manipulation of econometric models, and the current paper should be seen as a part of this research. - 3 - The paper is organized as follows: In section 2 we describe one of the existing reduced gradient algorithms, CONOPT, with special emphasis on the input requirements. Section 3 describes an existing system for esti- mating and simulating econometric models, the TSP or Time Series Processor System. This system is used in various government offices and universities in Denmark, and most major models of the Danish economy are maintained within this system. The second half of section 3 describes the internal representa- tion of expressions in TSP. It is rather technical as is section 4, where the details of the implementation of the link between TSP and CONOPT are described. The reader without technical interest can jump to section 5 where the new TSP-commands are described, and section 6 where we consider the use of these commands in optimal control exercises. -4- 2. The CONOPT-code. The CONOPT-code is a FORTRAN-code for solving deterministic dynamic optimization problems. The code is based on the Generalized Reduced Gradient principle (Abadie and Carpentier, 1969, and Abadie, 1970) for computing search directions and on Newtons method for imulating the model. It is developed for fairly general dynamic optimization problems, but many of the implementa- tion choices have been guided by the properties of econometric models, so it should be efficient for this type of models. More details can be found in Drud (1976 and 1978a). Some of the efficiency of CONOPT is based on a very detailed des- cription of the optimization problem, that must be entered as input. The input must contain: (a) A FORTRAN subroutine that for given values of the optimization variables, i.e. the endogenous variables and the control variables, computes the residuals of all equations in the model. (b) A structural description of the model: For each equation describe which optimization variables enter into the definition of the equation, both unlagged and lagged. The efficiency can be increased, if the user also defines which variables enter linearly, for the linear variables which have constant and which have time dependent coefficients, and if the equations are grouped into linear and nonlinear equations. (c) A FORTRAN subroutine that computes the values of the deriva- tives of all equations with respect to all optimization - 5 - variables. Only nonzero derivatives corresponding to the structural description should be supplied. If linear relationships have been recognized, it is only necessary to compute the coefficients in the linear expressions once. An option exists for computing the derivatives numerically. However, this will increase the computing time considerably. (d) A FORTRAN subroutine that computes the value of the objective function. (e) A FORTRAN subroutine that computes the nonzero derivatives of the objective with respect to the optimization variables. There is also an option for computing numerical derivatives. (f) A file with numerial values for exogenous variables, bounds on the optimization variables, etc. In both b aizd c all information must be supplied columnwise, i.e. variable by variable and not equation by equation. This makes it rather difficult to make changes in one or in a few equations. Also, preferably the model should be w- scaled which means that all derivatives and all solution values are not too many orders of magniLude away from 1. For'small models it is easy to generate the required input. How- ever, for larger models it can be very time consuming to prepare .the input, especially part b and c, and the probability of errors is large. - Also, it can be rather time consuming to change the model because the information supplied in a, b, and c must be changed simultaneously and consistently. Therefore, during the last years more approaches have been made towards simplifying the input process such that only the equations and the - 6 - objective function are entered in a straight forward format instead of input a to e above. The partial derivatives and the structural information are derived through automatic methods, see Drud (1978 b and c). When this first step has been taken it is only a small step further to take the equations of the model directly from the simulation and/or estima- tion system that is used to handle the econometric mcdel. The following section describes the system that is used with most larger econometric models in Denmark. The main ideas of the system are given followed by a detailed description of those parts of the system that are necessary in a link to the CONOPT code. -7- 3. The Time Series Processor System, TSP. The TSP-system is a FORTRAN-based system for manipulating time series and for estimating and simulating econometric models. It has been developed at a number of universities in the United States over a long period cf time. Unfortunately, the development has not been very coordinated so today different versions have their force in different areas, and most ver- sions are more or less dependent on the operating system of a certain computer. The following description will cover the University of Wisconsin version as set up on the Univac 1108 computer at RECKU, the Computing Center at the University of Copenhagen, with some later changes, see 3ever (1977). Due to the different versions of TSP, the link described here can unfortunately not bc transferred to other TSP-systems without a large aount of work. In batch mode the input to TSP consists of two files, a command file and a data file plus possibly a reference to one or more data banks. On the coomand file the work tasks are described through a sequence of commands, each followed by a set of parameters. The input is almost free format with no fixed columns. TSP can also be used in demand mode (i.e. time sharing) in which case the two input files are merged into one file, the terminal keybord. The basic TSP commands are used to manipulate time series (the names of the commands are given in parenthesis after the description): read, both unformatted (LOAD) and formatted (READ), and write (PRINT and PLOT)-time series, generate new time series from old ones (GENERATE and CUMULATE), and compute certain statistics such as mean, the minimum and maximum values, standards diviation etc (MEAN, MIN, MAX, ...). A second set of commands are used to define functional forms (FORMULA), parameters (PARAMETER), or constants (CONSTIAST). All formulas are - 8 - defined in normalized form, i.e. with one unlagged variable on the left side of the equality sign. The format is almost standard FORTRAN. A set of estimation commands work on the functional forms and estimates the values of the unknowns, i.e. the quantities that were defined as parameters, using ordinary least squares (OLSQ), two-stage least squares and instrumental variables (INST), or nonlinear least squares on one or more equations (LSQ). All time series read in or created as well as all formulas, para- meters, and constants are defined by a unique name, and they are stored in an internal format in a data bank. The data bank can be saved from one run to another such that a whole library of time series, equations, and parameters can be build through a succession of r-ans. It is also possible to save different parts of the data in different data banks. Based on the equations and their data it is possible to define models. Through a LIST-command a set of equation names are grouped together with a name to form a model. The endogenous variables of the model are implicitly defined as the set of variables on the left hand side of the equality signs. The command ORDER analyzes the structure of the model and finds the recursive block ordering of the model, and the command SIMULATE simulates the model using the information from ORDER. The values for the exogenous variables, the parameters, and the constant are automatically re- trieved from the data bank before the S4Ul,tion is started. Currently most major Danish econometric models are maintained in the TSP-system and there re many development plans. At the roment each user has his own data bank with time series data. However, the plan is to make a large data bank with all major economic time series, malitained and updated by the official st--istical department. Users can then retrieve series from - 9 - this bank and generate their own series of differences, quotients etc. Each user can also have a bank of equations, often with alternative specifications toz some of the equations. A model is then simply defined as a list of equation names, where any subset of the set of equations in the bank can be selected. By changing certain names in a list of equation names or by choosing different subsets of names it is fairly easy to define many different models and test their behaviour in a simulation. Because of the plans of using TSP as the general modelling system in Denmark it was decided to make a link from TSP to CONOPT. Different ad-hoc optimizations had earlier been made, but each time the transformation of the model from the simulat, system into the format required by CONOPT caused many problems. Before describing hcw the ditferent parts of the input to CONOPT are generated from TSP it is necessary to describe the data bank in TSP and the way equations are represented in trnally. The following will be rather technical and users without technical interest can continue with section 5. All quantities, i.e. constants, parameters, time series, formulas, and lists are represented by a name with 1 to 6 alfanumerie characters,the first being alfabetic. In the data bank the quantities are stored in separate logical reco-ds on a random access file with the name of the quantity used as the key. The second word in the record is always the length of the record excluding the name and length words. Parameters and constants have length 1 while time series, for=ulas and lists can have variable length. One subroutine, PACKET, saves and retrieves everything from the data bank. The programmer that wants to add new routines to TSP has only to think about the names of the quantities and ?ACKET will take care of file handling and - 10 - hashing techniques. A large buffer is used for intermediate storage of some of the quantities and PACKET keeps track of the content of tne buffer such that search on the disc in many cases can be avoided. Formulas are stored in a passed form using a directed graph repre- sentation as shown in figures 1 to 3. The graph is acyclic and in many cases + 17 15 + * 12 * 4 * 11\ 13.log - 9 - 10 / 16 1.3 4 a 3 (-1) c 2.5 2 e 8 d(-2) 7 Figure 1: The representation of the expression r - e*(a-b(-1))*(c-.5)+1.5*log(a)-e/d'-2) Ln TSP it Is actually a tree, so the representation is usually referred to as a tree representation. Internally the representation is diiided into two parts, a description of the variables or numerical constants tha.. enters into the - 11 - Number Name or Value Type lag 1 r symbolic 0 2 2.5 numeric 0 3 a symbolic 0 4 1.5 numeric 0 5 b symbolic 1 6 c symbolic 0 7 d symbolic 2 8 e symbolic 0 Figure 2: The -able describing the inn.t nodes for thc fcrmula in figure 1. Operator operand 1 operand 2 result 3 5 9 6 2 10 * 9 10 11 * 11 8 12 log 3 - 13 * 13 14 + 12 14 15 / 8 7 16 + 15 16 17 Figure 3: The table describing the operations for the formula in figure 1. - 12 - expression, see figure 2, and a description of the operations that should be performed on these variables, figure 3. Each variable or numerical constant is described through a. value or a name, a type, i.e. numerical value or symbolic name, and a lag number. The first variable is assumed to be the variable on the left hand side. The variables are considered as the leaf- nodes in the tree or the acyclic graph. The operation3 are described through an operations code (2 for +, 3 for -, 13 for log, etc) followed by tdo numbers hat represent the nodes of the operators and a number that represent the node where the result of the operation is stored. In the following the operator nodes will be called parent nodes and the result node will be called the scn. The final result node is called the root. Notice the difference in notation from search trees where each node has at most one parent and where the leaf nodes are the ultimate sons. 'When a formula has to be evaluated the input nodes are scanned and the values of the symbolic variables are retrieved from the data bank. The operations part is then analyzed. For each operation the code number and the node numbers are found, and based on the code number a computed GO .TO brings us to a piece of ccde where the correct operation is perfo,rmed on the proper node values and the result is saved again in the proper node. - 13 - 4. Implementing the link of TSP and CONOPT The first major decision concerning the link of TSP and CONOPT was whether they should be aggregated into one large program or not. It was felt that it would be easier to implement the linkage through a few commands in TSP that generated one or more disc files. These files could then be read into CONOPT through a special input program. Actually, most of the implemen- tation decisions for this first version were guided by the expected time to implement the code since the time limit was very strict. In a later version hopefully it will be possible to put more emphasis on efficiency. If we consider the input requirements described in section 2, some of the parts are in principle straight forward to produce from the TSP-repre- sentation altaough a lot of data manipulation and book-keeping are necessary. Part b, the structural description of the model, is eaey. Running through the list of ecuations, the set of endogenous variables can be identified as the set of -ariables on the left hand sides of the equations, i.e. for each equation it is the first variable in the table that describes the input variables of the equation. The control variables are assumed to be defined through a list in TSP and the list of optimization variables is then simply the concatenation of the list of endoganous and the list of control variables. The information on which variables enters in which equations can -be created in a second pass of the list of equations. For each equation the names in the list of input variables are compared with the names in the list of opti- mization variables, and for each match a nonzero has been detected in the Jacobian. The names that does not match are identified as either exogenous variables or as parameters or constants, and the names are entered in a separate list. The search produces a row vise representation of the struczure - 14 - of the Jacobian, but in a single pass of the nonzero elements it is easy to transpose the representation. In the first version of the link the searches for matching names have all been implemented as straight forward linear searches, but in a later version a hashing technique as in Brent (1973) or a binary search technique should be used. The second major decision was on the representation of equations or expressions. CONOPT requires that the residuals of the equations can be computed. There were two possibilities: we could use an internal tree representation similar to the one used in TSP with a similar algorithm for interpreting the expressions, or we could transform the TSP-representation into a FORTRAN expression and enter it into the proper subroutine. A FORTRAN compiler could then be used to generate an optimized machine code. Some simple timing experiments were performed to estimate the loss in efficiency from using a tree representation. An ordinary model expression was evaluated 10,000 times through a subroutine call using both approaches and on an IBM 370/165 the ratios between the computing times were between 2.2 and 4 depending on the compiler. Further 'experiments have to be performed to reduce the influence of the set up times for the subroutine. But because of these small ratios and because of the easier implementation it was decided to use the tree 'representation in a first version of the link. In a new pass of the equations after the lists of optimization variables, exogenous variables, and constants are found the tree representa- tion is changed by translating the numbers of the nodes from a set of numbers local to the formula into a set of global node numbers or indices in a vector. Because there are two types of basicly different variables, time dependent and time independent, it is not easy to work directly on the vectors used by - 15 - the optimization code. Instead an extra vector with the content shown in figure 4 is created in which the computations are performed. The last part lag maxlag lag-1 lag - 0 /__/_/ optimization variables with varying number of lags exogenous variables i I constants defined by value I-j constants defined by name nodes containing intermediate and final expression values Figure 4: The content of the vector to which the tree representation of the formulas is applied of the vector with time independent constants can be initialized once while the first part with the time dependent variables must be set up each time the time index changes in the optimization algorithm. When the optimization variables changes during the iterations a part of the vector must be changed, but often it is only the unlagged variables. As mentioned in section 2 the model should be well scaled if pos- sible. With the scheme chosen above it is very simple to introduce scaling. Each time an optimization variable is moved from the vector of optimization - 16 - variables used by CONOPT to the vector used in the residual routine it is multiplied by a column- or variable-scale factor, and each time a residual is returned to CONOPT it is multiplied by a row- or equation-scale factor. The same scale factors are later on also used to scale the derivatives. The third major decision was on the computatior and representation of derivatives. For reasons of numerical accuracy it was decided not to use finite differences but to use an accurate derivative. Two alternatives were immediately available: to create a separate tree representation of the expression for the derivative, or to compute the numerical value of the derivative directly from the tree representation of the expression. Both alternatives are based on the simple fact, that the derivative of a tree node A with parent nodes B and C can be computed in a si=ple manner depending on the operator from the values of node A, B, and C and the deriva- tives of node B and C. In some cases such as with + and - operators the values are not even needed. A tree representation of a derivative is created in a three pass procedure. In a first pass through the tree from the leaves to the root we recognize which nodes have a nonzero derivative with respect to a certain variable. Initially only one leaf node has a derivative, and during the pass a node is assigned a nonzero derivative if a least one of its parent nodes has a nonzero derivative. In a second pass back again through the tree from the root to the leaves we mark the nodes whose values are needed for the deriva- tive of the root node. Depending on the operator of a node with a nonzero derivative we mark (a) both the value of the node itself and the values of the parent nodes as needed, e.g. with square root, exponential function and divide; - 17 - (b) one or both of the parent nodes as needed, e.g. with miltiplication and logarithm; (c) or we d,n't mark any values as needed, e.g. with addition and subtraction. If the value of a node is marked as needed due to operators closer to the root the values of the parent nodes are also marked as needed. The nodes whose values are needed will form one or more subtrees or subgraphs of the original tree or graph, and the code used to compute their values is simply a subset of the code used to compute the original expression, possibly with a renumbering of some intermediate nodes. In the third pass of the tree we start from the level above the leaves and work against the root creating code fo: che derivative of each node that has a nonzero derivative. Each operator in the original expression tree gives rise to a certain piece of code using the numbers of the rodes that represent the value of the node itself, the values of the parent nodes, and the derivative of the parent nodes as input. Special care can be taken to eliminate multiplications with 0 or I or other unnecessary operations. Figures 5 to 7 show the generation of the derivative of the expression from figures 1 to 3 with respect to one of the variables. - 18 - d + 17 d + 15 d 12 d d 14 11 d V 3 log - - 0 16 v d v 1.5 4 a (- 6 2.5 2 e 8 d(-2) 7 Figure 5: The nodes in the expression from figures 1-3 whose derivatives (d) or values (v) are needed to represent the derivative with respect to input variable a. - 19 - Operator Operand 1 Operand 2 Result Comments 6 2 9 The code that computes Val (10) is transferred ard intermediate node numbers are changed. Der(9) - Der(3)-Der(5) = 1-0 = 1 Der(11) - Der(9)*Val(10)+ Der(10)*Val(9) = 1*Val(10)+ G*Val(9) = Val(10) * 8 9 10 Der(12) - Der(11)*Val(8)+Der(8)* Val(11) Val(10)*Val(8) / "1" 3 11 Der(13) 1./Val(3). "1" indi- cates an input node that con- tains the value 1. * 4 11 12 Der(14) = Der(13)*Val(4)+Der(4) *Val(13) Der(13)*Val(4). + 10 12 13 Der(15) = Der(12)+Der(14) Der(17) Der(15)+Der(16) Der(15) Figure 6: The code for the derivative of the expression from figures 1-3 with respect to input variable a. Val(i) is used for the value of the original node number i and Der(i) is used for the derivative of the original node number i. - 20 - + 13 A 2 10 / 11 -9 1.5 4 1 a 3 c 6 2.5 2 e 8 Figure 7: The tree representation of the derivative of the expression from figures 1-3 with respect to input variable a. The numerical values of the derivatives can also be computed di- rectly from the tree representation of the original expression. For each node we store two numbers, the value of the node and the derivative .of the node. The derivative values of the leave nodes are initialized to 0 or I and during a pass from the leaves to the root the derivative values of each, nodc is computed from the value of the node itself, from the values of its. parents, and from the derivative values of the parent nodes. If more space is allocated such that the derivative values can be stored in vectors of length the number of input variables, the derivatives with respect to all leaf nodes can be computed in one pass. - 21 - Both approaches produce the same exact value for the derivatives. The advantage of the last approach is the small amounr of core storage needed. The disadvantage is the large number of redundant operations like multiplica- tions with zero or one, addition of zeros o. computations of node values that are not used later on. In the first implementation we have chosen to concentrate on speed, thus chcosing to represent each derivative through a separate tree. Our first experience with a large model gave the following numbers: Nunmber of equations and endogenous variables: 114 Number of control variables: 21 Number of exogenous variables: 143 Number of numerical constants: 313 Number of lags: 4 Number of integers to represent the expressions: 5128 or 45 integers per equation or 11.25 operations per expression. Number of derivatives with respect to unlagged optimization variables: 443 or 3.89 unlagged optimization variables per equation. Number of derivatives with respec.t to lagged optimization variables: 176 or 1.54 lagged optimization variables per equation. Number of integers to represent the derivatives: 12348 or,19.95 integer per derivative or 4.99 -operations per derivative. Based on this experience and some more detailed statistics the following changes will probably be incorporated into the next version: Deri- vatives that only depend on numerical constants or on exogenous variables are computed once in the set up phase and the code used for the computation is deleted. This will save both storage and time. The second change will be to integrate the code for the derivatives and the code for the original expression into one code, again saving both storage and time. The expected storage savings with the 114 equation model mentioned above are 6000 integers. - 22 - 5. The New TSP-commands Five new TSP-commands for defining the optimization problem have been implemented. rhe number of commands has been chosen so high in order to increase the flexibility. The first command is CONOPT with the following syntax (each para- meter is enclosed in < > to distinguish them): CONOPT < input models > (< control list >) < control model > < output file >; where: < input model > is the name of a model as it is created by the ORDER-command in connection with earlier simulations. The name implicitly defines all equations and all endogenous variables; < control list > is the rame of a list of names of control variables; < control model > is the name under which the information about the optimization problem will be saved in the data bank for use by later commands, and < output file > is the name of a file on which the representation of the equations and their derivatives will be stored. The objective function is defined after the control model through the CONOBJ command. The syntax is: CONOBJ < control model > < output file > < 'YEAR' > < from-period 1 > < to-period 1 > < expression 1 > < from-period 2 > ... where: < control model > is the name used in the CONOPT co=aLd to define the optimization model; < output file > is the name of a file on which the representation of the objective function, its nonzero derivatives. and the values of the constants and exogenous variables that are only used in the objective function will be stored. The file must be different from the one used in CONOPT. - 23 - < 'YEAR' > is an optional parameter. If present, the following period descriptions will be interpreted relative to a base point defined in an earlier YEAR command, and if uot present, the period descriptions will be in absolute period numbers. The optimization period itself is defined through a SAMPLE- command issued before CONOBJ; < from-period I > < to-period I > < axpression 1 > are two integers and the name of a formula. The expre-sion =n -.le right hand side of the formula is used as objective function during the period from < from-period I > to < to-period 1 > inclusive. The terms from the different perious are added; and < from-period 2 > ... More expressions can be defined e.g. if the objective has different functional forms in different time periods like terminal terms. More expressions can also be defined for the same time period if it is more convenient. All terms are added to form the total objective function. Note that only the exogenous variables and the constants that are only used by the objective are writLen to the file by CONOBJ. All other exogenous variables and constants as well as initial values for the optimiza- tion variables are written to a file by the CODATA command. The syntax for this command is: CODATA < control model > < output file >; where the two parameters are as defined above. Nute, that the outpuc file must be different from the files used by CONOPT and CONOBJ. CODATA collects the data from the data bank which means that -the numerical values that are written to the file correspond to the content of the data bank at the time the CCDATA command is issued. The last two commands are used to define lr-er and upper bounds on the optimization variables. The BOUND command has the syntax: BOUND < bound name > < time series or constant > < relational operator > < optimization variable > < relational operator > < time series or constant >; where: < bound name > is a name associated with these lower and upper bounds; - 24 - < time series or constant > is the name of a time series or a constant or a numerical value; < relational operator > is one of the relations .GE. (greater than or equal to) or .LE. (less than or equal to); and < optimization variable > is the name of a. optimization variable. Only one of the relations need to be prestent and if two relations are presenc they must both be of the same type defining a lower and an upper bound. The BOUND-command can be used to define many bounds and it is possible to use more BOUND commands for the samc optimizati .. variable. The bounds will be stored in the data bank under the bound namp so they can be saved for use tn later runs. The bounds actually used in the optimization are defined through the CBOUND command with the syntax: CBOUND < control model > (< bound list >) < output file >; where: < control model > and < output file > are as descr-:-d earlier; and < bound list > is the name of a list that contains the bou d names of the BOUND commands that should be selected for this particular run. - 25 - 6. Running and Changing an Optimization Problem Four of the TSP-commands described in the last section writes their rasults on a file. The actual optimization is performed by executing a program called START with the names of the four files as the only input. START will assign the files, read the size of the problem, assign the proper amount of storage, read the data, and call CONOPT. It is well known, that it is difficult to define the objective function and *he bounds on the variables for an optimal control exercise. In many cases an objective function and a set of bounds are tried, and based on the optimal solution the objective and/or the bounds are changed, the problem is reoptimized, new objective and bounds are created etc. The command structure chosen for the TSP-part makes it easy and cheap to make these changes in the problem formulation. When only the objec- tive is changed it is sufficient to go into TSP and issue the CONOBJ-command without calling the time consuming CONOPT-command since the information about the control model already is in the data bank. When the bounds are changed usually only a few bound values are changed. BOUND must be called a few times, a new bound list must be defined and CBOUND is called once. If the new bounds already were defined through earlier *BOUND commands that were not included into the actual set of bounds through the CBOUND command, it is sufficient to change the bound list or define a new one and call CBOUND. It is also possible to issue many CONOBJ and/or CBOUND commands in one TSP-run all using different output files. Afterwards, many different optimization problems can bc defined simply by choosing different sets of files as input to START. - 26 - A second type of change that is often wanted is a change in the planning period. With a new planning period CONOBJ, CODATA, and CBOUND must be called again after the period is changed in a SAMPLE-command in TSP. The most expensive command, CONOPT, and the BOUND-commands should not be called again. A third type of change, changes in the exogenous variables, is also cheap to perform. After the values of the exogenous variables in the data bank have been changed or a new data bank has been assigned, CODATA is called again. Based on the old files from CONOPT, CONOBJ, and CBOUND and the new file from CODATA a new optimization can be performed. Only when one or rare equations in the model are changed, added, or deleted is it necessary to call all commands, except BOUND, again. As can be seen from the examples of changes, the modularity of the definition of the control problem and of the corresponding TSP-commands makes it easy and cheap to perform most of the changes that frequently appear in optimal control exercises. - 27 - 7. Conclusions Optimization algorithms for large econometric models based on reduced gradients and sparse matrix techniques are assumed to be very efficient, but their use in practice has been limited due to large input requirements. The paper has shown that the input problem can be solved using automatic methods that transform the representation of the econometric model i: .ulation system into the proper input format, such that the manual wo:k ane. the probability of erroxs is minimized. The basic features of at implementation that links the TSP-system for estimating and simulating e:ouvmetric models to the CONOPT code for optimal control problems has been described, and it has been demonstrated how a modular command structure can make it very easy and cheap to change certain parts of the optimization problem, a feature that is often needed in practice. - 28 - 8. References Abadie, J. aad J. Carpentier (1969). Generalization of the Wolfe reduced gradient method to the case of nonlinear constraints. In R. Fletcher (ed.), Optimization. Academic Press, New York, pp. 37-47. Abadie, J. (1970). Application of the GRG algorithm to optimal control problems. In J. Abadie (ed.), Integer and non-linear programming. North Holland, Amsterdam, pp. 191-211. Bever, L. de (ed.) (1977). TSP at University of Wisconsin. Manual from the University of Wisconsin Computing Center. Bird, J.R. (1976). An efficient method for computing the reduced gradient of and econometric model. Working paper 76-13, School of Business, Queen's Uni:ersity, Kingston, Canada. Brent, R.P. (1973). Reducing the retrieval time of scatter storage tech- niques. Communic. ACM, 16, 105-109. Drud, A. (1976). Methods for control of complex dynamic systems - illustrated by econometric models. Ph.D.-t`esis, Inst. of Math. Stat. and Op. Res., Tech. Univ. of Denmark, Lyngby, Denmark. Drud, A. (1978a). An optimization code for nonlinear econometric models based an sparse matrix techniques and reduced gradients. Ann. Soc. & Eco. Meas., 6, 5631580. Drud, A. (1978b). Application of automatic symbolic formula manipulation in constrained nonlinear optimization. 'aper presented at NOAK 78, Elsinore, Denmark, Sept. 1978. Drud, A. (1978c). Implementation of the model in codes for control of large econometric models. In Proc. of the Int. Symp. on Systems Analysis and Optimization, Paris, Dec. 1978. To appear on Springer, Verlag.. Fair, R.C. (1974). Methods for computing optimal control solutions. On the solution of optimal control problems as maximization problems. Ann. Soc. & Eco. Meas., 3, 135-153. Holbrook, R.S. (1975). Optimal control choice under a nonlinear cpnstraiat, an iterative application of line3r techniques. J. Money, Credit & Banking, 6, 32-49. Lasdon, L.S. and J.B. Mantell (1978). A GRG algorithm for econometric control problems. Ann. Soc. & Eco. Meas., 6, 581-597. Nepomiastchy, P. and A. Ravelli (1978). Resolution and optimization of quasi- triangular econometric models. Ann. Soc. & Eco. Meas., 6, 555-562.