World Bank Reprint Series: Number Sixty-five R. P. Byron The Estimation of Large Social Account Matrices Reprinted frorm Journol of the Royal Statistical Society, series A, vol. 141 (1978) J. R. Statist. Soc. A (1978), 141, Part 3, pp. 359-367 The Estimation of Large Social Account Matrices By R. P. BYRON .Ausiralian National Uniiversity SUMMARY This paper formalizes and extends proposals by Sir Richard Stone for adjusting initial unbalanced estimates of the components of a matrix so that they optimally satisfy accounting requirements imposed by tabular form. Stone's proposal, based on linear combinations of initial unbiassed estimates, has many potential applications in national income accounting, input-output construction and demography, amongst other fields. Given that the Stone adjustment procedure simply represents the first- order conditions resulting from the minimization of a quadratic loss function it is possible, as is done here, to develop alternative procedures for minimizing the con- strained loss function. These procedures, based on the conjugate gradient algorithm, prove to be much more efficient than the traditional solution, both in terms of time taken and storage requirements, and the optimal adjustment of very large (say 1,000 x 1,000) social account matrices becomes quite feasible. Some other minor problems are handled which relate to multiple prior estimates of cell nimbers, and cell members for which no prior estimate exists at all. The techniques were applied to a social account matrix constructed for the Muda River district in West Malaysia and, though the results are too detailed to present here, figures are given which indicate the feasibility and usefulness of the methods. Keywords: socIAL ACCOUNTs; CONJUGATE GRADWNTS 1. INTRODUCTION THE construction of more detailed national accounts data based on a matrix framework is becoming an important issue as a result of increasing demands placed by policy makers and economists and the guidelines for the new United Nations SNA (1968) framework developed largely by Professor Stone. The collection and manipulation of data on such a scale strain the scarcest resource of statistical offices, namely, skilled national income statisticians while the increased detail delays publicatioi. This paper is primarily an extension and formalization of techniques suggested by Stone vthich enable the statisticiani, Using a computer, to balance rapidly a set oftables from data of variable realiability from incomplete sources. Furtlhermore, if the "balanced" results are unaccentable, the technique has the flexibility of regression analysis in that assumptions can be chainged and new results quickly produced. In a sense, the procedure computerizes existing subjective procedurcs and appor-tionis the adjustment according to the prior degree of reliability assigned to the initial flow estinmates. It is worth noting that three and a half diecades ago Stone et al. (1942) were exploring procedures for recognizing the different reliability of data from different sources when constructing national income accounts. Stone (1960) advanced the idea that the balancing of nat:3nal accounts identities be treated as a constrained estimation problem, where the con- strained estimates are as a weighted linear combination of initial estimates; the result is a mechanical procedure for producing a balanced set of national accounts. Stone (1961, 1968, 1970, 1975) has sug-ested possible applications of the idea in a number of contexts and recently (1976) has illustrated an application in a 6x6 social account matrix (SAM) framework. 360 BYRo-N - Estimation of Social Account Matrices [Part 3, 2. TEE STONE SOLUTION The estimates of a SAM satisfy certain accountinlg conditions by definition, xi = X-J (1) i xi = xi, (2) Xin =Xni, i1, n-i, (3) Xki 0, defined k,, (4 where xi1 is the ijth element of the n x n matrix X. Usup,lly, though not alwvays, there is a non- negativity condition of the form Xkl> 0. The conditions differ trivially for an input-output matrix and may differ for other tables. It is simplest to consider the balancing ("estimation") in terms of a SAM. Stone's suggestion is the essence of simplicity; let x be a row vectorization of the initial estimates of the non-zero elements in X and V is a prior estimate of the covariance matrix for these estimates, and let G be the constraint matrix based on a vectorization of (l)-(3), then the optimal restricted estimator of Xis a linear combination of the unrestricted estimnites x =~-i YG'[G VG']3'((x - ), (5) where h is a vector of constants such that GY = hi. As noted by Theil (1 961) this has the BLUE property. If V is interpreted as the covariance matrix for 2, the covariance of x is ;= V-VG'[GVG']-> GV. (6) However, there is little virtue in this approach if the restrictions are incorrect or if the covariance matrix is mis-specified. The former error leads to bias, while the latter results in an estimator which is not fully efficient. The restrictions (l)-(3) are definitiotial so that the only source of mis-speciied constraints is inappropdrite zero restrictions; the second hazard, more likely to occur in practice, but less disastrous in its consequences, could arise if the covariance matrix is inappropriately specified to be diagonal. If the zero restrictions are handled by exclusion of parameters (x-J's) from the vectorization, the Stone approach is to calculate (5)-(6) using the independent row, column and symmetry conditions. The SAM resembles a transportation array and redundant restrictions may be eiminated. For instance, the final column restriction is implied by the preceding row and column sum restrictions and the final symmetry condition is implied by the preceding restrictions. The maximum number of independent restrictions in the SAM case is 3n -3. In Stone's recent demonstration (1976) of the teclnique using a SAM, given in Table 1, the additional complication of cell members which are non-zero but are unknown a priori, is introduced. In the example, there are no prior estimates of ten cell members and the V matrix is assumed diagonal. Standard errors are provided for the "known" cells which assuming normality and the 2-sigma rule provide a likely range fer the estimates. The unknown cells, designated by an asterisk, are related to the known cells througlh the row, column and symmetry conditions, Providing there are less unknowns than the total number of restrictions, it is possible, by substitution, to rearrange the restrictiots so that each constraint accounts for one unknown cell, thus determining the unknown cells uniquely in terms of the known cells, The remaining cells are optimally adjusted to satisfy the unused restrictions in (5) and the "unknown" estimates adjust automatically, This procedure, as is apparent from Stone's example, is quite labour intensive and some skilled manipulation has to take place before it is possible to compute (5). Because of the size of the inverse required for [GVG'1 in (5) the procedurc is not applicable to large SAM's (say, n> 100). Furthermore, doubts remain about the interpretation of the balancing process; 1978] BYRON - Estimation of Social Account Matrices 361 TABLE 1 SAM example: initial estimates and standard errors Labour Houyeholds Production Total Labouir 0 15 3 130 80 220 (6) (1.2) (26) (16) (22) Households *0 0 0 0 * *0 0 0 0 * Production 0 15 130 0 20 190 (6) (52) (14) (38) 0 25 40 55 0 1.05 (2*5) (16) (11) (21) Total * * * * * * the priors are guesses, but the meaning of the restricted estimates and their new "reduced" variances is unclear. The argument for the importance of unbiasedness, while formally correct, is unconvincing because most statisticians will not have sufficient confidence in their initial estimates. An alternative, perhaps more satisfactory, approach is to set up a loss function, and to interpret the new estimates in relation to it. 3. AN ALTERNATV APPROACH Consider the constrained quadratic loss function where the only new term A is a vector of Lagrangians. In other words, what is required are new estimates of x which are as close as possible, in a quadratic loss senise, to the initial estimates -£, but which satisfy the restrictions Gx h. The covariance matrix then becomes a weighting matrix and much of the jargon of mathematical statistics, which is somewhat implausible in this context, can be eliminated. The first-order conditions on (7) are X-[GYG']-'(G -h), (8) x-i9- VGX, (9) which corresponds to (5). Taking this further and partitioning x into xl and x2, with XI unknown in advance, the Stone proposal is to rearrange the restrictions by row substitutiou until they have the form [ G22 ] [xa l h2 ](O Then, x1 -(G3)-1 G12 x2 + (G_)-l h,, (1 1) while the objecive function minimized is Z 2 j(2-1)'V2 -'(?2- --)+ A'(G22 x2 -h2), (12) yielding X2 = CG22G2 2[ VG 2 G -(G222-h. (13) 362 BYRON - Esti,nartionz of Social Accoi!nt Matrices [Part 3, Another treatment of this problem, which yields the same solution but provides some insight and avoids (10), is to use the Moore-Penrose generalized inverse to solve x1 in terms of x2. Partition the constraints by columns into G_x1±G2x2 = h, (14) X1- (GI (; 1 G' G2 x2+ (G ' G)-1 Gih. (15) Since (15) holds between estimates of x as well as the true values of x, 21-Xl--(G -G'G2(*2-x2). (16) The weights in the quadratic loss function based on E(-x) (2-x) are V= R' (17) where R = (G,'GL)-1 G'G2. There is a singularity in the weighting matrix implied by the partition of x into (xl:X2), but this can be circumvented. Given the singular weighting matrix, rewrite the constrained loss function using a generalized inverse Z = ( --)'V+(-2)+ A'(G9-h). (18) Since both the initial and the restricted estimates of x satisfy the same set of minimal conditions -R R) (X i}- I (X2) (19) it is a simple matter to show that an objective function (18) with the same linear dependence in the vectors and matrix of a quadratic form collapses to the minimal non-singular quadratic form. A similar problem occurs in demand analysis and the argument here follows Powell (1969). First decompose V, Y= QS, V= S'(SS)-1(Q'Q)-, Q' (20) V= [2j, [R' I-=QS, (21) S; - I+RR', (22) (Q'Q)'- {V2(I+RR') V2-' = YV I2+RR') Vi2, (23) { j I (I+R.R)- Vt-1(1+RRK)-[-R I] (24) The objective function (18) then becomes Z= (92-..2;)' -R I] j (I+RR)-1 V2-1(I+R)-:'[-R I] [ 2']) 2@ - A'(Gp 91+G2x2-h) (25) -i(;2-2) V?l<2@+ X(G:l 9L+ G2 X~2 - ) ' (26) (27) 1978] BYRON- Estinialioz of Social Account Matrices 363 where P 1- G,(G ) G G' 1. Then note that if the restrictions had been rearranged into (10) GI=G and Stone's solution is a special case of this more general approach. The first-order conditions on (26) are [ 2V' 0 G' - 2 C--e2 0 O GI 21 O (28) -2 G1 0 A h Such a system of equations is large and sparse but provides a clue as to how one might estimate a very large SAM. Clearly, the Stone approach represented by (10) is labour intensive and needs to be avoided, and although computer routines can sort and pivot to form (10), the alternatives below are superior. The solution of (27) involves the setting up of PG2 which is non-sparse and may even require a substantial inverse if the number of unknowns is large. in addition, a solution based on (28) is infeasible, requiring for a 1,000 x 1,000 SAM which is 10 per cent full a 103,000 inverse. The first step in the solution is to use an approximation which corresponds to the first-order conditions (28). If the unknown terms, xl, are treated as zero a priori and are assigned small weights in the objective function, which is again expressed, without partitioning, as (7), the first-order conditions then correspond to (28) as 0+Q and V'-1 L = 0. This result can be obtaine-d explicitly using partitioned inversion, but it is tedious and is borne out numerically below. If it is noted that (8) is a system of linear equations and the matrix is symmetric positive definite and itself the product of three sparse matrices, [GVG'] A=q or AA = q, (29) then a solution is obvious. The conjugate gradient algorithm, which has been used by the author (1977) to estimate the parameters of large econometric systems, can also be used to solve large systems of linear equations with this property. There seems no point in proving here that the algoritlh con- verges in k iterations, where k is the dimension of A, and in far less than k iterations if the problem is appropriately scaled, Proofs will be found in Ralston and Wilf (1960) amongst other sources. The iteration scheme, which is summarized below, involves the storage of a few gradient and direction vectors which will not tax even a very small computer system. The iteration scheme is po =ro =q-AA0, (30) ai = r,rilp/Api, (31) Ai-i =Ai+ Ci pi (32) ri+E = r-(i ,(33) Pi = ri+I r4:d/ri ri, (34) pi+1 = rse3+APi, (35) where p and r are gradient-based direction vectors, A0 are the initial values for A, and i, i+1 refer to the iteration count. The storage requirements are for A, p and r which is seen to be minimal given A can be stored compactly as it is a sparse product matrix. To scale the equations let wii = (11'gjj Vj gj)-, then using wN = 0 write W[GVG') WA* = Wq, (36) where A*= W1 A. The positive definite matrix now has unit elements on the diagonal while all the off-diagonal elements are less than one in an absolute value. This system, which is effected with minimum effort since W is diagonal, may be solved very rapidly for A* and A. 364 BYRON - Estimationz of Social Account MVatrices [Part 3, Given the form of GVG' the storage requirements are for GVG' stored compactly and 4(3n-3) vectors. The restricted estimates then emerge from the first order conditions -= + VG'X. (37) The storage savings and the treatment of unknown priors here makes the solution of the problem easy and nieclhanical. Also the scaling adlopted in (36) reduces the number of iterations so -rulch that substantial savings in time are achiieved. The appro&..ih above may be viewed as a gradient solution to the saddle-point problem which is generally regarded as troublesocme in the optimization literature, see, for instance, Bard and Greenstadt (1969) or Dixon (1972). The general solution is then a 2 {s2z t a -) a2z ) IOZ\ { ax / a a a (38) and (. z N-1 '92Z Xi = Xi--l+ ax aAA 9 with the first equation solved by the conjugate gradient algorithm. The two steps taken together correspond to the NeMton-Raphson algorithm which will converge even to a saddle point. A further problem likely to arise in practice is the question of nuiiltiple prior estimates; this refers to the situation in which the statistician has a number of separate cstimates of the same cell member. Let x be all the prior estimates of the cells and let y be the fundamental set of cells. Then x-Ky, (40) where K relates the various estimates to the fundamental set of cells. For example, [X3 [° 1 [(41) When xl and x2 are both prior estimates of tl. Then if V is the weighting matrix for x, the quadratic loss function is (KY -.)'V-'(KY - i) + )v'(Gy - ) (42) and the first-order conditions are - V*KV-l-. V*G'[GV*G'-l-(GV*K'-l.R-4z), (43) where V* = (K'V1 K)3-l. An examination of K' V-1 K and K' V-1' reveals both are weighted averages of the prior information on the weighting matrix for X. The latter term weights the multiple estimates to produce a single prior estimate for any cell, thusI K-? -R; the former, in similar fashion, combines the V elements to form a new weighting matrix for.9. Subsequent operations using G are done on these weighlted estimates, thus the steps can be separated and a single prior estimate can be rnade for each cell member yi and its weight V* before applying the estimation procedure based on (8) and (9) or y y-T*G[GV*G'J-I(G9 - h). (44) If V is diagonal the process is extremely simple as K has the structure (39) and V* remains diagonal. This is found to have further implications in the suggestions presented in the final section. 1978] B3YRON - Estinat oln of Social Account Mlatrices 365 4. API-UCATIONS The first illustration of the procedure is the restricted solution to the matrix in Table 1, The approximate direct solution based on (8) and, (9) with VI,, large so (T'1)7j->0, (VI)- 0 A for ijj and xl 0 yielded the same solution as the exact result based oni (28). In Table 2 the TABLE 2 Stone example.- social accounzts mnatrvix restricted estihuiates and "stantdlerdl errors" Labotur Hotuseholds Produtiction Total Labour 14'45 2'98 124 12 85'84 227'39 (5'89) (1I2) (16'02) (10 36) (15.94) Households 54'35 54.35 (8'71) (8'71) 173'04 17304 (16-99) (16-99) Produiction 15'13 139'65 20'60 175-38 (5'96) (17-51) (3'92) (16'60) 24'77 30'42 51 26 106'44 (2'48) (11-22) (9-45) (10'35) Total 227-39 54'35 173'04 175'38 106Ii44 736'60 (15 94) (8-71) (16-99) (16-60) (10-35) (47-88) results of the restricted minimization of the loss function are given togetlher with the square root of the diagonal members of the inverse matrix, in parentheses, While it is obvious the author does not favour the statistical interpretation, one school of thought would interprct these terms as standard errors. The restricted estimates in Table 2 correspond closely to the initial results in Table 1 which is a satisfactory outcotme in the sense thlat the loss resulting from the imposition of the restrictions is not great. The Lagrangians are presented, together with their "standard errors", in Table 3. The reason so niany are idlentical is found in the first-order condition G' A 0 and reflects the fact that only five lrestrictions apply independently to the objective function, the renminder determine the unknown variables xl in terms of x2. This is supported by the results in Table 3. TABLE 3 Lcagraggiails and ".sftaniciird errors" R1 R2 R3 R4 R5 R6 -'019 -034 -'034 -.038 '003 '000 ('047) (-05) ('05) ('05) ('041) (.000) *034 '034 '034 '028 '000 ('05) ('05) (05) ('051) ('000) -034 -034 -b34 '028 ('05) (05) ('05) ('051) A 46-sector example is provided in a SAM constructed by Bell et al. (1976) for the Muda Irrigation Project in West Malaysia using the tecliniques outlined here. The use of the programs was found to be extremely successful as it enabled the authors to examine the effect 366 BYRON - Estimation of Social Account Matrices [Part 3, of varying their assumptionis particularly relating to those flows on which their information was scanty. The results are too detailed to present here but some useful findings emerged, particularly the high capital outflow from the region and some initial sector dissaving figures. These were found to have been caused by a clear error in one initial sector income figure which was subsequently corrected. Only one negative estimate enmerged in the entire constrained SAM, the savings of landless farmer workers; while increased indebtedness is possible and even likely, the result was very close to zero and could be treated as such. The full results may be obtained by writing to the author. The emphasis of the present paper is on techniques and in particular on the s-uccess of the conjugate gradient method. The results of the conjugate gradient and direct solutions were idlentical in both cases to 6 or 7 significanti figures; for instance the (46, 46) element of the Muda SAM was 1,638,709 by the direct solution and 1,638,712 by the iterative solution. All the other cell members for the two solutions were equally close. To summarize the performance of the two approaches Table 4 is presented. In Table 4 the items are for complete CPU time and in fact, in execution, the conjugate gradient method is even better relatively, The times, cost and storage (dimensioned area) TABLE 4 Compairisoni of cojulgate gradienit and direct solutions CPU titme Storage Example (sec) (words) Iterations Stone Direct 8-6 - - C.G. 872 - 1S Muda Direct 59 67 35K C.G. 14*58 4K 22 all point to the superiority of the algorithm. Furthermore, as the iterations and storage relate to the Lagrangians it becomes entirely feasible to estimate a monster SAM, say 1,000 x 1,000, without difficulty. 5. CONCLUSIONS ANI3 FuRTHER PROSPECTS It slhould be clear that the method is quite general and that other constraints can be incorporated for instance, a "politic. '" constraint of the form x,,,, yo presents no difficulties. Furthcrnmore, if the SAM has a block diagonal structure then the objective function is separable and may be solved as a suIccession of smaller pr oblems. The approach might also be applied as a simple alternative to the RAS method of updating input-output matrices. Finally, it is worth drawing attention to the fact that not all the structural characteristics of the SAM have been e.xploited in this treatment. The synmnmetry restrictioni, for instance, could be handled as a problem of multiple cstimates of the same cell. In the SAM case this reduces the number of restrictions by one-third and so makes the problem identical to the classical transportation problem except that the objective function is quadratic. This leads one to questioni tle objective function and suggest the possibility that other forms of linear objective fuinctions might be worth investigation, especially as the transportation algorithm has the ability to enforce the non-niegatively conditions, if required. There is also the possibility of applying a transportation solution to the quadratic objective fuinction; howvever, it is doubtful if this would be superior to the conijugate gradient algorithm. The main contribution of the paper has been the demonstration of the usefulness of the conjugate gradient algorithm, and its ability to open completely new dimensions in social 1978] BYRON - Estimation of Social Account Matrices 367 account construction. The paper also attempted to formalize the methodology underlying Stone's adjustment procedure to make obvious what is involved in using the method, 'whetfler it is applied by direct or an iterative solution. Naturally, it is to be hoped that the methods presented will find widespread application because they should facilitate the collection and rapid dissemination of statistical data. ACKNOWLEDGOEMENTS I am indebted to Graham Pyatt of the World Bank for suggesting this problem to me. My thanks also to Richard Stone, Pasquale Scandizzo, Montek Ahluwalia and Jan Bisschop for comments on an early draft of the paper and to the authors of the Muda River Study (1976) for making their data available. The views expressed in this paper are those of the author and should not be ascribed to the World Bank or the above-mentioned individuals. REFERENCES BABD, Y. and GREENSTADT, J. (1960). A modified Newton method for optimization with equality constraints. In Optinmization (R. Fletcher, ed.), pp. 299-306. New York: Academic Press. BECKMAN, F. S. (1960). The solution of linear equations by the conjugate gradient method. In Mathemtatical Methodsfor Digital Computers (A. Ralston and H. S. Wilf, eds), pp. 62-72. New York: Wiley. BELL, C., DEVARAJAN, S., HAZELL, P and SLADE, R. (1976). A regional impact analysis of a major irrigation scheme. Working Paper No. 4, Development Research Center, World Bank, Washington, D.C. BYRON, R. P. (1977). Estimation and inference for large econometric systems. Econonzeirica, 45, 1499-1516. DixoN, L. C: W. (1972). Nonlinear Optimnization, pp. 89-93. London: Eniglish Universities Press, POWELL, A. A. (1969). Aitken estimators as a tool in allocating predetermined aggregates. J. Amer. Statist. Ass., 64, 913-922. STONE, R. (1961). Social accounts at the regional level: a survey. In Regiontal Economic Planning, Techniques of Analysis (W, Isard and J. H. Cumberland, eds), pp. 263-296. Paris: O.E.E.C. - (1961). Input-Output and NationalAccounts. Paris: O.E.E.C. - (1968). Input-output projections: consistent prices and quantity structures. L'Indusiria, No. 2, 212-224. - (1970). Mathematical Models of the Economy and Other Essays. London: Chapman and Hall. - (1975). Direct and indirect constraints in the adiLustnient of observations. M-fimcographed. D.A.E., Cambridge. - (1976). The development of economic data systems. In Social ,cecounring for Development Plauininr, with Special Reference to Sri Lanka (G. Pyatt et al., eds). Canmbridge: Cambridge University Press. STONE, R., MEADE, J. E. and CHAMPERNOWNE, 1D. G. (194"'). The precision of national income estimates. Rev, Econ. Stuidies, 9 (2), 111-125. THEm, H. (1961). Economic Forecasts and Policy. Amsterdamn: North-Holland. UNITED NATIONS (1968). A System of National Accounts, Series F, No. 2, Rev. 3.