90619 International Comparison Program [04.01] Education Memo Draft Alan Heston 5th Technical Advisory Group Meeting April 18-19, 2011 Washington DC Table of Contents Part I: The African Region .............................................................................................. 4 Estimate 1-Dependent Variable: Log of Volume 1 Students Per Capita by Level: Africa ...................................................................................................................... 5 Estimate 2-Dependent Variable: Log of Volume 2 All Students Per Capita: Africa ................................................................................................................................. 6 Part II: Comparing the EU and African Region............................................................ 7 Estimate 3-Dependent Variable: Log of Volume 1 Students Per Capita by Level: Africa-EU ................................................................................................................ 8 Estimate 4-Dependent Variable: Log of Volume 2 All Students Per Capita: Africa-EU ................................................................................................................ 8 Estimate 5-Dependent Variable: Log of Volume 1 Students Per Capita by Age: Africa-EU ................................................................................................................ 9 Part III: The Data Requirements and Approach for 2011 ......................................... 11 Tables Table 1 Education Volume Comparison: Africa 2005 ..................................................... 13 Table 2 ICP Education Volume Comparisons: Africa-EU 2005 ...................................... 14 2 Education Memo1 Alan Heston Introduction In previous Tag discussions it was agreed that the 2011 ICP should experiment with a direct student quantity approach and indirect PPP derivation for all education expenditures along the lines of the EU-OECD. One problem with such an approach for other regions is that the quality measure used by EU-OECD, PISA scores, are not widely available in other regions and particularly for important large countries in Asia and much of Africa. Outside the EU-OECD there is another problem, namely whether to use the number of registered students or the number of students attending school as obtained from household surveys. Other quality issues have also been raised in discussions with staff or the Education and Policy and Data Center (EPDC), such as education of parents and repeating students. They are not considered here in part because the data are so sparse. In a number of important expenditure groupings besides education the ICP uses official data even when it can produce anomalies as for dwelling services. In education this means the starting point for unadjusted quantity will be student registration at the primary, secondary and tertiary levels. One problem in taking account the differences between attendance and registration estimates and factors determining the quality of education is that there are holes in the data. The ICP has generally used the GEKS or CPD methods to deal with this problem and that is the recommendation of this paper. In this paper the CPD application uses data assembled by EPDC for 30 ICP 2005 African countries on enrollment and expenditures for illustration. This permits comparisons with the ICP volume estimates used in ICP 2005 within Africa. To round out the comparison with other regions, similar data were taken from a group of EU-OECD countries in their TAG presentation on methods of comparing education. 1 Acknowledgements are due to Nada Hamadeh for her inputs and to EPDC for their documentation. 3 Part 1 describes the available data set for Africa, discusses the illustrative CPD application and provides some results for Africa in comparison with ICP 2005. Part 2 expands the data set to include 29 EU-OECD countries and provides a similar comparison with ICP 2005. Part 3 makes some recommendations for the 2011 ICP benchmark. Part I: The African Region The EPDC assembled a data set for 30 countries in Africa for 2005 that includes registration for primary, secondary, tertiary as well as pre-school and adult education. In the empirical work only primary, secondary, and tertiary registration has been used, all numbers on a per capita basis. Whatever attendance data available apply only to primary education. The share of expenditures on primary, secondary, tertiary education per capita is also available. The ICP 2005 per capita education expenditure for each country is used as the control total since it differs from the data source used by EPDC. The EPDC had assembled limited quality data on what they termed Tims equivalent for some of the African countries. Students per-capita is the dependent variable that, other things equal, will be associated with the share of school age population. The variable Age has been used to control for this effect. These are all the variables used in the application and their notation is given below where the index i is over countries: RiP , RiS and RiT are student registration of primary, secondary and tertiary and RiPA is the attendance of primary students, all per capita, ESiP, ESiS and ESiT are expenditure shares on the 3 levels of education, Qi is the quality measure of countries, PISA or Tims equivalent. Agei is the share of population under 15 years old. We turn now to putting these data together to form a volume index for all of education. We have used a weighted country product dummy (CPDW) equation that is constructed as in (1) below: 4 The dependent variable is the log of the number of students per capita for each level of education from enrollment and for primary students also for attendance. In the case of Africa all 30 countries had primary registration data, but from the EPDC data, several did not have registration data separately for secondary and tertiary or primary attendance. So from a potential of 120 observations only 115 were available to estimate equation (1). The coefficient I is the country coefficient providing an overall volume measure for each country; j is a coefficient for each level of education and if is the error term. Equation (1) is estimated as a weighted regression with the expenditure shares at each level of education as the weights. The weight for primary students registered and attending is the same. The estimated coefficients and regression statistics are: Estimate 1-Dependent Variable: Log of Volume 1 Students Per Capita by Level: Africa R-Square Coeff Var Root MSE LVPC1 Mean 0.862 -14.47 0.387 -2.677 Parameter Estimate Error t Value Intercept -5.103 0.386 13.20 Level PA 3.728 0.208 17.91 Level PR 3.577 0.207 17.25 Level S 2.150 0.224 9.62 Level T 0.000 The country coefficients will be discussed later. The level coefficients are plausible in sign and magnitude; when tertiary students per capita is the reference level the coefficients rise moving to secondary, primary registered, and finally to primary attending. The coefficient on PA implies a large but not implausible effect, 16%. The RMSE is also large. One reason for the relatively large RMSE is that the classification of students differing across countries, particular for the former French dependencies, and/or implausible 5 expenditure shares. Further (1) is mis-specified in the sense that  is assumed the same across countries, something clearly contrary to observation since we believe the relative expenditure per pupil between primary, secondary and tertiary differs at different levels of per capita income. If this approach is adopted this is one of several areas where it might be improved. An un-weighted version of the equation has an even larger RMSE. Age cannot be used in the same equation with country; taken separately it is not significant across the African countries. An even simpler version of (1) takes the two sums of all students per capita, one with primary registration and one with primary attendance. This measure of education volume measure will be termed Vol2 in the later discussions. Estimate 2-Dependent Variable: Log of Volume 2 All Students Per Capita: Africa R-Square Coeff Var Root MSE LVPC2 Mean 0.998 -1.365 0.023 -1.684 Parameter Estimate Error t Value Intercept -1.326 0.021 -61.94 Level PA 0.125 0.008 14.87 Level PR 0.000 This equation, simple as it is, has a much smaller RMSE. The attendance impact is 12.5%, a somewhat smaller effect with much more precision than when all levels of education are considered. The quality measure is available for only 23 African observations. TimsE is not significant within these countries. This suggests that if TimsE is to be used, it would need to be introduced in a different specification. Returning to the country coefficients from Estimates 1 and 2, how much difference does this all make for the estimates of education volume in Africa? This is explored in Table 1. Columns (1) and (2) are based on the 2005 ICP Report providing the per capita GDP and PPP converted education expenditures as indexes with South Africa as the reference 6 country. Columns (3) and (4) show the estimated values of per capita volumes from the estimating equations in the text. The association of per capita GDP and the education volume measures in Columns 2-4 is positive but not large, r being .48, .30 and .19 respectively. The Bank productivity adjustments were in part based on capital estimates that were in turn based on income estimates, which should produce some correlation. The last rows of Table 1 provide the average values and measures of variation for the four variables. South Africa has an average GDP per capita three times the average of the 30 countries and its average quality of education can be taken as above the average for these countries. It is therefore plausible that the average values for the two direct volume measures are above that of the 2005 ICP values. Further, since Volume 2 does not distinguish levels of education counting primary and college students equally, it is plausible that its average value is greater than for Volume 1. One conclusion from Table 1 is that a very simple estimating equation provides a reasonable volume measure (Vol2) with more precision (lower coefficient errors and RMSE) than does the version (Volume1) distinguishing primary, secondary and tertiary levels. Since it is clearly desirable to distinguish education levels going forward, the Global Office, Regions and/or countries need to bring together as detailed enrolment and expenditure data as availability as possible. Without more detailed evaluation, perhaps by the regions, it is not possible to say which of the 3 volume measures in Table 1 is the better estimate. Part II: Comparing the EU and African Region Comparisons with other regions should provide another window on the efficacy of the proposed approach. Data for 29 greater EU countries were quite casually assembled (great attention was not given to whether data were 2005 or 2006, for example). The purpose is to estimate pooled CPDs in the form of Equations 1 and 2 above. The results were comparable as shown below. 7 Estimate 3-Dependent Variable: Log of Volume 1 Students Per Capita by Level: Africa-EU R-Square Coeff Var Root MSE Lvol1 Mean 0.974 -6.831 0.410 -6.009 Parameter Estimate Error t Value Intercept -3.622 0.377 -9.61 Level PA 2.175 0.141 15.48 Level PR 2.101 0.140 14.99 Level S 0.963 0.156 6.17 Level T 0.000 Estimate 4-Dependent Variable: Log of Volume 2 All Students Per Capita: Africa- EU R-Square Coeff Var Root MSE Lvol2 Mean 0.9999 -0.711 0.035 -4.964 Parameter Estimate Error t Value Intercept -1.295 0.033 -39.78 Level PA 0.065 0.009 6.97 Level PR 0.000 As for Africa the RMSE is lower for Volume 2 than when all levels are considered. Because Registered and Attending primary students were assumed to be the same for the EU countries, the differential between the PA and PR is essentially an average of 0.0 and the African value, so overall 6 or 7%. Before turning to the country comparisons, the effect of age should be mentioned. Whereas age is not particularly associated with student volume per capita between the 30 African countries, the percentage under 15 is positively associated with students per capita between Africa and the EU. 8 Estimate 5-Dependent Variable: Log of Volume 1 Students Per Capita by Age: Africa-EU R-Square Coeff Var Root MSE Lvol2 Mean 0.902 -15.531 0.770 -4.964 Parameter Estimate Error t Value Intercept -12.927 0.284 -45.51 Level PA 0.052 0.202 0.26 Level PR 0.000 AGE 0.247 0.008 32.40 There is a clear association between the AGE variable and the per capita volume when all 59 countries are included. However, when a regional dummy variable is added to Estimate 5 AGE is no longer significant. However, it does suggest that AGE would be a plausible check on the volume measure differences between the African and EU sample countries. This relationship will be examined in the discussion of Table 2. The UK has been taken as the reference country in Table 2. In addition to columns 1-4, which are the same as Table 1, Table 2 provides the age measure in column (7). The countries are organized by region and the last two rows provide the regional averages. It can be seen that all of the African countries have a share of school age population in column (7) that is typically 2 or more times that of EU countries, the average being 2.35 times the EU. For this reason a number of the countries in Africa have direct volume estimates, unadjusted for quality, that are higher than a number of EU countries. This simply means that the higher proportion of school age population attending school in the EU countries is more than offset by the larger cohort of school age population in the African countries. Columns (5) and (6) for Africa are columns (3) and (4) of Table 1 with the reference country being the UK instead of South Africa. Comparing these columns with columns (3 and (4)) provides some notion of the effect of pooling on regional results. The 9 correlations are fairly high, .953 between columns (3) and (5) and 1.0 to 3 decimal places between columns (4) and (6) for the African countries. Thus pooling Africa and the EU has not apparently disturbed country relationships within Africa but it has affected levels. That is looking at the averages with the UK as reference we see that now ICP 2005 in column (2) has a higher average than either of direct volume estimates in columns (5) and (6) even though ICP incorporates productivity adjustments that proxy educational quality. Column (5) for the EU is a direct volume comparison adjusted for quality so it is not comparable to Column (5) for Africa. The quality adjustment for the UK, the reference country, was about 102 with the EU 27 = 100. Thus using the UK as the reference country should have little effect on the averages. Now let us compare the average ICP 2005 education volumes in column (2) with quality adjusted direct volumes for the EU group in column (5) of Table 2. This comparison suggests the quality adjusted volume approach will lower the relative position of the EU to Africa compared to ICP 2005. At least that is my inference with which others may have different views. How closely correlated are the quality adjusted EU volumes with the directly estimated Volume1 values? R2 is .91 between columns (3) and (5) for the EU countries. This lends support to using direct quantity comparisons for a linking method between regions where comparable quality adjusted volume measures are not available. The association between the ICP 2005 for the African countries and the unadjusted direct volume is quite low, R2 = .095 and not much more for all 59 countries in Table 2, namely .107. And it is immediately apparent that the direct volume estimates are much higher in the African countries than the productivity adjusted indirect estimates of the 2005 ICP. The average of Direct Volume1 is 91.3 compared to 25.1 for ICP 2005. How much of this difference in volume can we attribute to quality of education? Consider the following evidence. In the EU paper the maximum Pisa adjustment is 1.113 for Finland, and the minimum is .786 for Macedonia, all relative to the EU 27 average. Min to Max is .707. Macedonia GDP PC/Finland is .242 and Age is 1.28. In ICP 2005 South Africa has a per capita GDP 12% above Macedonia so one might think that a 25 - 10 30% downward adjustment of direct volume for South Africa would be plausible. What about the much poorer African countries like Ethiopia or Niger? My judgment is that a maximum adjustment of 50%, which is non-linear with respect to PC GDP, would be consistent with the EU-OECD paper for their countries. If this is the case the average adjusted Volume1 estimate might be 40-50% rather than 91% of the UK, but still substantially above ICP 2005. The quality adjustment needs much more thought and discussion than have gone into the above illustration but it is a first step. Part III: The Data Requirements and Approach for 2011 Data requirements are listed here that would be necessary to implement the approach of this paper, replicate the 2005 ICP, and provide supplementary information as additional checks on the 2011 estimates. The approach suggested here is that the GO assemble publicly available education series that are shared with the Regions to obtain consensus, if only by default. In addition a questionnaire would be sent to countries to obtain the salary data necessary to replicate the 2005 ICP. The data to be assembled and actions to be taken by the Global Office include: 1) Student registration following the ISCED classification 2) Expenditures according to available classification, usually primary, secondary and tertiary. It will be necessary modify student data to conform across countries, particularly as between primary and secondary education. 3) Full time teachers by ISCED classification 4) From tabulations of HH surveys attendance data for available countries. 5) PISA scores from recent surveys, 2005 onwards? 6) Compare indirect and direct volume measures for 2011 before quality adjustment. 7) Provide direct and/or indirect volume estimates and PPPs for the regions after quality adjustment based upon only regional and on pooled data across the regions. 11 Countries/Regions would be requested to: 1) Review their data assembled by the GO. 2) Provide salary and benefit information for teachers and other groups as per 2005 ICP. 3) Review the quality adjusted volume and PPP estimates provided by the GO based on only regional and on pooled data across the regions. The Regions would then decide whether they wish to use these estimates in their regional comparison or develop their own estimates. This paper applied the CPD method to countries from the 2005 ICP in order to estimate direct volume measures for education as a possible approach to education for the 2011 ICP. It is an approach that can be a first approximation to the EU-OECD method that uses quality measures of education based upon student testing and other data to arrive at a final direct volume estimate. An illustration building on the EU-OECD results was presented that was suggestive how quality adjustments might be approximated for countries/regions where standardized testing results are not widely available. The empirical implementation of the approach built upon 30 African and 29 EU countries for 2005 where comparisons were also made with ICP 2005. An argument was made that the approach could be used to link regions even if quality adjustments within the regions differed. 12 Table 1 Education Volume Comparison: Africa 2005 2005 ICP Direct Country PC 2005 ICP PC Direct Index Index GDP Index Ed-Volume Volume1 Volume2 SA =100 SA=100 SA=100 SA=100 1 2 3 4 Angola 41.9 73.8 73.0 95.4 Benin 16.3 37.2 79.6 83.5 Botswana 142.4 63.5 124.9 113.9 Burkina Faso 13.3 31.5 39.8 41.9 Burundi 22.2 15.2 16.4 18.2 Cameroon 23.6 39.2 72.8 73.1 Cape Verde 33.5 89.7 105.0 111.3 Central African R 7.9 36.2 39.2 45.5 Chad 20.7 4.3 62.1 82.2 Congo, R 42.9 28.9 14.6 79.5 Cote d'Ivoire 18.7 68.4 10.6 14.8 Ethiopia 6.9 63.0 52.2 65.1 Gambia 8.4 29.2 56.8 68.9 Ghana 14.3 41.0 71.1 68.1 Kenya 16.3 51.4 82.2 95.8 Lesotho 16.7 44.9 96.6 115.3 Madagascar 11.8 11.1 72.4 81.3 Mauritania 20.2 26.1 65.3 71.7 Mauritius 118.7 54.9 56.6 50.4 Mozambique 8.9 48.7 59.3 89.9 Namibia 53.7 90.4 98.6 108.6 Niger 7.4 42.6 34.0 37.9 Rwanda 9.9 22.4 87.8 104.3 Senegal 19.7 55.1 60.8 61.4 Sierra Leone 9.4 35.0 115.5 124.3 South Africa 100.0 100.0 100.0 100.0 Swaziland 51.7 67.2 100.6 108.8 Togo 10.3 25.9 69.2 69.8 Uganda 11.8 40.1 99.5 114.6 Zambia 13.8 62.8 103.7 108.7 Average 29.8 46.7 70.7 80.1 Standard Deviation 33.700 23.515 29.849 29.369 Coeff of Variation 1.132 0.504 0.422 0.366 13 Table 2 ICP Education Volume Comparisons: Africa-EU 2005 2005 2005 ICP ED ED ED ED Country by ICP PC PC Volume1 Volume2 Volume1 Volume2 AGE GDP ED - % Region Index Volume Direct Direct Direct T1 Direct T1 Under 15 UK=100 UK=100 UK=100 UK=100 UK=100 UK=100 Years 1 2 3 4 6 7 5 Angola 11.2 6.4 88.2 121.6 19.5 25.5 43.2 Benin 4.4 9.7 96.9 109.7 21.3 22.4 47.5 Botswana 38.1 78.9 131.8 149.7 33.5 30.5 40.6 Burkina Faso 3.6 7.5 58.3 55.1 10.7 11.2 47.6 Burundi 5.9 6.8 21.4 23.9 4.4 4.9 47.1 Cameroon 6.3 12.1 90.4 96.1 19.5 19.6 42.7 Cape Verde 9.0 41.5 140.6 146.2 28.1 29.8 43.6 Central Afr 2.1 5.2 48.4 59.8 10.5 12.2 43.4 Chad 5.5 28.6 76.8 108.0 16.6 22.0 47.7 Congo, R 11.5 26.5 20.9 104.5 3.9 21.3 42.5 Cote d'Ivoire 5.0 5.9 13.5 19.5 2.8 4.0 46.4 Ethiopia 1.8 3.1 54.2 85.5 14.0 17.4 47.0 Gambia 2.2 44.1 74.5 90.5 15.2 18.5 45.3 Ghana 3.8 13.0 82.4 89.5 19.1 18.2 41.9 Kenya 4.4 18.2 108.4 125.9 22.0 25.7 42.8 Lesotho 4.5 40.0 102.1 151.6 25.9 30.9 39.6 Madagascar 3.2 21.4 101.3 106.9 19.4 21.8 45.0 Mauritania 5.4 11.8 95.0 94.2 17.5 19.2 46.2 Mauritius 31.8 90.9 70.0 66.3 15.2 13.5 25.7 Mozambique 2.4 6.3 79.1 118.1 15.9 24.1 42.9 Namibia 14.4 52.9 135.7 142.7 26.4 29.1 42.9 Niger 2.0 2.7 48.9 49.9 9.1 10.2 48.0 Rwanda 2.6 10.8 103.9 137.0 23.5 27.9 43.0 Senegal 5.3 10.1 72.6 80.7 16.3 16.4 44.9 Sierra Leone 2.5 13.5 177.0 163.4 30.9 33.3 41.5 South Africa 26.8 62.8 146.1 131.4 26.8 26.8 32.5 Swaziland 13.9 32.1 116.2 143.0 26.9 29.1 45.6 Togo 2.8 10.3 82.5 91.7 18.5 18.7 46.1 Uganda 3.2 61.7 139.0 150.6 26.7 30.7 51.1 Zambia 3.7 16.9 163.3 142.9 27.8 29.1 47.6 Austria 107.9 131.4 89.4 89.7 97.2 16.7 14 2005 2005 ICP ED ED ED ED Country by ICP PC PC Volume1 Volume2 Volume1 Volume2 AGE GDP ED - % Region Index Volume Direct Direct Direct T1 Direct T1 Under 15 UK=100 UK=100 UK=100 UK=100 UK=100 UK=100 Years 1 2 3 4 6 7 5 Belgium 101.6 141.2 121.0 102.7 113.7 17.5 Bulgaria 29.6 93.5 78.6 75.2 78.0 15.6 Croatia 42.0 89.0 85.5 81.2 83.7 18.0 Cyprus 77.4 123.8 95.5 94.8 97.6 23.5 Czech Rep 64.2 109.7 98.4 90.6 99.7 16.5 Denmark 106.5 148.0 108.5 102.3 114.8 18.5 Finland 96.4 126.5 116.9 111.8 122.1 18.2 France 93.8 127.2 100.5 97.3 108.4 18.8 Germany 96.6 73.5 91.6 87.8 96.7 15.7 Greece 80.7 111.2 97.0 91.9 87.0 15.2 Hungary 53.8 112.0 90.6 90.5 99.9 16.9 Iceland 112.8 211.2 131.8 131.2 137.1 23.3 Ireland 120.4 162.9 124.6 120.3 114.3 21.8 Italy 87.9 95.4 79.1 80.9 86.5 14.2 Latvia 41.8 126.0 101.4 96.9 99.6 17.3 Lithuania 44.6 126.8 114.1 107.9 113.9 19.3 Malta 64.6 75.4 97.7 92.9 95.7 20.3 Netherlands 109.9 128.6 105.8 100.6 105.1 18.4 Norway 150.5 144.9 118.2 112.2 114.6 20.0 Poland 43.0 101.5 104.3 105.2 111.3 19.0 Portugal 63.3 86.0 93.9 89.3 93.6 17.1 Romania 29.7 69.1 90.8 86.5 88.2 18.4 Slovakia 50.3 98.0 104.6 102.0 94.5 19.5 Slovenia 72.8 106.1 83.6 89.4 107.6 16.5 Sweden 101.3 170.7 107.2 102.4 111.3 18.4 Switzerland 112.4 122.4 91.8 88.7 94.0 17.1 Turkey 24.7 46.7 122.9 118.4 112.3 29.1 United Kingdom 100.0 100.0 100.0 100.0 100.0 19.0 Mean Africa 8.0 25.1 91.3 105.2 18.9 21.5 43.7 Mean EU 78.6 115.8 101.6 97.9 102.7 18.6 15