What Makes Public Sector Data Valuable for Development? Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 Dean Jolliffe, Daniel Gerszon Mahler, Malarvizhi Veerappan, Talip Kilic, and Philip Wollburg Data produced by the public sector can have transformational impacts on development out- comes through better targeting of resources, improved service delivery, cost savings, in- creased accountability, and more. Around the world, the amount of data produced by the public sector is increasing rapidly, but we argue the full potential of data to improve develop- ment outcomes has not been realized yet. We outline 12 features needed for data to generate greater value for development and present case studies substantiating these features. We ar- gue that a key reason why the transformational value of data has not yet been realized is that suboptimal data—data not satisfying these 12 features—are being supplied. The features are that the data should be of adequate spatial and temporal coverage (complete, frequent, and timely), should be of high quality (accurate, comparable, and granular), should be easy to use (accessible, understandable, and interoperable), and should be safe to use (impartial, confidential, and appropriate). JEL Codes: C81, C83, O10, O20 Keywords: data, development, statistics. Around the world, the supply of public sector data has increased rapidly. Since 2005, the number of countries without a population and housing census conducted over the preceding 10 years has fallen by nearly 80 percent (from 36 to 8), the number of countries without a labor force survey conducted over the same period has fallen by 50 percent (from 98 to 49), and the number of countries without nationally rep- resentative administrative data on learning assessments within a five-year range has fallen by 83 percent since 2008 (from 36 to 6) (World Bank 2021a). Public sector data can have a transformational role in development and efforts to reduce poverty. Amongst many fruitful uses, data can be used to increase access to government services, prepare for and respond to emergencies, target resources and The World Bank Research Observer © 2023 International Bank for Reconstruction and Development / The World Bank. Published by Oxford University Press https://doi.org/10.1093/wbro/lkad004 38:325–346 foster the inclusion of marginalized groups, save money and resources in policy im- plementation and service delivery, monitor progress and track performance, increase accountability, and empower individuals (World Bank 2021b). While the increased availability of data has improved development outcomes, we argue that their full potential for development is far from being realized. We suggest that a major, and sometimes overlooked, reason for this shortfall is that the data pro- Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 duced do not have certain features that make them valuable. We first develop a con- ceptual framework specifying 12 features often needed for data to most effectively generate value: data should have adequate coverage (complete, frequent, and timely), be of high quality (accurate, comparable, and granular), be easy to use (accessible, understandable, and interoperable), and be safe to use (impartial, confidential, and appropriate). We argue that often the feature least present can be decisive for the value, or lack thereof, that can be derived from the data. Next, we use a collection of case studies to provide support for these features and showcase how they matter in practice. Too often, we find, the data produced by governments do not satisfy these features and thus are not conducive to transforming development outcomes. The data may be of poor quality, siloed in various administrative systems, not shared with the public, not readable by computers, and so forth. We aim to contribute to the literature in several ways. We develop a framework of features that increase the potential of data to be valuable. We discuss how the inter- action of the 12 features is crucial to understanding the value of data and the policies necessary to increase the value of data. We further explore how our framework re- lates to the equilibrium of data supply and demand. We directly tie the theoretical framework of data features to case studies to support our inferences on how best to scale-up public data systems to realize greater value. We make no attempt to estimate the net social value of public sector data, in part because many of the benefits occur in dimensions for which prices do not exist (e.g., improved health from drinking clean water), but this also reflects our view that there is no satisfactory way to assign a monetary value to an inexhaustible, or nonrival, good. The limitless scope for data to be used and reused to address new and unex- pected problems, as well as the potential for data to be misused for harm, suggests to us that any estimate of the expected net social value of public sector data is one that will be very imprecise. Rather, by reviewing a series of use cases of public sector data, we provide evidence of the theoretical framework. Following recommendations in the literature on case studies, this has involved an iterative and inductive approach of using case studies to develop the theory and testing the validity of the theory using case studies (Dubois and Gadde 2002, Eisenhardt and Graebner 2007). The use of case studies is consid- ered appropriate when current theoretical perspectives are inadequate due to little empirical substantiation (Eisenhardt 1989), which we believe is pertinent due to the difficulty of estimating the net social value of public sector data discussed above. We 326 The World Bank Research Observer, vol. 38, no. 2 (2023) acknowledge the limitations of using a case study approach which some may con- sider as lending itself to opinion-based and prescriptive analyses. We restrict our analysis to data collected in the public sector at large, focusing on low- and middle-income countries. This means that we neglect private sector data, most citizen-generated data,1 and data from high-income countries. We also shy away from discussing two closely related issues: (a) why these 12 features are often Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 missing from public sector data, that is, issues concerning the political economy of data production, and (b) under what conditions the demand for public sector data can be maximized. Our focus on data production means that the features we identify in- crease the potential for data to be valuable for development. However, for that value to be realized, the data need to be demanded and used for legitimate purposes. There are other frameworks that list features conducive for data to be valuable. Such frameworks have been created by national statistical offices (Statistics Canada 2017), international organizations (OECD 2011 and United Nations 2019), and academia (Biemer 2010 and Wilkinson et al. 2016). Other frameworks have been cre- ated in more narrow contexts, such as for the case of health data (Wang and Desalvo 2018), official statistics in the US (Groshen 2021), and for the purpose of maintain- ing confidence in public sector data (Gore et al. 1991). There is a considerable overlap between the different frameworks, though the exact names and definitions of the fea- tures in them differ from source to source. All 12 features we identify can be found in some shape or form in one of the other frameworks. Though many of the frameworks originate from work with survey data, they can be adapted to other data types. The framework of Biemer (2010), for example, which is based on the concept of a total survey error, has been adapted to big data (Amaya et al. 2020) and agricultural data (Carletto et al. 2021). Other collections of case studies of the impact of data on development exist as well, and we will draw examples from some of these other collections. Among these are the Data Impact Case Studies gathered by Open Data Watch (2020), the Value of Data Case Studies gathered by the Global Partnership for Sustainable Development Data (2021), Open Data Impact Map’s collection of use cases of open data (Center for Open Data Enterprise 2021), GovLab’s collection on the same topic (Verhulst and Young 2016), a World Bank collection on the same topic (World Bank 2015), data2x’s col- lection of examples related to gender data (data2× 2019), and the World Bank’s col- lection of examples related to digital identification in the health sector (World Bank 2018a), the public sector (World Bank 2018b), and the private sector (2018c). Conceptual Framework We outline 12 features that we argue are needed for public sector data to maximize their potential value for development. The 12 features relate to coverage, quality, and usability. With respect to coverage, inferences from the data need to be valid for all Jolliffe et al. 327 Figure 1. Features of Data Conducive to Maximizing their Value for Development Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 relevant units and time periods. We argue that this means that the data need to be complete, frequent, and timely (these features, and the ones to follow will be defined more clearly in what follows). With respect to quality, the data need to accurately measure the concepts of interest to the users and be comparable and granular. To en- sure usability of the data, they should be easy and safe to use. Easy to use means that the data can be accessed, that they are understandable, and that they are interoper- able. To ensure that the data are safe to use, they need to be impartial, confidential, and appropriate (Figure 1). Though many different types of public sector data exist—such as censuses, survey data, administrative data, and geospatial data—we argue that these features increase the potential value of all public sector data. The exact way in which each feature mat- ters varies from data type to data type. One of the features we will outline, for exam- ple, is that data benefit from being frequent. However, frequent data mean a different thing for GDP data than for weather data. Moreover, public sector data are used for a variety of purposes, such as to inform policy making, program implementation, and monitoring. We likewise argue that satisfying these features increases the potential value of data used for any of these purposes despite their differences. Again, the ex- act ways the features matter, and the relative importance of different features, differs between use cases. 328 The World Bank Research Observer, vol. 38, no. 2 (2023) Adequate Coverage: • Complete: Data are representative of the population of interest. Often, for data to maximize their value for development, they need to cover the entire population of interest, whether geo- graphical areas, households, or something else. In the case of census and administrative data, this typically means that the entire population of interest is enumerated. In the case of sample data, this means a complete sampling frame containing the entire population of Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 interest, each member having some known, positive probability of being selected into the sample. In this case, completeness does not imply that the entire population of interest is directly captured in the data, but that the data are representative of the population of in- terest. Completeness is particularly relevant for redistributive policies where identifying the worst off requires that all are identified. • Frequent: Data that are produced at regular intervals. The frequency of data relates to how of- ten new data points are made available. If one is only able to get new data on a topic, say, every 10 years, then understanding developments and implementing policies to improve outcomes becomes difficult. High frequency of data is particularly relevant for evaluating projects, tracking goals in national development plans, macroeconomic conditions, and more. For some outcomes, such as weather forecasts, the ideal frequency is at least daily, if not more frequent, while for other outcomes, such as macroeconomic data, the desired frequency may be monthly or quarterly. In each case, the desired frequency of data depends on the temporal dynamics of the outcome of interest, and this frequency may change as other conditions change. • Timely: Data that are released shortly after collection or after an event occurring. Timeliness is particularly relevant for emergencies where policy response needs to be immediate, such as environmental disasters, health crises, conflicts and more. The timeliness of data depends on the mode of data collection. Some machine-generated data can be made available nearly instantaneously, while other data, such as survey data, often takes place over a longer period of time and needs careful analysis and cleaning before it can be released, leaving a time lag of several months. High Quality: • Accurate: Data that measure concepts of interest with minimal error. For data to be valuable they need to measure their concept of interest with minimal error. Minimizing the error requires paying attention both to the variance and bias of the statistic of interest. The former implies having enough power to minimize sampling error while the latter means that the data on expectation measure the statistic of interest accurately. Jointly minimizing the error means that the number of observations is adequate for relevant policy questions and that data collection and processing use methods and tools designed to capture the signal of some phenomena while filtering out noise or measurement error that can bias conclusions and evidence-based policies. • Comparable: Data that are comparable across space and time. For data to be used for cross- country analysis or for tracking patterns over time, they need to be comparable across space and time. This means, for example, that the data should conform to certain standards and data collection instruments should be relatively stable over time. Examples of conforming to standards include the near-universal acceptance of the System of National Accounts for measuring the size of economies, and similarly the widely accepted standards for mea- suring geolocation and units of time. There are fewer clear examples of how to maintain Jolliffe et al. 329 comparability over time, but one common view is to avoid making unnecessary changes to the data collection processes which may induce a break in comparability of the data series. • Granular: Data that can be broken down by subgroups. Often for data to maximize their value they need to be broken down by subgroups, such as geographical areas, time, sex, disability status, and more. This requires these subgroups to be captured appropriately in the data. The level of detail available with granular data means that de-identifying subjects becomes critical to protecting the privacy of the subjects. Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 Easy to Use: • Accessible: Data that are made available to a wide audience of users. Failing to make data available to relevant users prevents them from being used effectively. Making data accessible entails that the data are machine-readable, meaning that content can be processed by computers, and, where relevant, that the data are made available free of charge. Not all data should be made available to everyone, and to ensure the safe use of data, accessibility may be condi- tioned on a terms of use agreement and on other restrictions, such as how and where data can be accessed. • Understandable: Data that are easy to understand, process, and use. Making data accessible is not enough for them to be used. To facilitate use, the data need to be easy to understand as well. This entails that the data come with metadata describing how they were gener- ated/collected and how to process them. For certain applications, understandability neces- sitates that the websites hosting the data are translated to English (Ekhator-Modayode and Hoogeveen 2021). For other applications, the value and use of data can also be maximized when they are summarized or visualized in figures, tables and more. • Interoperable: Data that can be linked to other data sources. For data to maximize their use, it is often necessary to link and combine different data sources through common identifiers for persons, facilities, firms, geospatial coordinates, time stamps, or common classification standards. This ensures that information from multiple datasets can be combined, maxi- mizing the potential uses. Interoperability increases when data are conforming to specific standards. Interoperability amplifies the risk of data breaches and misuse, implying that terms of use agreements ought to be in place for users wishing to merge different datasets. Safe to Use: • Impartial: Data that are immune to the influence of stakeholders. For data to be used safely, they need to be immune to harmful influences of any stakeholders in the data lifecycle, such as funders, producers, or users. If stakeholders can negatively influence data such as altering or censoring data values to promote some agenda, the data lose credibility and prevent ob- jective and accurate data analysis. Lack of impartiality can have externalities on other data products; if users know that one data source has been meddled with, they may lose trust in all data products from the same institution. • Confidential: Data that protect personal and sensitive information and are only accessible in a safe and secure manner. Protecting personal and sensitive information requires de-identification of data, such that individuals or establishments cannot be identified in the data. Accessibil- ity in a safe and secure manner allows access for legitimate purposes but seeks to limit the scope for misuse of the data, which could imply, for instance, that data can only be accessed from certain virtual or physical enclaves or through systems preventing users from storing the data locally. 330 The World Bank Research Observer, vol. 38, no. 2 (2023) • Appropriate: Data that measure concepts of interest with a clear development purpose. One way to restrict misuse of data is to ensure that only data appropriate for measuring the concepts of interest for a clear policy purpose are produced and used, without attempts to collect excessive information or conduct surveillance. Appropriateness implies a proportionality or adequacy principle, by which the amount of data is proportional or adequate to the need. Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 Interactions between Data Features To understand how value can be derived from these features it is constructive to cre- ate a data production function, which is a function of the 12 features: supply = f (x1 , x2 , . . ., x12 ).2 Though such a function cannot be estimated in practice, the signs of its partial and cross derivatives are often known. As a general rule, the more present a feature is, the greater potential the data bring to development, meaning that ∂ f /∂ xi > 0. But there are important caveats and nuances to this broad state- ment, given that the effect of one feature on the value of data is not independent of the other data features. More specifically, there are three important ways in which value is determined from the interaction of features: (a) the feature least present is often the constraining factor on the value that can be derived (i.e., increasing other features may not increase value if the least present feature is unchanged), (b) at times there are positive spillovers between features, (c) at times there are negative spillovers between features. 1. The importance of the features least present. Often the derived value will be de- termined by the feature least present. When this is the case, the data produc- tion function can be represented through a Leontief specification: supply = min (x1 , x2 , . . ., x12 ). Such a situation is comparable to Kremer’s O-Ring theory of economic development, where one malfunction in the value chain in produc- tion can dramatically reduce a product’s value (Kremer 1993). More colloquially, this is sometimes expressed as a chain being only as strong as its weakest link. If, for instance, nearly all features are present, but the data are not produced with adequate coverage, they may not be able to serve their intended policy objective. When data are not of high quality, they could misguide policy decisions. When the data are not easy to use, they might not be used at all. When data are not safe to use, they may end up causing harm due to data breaches, surveillance, or exclusion. 2. Positive spillovers between features. The strength of certain features in a dataset may nurture progress in other features. For the data production function, this implies that the cross derivatives are positive in such cases: ∂ 2 f /∂ xi ∂ x j > 0. For exam- ple, when data are made accessible to a larger public, it tends to create a sense of greater scrutiny of the data which fosters impartiality (or, in other words, serves as a check on efforts to manipulate data). The argument also goes the other way; weakness in a feature might cause weakness in another feature. For example, when data quality is weak, and especially when methodological foundations are Jolliffe et al. 331 vague, it becomes easier to manipulate statistics in one direction or another, vio- lating impartiality (Jerven 2019). 3. Negative spillovers between features. On other occasions the presence of one feature has a negative impact on other features. In such cases, the cross derivatives of the data production function are negative: ∂ 2 f /∂ xi ∂ x j < 0. For data producers, this can create trade-offs and conflicts between features. For example, when advanc- Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 ing one feature is costly—such as collecting more data, which can help with cover- age and accuracy—this may limit resources that can be devoted to other features. Data producers may also have to balance temporal comparability with the need to revise the data collection instruments if the behavior of the enumerated entity changes significantly (e.g., list of consumption items may need to change if new items are introduced into society) or if international standards of data collection are updated over time, in part to facilitate the adoption methods to increase accu- racy. And perhaps most importantly, the potential harm from data not being safe to use increases when certain other features are in place (Hasanzadeh et al. 2020, Heijlen and Crompvoets 2021). For example, when data are granular, interopera- ble, and have wide coverage, they can be misused more easily and more effectively for illicit ends, magnifying the risk of confidentially violations. In essence, because these features make the data very useful and usable, they multiply the ways in which the data could be used to cause harm. The Equilibrium Data Supply and Demand Even if the 12 features are present in some public sector data, the data need not gen- erate value for development. For the value to be realized, the data also need to be de- manded and used effectively for legitimate purposes. Just like we have specified a data supply function, we could likewise specify the de- terminants of a data demand function, which would include data literacy, infrastruc- ture, incentives, and more. Discussing the determinants of data demand is beyond the scope of this paper and is covered in greater detail in World Bank (2021b). Here we want to discuss whether the market for data—the intersection of data supply and data demand—clears with an efficient amount of data being produced. In practice, this does not tend to be the case for several reasons, three of which are: 1. Markets for public sector data are largely non-existent. In private markets, prices re- veal to producers of private goods the efficient supply of goods to provide. In the case of public sector data, there are no prices that inform governments of how people value data. Without prices, there are no economic forces that help ensure an equilibrium outcome where supply equals demand. 2. Government data supply is often a public good. Most public sector data, really data in general, can be characterized as having attributes of a public good. In particular, 332 The World Bank Research Observer, vol. 38, no. 2 (2023) data are nonrival—one person’s use does not diminish another person’s use. As with all public goods, this tends to lead to an undersupply. 3. Data use is often a positive externality. Most legitimate data use has positive impacts on individuals beyond the one using the data. This means that public sector data is a positive externality and by consequence that both the use and production of the data is below the optimal level. Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 These three issues have been widely recognized and led to market interventions, most notably large international and national efforts aimed at boosting the capac- ity of national statistical offices and at boosting data literacy (World Bank 2021b). Though these efforts certainly are important, our analysis sheds light on a further aspect of the sub-optimal supply of data. In particular, we argue that sub-optimal supply is not only related to a sub-optimal amount of data supplied but also to the wrong kind of data supplied—data that do not have the 12 features listed above. Im- proving the feature least present may in many cases be quicker and cheaper than pro- ducing new datasets or improving data literacy. Making data more understandable, for example, does not require producing new data and increases the use of data at any given data literacy level. Likewise, introducing safeguards that minimize the poten- tial for data to lose its impartiality comes at minimum economic costs and is likely to increase the trust users have in data and hence have positive spillovers on other data products. Case Studies This section relies on case studies illustrating how each feature can bring about pos- itive change or how the absence of a feature has resulted in worse development out- comes. Most examples involve several features either directly or indirectly, and as out- lined above, having in place a single feature in isolation is seldom enough to bring about value. Completeness When data cover the entire population of interest it is possible to make inferences that one otherwise would not be able to. One of the most fundamental ways in which countries can make sure their data cover their entire population of interest is through population registration systems and by assuring that all individuals are covered in government databases. In Thailand, at the turn of the century, only 71 percent of the population was covered by a public health insurance scheme that was intended to be universal. Yet the country had a near-universal population register in which citizens were issued a personal identification number when they were born or when their household was registered for the first time. Leveraging this register and the Jolliffe et al. 333 personal identification information from the existing public insurance scheme, the government was able to identify the population not covered and increased health in- surance from 71 percent to 95 percent (World Bank 2020a and World Bank 2018a). In the absence of nation-wide administrative data, representative survey data ful- fill a similar role. In Nigeria, for example, the government commissioned the 2015 National Water Supply and Sanitation Survey to understand access to water and Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 sanitation of all households in Nigeria. As part of the survey, 201,842 households, 89,721 improved water points, 5,100 water schemes, and 51,551 public facilities in- cluding schools and health care facilities were enumerated. The data revealed that sanitation services of 130 million Nigerians did not meet the standard for sanita- tion as expressed in the United Nation’s Millennium Development Goals. More specif- ically, the data revealed that inadequate access to water is particularly an issue for poor households, and that public expenditure in water and sanitation is limited and of poor quality (Figure 2) (World Bank 2017). In response to these findings, Presi- dent Muhammadu Buhari declared a state of emergency in the sector and launched the National Action Plan for the Revitalization of Nigeria’s Water, Sanitation and Hygiene (WASH) Sector (Nigeria Federal Ministry of Water Resources 2018). Pres- ident Buhari pledged that his administration would prioritize WASH infrastructure development, long-term planning, and stakeholder coordination. In recognition of the sector-wide crisis, the government requested from the World Bank a 700 million USD lending operation to support the sector. Frequency When data are frequent, they can be better used for monitoring. Often, this requires data that are at least annual, so they can inform annual budget and policy decisions. This has been the case in Costa Rica where a Multidimensional Poverty Index using a household survey has been adopted as an official measure to inform and monitor poverty reduction strategies. The index reveals how the country is performing along key indicators related to education, health, housing, employment, and social protec- tion. In May 2016, a Presidential Directive was issued stating that the index should be used for budgetary planning and as an official measure for allocating resources and monitoring and evaluating social programs. Through this Directive, the index has been used to modify the allocation of resources, which helped accelerate poverty reduction during austerity without an increase in the budget (Multidimensional Poverty Peer Network 2017). When data are infrequent, the consequences can be dire. A study of intergov- ernmental fiscal transfers in Bolivia, Ecuador, and El Salvador revealed that trans- fers can be misallocated in the absence of frequent data. Since the transfers rely on subnational population estimates and given that recent population estimates are not always available, the transfers at times rely on outdated data. By employing 334 The World Bank Research Observer, vol. 38, no. 2 (2023) Figure 2. Complete Data Pinpointed Areas of Nigeria that Needed Better Sanitation Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 Source: World Bank (2017). retrospective census estimates, Roseth et al. (2019) were able to simulate how trans- fers would have been allocated had population data been available at the time of their allocation. In El Salvador, inaccuracies in municipal population sizes led to 92 million US dollars being misallocated between 2000 and 2007. This is equivalent to more than 27 times the annual budget of the national statistical office and 8 times the bud- get of the latest census. Timeliness When data are timely, they can lead to better emergency response when disasters hit, whether environmental, financial, health, or conflict related. For example, weather data, especially weather forecasts, can help people anticipate and prepare for extreme Jolliffe et al. 335 events. The value of such data can be illustrated by two intense cyclones in the Bay of Bengal that occurred 14 years apart. The 1999 cyclone caught the Indian state of Odisha by surprise, causing massive devastation, killing more than 10,000 people, and destroying housing and public infrastructure. Since then, the Odisha State Disas- ter Management Authority and the government of Odisha have invested in weather forecast data and disaster response measures. When another cyclone hit in 2013, Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 nearly 1 million people were evacuated to cyclone shelters, safe houses, and inland locations, and only 38 people died during and immediately after the storm (Hallegatte et al. 2017). These impressive results would not have possible without the weather data that gave sufficient advance warning of the cyclone. Satellite data can likewise offer a timely response to emerging threats. In Sri Lanka, where more than two million hectares of natural forest are home to over 30 endan- gered mammals, satellite data have been used to safely and quickly respond to defor- estation of protected areas (Jamilla and Ruiz 2020). The Department of Forest Con- servation used to rely mostly on routine patrolling, which can be a strain on resources and be a dangerous task due to the terrain and animals that may be encountered by patrollers. Propelled by Covid-19, which restricted the use of patrolling, the de- partment now relies on a mobile app provided by the Global Forest Watch. The app relays real-time satellite imagery to alert users of possible deforestation and provide evidence when deforestation has occurred. In one case, a popular local TV channel reported that encroachments had taken place without being addressed by the depart- ment, upon which the department used the satellite data and alerts to identify two locations of encroachment. As a result of this, legal actions were taken to stop the encroachments from expanding. Granularity Data can be granular along a number of dimensions, for example granular in space, granular in time, or granular in demographic attributes, such as sex. Spatial gran- ularity can help target resources and foster inclusion. In Croatia, for instance, data from the population census were combined with household survey data and admin- istrative data to create detailed maps of poverty and deprivations (Figure 3) (Croatian Bureau of Statistics 2016 and Croatian Bureau of Statistics 2017). The maps re- vealed large differences in living standards across municipalities and within the re- gions used for allocating funds from the European Union (EU). More than one-third of the EU’s annual budget—equivalent to more than €50 billion—is dedicated to invest- ments in infrastructure, such as hospitals and schools, in less economically developed regions. The allocation of the funds depends on regions’ gross domestic product per capita, which means that poor municipalities situated in non-poor regions may be prevented from receiving funding. Armed with the poverty map, Croatia responded with proposals for new regional divisions that concentrate EU funds in the poorest 336 The World Bank Research Observer, vol. 38, no. 2 (2023) Figure 3. Mapping Pockets of Poverty in Croatia Allowed Better Targeting of Antipoverty Funds Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 Source: Croatian Bureau of Statistics (2016). areas (Government of the Republic of Croatia 2019). This reordering, thanks to bet- ter data and analysis, has the potential to reduce inequality and pockets of poverty in Croatia. In the absence of granular data, resources may be poorly allocated. This was the case in Sierra Leone, which lacked granular data when making 1,561 water in- vestments across 12 districts in 2012. The investments reached 28,556 individu- als, a majority of which were located in areas where the surrounding populations were already served by other functional water points. A retrospective analysis using highly granular water point data made available through the Water Point Data Ex- change (WPdx)—an online platform for rural water point data sharing, access, and analysis—showed that the results could have been very different if granular data were available and used in 2012. Had the WPdx data been available and used for the investments in 2012, it would have been possible to reach nearly 4 for times as Jolliffe et al. 337 many people with only about a third of water point investments. This is equivalent to a reduction in costs per-person reached from 54.66 to 3.94 USD (WPdx 2022). Accuracy In the absence of a sound methodological base, indicators derived from data may be Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 inaccurate and misleading, limiting their policy relevance. In the Middle East and North Africa, basic statistics such as unemployment rates are measured with impre- cise definitions, blurring the line between unemployment and informality which dis- tort the role of women and rural areas in assessments of national employment. Small changes in the definitions have large implications on the quantity of interest (Arezki et al. 2020). Inaccurate data also pertains to GDP figures. In 2010, Ghana officially revised its GDP figures upwards by about 60 percent. This change suggested that Ghana’s economic performance had been significantly underestimated for years. The revision was the result of moving from the 1968 System of National Accounts guidelines to the 1993 vintage as well as accounting more accurately for emerging sectors of the Ghanaian economy (Jerven and Duncan 2012). Similar upwards revisions oc- curred in other African nations, including Nigeria, Senegal, Kenya, and Zimbabwe (Koumane et al. 2019). While these revisions have no real direct effect on improving wellbeing, an accurate assessment of a country’s output better informs macroeco- nomic policies, affecting well-being indirectly. It also improves cross-country compa- rability which allows countries to better monitor relative performance. Comparability When data lack comparability, they will be of limited use. A good example of this is COVID-19 case data, which, though it arrived daily in most countries of the world, could only imperfectly be used for within-country comparisons over time. The pri- mary reason for this is that as countries increased their testing capabilities over time, more people were reported as having contracted the virus. Though increased testing is critical, this made data on confirmed cases less comparable over time within coun- tries. Conversely, when a country’s data is comparable to that of other countries, it makes it possible to benchmark their performance against peers, which they can use to evaluate national policies and assess national priorities. Countries often respond with reforms in areas where they are lagging. As one example, the Democratic Repub- lic of Congo (DRC) made gender equity reforms upon seeing data from the Women, Business and the Law Index—an index created by the World Bank to compare laws and regulations affecting women’s economic opportunity across economies. The re- form effort led to changes the DRC’s Family Code, which for decades contained legal 338 The World Bank Research Observer, vol. 38, no. 2 (2023) provisions that prevented married women from carrying out economic activities. The adoption of a new Family Code in July 2016 allowed married female entrepreneurs in DRC to start formal businesses, open bank accounts, register a company, and perform a host of other economic activities without interference from their husbands. Accessibility Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 When data producers make data widely available, they empower individuals to make better choices through more information and knowledge. The digital revolution has increased the potential and ease with which information can be shared. In Ethiopia, small-scale farmers lack access to reliable price information, and thus often receive prices far below market value. In 2008 the Ethiopian Commodity Exchange opened, providing price information to farmers through text messages, hotlines, and online information. As a result, the farmers have been able to cut trader margins by half and increase their revenue (Vaitla et al. 2015). Another example where data attained greater value when it was made accessible relates to public procurement. Too often, public projects are not implemented ade- quately due to poor procurement, such as inflated costs, corruption, or ghost con- tracts. Since 12 percent of global GDP is spent on public procurement, this matters tremendously for development outcomes (Bosio and Djankov 2020). In Uganda, in an attempt to improve procurement outcomes, local government entities made ad- ministrative procurement data from the bidding process, down to the execution of contracts, available to certain Civil Society Organizations (CSOs). The CSOs trained community members to understand the information in the contracts and conduct site checks to verify it. The findings revealed mismanagement of resources by contractors and government officials and high dependence on noncompetitive contracts. Aside from the direct benefits of assuring that contracts complied with national procure- ment standards, the national public procurement agency upgraded its procurement portal in line with international open contracting data standards, making Uganda the first African country to do so (Africa Freedom of Information Centre 2018 and Global Partnership for Social Accountability 2020). Understandability Data accessibility is not enough to ensure that data are used by governments or indi- viduals when the data are difficult to understand or there is a lack of skills to under- stand how to use the data. This was the case in Brazil where receiving summarized data on research findings was instrumental for mayors to make policy changes. Evi- dence from 2,150 municipalities found that informing municipal mayors of research findings on the effectiveness of a simple policy change, increased the probability that their municipality implemented the policy by 10 percentage points (Hjort et al. 2021). Jolliffe et al. 339 Another example, where making the data more understandable helped guide pol- icy choices, comes from Pakistan. Prior to 2008, Pakistan’s Punjab province suffered from poor government service delivery due to inefficiencies, rent seeking and more. Lack of digitized service delivery processes made it impossible to track service deliv- ery and monitor performance and satisfaction with services. In an attempt to take on these challenges, in 2008, officials in one district of Punjab put in place a pilot Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 Citizen Feedback Monitoring Program (CFMP), in which data on service provision from citizens was crowd-sourced using simple text messaging and other information and communications technology (ICT) applications. Analytical reports were sent to government officials enabling them to identify patterns and take evidence-based cor- rective measures. In 2012, the initiative was scaled up to 36 districts of the province and across 25 different public services. As of 2019, CFMP had contacted 29 million citizens to solicit their feedback, and the government of Punjab had taken 41,600 corrective measures in response to CFMP data, including warnings, penalties and sus- pension. One of many results of this is that the availability of medicine increased from 46 percent to 77 percent (Global Delivery Initiative 2019). In a 2015 evaluation of the program, 90 percent of respondents said it had helped build trust between citizens and authorities (Masud 2015). Interoperability When data sources are interoperable the potential use and value of data increase. As one example, interoperability can help governments prioritize scarce resources by cutting costs by eliminating duplicate or ghost recipients of social transfers— beneficiaries, often of pension funds, who are no longer alive. This was the case in Argentina where the government identified noneligible beneficiaries across various social programs using the country’s system of unique taxpayer ID numbers to link datasets, generating an estimated savings of 143 million USD over eight years (World Bank 2020a and World Bank 2018b). Interoperability can also induce cost savings for the private sector, particularly data containing key national identifiers of companies, individuals, geographical units, and other entities, which allow for easy linking with the company’s own data. National ID systems, for example, can increase private sector efficiency through cost savings such as removing the need for companies to create their own ID systems, and increase rev- enues by expanding the potential customer base (World Bank 2020a and World Bank 2018c). In India, Aadhar, the unique 12-digit identity number that can be obtained by residents and citizens of India, has reduced the need for identity verification for firms. As a result, it is estimated that a firm’s typical onboarding costs for new em- ployees could decrease from 1,500 rupees to 10 rupees—a reduction of 99.3 percent (World Bank 2020a and World Bank 2018c). 340 The World Bank Research Observer, vol. 38, no. 2 (2023) Impartiality When data are not impartial, the potential for misuse increases and trust in data can be eroded. Assuring that data cannot be manipulated by producers can have positive consequences on government budgets. India saved 19 percent on a rural employ- ment guarantee scheme—the world’s largest workfare program—after introducing Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 e-governance to the payments system. Most of the savings were incurred because lo- cal officials were no longer tasked with distributing the funds and thus could not mis- manage them and misreport the data. The change reduced officials’ personal wealth by 10 percent (Banerjee et al. 2020). The World Bank recently underwent a case in which the impartially of one of its key data products was called into question. An independent external re- view established “data irregularities”—data choices not immune to the influence of stakeholders—with respect to four countries’ rankings in the annual Doing Business report. One of these was with respect to China’s 2018 ranking, where the indepen- dent review found World Bank senior leadership to push for last-minute methodolog- ical changes in an attempt to boost China’s ranking. This came in the backdrop of high-ranking Chinese government officials expressing concerns that the country’s ranking did not reflect its economic reforms at a time where the World Bank was heavily reliant on China in negotiations for its capital increase goals (Machen et al. 2021). As a result of finding these irregularities, the credibility of the Doing Business report was compromised and the World Bank decided to discontinue it. Confidentiality When data are not confidential, individuals may be identifiable. This violates core principles of data protection and increases the likelihood that harm can be done to the identified subjects. De-identifying individuals is not always enough to maintain con- fidentiality. In the 1990s, for example, the Governor of Massachusetts approved mak- ing de-identified medical records of state employees available for researchers. Though the data had key identifiers such as name and addresses removed, by triangulating the information available with other public information, a researcher was able to iden- tify the medical records of the governor. Other individuals could likewise be identified (Heffetz and Ligett 2014). Beyond re-identification, data breaches also pose a threat to confidentiality because they raise doubts as to whether personal information can be safe. Aadhar, the unique 12-digit identity number of India mentioned before, has suffered from several data leaks. In one instance, more than 200 government websites accidentally made per- sonal data including demographic data, names, phone numbers, religion, bank ac- count numbers and more, available publicly on the internet (TECH2 2018). Though Jolliffe et al. 341 the data were quickly removed, such leaks often have permanent implications on con- fidence in data systems. Appropriateness To avoid data misuse and surveillance, the amount of data collected should not be ex- Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 cessive but appropriate for the particular needs. Governments’ tracking and contact tracing of infected and at-risk individuals through smartphone apps or phone loca- tion data during the COVID-19 pandemic shed light on this. One example of how to minimize the risk of surveillance beyond what is needed to minimize the spread of the virus, which has been taken up by some countries, is to rely on a protocol devel- oped by Apple and Google for their contact-tracing apps. With this protocol, phones in close proximity of each other will use low energy Bluetooth technology to exchange a temporary key code, which changes every 15 minutes. This information is stored on each phone, rather than in a centralized database. When a user reports a positive test result, the temporary key codes are used to notify other users of their potential ex- posure (Zastrow 2020). This solution serves the intended purpose of notifying users that may have been exposed to the virus, while being relatively privacy-preserving. It cannot be used to enforce quarantine, track, or identify individuals, limiting the scope for surveillance and other inappropriate uses (Privacy International 2020). In contrast, some countries have taken a more expansive approach to contact tracing, in which more sensitive information was collected and used and, in some instances, shared with law enforcement agencies. In Israel, mobile phone location data was used by the domestic security agency to identify individuals exposed to the virus and by the police to enforce quarantine, a practice which was subsequently challenged as unconstitutional in the Supreme Court (Amit et al. 2020; Bradford et al. 2020). India’s contract-tracing app stores location data alongside a set of de- mographic information, including age, gender, phone number, and travel history, on a centralized server. It was initially mandatory for air and rail travel as well as for government employees to return to work (Arun 2020; Bradford et al. 2020; Privacy International 2020). Conclusion Around the world, more and more data are produced in the public sector, yet more value could be reaped from these data. In this paper, we have presented a con- ceptual framework and empirical arguments suggesting why the returns to ex- isting public sector data are not maximized. We show that for data to yield re- turns for governments they must live up to certain features, these being that they are of adequate spatial and temporal coverage (complete, frequent, and timely), of high quality (accurate, comparable, and granular), easy to use (accessible, 342 The World Bank Research Observer, vol. 38, no. 2 (2023) understandable, and interoperable) and safe to use (impartial, confidential, and appropriate). We substantiate and validate these features through case studies covering a wide range of countries, topics, data types, and data uses—an approach we acknowledge may not satisfy all readers and appear prescriptive and opinion-based. These case studies, we argue, illustrate how the 12 features matter in practice for generating Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 value from data. Too often governments or other public sector institutions are concerned with pro- ducing a particular dataset, with less of an eye to how those data will generate value. It is imperative for governments not only to invest in more data, but also to ensure that data possess the right features. Often the value derived from data is determined by the features least present, emphasizing the need for a comprehensive analysis of which of the 12 features elicited in this paper may be lacking. This can ensure that the whole chain from data production to data use is in place and the data can come closer to realizing their value for improved wellbeing of the population and economic development. Notes Development Data Group The World Bank Group (@worldbank.org) 2121 Pennsylvania Av- enue, NW Washington, DC 20433, USA. Corresponding author: Daniel Gerszon Mahler (dmahler@worldbank.org). The author ordering was constructed through American Economic Association’s randomization tool (confirmation code: qhPJIjX8dDgr). All authors are with the Development Data Group of the World Bank. The authors are thankful for comments and feedback received from Andrew Dabalen, Gero Car- letto, Kathleen Beegle, Lucas Kitzmuller, Luis Alberto Andres, Paolo Verme, Robert Cull, Tim Kelly, Umar Serajuddin, and Vivien Foster. The authors are also thankful to the editor and three anonymous re- viewers of the World Bank Research Observer for valuable comments. In addition, they are thankful for the many colleagues who provided and reviewed case studies including Ann-Sofie Jespersen, Aparajita Goyal, Audrey Ariss, Benjamin David Roseth, Brian Banks, Emilia Galiano, Elizabeth Goldman, Florence Kondylis, Frederic Meunier, Hana Brixi, Isis Gaddis, Joao Pedro Wagner De Azevedo, Katy Sill, Madalina Papahagi, Marc Tobias Schiffbauer, Marelize Gorgens, Maria Poli, Natalia Baal, Natasha Rovo, Megumi Kubota, Paul Andres Corral Rodas, Sabina Alkire, Sonali Vyas, Stephane Hallegatte, Stephanie Jamilla, Tea Trumbic, Theresa Beltramo, Utz Johann Pape, and Zelalem Yilma Debebe. 1. Arguably, our framework can still speak to the features under which these other data types can be repurposed and used for public good, but a sufficiently in-depth discussion of these data types is beyond the scope of this paper. 2. See Dillon et al. (2020) for another example of a data production function. References Africa Freedom of Information Centre. 2018. “Eyes on the Contract: Citizens Voice in Improving the Performance of Public Contracts in Uganda.” https://africafoicentre.org/download/eyes-on-the- contract-citizens-voice-in-improving-the-performance-of-public-contracts-in-uganda/. Jolliffe et al. 343 Amaya, A., P. P. Biemer, and D. Kinyon. 2020. “Total Error in a Big Data World: Adapting the TSE Frame- work to Big Data.” Journal of Survey Statistics and Methodology 8 (1): 89–119. Amit, M., H. Kimhi, T. Bader, J. Chen, E. Glassberg, and A. Benov. 2020. “Mass-Surveillance Technolo- gies To Fight Coronavirus Spread: The Case of Israel.” Nature Medicine 26 (8): 1167–9. Arezki, R., D. Lederman, A. A. Harb, N. Y. L. W. Elmallakh, Y. Fan, A. M. Islam, H. A M. Nguyen, and M. Zouaidi. 2020. “Middle East and North Africa Economic Update, April 2020: How Trans- parency Can Help the Middle East and North Africa.” Country Economic Memorandum Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 No. 147545, World Bank, Washington, DC. http://documents.worldbank.org/curated/en/ 343911586470772558/Middle-East-and-North-Africa-Economic-Update-April-2020-How- Transparency-Can-Help-the-Middle-East-and-North-Africa. Arun, C., 2020. “India’s Contact Tracing App Is a Bridge Too Far.” Council on Foreign Relations. Digital and Cyberspace Policy Program, September 2. https://www.cfr.org/blog/indias-contact-tracing-app- bridge-too-far. Banerjee, A., E. Duflo, C. Imbert, S. Mathew, and R. Pande. 2020. “E-Governance, Account- ability, and Leakage in Public Programs: Experimental Evidence from a Financial Man- agement Reform in India.” American Economic Journal: Applied Economics 12 (4): 39–72. https://www.aeaweb.org/articles?id=10.1257/app.20180302. Biemer, P. P. 2010. “Total Survey Error: Design, Implementation, and Evaluation.” Public Opinion Quar- terly 74 (5): 817—848. Bosio, E., and S. Djankov. 2020. “How Large Is Public Procurement.” Let’s Talk Development, February 5. https://blogs.worldbank.org/developmenttalk/how-large-public-procurement. Bradford, L., M. Aboy, and K. Liddell. 2020. “COVID-19 Contact Tracing Apps: A Stress Test for Privacy, the GDPR, and Data Protection Regimes.” Journal of Law and the Biosciences 7 (1): . Carletto, C., A. Dillon, and A. Zezza. 2021. Agricultural Data Collection to Minimize Measurement Error and Maximize Coverage. Policy Research Working Paper No. 9745, World Bank, Washington, DC. Center for Open Data Enterprise. 2021. “Open Data Impact Map.” Accessed August 10, 2021. https://www.opendataimpactmap.org/usecases. Croatian Bureau of Statistics. 2016. “Croatia Small-Area Estimation of Consumption-Based Poverty.” https://razvoj.gov.hr/UserDocsImages//Istaknute%20teme/Kartom%20siroma%C5%A1tva// Croatia%20Small-Area%20Estimation%20of %20Consumption-Based%20Poverty%20(Poverty% 20Maps).pdf . ———. 2017. “Index of Multiple Deprivation: Conceptual Framework for Identifying Lagging Municipalities and Towns in Croatia.” https://razvoj.gov.hr/UserDocsImages//Istaknute%20 teme/Kartom%20siroma%C5%A1tva//Index%20of %20Multiple%20Deprivation%20- %20Conceptual%20framework_18.06.2019.pdf . data2x. 2019. “Big Data, Big Impact? Toward Gender-Sensitive Data Systems.” https://data2x.org/wp- content/uploads/2019/11/BigDataBigImpact-Report-WR.pdf . Dillon, A., D. Karlan, C. Udry, and J. Zinman. 2020. “Good Identification, Meet Good Data.” World Devel- opment 127: 104796. https://doi.org/10.1016/j.worlddev.2019.104796. Dubois, A., and L.-E. Gadde. 2002. “Systematic Combining: An Abductive Approach to Case Research.” Journal of Business Research 55 (7): 553–60. Eisenhardt, K. M. 1989. “Building Theories from Case Study Research.” Academy of Management Review 14 (4): 532–50. Eisenhardt, K. M., and M. E. Graebner. 2007. “Theory Building from Cases: Opportunities and Chal- lenges.” Academy of Management Journal 50 (1): 25–32. Ekhator-Mobayode, U. E., and J. Hoogeveen. 2021. “Microdata Collection and Openness in the Middle East and North Africa: Introducing the MENA Microdata Access Indicator.” Policy Research Working Paper No. 9892. World Bank, Washington, DC. 344 The World Bank Research Observer, vol. 38, no. 2 (2023) Global Delivery Initiative. 2019. “Improving Public Service Delivery in Pakistan through Citizen Feed- back.” https://www.globaldeliveryinitiative.org/sites/default/files/case-studies/cs_pakistancitizen_ v3a.pdf Global Partnership for Social Accountability. 2020. “Making Public Contracts Work for People: Expe- riences from Uganda.” Accessed June 30, 2021. https://www.thegpsa.org/stories/making-public- contracts-work-people-experiences-uganda. Global Partnership for Sustainable Development Data. 2021. “Value of Data Case Studies.” Accessed Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 August 10, 2021. https://www.data4sdgs.org/resources/value-data-case-studies. Gore, S. M., T. Holt, and I. P. Fellegi. 1991. “Maintaining Public Confidence in Official Statistics.” Journal of the Royal Statistical Society. Series A (Statistics in Society): 1–6. Government of the Republic of Croatia. 2019. “Gov’t Launches Changes to Country’s Statistical Subdi- vision.” News release, January 23. https://vlada.gov.hr/news/gov-t-launches-changes-to-country-s- statistical-subdivision/25178. Groshen, E. L. 2021. “The Future of Official Statistics.” Harvard Data Science Review 3 (4). https://doi.org/10.1162/99608f92.591917c6. Hallegatte, S., A. Vogt-Schilb, M. Bangalore, and J. Rozenberg. 2017. Unbreakable: Building the Resilience of the Poor in the Face of Natural Disasters. World Bank, Washington, DC. Hasanzadeh, K., A. Kajosaari, D. Häggman, and M. Kyttä. 2020. “A Context Sensitive Approach to Anonymizing Public Participation GIS Data: From Development to the Assessment of Anonymiza- tion Effects on Data Quality.” Computers, Environment and Urban Systems 83: 101513. Heffets, O., and K. Ligett. 2014. “Privacy and Data-Based Research.” Journal of Economic Perspectives 28 (2): 75–98. Heijlen, R., and J. Crompvoets. 2021. “Open Health Data: Mapping the Ecosystem.” Digital Health 7: 20552076211050167. Hjort, J., D. Moreira, G. Rao, and J. F. Santini. 2021. “How Research Affects Policy: Experimental Evidence from 2,150 Brazilian Municipalities.” American Economic Review 111 (5): 1442–80. Jamilla, S., and S. Ruiz. 2020. “Satellite Data Helps Sri Lankan Forest Officers Patrol during Pandemic, at a Safe Distance.” Global Forest Watch, July 23. https://blog.globalforestwatch.org/people/sri-lanka- covid-19-forest-monitoring/. Jerven, M., 2019. “The Problems of Economic Data in Africa.” In Oxford Research Encyclopedia of Politics. Oxford University Press. https://doi.org/10.1093/acrefore/9780190228637.013.748 Jerven, M., and M. E. Duncan. 2012. “Revising GDP Estimates in sub-Saharan Africa: Lessons from Ghana.” African Statistical Journal 15: 13–24. Koumane, C. Y., B. B. N. Kalimi, and F. Pirlea. 2019. “Many African Economies Are Larger Than Previously Estimated. World Development Indicators Stories.” World Development Indicators Sto- ries, September 10. https://datatopics.worldbank.org/world-development-indicators/stories/many- economies-in-ssa-larger-than-previously-thought.html. Kremer, M. 1993. “The O-Ring Theory of Economic Development.” Quarterly Journal of Economics 108 (3): 551–75. Machen, R. C., M. T. Jones, G. P. Varghese, and E. L. Stark. 2021. “Investigation of Data Irregularities in Doing Business 2018 and Doing Business 2020: Investigation Findings and Report to the Board of Executive Directors.” WilmerHale. Masud, M. O. 2015. “Calling Citizens, Improving the State: Pakistan’s Citizen Feedback Monitor- ing Program, 2008–2014.” Innovations for Successful Societies, Princeton University, Prince- ton, NJ. https://successfulsocieties.princeton.edu/publications/calling-citizens-improving-state- pakistan%E2%80%99s-citizen-feedback-monitoring-program-2008-E2%80%93 Multidimensional Poverty Peer Network. 2017. “Dimensions.” August, Number 4. https://www.mppn.org/wp-content/uploads/2017/08/Dim_4_ENGLISH_online.pdf Jolliffe et al. 345 Nigeria Federal Ministry of Water Resources. 2018. “National Action Plan for Revitalization of the WASH Sector.” June 26. https://waterresources.gov.ng/policy-documents/June 26, 2020. OECD. 2011. “Measuring Trust in Official Statistics.” https://www.oecd.org/sdd/50027008.pdf Open Data Watch. 2020. “Data Impact Case Studies.” Accessed November 1, 2020. https://dataimpacts.org/case-studies/. Privacy International. 2020. “Covid Contact Tracing Apps Are a Complicated Mess: What You Downloaded from https://academic.oup.com/wbro/article/38/2/325/7118955 by Joint Bank-Fund library user on 04 September 2023 Need To Know.” https://privacyinternational.org/long-read/3792/covid-contact-tracing-apps-are- complicated-mess-what-you-need-know Roseth, B., A. Reyes, and K. Y. Amézaga. 2019. “The Value of Official Statistics: Lessons from Intergov- ernmental Transfers.” Inter-American Development Bank. https://doi.org/10.18235/0001883. Statistics Canada. 2017. Data quality toolkit, release data September 27. https://www.statcan.gc.ca/ eng/data-quality-toolkit TECH2. 2018. “Aadhaar Security Breaches: Here Are the Major Untoward Incidents That Have Happened with Aadhaar and What Was Actually Effected.” https://www.firstpost.com/tech/news- analysis/aadhaar-security-breaches-here-are-the-major-untoward-incidents-that-have-happened- with-aadhaar-and-what-was-actually-affected-4300349.html United Nations. 2019. “United Nations National Quality Assurance Frameworks Manual for Official Statistics.” https://unstats.un.org/unsd/methodology/dataquality/un-nqaf-manual/#UN- NQAF-Manual https://dataimpacts.org/project/health-surveys/. Vaitla, B., C. Wells, and C. Van Horn. 2015. “Market Data Raise Farmers’ Income.” Data Impacts Case Studies. https://dataimpacts.org/project/market-data-raise-farmer-income/. Verhulst, S., and A. Young. 2016. “Open Data Impact: When Demand and Supply Meet.” Accessed Au- gust 10, 2021. https://thegovlab.org/static/files/publications/open-data-impact-key-findings.pdf . Wang, Y. C., and K. DeSalvo. 2018. “Timely, Granular, and Actionable: Informatics in the Public Health 3.0 Era.” American Journal of Public Health 108 (7): 930–4. Wilkinson, M. D., M. Dumontier, I. J. J. Aalbersberg, G. Appleton, M. Axton, A. Baak, and N. Blomberg et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1): 1–9. World Bank. 2015. Open Data for Sustainable Development. Transport and ICT, World Bank, Washington, DC. ———. 2017a. A Wake Up Call: Nigeria Water Supply, Sanitation, and Hygiene Poverty Diagnostic. World Bank, Washington, DC. ———. 2018a. The Role of Digital Identification for Healthcare: The Emerging Use Cases. World Bank, Wash- ington, DC. ———. 2018b. Public Sector Savings and Revenue from Identification Systems: Opportunities and Constraints. World Bank, Washington, DC. ———. 2018c. Private Sector Economic Impacts from Identification Systems. World Bank, Washington, DC. ———. 2020a. Benin, Burkina Faso, Togo and Niger - Second Phase of West Africa Unique Identification for Regional Integration and Inclusion (WURI) Project. World Bank, Washington, DC. ———. 2021a. “Statistical Performance Indicators.” World Bank, Washington, DC. Accessed Septem- ber 16, 2021. https://www.worldbank.org/en/programs/statistical-performance-indicators. ———. 2021b. “World Development Report 2021: Data for Better Lives.” World Bank, Washington, DC. https://wdr2021.worldbank.org/ WPdx. 2022. “Data Use Impact Desktop Study.” https://www.waterpointdata.org/wp-content/ uploads/2022/02/Data-Use-Impact-Desktop-Methodology-and-Results_revised.pdf Zastrow, M. 2020. “Coronavirus Contact-tracing Apps: Can They Slow the Spread of COVID-19? Nature. https://doi.org/10.1038/d41586-020-01514-2 346 The World Bank Research Observer, vol. 38, no. 2 (2023)