Policy Research Working Paper 10598 MiDES New Data and Facts from Local Procurement and Budget Execution in Brazil Ricardo Dahis Bernardo Ricca Thiago Scot Nathalia Sales Lucas Nascimento Development Economics A verified reproducibility package for this paper is Development Impact Group available at http://reproducibility.worldbank.org, November 2023 click here for direct access. Policy Research Working Paper 10598 Abstract This paper introduces a new disaggregated and harmonized action. It then uses these data to provide new stylized dataset on public procurement and budget execution by facts about local public finance. First, it shows that about Brazilian subnational entities, which currently covers half one-quarter of government purchases are locally procured of Brazilian municipalities and spans the years 2003–21. and discusses implications for efficiency. Second, it demon- This dataset provides key information that was previously strates that close to 15 percent of payments exceed the unavailable from aggregate data, such as the identities of 30-day threshold and that payment timeliness is systemat- suppliers, details on purchases of goods and services, and ically correlated with the income level of the municipality. granular information on the life cycle of each expenditure This paper is a product of the Development Impact Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at tscot@worldbank.org.@worldbank.org. A verified reproducibility package for this paper is available at http://reproducibility.worldbank.org, click here for direct access. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team MiDES: New Data and Facts from Local Procurement and Budget Execution in Brazil∗ Ricardo Dahis1,5 , Bernardo Ricca2 , Thiago Scot3 , Nathalia Sales4 , and Lucas Nascimento3,5 1 Monash University 2 Insper 3 Development Impact Evaluation (DIME), World Bank 4 PUC-Rio 5 Data Basis Keywords: Subnational Public Finance; Procurement, Budget Execution JEL Codes: H50, H72 ∗ Dahis: ricardo.dahis@monash.edu; Ricca: bernardoogr@insper.edu.br; Scot: tscot@worldbank.org; Sales: nathalia.msales@gmail.com; Nascimento: lnascimentomorei@worldbank.org. The authors acknowledge gener- ous support from the World Bank DEC Research Support Budget (RSB). 1 Introduction The last decades have seen a remarkable expansion in the number of subnational units across the world and their roles in providing public services (Gadenne and Singhal, 2014; Grossman and Lewis, 2014; Dahis and Szerman, 2023). As their role in service provision increased, so did their economic relevance – in Brazil, municipal procurement was equivalent to approxi- mately 3% of GDP between 2002 and 2019, or about 25%-30% of total purchases by all levels of government – federal, state, and municipal (Thorstensen and Giesteira, 2021). Data on local governments are often only available at an aggregate level, e.g. the total amount of purchases of goods and services, the total amount spent on health services, and the total amount of accrued liabilities. Consequently, many simple yet important questions related to the public finances of these units remain unanswered. Which are the suppliers of local governments, and what are their characteristics (e.g., size, location)? What is the share of purchases that happen through competitive tenders versus non-competitive methods? When competitive auctions take place, what is the degree of competition? Another crucial dimension of local governments’ finances is how they execute their bud- gets. While aggregate commitment and spending amounts are easier to come by, we know little about the details of the budget execution. For instance, how long does it take to pay suppliers after deliveries are recognized? Is there a large variability in the payment timeli- ness across purchases within the same government? Do governments treat suppliers equally regarding payment timeliness? Answers to these questions can shed new light on the effec- tiveness of local governments and on their interaction with other economic agents. Similarly, when facing liquidity constraints, do governments prioritize the payment of some govern- ment functions at the expense of others? Payment delays impose several costs on suppliers and might affect the provision of public goods (Flynn and Pessoa, 2014). The importance of this issue is reflected in several recent regulations and initiatives that governments imple- mented in an attempt to shorten payment terms.1 Yet, the scarcity of granular data on the stages of the budget execution renders the answers to these questions elusive.2 In this paper, we introduce a new dataset on municipal public finance in Brazil (MiDES, Microdados de Despesas de Entes Subnacionais) that allows users to answer several of these questions.3 We collect, clean, and harmonize microdata on public procurement and budget 1 Examples include the QuickPay initiative, launched in 2011 in the United States; Regulation 113 of the Public Contracts Regulations, passed in 2015 in the United Kingdom; and the introduction of the Centralized Payment Platform (PPC) in 2020 in Chile. 2 According to Potter et al. (1999): “For the fiscal economist seeking to monitor budget execution, choosing which stage(s) of the expenditure management procedure to monitor is often constrained by information avail- ability. In principle, the data given at the verification stage may be particularly relevant because they measure the actual liability of the public entity and thus the accrued account liability. For example, if bills are verified promptly when they arrive, it allows a good measure of the potential arrears, where strict cash limits constrain the amounts available to make payments. But such information is rarely available.” 3 The English translation of Microdados de Despesas de Entes Subnacionais is Subnational Entities’ Expenditure Microdata. 2 execution that covers more than half of the total municipalities of Brazil and represents over 40% of the country’s population. Our procurement dataset allows users to see information on specific tenders, such as the number, reserve price, and description of items being sold, the number of participants in competitive tenders, and the identity of participants and winning parties. On the budget execution side, the data includes information on each commitment, verification (an important step in the budget spending when buyers recognize that a good or service was delivered), and payments, again allowing users to see the identity of payees and the amount and dates of each step of the budget execution. In particular, we can compute the time to pay a particular transaction by using the time elapsed between the verification and payment stages. The dataset we build is fully and publicly available on the Data Basis (Base dos Dados) platform (Dahis et al., 2022).4 The platform provides high-quality data at scale, with tools such as a curated search engine and an SQL-powered data lake where tables share a unified schema. The platform allows users to seamlessly query and merge hundreds of tables, across a variety of themes, directly on Google BigQuery.5 All code used to generate our dataset is publicly available on GitHub.6 We collect our data from State Audit Courts (Tribunais de Contas dos Estados, TCEs). These courts are independent institutions that supervise the public finances of the municipalities of their states. One important concern is the quality of the data – are municipalities providing accurate information on procurement and budget execution, or are they providing incomplete and selective information? We test the quality of our data by generating aggregates from our microdata and comparing them with information from the Brazilian Public Sector Account- ing and Tax Information System (Sistema de Informações Contábeis e Fiscais do Setor Público Brasileiro, SICONFI), a dataset maintained by the National Treasury to “facilitate the produc- tion and analysis of accounting and tax information, standardize consolidation mechanisms and increase the quality and reliability of accounting information, financial and fiscal statis- tics received from municipalities, states, the Federal District and the central government.” We show that our aggregates closely match those of the SICONFI, not only at the municipality- year level but often at the more disaggregated level of function, which is a classification that groups expenditures according to their purpose (e.g., health, education, security). We exemplify possible applications of our data by exploring two features of municipal public finances. First, we evaluate the extent to which local governments buy from suppliers located in the same municipalities. García-Santana and Santamaría (2023) show compelling evidence of home bias in the European Union and discuss its implications for the value-for- money of public procurement. We show how our microdata allow us to identify each gov- 4 Available as the Microdados de Despesas de Entes Subnacionais (MiDES) dataset at https://basedosdados.org/dataset/d3874769-bcbd-4ece-a38a-157ba1021514. 5 In Section A.1, we illustrate how the dataset can be seamlessly queried to perform descriptive analyses using R. 6 Available at https://github.com/basedosdados. 3 ernment supplier and identify their location using publicly available data from the Ministry of the Economy. We then document that approximately 25% of purchases are from suppliers located in the same municipality, but with wide variation across entities. Local purchases are less prevalent in smaller municipalities, which could be driven by the scarcity of local suppli- ers, but they are no more or less common when we compare competitive vs. non-competitive purchase modalities. Second, we document the extent of delays in payments to suppliers across municipali- ties in the country and how it correlates with municipal characteristics. Payment delays are considered one of the key barriers to the participation of small and medium-sized enterprises (SMEs) in public procurement since they often have less access to working capital loans and are unable to wait for longer periods of time before being paid by their clients (Barrot and Nanda, 2020; Breza and Liberman, 2017; Barrot, 2016; Conti et al., 2021). By law, public en- tities in Brazil are mandated to pay their clients in less than 30 days as a rule.7 However, late payments are common – we document that 15% of all payments to suppliers of goods and services in recent years are made in more than 30 days, and 20% of municipality-year observations have an average payment delay over 30 days. Payment timeliness is also sys- tematically correlated with local per capita GDP, with higher-income municipalities paying faster on average. The granularity of the data allows us to show that aggregate quantities, such as the average payment delay, might hide a strong variability in payment timeliness across suppliers and types of purchases. This is an instance in which microdata is crucial for a proper assessment of the quality of the budget execution process of a municipality. The remainder of this paper is organized as follows. In Section 2, we provide further in- stitutional details on procurement and budget execution in Brazil and present some statistics from our datasets. We then proceed to validate the quality of our budget execution data in Section 3, producing metrics of internal consistency as well as comparing aggregates con- structed from microdata to information available at the SICONFI. In Section 4, we illustrate two applications of this new dataset. Finally, we conclude in Section 5 by discussing other avenues of research using these new data. 2 Institutional context and data Our dataset on public procurement and budget execution of Brazilian municipalities is con- structed from data that TCEs assemble. The mission of these audit courts is to oversee the fiscal policy of states and municipalities, which includes taxation, spending, and budget exe- cution. TCEs are present in each of the twenty-six Brazilian states plus the Federal District.8 7 The Public Procurement Law (Law 8,666) contains most of the public procurement regulation. Article 40 describes the rules that govern payment conditions. See also here for a summary of payment rules. 8 In the states of Rio de Janeiro and São Paulo, the TCEs oversee all municipalities apart from the capitals, which have their own audit court. Three states – Bahia, Goiás, and Pará – have two audit courts: one that oversees the state government and one that oversees all municipal governments within the state. 4 All data used to construct our dataset are publicly available, often in transparency portals provided by the TCEs.9 The structure, coverage, and quality of the raw data vary across states and often across municipalities in the same state.10 For those reasons, the data require extensive upfront work to be harmonized, cross-checked, and cleaned for usage. The dataset we build is fully and publicly available on the Data Basis (Base dos Dados) platform.11 This is an ongoing project that currently covers 3,076 (out of 5,570) municipalities in 7 states (out of 27), and we expect to keep expanding the data to the extent possible.12 We provide all code used to download, clean, and produce the final datasets on GitHub. Below we provide more explanation of the institutional context and key variables related to public procurement and budget execution. 2.1 Public procurement Public procurement is the process through which governments acquire goods and services. As a rule, governments are expected to run competitive tenders when buying goods and ser- vices. The specific method used, such as reverse auctions, invitations to tender or framework agreements, will depend on the nature of the object to be acquired and its estimated value. In exceptional circumstances (such as when there is only one feasible supplier or during an emer- gency) or when purchase values are small, officials can waive tenders and directly contract with suppliers (see Table A.2 for more details). Our municipal public procurement dataset is organized into three tables: tender, tender- item, and tender-participant. In the tender table, one observation is a tendering procedure that some agency in a given municipality executes. Tenders are often comprised of several items, so they should be thought of as a batch of items that are separately sold – one tender might contain, for example, ten different items that are purchased from four different suppliers. Each tender often includes a general description of the items being purchased (like “purchase of cleaning goods for city hall” or “hiring of mechanical services”), classifiers for groups of goods/services, the modality used for the purchase, and the status of the tender (if it was completed, canceled or deserted, for example). It will also sometimes include the total estimated or budgeted value for that batch of goods. Each tender process is uniquely identified by the variable id_licitacao_bd. In the tender-item table, each observation is an item, linked to a specific tender. Items will often include a textual description of the specific purchase (e.g., “cleaning detergent 0.5L” or 9 We provide links to all the raw data we use in Table A.1. 10 In Appendix B we discuss in detail some aspects of the quality of the final harmonized dataset, such as the existence of unique identifiers to link different stages of the budget execution process. 11 Available as the Microdados de Despesas de Entes Subnacionais (MiDES) dataset at https://basedosdados.org/dataset/d3874769-bcbd-4ece-a38a-157ba1021514. 12 The first seven states included in the dataset were chosen due to data availability, granularity and ease of access. For some states we could not find any public data on the website of TCEs; in others, data were available for download but required complex scraping; yet others provide public data for download but the quality is low e.g. no detailed budget execution information. 5 “accounting services”), quantities, unit prices and total prices. In some cases, the item will separately describe the “quoted” prices, before the execution of the tender, and the price for the winning proposal. The item dataset also includes the name and document number of the winning entity, which can be a firm or individual. Each item is uniquely identified by the variable id_item_bd and can be connected to tenders by the variable id_licitacao_bd. Finally, in the tender-participant table each observation is a participant in a given tender. Some municipalities will only provide winners in each tender, so in that case these will be the same as those listed in the tender-item table. Others will include all participants in a tender, including those that did not win any contracts. The data includes names and document num- bers of each participant, as well as an indicator of whether they are winners. Each participant is uniquely identified by the pair of variables documento and razao_social, and can be connected to tenders by the variable id_licitacao_bd. 2.2 Budget execution Budget execution refers to the implementation of the annual budget that is approved by the lo- cal legislature. While much of the analytical work presented in this paper focuses on budget execution of public procurement, the tables on budget execution include all activities per- formed by municipalities, including payment of salaries, transfers, and others.13 The budget execution process in Brazil is similar to that in other countries.14 It consists of three distinct steps: commitment, verification, and payment. The commitment (empenho) phase is the moment when governments set aside part of the budget appropriated to them for a specific activity, such as buying goods from a supplier or paying salaries of health workers. From a budgetary perspective, this is often seen as the moment an expenditure is recognized, since committed amounts are deducted from the budget appropriation. The second step of the budget execution is the verification (liquidação). It occurs when the government acknowl- edges that a certain service or good has been provided. This is the equivalent of recognizing a debt with a provider and is considered an expenditure from an accrual accounting point of view. In fact, if an expenditure was verified but not paid, it is recorded as accounts payable (restos a pagar processados), and, as with firms, increases in this amount might reflect a dete- rioration in the ability of governments to meet short-term obligations.15 The final step of the budget execution is the payment (pagamento), when governments transfer the money to their suppliers. 13 Brazil adopts a budget classification system (see here for more details) in which the economic classification of expenses (elemento de despesa) is comprised of 69 groups that are identified by a two-digit code. For the analyses of the payment delays part of this paper, we restrict the data to three groups related to purchase of goods and materials: consumption material (code 30), material for free distribution (code 32), and equipment and permanent material (code 52). 14 For an overview of recommended budget execution practices and a cross-country comparison, see Potter et al. (1999), Chapter 4. 15 As of December 2022, the local and state governments registered as accounts payable to suppliers of goods and services (account Fornecedores e Contas a Pagar a Curto Prazo) a total of R$ 75.6 billion (or 0.76% of the GDP). 6 Our municipal budget execution dataset is organized into three tables: commitment, veri- fication, and payment. In the commitment table, one observation is a commitment by one agency in a given mu- nicipality. Commitments vary from very large expenses, such as the commitment for the entire wage bill of the mayor’s office in a month, to very specific commitments such as the acquisition of replacement parts for a car. Each commitment is often linked to four levels of “functional programming” that map the nature of a commitment (such as Transportation > Road Transportation > Road Recovery > Recovery of a specific road in a given street), as well as a text that explicitly describes the nature of the commitment.16 After an initial commit- ment, officials can increase or decrease the amounts committed as well as annul them. Across municipalities in our dataset, the quality of tracking these actions after initial commitment varies, but we include initial, increases, annulments, adjustments, and net amounts for each commitment when these are available. Each commitment is also linked to a unique identifier (id_empenho_bd) that allows users to connect a commitment to verification and payments linked to them. In the verification table, observations are verifications by some agency in a given munici- pality, which are always linked to a specific commitment. The key information available for each verification is the date when they happen, the initial value verified, any adjustments to the original value and the final value. One commitment might generate one or more verifica- tions, so each verification is uniquely identified by the variable id_liquidacao_bd and can be connected to commitments by the variable id_empenho_bd. Finally, for the payment table, each observation is a payment made to a specific entity. Payments can often (but not always) be linked back to a specific verification. The key infor- mation available in the payments table is the date of payment and values. The dataset also includes variables with the names and “document number” of payees - these are often unique national identifiers for individuals (CPF) and firms (CNPJ) but are not available for all pay- ments or municipalities. Again, one verification event can lead to one or multiple payments, so payments are uniquely identified by the variable id_pagamento_bd and can be linked back to verification events by id_liquidacao_bd and to commitments by id_empenho_bd. 2.3 Coverage and descriptive statistics Our dataset on municipal procurement and budget execution currently covers seven of Brazil’s twenty-seven states, highlighted in Figure 1: Ceará (CE), Minas Gerais (MG), Paraíba (PB), Pernambuco (PE), Paraná (PR), Rio Grande do Sul (RS), and São Paulo (SP). These are large states that cover a substantial share of the total number of municipalities (55%), population (48%), and GDP (49%) of the country according to 2020 data. Notably, our dataset only covers states in the South, Southeast and Northeast regions - data from the states in the North and 16 For a detailed discussion of expenditure functions see Manual do Orçamento (in Portuguese). 7 Center-West regions are currently not available.17 We provide further details on the geographical and temporal coverage of the dataset in Table 1. Starting with geographical coverage, our budget execution tables (commitment, ver- ification, and payment) are available for all seven states. The procurement data are less com- prehensive: the dataset currently includes no procurement data for SP, and the data for PB and PE include information on tenders and participants, but not on the more disaggregated level of items. In terms of temporal coverage, most of our budget execution data starts in the early- to mid-2000s, with the exception of PE (2012), PR (2013), and MG (2014), and currently runs until 2021. Once again, data on procurement is less comprehensive and, with the exception of CE (2009-2021), starts in the mid-2010s. In Table 2 and Table 3, we provide simple descriptive statistics from our public procure- ment and budget execution datasets, respectively. For procurement, across the six states cov- ered, we observe over 2.4 million unique tenders and almost 800,000 unique suppliers. Since we observe microdata on each tender and often items, we are able to compute statistics such as the share of tenders that are deserted and/or unsuccessful (e.g., no bidders in a competi- tive tender) and the average number of items listed in a tender. We can also document, for example, that across states approximately 30%-40% of tenders are not competitive auctions – meaning they are directly awarded to suppliers by means other than an auction – but in terms of total purchase value these non-competitive tenders always represent less than 20% of total amounts. This is consistent with the fact that, similar to other countries, Brazilian law allows small purchases to be performed without competitive auctions (Fazio, 2022). We can also compute measures of competitiveness in tenders, such as the number of participants per tender - the average fluctuates between two and four across the municipalities in our sample. The dataset often contains unique identifiers for each supplier, which in Brazil vary if the supplier is an individual or a firm, allowing us to compute the share of suppliers identified as firms (which vary substantially across states, from less than 60% in CE and PB to over 80% in MG). Statistics for the budget execution dataset are presented in Table 3. Our budget execution dataset includes over 880 million observations - over 250 million commitments, around 300 million verifications and over 300 million payments. For all three stages of the budget execu- tion, we present the total number of observations and the total number of distinct events – in some cases, we are unable to assess whether two commitment observations, for example, refer to the same commitment or not. In those cases, we set our respective identifier vari- able to missing to flag to users that we are unsure whether these are unique events that can be tracked across datasets. Using the budget classification discussed previously, we estimate that between 25%-35% of total budget commitment events are related to the procurement of 17 Municipalities that are in our sample are slightly richer and more educated than those that are not. In 2019, the average GDP per capita (literacy rate) in our sample was 26.8 thousand BRL (87.7%), while in the municipalities that are not in our sample it was 22 thousand BRL (82.5%). Population size, however, does not differ across the samples (37.7 thousand inhabitants in both). 8 goods and materials - this is a sample we exploit in more detail in the coming Sections. We also show that for an overwhelming proportion of commitments values are non-zero and can be matched to some verification and payment – allowing us to track the entire cycle of budget execution. We also highlight that our dataset currently encompasses over 3.5 trillion BRL in payments (in 2021 prices) or the equivalent of 38% of GDP in 2021, made to over 9 million different unique agents (identified by national tax IDs). We provide an illustrative example of the nature of our data in Figure 2, highlighting a case in which we can track the entire process of public procurement and budget execution.18 In this figure we zoom in on one case involving the town hall of Abatiá – a municipality with less than 10,000 inhabitants but relatively high income in the state or Paraná – which initiated a tender to procure uniforms for basic health employees, identified by ID 1336360. The tender was officially published in November 2018, with an initial budget of 17,116 BRL. The auction took place on December 13 of the same year, during which bids were received for 23 items, comprising different types of t-shirts and bags. The tender resulted in two win- ners, one for supplying t-shirts and another for supplying bags. Subsequently, the tender was officially homologated on December 18. One day after the tender’s homologation, four separate commitments were made – mean- ing that the local government set apart funds to pay for the purchases, once goods were de- livered. Verification of delivery happened in mid-May 2019, five months after the end of the tender. Less than 30 days later, three separate payments were made to different agents that won the contracts, with total values of 11,377 BRL and 3,890 BRL. For the purchase of this relatively simple good, a total of 235 days elapsed between the tender publication and the last payment. The example above shows the potential of this newly constructed dataset in providing re- searchers, policy makers, and the civil society with a granular view of how local governments in Brazil acquire goods and services and execute their budgets. It also illustrates the potential for similar datasets to be developed in other countries where scattered data might exist but require upfront investment to be collected, cleaned, and harmonized. 3 Validation The flagship dataset for information on municipal public finances in Brazil is the Sistema de Informações Contábeis e Fiscais do Setor Público Brasileiro (SICONFI).19 SICONFI contains self- reported information on municipalities’ revenues, expenditures, and balance sheets starting in 1989, with details on amounts per category and budget execution phase (i.e. commitment, ver- 18 We note that this connection between public procurement and budget execution is currently only possible for the state of Parana (PR), which provides a table connecting tender IDs to commitment IDs. Approximately 78% of the procurement IDs and 38% of total commitment IDs can be found within this correspondence table. 19 The dataset was formerly called Finanças Brasileiras (Finbra). 9 ification, payment). These data have been extensively used and validated in empirical research using data on public finances in Brazil (Gadenne, 2017; Corbi et al., 2019; Shamsuddin et al., 2021). Moreover, the federal government performs several checks to guarantee an adequate level of quality. Our data is much more granular than SICONFI, measuring individual commit- ments, verification and payments, but once we aggregate at levels such as municipality-year we should expect to match totals from SICONFI. A natural validation of the quality of our dataset therefore is to compare our aggregates with those provided by SICONFI.20 We perform the following exercises. First, we aggregate both amounts committed and paid at the municipality-year level in our new dataset and compare these values with information from SICONFI.21 Formally, we compute Dmt = (Tmt BE − TmtSICONFI SICONFI )/Tmt , where TmtBE represents total expenditures for municipality m and year t as calculated from our budget execution data and TmtSICONFI represents total expenditures as calculated from SICONFI data.22 In Figure 3, we present the histogram of the percentage deviation of committed amounts from SICONFI, across states. Our key takeaway is that for five states (CE, MG, PB, PE and SP), our aggregates are almost identical to those from SICONFI - for each state, over 75% of devia- tions are below 1%, and often precisely zero. For PR and RS our deviations are centered around zero but with larger mass slightly above or slightly below - in both states three-quarters of deviations are in the range [-0.5%, 5%], but with more mass for larger absolute deviations in some municipality-years. We also present the same deviations but considering total commit- ted amounts at the municipality-year-function level, where one observation is, for example, the total amount committed by one city for the Health function in 2020. We present these re- sults in Figure A.1, again documenting that our measures of aggregate commitments closely follow those available at SICONFI. We also provide the distribution of deviations from SICONFI in verification and payment amounts in Figure 4 and Figure 5, respectively. Here our concordance with SICONFI is less precise – while PB and SP show very small deviations, in the remaining states we systemati- cally overestimate total amounts of verification and payments. The distributions of deviations are somewhat similar across states, centered at around 5%-8% and with some mass up to 15%- 20% in some municipality-years – meaning that the aggregate amount in our data exceeds that in SICONFI by that amount. As reported in Figure A.4 for the states of MG and PR, these deviations do not seem to be driven by specific municipalities or to be systematically different across years, suggesting that there are consistent differences between our measures of veri- fication and payments and those of SICONFI. While we are unable to precisely explain those differences, we expect they are partly driven by amounts carried over across fiscal years, the 20 We note that SICONFI contains self-reported information and as a result can also contain errors. Moreover, some differences may arise due to different aggregation methods and the inclusion of government entities besides the municipal executive branch, such as state-owned companies. 21 We use the SICONFI dataset available on the Data Basis (Base dos Dados) platform at https://basedosdados.org/dataset/5a3dec52-8740-460e-b31d-0e0347979da0. 22 We are only able to compute these indicators for municipality-year observations available both in our dataset and SICONFI. 10 so-called restos a pagar. In most states we are unable to assert whether payments in a given year were committed and/or verified in the previous year; furthermore, annulments of verifi- cations and cancellations of payments can be common, and are not always precisely measured in our dataset. We believe both of these factors might drive the systematic overestimation of aggregate amounts in our dataset compared to SICONFI. Finally, we also highlight that results in both our dataset and SICONFI are self-reported by municipalities, meaning there could be systematic discrepancies not due to errors in one source, but due to inconsistent reporting by subnational agents. While we document that in some states we systematically overestimate total amounts veri- fied and paid, these deviations Dmt are only weakly correlated with municipal characteristics. We perform a predictive OLS exercise regressing Dmt on a list of observables including each municipality’s log population and log GDP per capita. In Table 4 we report results from regres- sions with only year fixed effects in odd columns and with year and state fixed effects in even columns. In odd columns we find that deviations are negatively correlated with ln(Population) and positively correlated with ln(GDP). For both deviations in commitment and verification amounts, these correlations become much smaller in magnitude and mostly not statistically different from zero once we include state fixed-effects. For deviations in payment amounts, we still observer marginally significant correlations (at 10% significance level) for both pop- ulation and GDP – in column (6), a one log point increase in GDP correlates with 0.5 p.p. smaller deviation in payments, or about 12% of the mean. A one log point increase in GDP per capita roughly corresponds to going from p25 to p75 of the distribution, so this is a small magnitude. In sum these results suggest that our data’s aggregate deviations from SICONFI totals are partially correlated with municipalities’ characteristics, although only weakly. Researchers using our data may want to control for such characteristics depending on the research ques- tion at hand. 4 Applications 4.1 Local firm contracting in public procurement Using our newly created dataset on local public procurement in Brazil, we investigate how much geographical variation exists in the location of government suppliers. According to García-Santana and Santamaría (2023), government purchases are highly locally concentrated worldwide. This phenomenon may occur due to supply factors such as regional economic specialization and lack of a diverse local production sector, or frictions such as transportation costs, geography or information asymmetries. It may also be explained by demand factors such as buy-local policies. These policies are ubiquitous across countries, but there is limited evidence regarding whether governments unofficially favor local suppliers even if a formal 11 buy-local policy does not exist. García-Santana and Santamaría (2023) provide compelling evidence that regional governments in Spain and France present home bias, i.e. they favor local suppliers. In this Section, we document the prevalence of local suppliers across Brazilian munici- palities, measuring the share of purchases from local governments flowing to firms located in the same municipalities. No such information exists currently in Brazil due to the lack of consistent data on the identity of municipal government suppliers – an information gap we fill with our dataset. We merge the identity of around 575,000 suppliers in local contracts with information on which municipality they are registered in - given by the Cadastro Nacional de Pessoas Jurídicas (CNPJ), a dataset provided by the Ministry of Economy. We consider a firm local if it operates within the municipality, regardless of whether it is a headquarters or a branch. In Figure 6, we present the distribution of the share of local purchases provided by suppli- ers located in the same municipality in each state, pooling all purchases from 2014 to 2021. PR and MG are the states in which local contracting is higher on average - the mean municipal- ity observation purchases almost one-quarter of its goods and services from suppliers located in the same municipality. RS and CE have slightly lower averages but similar distributions. In contrast, PE and PB exhibit considerably lower levels of local firms contracting, with the entire distribution shifted to the left. However, it is worth noting that, within states, we still observe a large variation in the levels of local contracting. Take MG, for example, a state with more than 800 municipalities. According to our data, the municipalities of Baldim and Ibirité procured less than 10% of total purchases from firms located in the same municipality, whereas municipalities like Almenara and Taiobeiras procured over 60% from local suppliers. An important source of heterogeneity discussed in the public procurement literature, which might also affect the identity of suppliers, is the modality of purchase. In general, pur- chases can be divided into competitive and non-competitive tenders, as described in Table A.2. The literature showing the effects of discretion on procurement outcomes documents that it may have potential negative impacts on efficiency and corruption (Baltrunaite et al., 2021; De- carolis et al., 2020; Palguta and Pertold, 2017). One explanation is related to favoritism, once lack of competition may favor government bureaucracy to engage in opportunistic behavior for private benefit by awarding contracts to local and/or connected firms. These supplier firms may not be the most efficient, leading to overpricing and other inefficiencies. Using Brazilian federal procurement data, Fazio (2022) finds evidence that although public agencies use dis- cretion to purchase higher-quality products, they also use it to favor firms that are politically connected, located in the same municipality as the government agency, larger, and older.23 In order to investigate this heterogeneity, in Figure 7 we present the distribution of the 23 In the Hungarian context, Szucs (2023) finds a similar result - winners of high discretion procedures are more likely to be domestically owned. On the other hand, contracts in high discretion procedures tend to be awarded to younger and smaller firms. 12 share of same-municipality suppliers by competitive and non-competitive tenders separately. Non-competitive tenders are those classified as a tender waiver (dispensa) or non-requirement tender (inexigibilidade).24 Under the assumption that buying from local suppliers reveals po- litical favoritism and that this is more likely to be enacted in discretionary, non-competitive tenders, we would expect a much larger share of local suppliers in the latter modality. This is not what we document: the average share of purchases from local suppliers is very similar at approximately 20% for both modalities. We do observe more dispersion in non-competitive tenders: while the distribution for competitive ones shows concentration around the mean, in non-competitive tenders we observe both a larger share of municipalities for which almost no suppliers are local and also a larger share with large participation of local suppliers.25 To analyze the supply-side factor, we present the share of same-municipality suppliers for municipalities below and above the median population, as illustrated in Figure 8. Notably, the calculated median for our dataset, using the 2018 population, stands at 11,590. Within smaller municipalities, the range of economic activities tends to be narrower and less diverse. This likely translates to a scarcity of local businesses available to adequately address the re- quirements of the local government. In addition, even if there is a supplier for some purchase, smaller municipalities might lack local competition, which can result in higher prices being charged by local suppliers in comparison to suppliers located in more competitive markets. So it is possible that policy-makers in those small municipalities might actually encourage the participation of firms from other areas. We document a pattern of local purchases consistent with those conjectures: on average, municipalities with above-median population present a 7 p.p. higher share of local suppliers (25% vs. 18% for those below median), and the entire distribution of local purchases is shifted to the left for smaller municipalities. 4.2 Timeliness in government payments Another important dimension of procurement practices is the timeliness of payments. Stretched payment terms increase the length of time between the payments for inputs and the receipt of cash from customers, increasing the working capital needs and financial expenses of suppliers. Previous research has documented the importance of trade credit terms for the performance of firms (Checherita-Westphal et al., 2016; Breza and Liberman, 2017). In extreme cases, late payments can lead to default and bankruptcy. 24 The first one consists of cases in which competition would be feasible, but the government chooses not to carry out the tender process - which is possible under a specific threshold established by procurement law. On the other hand, non-requirement occurs when competition is impossible (i.e. only one possible supplier, acquisition of unique goods or hiring of specialized professionals). 25 It is worth mentioning that we can only calculate this measure for suppliers whose national identifier con- sists of 14 digits - firms with a CNPJ. In our dataset we also have individual suppliers, registered with an 11-digit identifier (CPF). For those cases, we can not identify whether the purchases are more or less local, but we do know that non-competitive tenders represents a larger share of total number of purchases for individual sup- pliers (63%) than for firms (24%). If individual suppliers - which tend to be smaller - are more local, we might expect the distribution in the second panel to be shifted to the right. 13 Governments across the world often take long to pay their suppliers: procuring entities take on average 100 days to pay firms, with vast variation across countries (Bosio et al., 2022). This has several important implications. First, government purchases are a large share of the economy and a substantial revenue source for firms, affecting their future growth and em- ployment trajectory (Ferraz et al., 2015). When governments take long to pay their suppliers, they impose an additional financial cost on firms deciding to supply, and potentially exclude small and medium-sized firms that are more likely to be liquidity constrained (Barrot and Nanda, 2020). Second, if firms understand these additional financial costs, firms may avoid competing for government contracts or, upon deciding to compete, only accept higher prices that make up for the additional liquidity necessary to finance themselves. In either case, that may reduce the cost-effectiveness of public purchases and/or lead to worsening of the quality of goods and services procured. Some government have explicitly introduced reforms to ac- celerate payment to suppliers: Barrot and Nanda (2020) discuss the impact of QuickPay, which decreased the payment delay from 30 to 15 days for small business in the U.S. in 2011; and Chile introduced the Centralized Payment Platform (PPC, Spanish acronym for Plataforma de Pagos Centralizados) in 2020, which centralized payments from purchasing units to the Treasury and started to enforce a 30-day limit to payments. In this Section we document payment timeliness across Brazilian municipalities using our new dataset. We highlight this exercise is only possible by using microdata on the entire budget executing process. In an ideal scenario, we would be able to connect each payment to a single verification, and then compute average payment delays at the verification level. In practice, the majority of payments are connected to one commitment, but not to a verifi- cation. We instead compute payment delays at the commitment level – in cases where one commitment is linked to more than one verification and more than one payment, we compute amount-weighted-dates for verifications and payments, and then determine payment delay as the difference between these two dates.26 We restrict our sample to procurement-related commitment, to focus on delay to suppliers, and further restrict it to the purchase of goods and materials – since verifications of services are often more complex and numerous. Procurement law in Brazil determines that payments should be made no later than 30 days after the verification, with a shorter limit of 5 days for bid waiver processes (dispensa de licitação). In Figure 9 below, we present the distribution of the average payment delay at the municipality-by-year level, where average delays are calculated using the total amount of committed funds as weights (so it can be interpreted as the average delay to pay 1 BRL). In Figure 9a, we show the histogram of our delay measure – the distribution is centered around 15 days, showing that in the majority of municipality-year observations the average payment delay is well below the 30 day limit. We also document, nonetheless, a large right tail of ob- servations with average delays well above 30 days: in Figure 9b, we show that approximately 26 Details on the methodology to compute delays in complex situations, when one commitment is linked to several verifications and several payments, are discussed in Ricca (2019). 14 20% of municipality-years have an average payment delay above 30 days, and many are above 45 or even 60 days. Overall, approximately 15% of the total amount paid in the procurement of goods and materials in recent years is made in more than 30 days. In Figure 10, we present a map of the regions in Brazil included in our paper and color municipalities according to their average payment delay in 2018. Consistent with the distri- butions we plot above, we see that the majority of municipalities are paying their suppliers on average below 30 days, which are represented in dark and light green colors. Slow paying municipalities are often spread across the geography, but some clusters are clearly seen - the northeastern part of MG, for example, in the upper part of the "Southeast and South" region, is home to several municipalities that pay on average in more than 30 days.27 This is also a region of lower-income municipalities, which suggests that perhaps payment delays are con- sistently correlated with local income – either because these are areas of lower state capacity or because local governments face budget constraints, for example. We first document that, in that raw data, this relationship seems to be present: in Figure 11, we document a negative correlation between a municipality’s per capita GDP and their aver- age payment delay – those with higher incomes pay their suppliers systematically faster. We then present in Table 5 a series of regressions documenting that this relationship is robust to other measures of payment timeliness. In all specifications, we control for the log of popula- tion and include state and year fixed-effects. In column (1), we show that municipalities with 1 log-point higher income (which is roughly moving from the 1st to 3rd quartile in the per capita GDP distribution) pay their suppliers on average 1.7 days faster - this is approximately an 8% increase in payment timeliness when compared to the sample mean of 21 payment de- lay. While the average payment delay is important, it is possible that suppliers care less about averages and more about the probability of extreme events, such as being paid later than a certain number of days. In column (2), we first document that 1 log-point increase in GDP is correlated with a 3.9 p.p. decrease in the probability of being paid after the 30-days limit, compared to a 19% baseline mean. Higher income municipalities are also less likely to pay in more than 45 or 60 days, as we document in columns (3) and (4). Overall, these findings show that, while the majority of municipalities pay their suppliers on average within the time-frame determined by law, substantial variation still exists: 15% of payments are made over 30 days and payment timeliness seems to be systematically correlated with local per capita income. 5 Conclusion This paper introduced MiDES – a new disaggregated and harmonized dataset on Brazilian local procurement and budget execution. We first described the dataset’s basic properties and 27 We consider an alternative measure of payment delays in Figure A.5, the share of payments at the munici- pality level performed over 30 days, and observe a similar geographical pattern. 15 coverage, and then validated it against the standard aggregated public finance data source in Brazil. We then illustrated the potential uses of this new data in two applications, uncovering new facts about local public finances in Brazil that are only measurable using the granular data we provide. First, we show that, on average, municipalities purchase 15%-25% of their goods and materials from suppliers located in the same municipality. Furthermore, we docu- ment wide variation behind that average - several municipalities purchase close to 50% from local suppliers. This average share is not meaningfully different when local officers use non- competitive, discretionary tender methods. But it is systematically correlated with the size of municipalities – those with larger populations tend to buy more locally, perhaps reflecting the existence of a larger pool of suppliers. Second, we produce new descriptive evidence on delays in payments to government sup- pliers. We show that approximately 15% of payments are delayed, meaning they are paid over the maximum allowed limit of 30 days. Furthermore, 20% of municipality-year observations have an average delay above 30 days – suggesting they are systematically late payers. Pay- ment timeliness is also correlated with local per capita GDP, with higher-income municipal- ities paying their suppliers faster. These findings open several additional research questions. For example, if suppliers know that some municipalities tend to be late payers, they might in- clude that "financial risk" of mismatched assets and liabilities in their decisions when selling to these governments and increase prices. Some firms, particularly smaller and liquidity- constrained ones, might also decide not to sell to these governments in order to avoid that financial risk. Other possible questions are whether these delays are systematically differ- ent depending on government functions (are health expenses paid faster than education for example?); whether they systematically vary with the business and/or political cycles28 ; and whether some suppliers benefit from better payment terms than others (Ricca, 2019). All of these have important implications for competition and value-for-money in the public sector and deserve further investigation, which is possible using the granular data we provide. Several other research questions related to local public finances can be explored using these novel data, particularly when matched with other administrative data available in Brazil. Ash et al. (2021), for example, combine the well-known audit courts data that reveal corruption at the municipal level (Ferraz and Finan, 2011) with aggregate data from SICONFI to predict out-of-sample corruption. The new dataset we provide could be used to compute additional measures of local budget and procurement decisions – such as payment delays or shares of purchases from local and/or politically connected firms – which could improve machine learn- ing models used to predict mismanagement and corruption. The granular data available can also shed new light on how subnational entities adjust their expenditures throughout the busi- ness cycle, the political cycle (Foremny et al., 2018) and in response to fiscal rules that might 28 In Figure A.6, we document that the share of late payments was much higher in the 2014-2016 period, when Brazil faced a severe recession, and then improved substantially in more recent years. 16 constrain their policy choices (Carreri and Martinez, 2022). Matched with personnel data, the detailed procurement data we provide could generate new evidence on how the personal traits of state bureaucrats, such as experience and educational attainment, correlate with measures of value-for-money (Best et al., 2023; Fenizia, 2022). More broadly, we expect MiDES to allow researchers to engage with these and many other questions on subnational public finances. 17 References Ash, E., Galletta, S., and Giommoni, T. (2021). A Machine Learning Approach to Analyze and Support Anti-Corruption Policy. 16 Baltrunaite, A., Giorgiantonio, C., Mocetti, S., and Orlando, T. (2021). Discretion and sup- plier selection in public procurement. The Journal of Law, Economics, and Organization, 37(1):134–166. Publisher: Oxford University Press. 12 Barrot, J. and Nanda, R. (2020). The Employment Effects of Faster Payment: Evidence from the Federal Quickpay Reform. The Journal of Finance, 75(6):3139–3173. 4, 14 Barrot, J.-N. (2016). Trade Credit and Industry Dynamics: Evidence from Trucking Firms. The Journal of Finance, 71(5):1975–2016. 4 Best, M. C., Hjort, J., and Szakonyi, D. (2023). Individuals and Organizations as Sources of State Effectiveness. American Economic Review, 113(8):2121–2167. 17 Bosio, E., Djankov, S., Glaeser, E., and Shleifer, A. (2022). Public Procurement in Law and Practice. American Economic Review, 112(4):1091–1117. 14 Breza, E. and Liberman, A. (2017). Financial Contracting and Organizational Form: Evidence from the Regulation of Trade Credit. The Journal of Finance, 72(1):291–324. 4, 13 Carreri, M. and Martinez, L. R. (2022). Fiscal Rules, Austerity in Public Administration, and Political Accountability: Evidence from a Natural Experiment in Colombia. 17 Cattaneo, M. D., Crump, R. K., Farrell, M. H., and Feng, Y. (2022). On binscatter. 31 Checherita-Westphal, C., Klemm, A., and Viefers, P. (2016). Governments’ payment discipline: The macroeconomic impact of public payment delays and arrears. Journal of Macroeco- nomics, 47:147–165. 13 Conti, M., Elia, L., Ferrara, A. R., and Ferraresi, M. (2021). Governments’ late payments and firms’ survival: Evidence from the European Union. The Journal of Law and Economics, 64(3):603–627. 4 Corbi, R., Papaioannou, E., and Surico, P. (2019). Regional Transfer Multipliers. Review of Economic Studies, 86:1901–1934. 10 Dahis, R., Carabetta, J., Scovino, F., Israel, F., and Oliveira, D. (2022). Data Basis: Universalizing Access to High-Quality Data. preprint, SocArXiv. 3 Dahis, R. and Szerman, C. (2023). Decentralizing Development: Evidence from Government Splits. 2 18 Decarolis, F., Fisman, R., Pinotti, P., and Vannutelli, S. (2020). Rules, discretion, and corruption in procurement: Evidence from italian government contracting. Technical report, National Bureau of Economic Research. 12 Fazio, D. (2022). Rethinking Discretion in Public Procurement. 8, 12 Fenizia, A. (2022). Managers and Productivity in the Public Sector. Econometrica, 90(3):1063– 1084. 17 Ferraz, C. and Finan, F. (2011). Electoral Accountability and Corruption: Evidence from the Audits of Local Governments. American Economic Review, 101(4):1274–1311. 16 Ferraz, C., Finan, F., and Szerman, D. (2015). Procuring Firm Growth: The Effects of Govern- ment Purchases on Firm Dynamics. 14 Flynn, M. S. and Pessoa, M. (2014). Prevention and Management of Government Arrears. Inter- national Monetary Fund. 2 Foremny, D., Freier, R., Moessinger, M.-D., and Yeter, M. (2018). Overlapping political budget cycles. Public Choice, 177(1):1–27. 16 Gadenne, L. (2017). Tax Me, but Spend Wisely? Sources of Public Finance and Government Accountability. American Economic Journal: Applied Economics, 9(1):274–314. 10 Gadenne, L. and Singhal, M. (2014). Decentralization in Developing Economies. Annual Review of Economics, 6(1):581–604. _eprint: https://doi.org/10.1146/annurev-economics-080213- 040833. 2 García-Santana, M. and Santamaría, M. (2023). Understanding Home Bias in Procurement. Publisher: World Bank, Washington, DC. 3, 11, 12 Grossman, G. and Lewis, J. I. (2014). Administrative Unit Proliferation. American Political Science Review, 108(01):196–217. 2 Palguta, J. and Pertold, F. (2017). Manipulation of Procurement Contracts: Evidence from the Introduction of Discretionary Thresholds. American Economic Journal: Economic Policy, 9(2):293–315. 12 Potter, B. H., Diamond, J., and Währungsfonds, I., editors (1999). Guidelines for public expen- diture management. International Monetary Fund, Washington, D.C. 2, 6 Ricca, B. (2019). Procurement payment periods and political contributions: evidence from Brazilian municipalities. Working Paper. 14, 16 19 Shamsuddin, M., Acosta, P. A., Battaglin Schwengber, R., Fix, J., and Pirani, N. (2021). Eco- nomic and Fiscal Impacts of Venezuelan Refugees and Migrants in Brazil. Publisher: World Bank, Washington, DC. 10 Szucs, F. (2023). Discretion and Favoritism in Public Procurement. Journal of the European Economic Association, page jvad017. 12 Thorstensen, V. and Giesteira, L. F. (2021). Caderno Brasil na OCDE – Compras Públicas. Relatório Institucional, pages 1–49. 2 20 Figures and Tables Figure 1: Coverage of procurement and budget execution data Notes: This figure presents a map of Brazil with administrative boundaries of its 27 states plus the Federal District. Blue-shaded areas represent states for which full or partial procurement and/or budget execution municipal microdata is currently available in the dataset. See Table 1 for more details on our data coverage. We do not have date for the capital of the state of SP. 21 Figure 2: Example of procurement and budget execution process Town hall of Abatiá - PR Purchase of uniforms for basic health and health surveillance unit employes Nov 11 2018 Tender Publication ID 1336360 Budget Dec 13 2018 Auction 23 items R$ 17.116 Health Primary care ID 59967681 R$ 3.855 Dec 18 2018 Tender Dec 19 2018 ID 59967682 R$ 790 Commitment Homologation ID 59967683 R$ 7.522 ID 59967684 R$ 3.100 R$ 11.377 ANSELMO GIL SELINGARDI - ME May 21 2019 R$ 7.522 R$ 3.855 Payment Verification Awarding SUPRA ACESSORIOS DE R$ 3.100 INFORMATICA - ME May 23 2019 Jun 11 2019 R$ 790 R$ 3.890 Jun 18 2019 Jun 24 2019 Notes: This figure illustrates the nature of our datasets and how we can follow procurement processes from the tendering stage all the way to the payment of suppliers. 22 Figure 3: Validation with SICONFI data - commitment CE MG 2000 1500 4000 Frequency 1000 2000 500 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PB PE 600 2000 Frequency 400 1000 200 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PR RS 1500 1000 Frequency 1000 500 500 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> 8000 SP % difference from SICONFI data 6000 Frequency 4000 2000 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> % difference from SICONFI data Notes: This figure presents the percentage deviation in the total amount of budget commitments, at the municipality-year level, between our dataset and SICONFI, the public finance dataset of the Brazilian Trea- sury. Values are positive whenever the total amount in our dataset, aggregated from individual commit- ments, is larger than that of SICONFI. See Table 1 for more details on our data coverage. We truncate observations at -25% to the left and at 25% to the right. 23 Figure 4: Validation with SICONFI data - verification CE MG 4000 1000 3000 Frequency 2000 500 1000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PB PE 1000 600 750 Frequency 400 500 250 200 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PR RS 800 1500 Frequency 600 1000 400 200 500 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> SP % difference from SICONFI data 6000 Frequency 4000 2000 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> % difference from SICONFI data Notes: This figure presents the percentage deviation in total amount of budget verifications, at the municipality-year level, between our dataset and SICONFI, the public finance dataset of the Brazilian Trea- sury. Values are positive whenever the total amount in our dataset, aggregated from individual verifications, is larger than that of SICONFI. See Table 1 for more details on our data coverage. We truncate observations at -25% to the left and at 25% to the right. 24 Figure 5: Validation with SICONFI data - payment CE MG 1500 600 Frequency 400 1000 200 500 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PB PE 300 1500 Frequency 1000 200 500 100 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PR RS 600 1000 Frequency 400 500 200 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> SP % difference from SICONFI data 6000 Frequency 4000 2000 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> % difference from SICONFI data Notes: This figure presents the percentage deviation in total amount of budget payments, at the municipality-year level, between our dataset and SICONFI, the public finance dataset of the Brazilian Trea- sury. Values are positive whenever the total amount in our dataset, aggregated from individual payments, is larger than that of SICONFI. See Table 1 for more details on our data coverage. We truncate observations at -25% to the left and at 25% to the right. 25 Figure 6: Distribution of share of local suppliers across different states 0* 35   )UHTXHQF\                      56 3%   )UHTXHQF\                      &( 3(   )UHTXHQF\                      6KDUHRIVDPHPXQLFLSDOLW\VXSSOLHUV 6KDUHRIVDPHPXQLFLSDOLW\VXSSOLHUV Notes: This figure presents the distribution of the percentage of suppliers located within the same munici- pality where the tender process occurs, regardless of whether it is a headquarters or a branch, for each state. Data are drawn from the tender-participant table, and encompass all types of purchases, including tender waivers, for both products and services. Here, we consider only the winners’ (suppliers) information. We match this dataset with the Cadastro Nacional de Pessoas Jurídicas (CNPJ), a dataset provided by the Ministry of the Economy that contains information on every firm registered in Brazil (we use this information as of 2019, the earliest available year). Red dotted line marks the average value of the distribution. The temporal coverage of the data ranges from 2014 to 2021. See Table 1 for more details on our data coverage. 26 Figure 7: Distribution of share of local suppliers, by type of purchase 0.20 200 150 Frequency 100 50 0 0.0 0.2 0.4 0.6 0.8 1.0 Share of same-municipality suppliers (Competitive tender) 0.22 200 150 Frequency 100 50 0 0.0 0.2 0.4 0.6 0.8 1.0 Share of same-municipality suppliers (Non-competitive tender) Notes: This figure presents the distribution of the percentage of firms located within the same municipality where the tender process occurs, regardless of whether it is a headquarters or a branch. The distribution is disaggregated by type of purchase - competitive and non-competitive. Data is drawn from the tender- participant table and includes only suppliers who won the tender process and whose identifier has 14 digits. We match this data with the Cadastro Nacional de Pessoas Jurídicas (CNPJ), a dataset provided by the Ministry of the Economy that contains information on every firm registered in Brazil. The temporal coverage of the data ranges from 2014 to 2021. See Table 1 for more details. 27 Figure 8: Distribution of share of local suppliers, by population size    )UHTXHQF\            6KDUHRIVDPHPXQLFLSDOLW\VXSSOLHUV  3RSXODWLRQDERYHPHGLDQ     )UHTXHQF\             6KDUHRIVDPHPXQLFLSDOLW\VXSSOLHUV  3RSXODWLRQEHORZPHGLDQ Notes: This figure presents the distribution of the percentage of firms located within the same municipality where the tender process occurs, regardless of whether it is a headquarters or a branch. The distribution is disaggregated by municipalities above and below the median population of 2018. Data is drawn from the tender-participant table and includes only suppliers who won the tender process and whose identifier has 14 digits. We match this data with the Cadastro Nacional de Pessoas Jurídicas (CNPJ), a dataset provided by the Ministry of the Economy that contains information on every firm registered in Brazil. Red dotted line marks the average value of the distribution. The temporal coverage of the data ranges from 2014 to 2021. See Table 1 for more details. 28 Figure 9: Distribution of payment delays at municipality-year level (a) Histogram (b) Cumulative Distribution Function (CDF) Notes: Panel (a) presents the histogram of average payment delay at the municipality-year level, where average delay is weighted by total committed amount. The dotted line marks the 30-day threshold, the maximum allowed payment delay for procurement in Brazil. The underlying data cover the 2014-2018 period and six states (CE, MG, PB, PR, RS and SP). We are unable to calculate payment delays for PE due to our inability to match payments to their respective commitments. 29 Figure 10: Weighted average payment delay (days) Notes: This figure presents a map of regions of Brazil where municipalities are colored according to the average payment delay in the procurement of goods and materials. Municipality average delays are weighted by the total amount paid/committed. This indicator is available for the states of RS, PR, SP, MG, PB, CE. We only consider commitments that are fully executed within a fiscal year, that is, the amount committed is equal to the amount verified and the amount paid. 30 Figure 11: Scatter plot - Average payment delay vs. GDP per capita Notes: Note: This figure showcases a binned scatter plot, plotting the logarithm of per capita GDP on the x-axis against average payment delays on the y-axis. Municipality average delays are weighted by the total amount paid/committed. This indicator is available for the states of RS, PR, SP, MG, PB, CE. We only consider commitments that are fully executed within a fiscal year, that is, the amount committed is equal to the amount verified and the amount paid. The scatter plot is constructed using the package developed in Cattaneo et al. (2022). 31 Table 1: Procurement and budget execution coverage State # Procurement Budget Execution Munic. Tender Tender Tender Temporal Commitment Verification Payments Temporal Item Participants Coverage Coverage CE 184 ✓ ✓ ✓ 2009-2021 ✓ ✓ ✓ 2009-2021 MG 853 ✓ ✓ ✓ 2014-2021 ✓ ✓ ✓ 2014-2021 PB 223 ✓ ✓ 2014-2020 ✓ ✓ ✓ 2003-2021 PE 185 ✓ ✓ 2012-2021 ✓ ✓ ✓ 2012-2020 PR 399 ✓ ✓ ✓ 2013-2021 ✓ ✓ ✓ 2013-2021 RS 497 ✓ ✓ ✓ 2016-2021 ✓ ✓ ✓ 2010-2021 SP 644 ✓ ✓ ✓ 2008-2021 Total 3,076 2009-2021 2008 - 2021 Notes: This table reports temporal and geographical coverage of our dataset. For procurement data, the number of municipalities for PR is 392 due to problems with the conversion of xml files. In the budget execution data, the original data does not include the municipalities of Quixabá and Santa Teresinha in the state of PB. We could not obtain the data for the São Paulo municipality (state capital), which is supervised by a separate audit court. Table 2: Descriptive statistics - public procurement CE MG PB PE PR RS Total Number of distinct tenders 271,345 643,442 133,201 220,025 735,632 417,348 2,420,993 Deserted tenders (%) - - - 3.1 2.6 0.5 1.2 Unsuccessful tenders (%) - - 0.0 6.3 3.0 0.6 1.6 Non-competitive tenders (%) 27.6 31.8 38.3 38.0 45.0 45.5 38.6 Non-competitive tenders value (%) 10.0 18.0 12.0 15.0 19.0 12.0 14.9 Has item information (%) 93.3 100 - - 92.0 100 82.2 Avg. number of items per tender 25.1 25.4 - - 32.2 12.4 23.8 Has participant information (%) 94.4 96.5 100 100 92.7 92.2 94.9 Number of distinct participants 189,738 330,745 50,236 94,864 175,024 119,322 959,929 Number of distinct suppliers 137,997 301,447 43,548 66,466 154,301 91,609 795,368 Firms among suppliers (%) 55.4 83.2 47.8 66.4 71.6 79.0 72.3 Competitive tenders Avg. number of participants per tender 2.8 2.3 2.6 3.9 3.2 3.6 3.1 Avg. number of suppliers per tender 1.5 2.1 2.0 1.4 2.2 2.2 1.9 Number of distinct municipalities 184 853 223 184 392 497 2333 Notes: This table presents the descriptive statistics from the public procurement dataset. Deserted tenders refers to a situation where no proposals were submitted by potential bidders in response to a tender notice, while unsuccessful tenders occur when proposals were submitted but didentifier not meet the requirements or the tender process was cancelled or revoked. The variables “Has item information” and “Has participant information” refer to the percentage of tender identifiers with any information related to items or partici- pants, respectively. The definition of non-competitive tender encompasses both dispensa and inexigibilidade. To calculate the percentage of “Non-competitive tenders value ” we use the variable valor_corrigido win- sorized at percentiles 0.01 and 99.9. Additionally, the percentage of firms among suppliers is calculated as the number of distinct firms divided by the number of distinct suppliers, where firms are those whose identifier has 14 digits. 32 Table 3: Descriptive statistics - budget execution CE MG PB PE PR RS SP Total Commitments Observations 7,634,745 39,015,657 20,323,928 8,338,666 35,792,154 52,368,944 90,882,992 254,357,086 Distinct commitments 7,634,743 39,015,657 20,319,861 - 35,792,154 52,332,427 89,886,641 244,981,483 Related to procurement (%) 26.7 30.3 19.1 19.9 29.3 26.6 31.7 28.5 Greater than zero (%) 97.8 96.5 98.7 95.9 96.9 96.4 96.6 96.8 Has verification information (%) 96.0 95.0 63.0 - 98.0 97.0 97.0 94.0 Has payment information (%) 82.0 89.0 95.0 - 97.0 96.0 92.0 93.0 Verifications Observations 15,189,831 63,757,882 13,787,845 15,229,631 40,457,210 65,789,353 87,894,462 302,106,214 Distinct verifications - 63,753,322 - - 40,457,210 34,941,675 81,410,584 220,562,791 Payments Observations payment 13,930,424 64,127,137 22,113,067 21,449,922 52,545,469 74,023,098 83,436,799 331,625,916 Distinct payments 13,243,955 64,127,137 22,109,079 - 52,545,469 72,428,292 69,293,735 293,747,667 Total amount of payments (billion BRL) 132.2 514.9 155.4 196.5 407.4 501.7 1691.6 3,599.7 Number of distinct sellers 548,503 1,710,005 1,326,612 - 758,735 1,541,115 3,141,460 9,026,430 Number of distinct municipalities 182 853 223 184 399 497 644 2,798 Notes: This table presents the descriptive statistics from the budget execution dataset. The variables “Dis- tinct commitments”, “Distinct verifications” and “Distinct payments” are simply the count of the identifiers that uniquely identify each table - empenho, liquidacao, pagamento. In the states of CE, PB and PE we can not count the number of distinct verifications because we were unable to build a unique identifier for this table. Likewise for distinct commitments and distinct payments in the case of PE. The variables “Has verification information” and “Has payment information” refer to the percentage of commitment identifiers with any verification or payment information, respectively, in the current year or later. This means that they can be followed throughout the execution process. The state of PB has a low percentage of commitments with ver- ification information due to different temporal coverage of those two tables - while the first starts on 2003, the later starts on 2008. The commitments related to procurement restricts the data to three categories: con- sumption material (code 30), material for free distribution (code 32) and, equipment and permanent material (code 52). Also, the number of distinct sellers is missing for PE because we don’t have information about suppliers in the payment table. The variable “Total amount of payments” is in 2021 prices. 33 Table 4: Correlates of deviations Dependent Variables: Commitment (p.p) Verification (p.p) Payment (p.p) Model: (1) (2) (3) (4) (5) (6) Variables Log(GDP) 0.536∗∗∗ -0.136 1.28∗∗∗ 0.007 0.569∗∗∗ -0.464∗ (0.055) (0.074) (0.143) (0.062) (0.178) (0.221) Log Population -0.650∗∗∗ 0.200∗ -1.37∗∗∗ 0.126 -0.737∗∗∗ 0.728∗ (0.075) (0.089) (0.136) (0.093) (0.183) (0.338) Year Fixed Effects ✓ ✓ ✓ ✓ ✓ ✓ State Fixed Effects ✗ ✓ ✗ ✓ ✗ ✓ Fit statistics Observations 26,792 26,792 26,168 26,168 26,129 26,129 Dependent variable mean 1.0 1.0 2.5 2.5 4.2 4.2 RMSE 2.7 2.6 4.0 3.6 5.2 4.3 R2 0.04 0.12 0.07 0.24 0.05 0.36 Adjusted R2 0.04 0.12 0.07 0.23 0.04 0.36 Notes: This table presents results from OLS regressions where the deviation outcomes are defined as Dmt = BE (Tmt SICONFI − Tmt SICONFI )/Tmt for each stage (commitment, verification, payment), as described in Section 3. “% Procurement” measures the percentage of all expenditures directed to procurement. Each observation is a municipality-year. The data cover 6 states (CE, MG, PB, PR, RS, SP) and their corresponding years described in Table 1. Robust standard errors in parentheses. Significance levels: ***: 0.01, **: 0.05, *: 0.1. 34 Table 5: Correlates of payment delays Dependent Variables: Average Payment Delay % Over 30 Days % Over 45 Days % Over 60 Days Model: (1) (2) (3) (4) Variables Log(GDP) -1.74∗∗∗ -3.86∗∗∗ -2.83∗∗∗ -2.02∗∗∗ (0.391) (0.502) (0.357) (0.291) Year Fixed Effects ✓ ✓ ✓ ✓ State Fixed Effects ✓ ✓ ✓ ✓ Fit statistics Observations 19,314 19,314 19,314 19,314 Dependent variable mean 20.5 19.4 10.5 6.7 RMSE 11.1 16.1 11.3 8.4 R2 0.11 0.15 0.12 0.10 Adjusted R2 0.11 0.15 0.12 0.10 Notes: This table presents regressions using different measures of payment delay as dependent variable and log(GDP) as main dependent variable. All regressions control for the log of population and include state and year fixed effects. Observations are at the municipality-year level and encompass the period 2014- 2020. Standard errors clustered at the state-level in parentheses. Significance levels: ***: 0.01, **: 0.05, *: 0.1. 35 A Figures and Tables Figure A.1: Validation with SICONFI data: commitment phase, by function CE 100000 MG 20000 15000 75000 Frequency 10000 50000 5000 25000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PB PE 15000 20000 Frequency 10000 10000 5000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PR RS 40000 20000 Frequency 30000 10000 20000 10000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> SP % difference from SICONFI data 100000 75000 Frequency 50000 25000 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> % difference from SICONFI data Notes: This figure presents the percentage difference of commitment data from the budget execution dataset in relation to data on committed expenses from SICONFI, as described in Section 3. 36 Figure A.2: Validation with SICONFI data: verification phase, by function CE 100000 MG 10000 7500 75000 Frequency 5000 50000 2500 25000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PB PR 30000 8000 Frequency Frequency 6000 20000 4000 10000 2000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> RS SP 30000 60000 Frequency 20000 40000 10000 20000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> % difference from SICONFI data % difference from SICONFI data Notes: This figure presents the percentage difference of verification data from the budget execution dataset in relation to data on verified expenses from SICONFI, as described in Section 3. Figure A.3: Validation with SICONFI data: payment phase, by function CE MG 6000 80000 60000 Frequency 4000 40000 2000 20000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> PB PR 10000 20000 7500 15000 Frequency Frequency 5000 10000 2500 5000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> RS SP 30000 60000 Frequency 20000 40000 10000 20000 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> % difference from SICONFI data % difference from SICONFI data Notes: This figure presents the percentage difference of payment data from the budget execution dataset in relation to data on paid expenses from SICONFI, as described in Section 3. 37 Figure A.4: Validation with SICONFI data across years - payment (a) Minas Gerais (MG) 2014 2015 150 150 Frequency 100 100 50 50 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> 2016 200 2017 200 150 150 Frequency 100 100 50 50 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> 5 0 5 10 15 20 25 2018 2019 150 100 Frequency 100 50 50 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> 2020 2021 200 200 150 150 Frequency 100 100 50 50 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> % difference from SICONFI data - MG % difference from SICONFI data - MG (b) Paraná (PR) 80 2013 2014 40 60 Frequency 30 40 20 20 10 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> 2015 2016 40 40 30 Frequency 20 20 10 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> 2017 2018 40 40 30 Frequency 20 20 10 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> 2019 2020 40 30 40 Frequency 20 20 10 0 0 <-25 -20 -15 -10 -5 0 5 10 15 20 25> <-25 -20 -15 -10 -5 0 5 10 15 20 25> % difference from SICONFI data - PR % difference from SICONFI data - PR Notes: This figure presents the percentage deviation in total amount of budget payments, at the municipality-year level, between our dataset and SICONFI, the public finance dataset of the Brazilian Trea- sury. Values are positive whenever the total amount in our dataset, aggregated from individual payments, is larger than that of SICONFI. See Table 1 for more details on our data coverage. We truncate observations at -25% to the left and at 25% to the right. 38 Figure A.5: Share of payments paid over 30 days (%) Notes: This figure presents a map of regions of Brazil where municipalities are colored according to the share of payments performed over 30 days. This indicator is available for the states of RS, PR, SP, MG, PB, CE. We only consider commitments that are fully executed within a fiscal year, that is, the amount committed is equal to the amount verified and the amount paid. 39 Figure A.6: Distribution of share of late payments (over 30 days) Notes: This figure presents the cumulative distribution functions (CDFs) the share of payments that are made over 30 days, at the municipality level, across different years in the period 2014-2020. The underlying data cover six states (CE, MG, PB, PR, RS and SP). We are unable to calculate payment delays for PE due to our inability to match payments to their respective commitments. 40 Table A.1: Procurement and budget execution sources State Procurement Budget Execution CE https://api.tce.ce.gov.br/ https://api.tce.ce.gov.br/ MG https://dadosabertos.tce.mg.gov.br https://dadosabertos.tce.mg.gov.br PB https://dados.tce.pb.gov.br https://zeoserver.pb.gov.br/portaltcepb/tcepb/servicos/dados-abertos-do-sagres-tce-pb PE https://sistemas.tce.pe.gov.br/DadosAbertos https://sistemas.tce.pe.gov.br/DadosAbertos PR https://servicos.tce.pr.gov.br/TCEPR/Tribunal/Relacon/Dados/DadosConsulta/Consolidado https://servicos.tce.pr.gov.br/TCEPR/Tribunal/Relacon/Dados/DadosConsulta/Consolidado RS http://dados.tce.rs.gov.br http://dados.tce.rs.gov.br SP https://transparencia.tce.sp.gov.br/conjunto-de-dados Notes: This table presents urls with data sources for each state used in this paper. Table A.2: Procurement methods Purchasing method Competitive Characteristics Contract size Reverse auction, open to any interested firm. Reverse auction Yes Online or in-person. Off-the-shelf goods. Any value (Pregão) Multiples bids per participant. Waiver No Small purchases. Up to 17,600 BRL (direct contracting) Participants are invited. Minimum of 3 bidders. Invitation to tender Yes Uninvited firms are allowed to participate. Up to 176,000 BRL (Convite) One bid per participant. Competitive bidding Open to any interested bidder. Yes Any value (Concorrência) One bid per participant. Submission of prices Bidder must be previously registered. Yes Up to 1,430,000 BRL (Tomada de preços) One bid per participant. Direct contracting No There is only one supplier. - Contest Yes Artistic, scientific or technical work. - Notes: Contract size refers to the purchases of products and services other than construction (see Federal Decree 9,412, of June 18, 2018, for more details ). Thresholds for construction are different (33,000 BRL). The maximum contract size for direct contracting changed in 2018 from 8,000 BRL to 17,600 BRL. 41 A.1 Example of data extraction and analysis In this section we exemplify how researchers can use our dataset for generating descrip- tive statistics about public procurement and budget execution across Brazilian municipalities. The full documentation on using datasets hosted on the Data Basis platform can be found at https://basedosdados.github.io/mais (in Portuguese), including video tutorials and details on how to create projects at Google BigQuery. Here we provide a simple example using the ‘basedosdados‘ R package. This package allows users to connect to their BigQuery projects (and billing ID) and directly make SQL queries on BigQuery datasets, which are then loaded in R. The SQL query presented below does the following: it connects to the licitacao (tender) table including tender-level information for all municipalities and years covered in our data (over 2.3 million unique tenders); it creates an indicator for when the modality is defined as categories 8 (waiver) or 10 (non-requirement); then it computes the average share of tenders that are non-competitive and the total number of tenders at the municipality-year level. The resulting table, loaded in the ‘data‘ object, includes 19,033 municipality-year observations. When executed, this query took 3 seconds to process the 2.3 million observations and provide us these summarized statistics. We then can simply plot a histogram of the share of non-competitive tenders across mu- nicipality - year (in this case, we filter the dataset for observations with at least 50 tenders). We present the results in Figure A.7. We can see that the average we describe in Table 2 of 30%-40% masks substantial heterogeneity: while a substantial amount of municipalities in a given year do not use non-competitive modalities, in some municipalities the share of these is above three-quarters. Using this dataset and auxiliary tables also easily accessible in the ‘basedosdados‘ pack- age, we could also explore additional questions, such as whether the use of non-competitive tenders vary systematically over time and whether it correlates with other municipality traits, such as GDP per capita. 42 # I n s t a l l ( i f n o t y e t i n s t a l l e d ) and l o a d p a c k a g e s require ( ggplot2 ) require ( basedosdados ) require ( dplyr ) # S e t u s e r ’ s BigQuery b i l l i n g ID s e t _ b i l l i n g _ i d ( " < BILLING − ID > " ) # D e f i n e SQL q u e r y t o e x t r a c t d a t a q u e r y <− " SELECT id_municipio , ano , AVG( CASE WHEN m o d a l i d a d e = ’ 8 ’ OR m o d a l i d a d e = ’ 1 0 ’ THEN 1 ELSE 0 END ) AS s h a r e _ d i s c r e t i o n , COUNT ( i d _ l i c i t a c a o _ b d ) AS c o u n t FROM ‘ b a s e d o s d a d o s . world_wb_mides . l i c i t a c a o ‘ GROUP BY i d _ m u n i c i p i o , ano " # E x e c u t e q u e r y and c r e a t e d a t a f r a m e from o u t p u t d a t a <− r e a d _ s q l ( q u e r y ) # C r e a t e h i s t o g r a m o f s h a r e o f non − c o m p e t i t i v e t e n d e r s d a t a %>% f i l t e r ( c o u n t > 5 0 ) %>% ggplot ( ) + geom_histogram ( aes ( x = s h a r e _ d i s c r e t i o n ) , b i n s = 100 , c o l o r =" b l a c k " , f i l l ="#02075 d " ) + x l a b ( " S h a r e o f non − c o m p e t i t i v e t e n d e r s " ) + theme_classic ( ) 43 Figure A.7: Histogram of share of non-competitive tenders Notes: This figure presents a histogram of share of tenders that are non-competitive at the municipality- year level. Non-competitive tenders are defined as those tagged with modality categories 8 (waiver) and 10 (non-requirement). 44 B Data Quality This section briefly describes some aspects of the quality of our newly created dataset on local procurement and budget execution. Despite the large effort in harmonizing a variety of original sources as described in Section 2, there are still some issues that researchers using our data should be aware of. First, we plot the percentage of observations we cannot uniquely identify for each table in Figure B.1, Figure B.2, Figure B.3, and Figure B.4. After harmonizing and stacking each state’s information we create unique identifiers for each observation. For example, for the tender table, to create the id_licitacao_bd variable, we concatenate the following original variables: tender number, agency identifier, and state identifier.1 This identifier can be null in rows where (1) any of these components are missing or (2) this combination of variables still does not uniquely identify it. In Figure B.1 we report that CE was the only state in which we could not uniquely identify all observations - the percentage of null tenders identifiers exceeds 3% in 2010 but later remains stable around 1%. Similarly, for the commitment table, to create the id_empenho_bd variable, we concate- nate the following original variables: number, agency identifier, unit identifier, municipality identifier, year, and month. In Figure B.2 we document that the vast majority of observations are uniquely identified, i.e. the percentage of null commitment identifiers is never above 2%. When adding the total value committed from observations with null identifiers this percent- age is at most about 5% in SP between 2008 and 2012. The situation is worse for verifications and payments. In Figure B.3 we report that is was impossible to uniquely identify observations in the states of CE and PB, while the state of RS has a percentage of up to 60% in the early years. For payments in Figure B.4 we show that the states of CE, RS, and SP have values up to 10%. In other states such as MG or PR we are able to fully identify observations. Second, in Figure B.5 and Figure B.6 we plot the percentage of missing municipalities for each table and state. The relevant pattern observed is that for most states the number of missing municipalities is stable and close to zero over time. The exceptions are the state of CE, ranging around 5%-10% missing, the state of PE in 2013 in the payment table, the state of PR in 2022 in the payment table, and the state of PE in 2012 in the tender and tender-participant tables. Finally, we compare the number of municipalities present in each of our budget execution tables to the number in SICONFI in Figure B.7, Figure B.8, and Figure B.9. We document that for most states and most years both datasets have similar coverage or at most a 10% difference. Some exceptions are the state of MG in 2014 or the state of RS in 2021, for which our dataset has very little coverage. 1 The combination of these variables might vary depending on the state. In CE, for example, there is no agency identifier, instead we use the municipality identifier. Also, in RS, we had to include the year and an additional identifier for the type of purchase. 45 Table B.1: Limitations in the budget execution data CE MG PB PE PR RS SP Obs Incomplete temporal coverage Commitment Verification ✓ From 2009 Payment Has some municipality missing Commitment ✓ ✓ ✓ ✓ See Figure B.6 Verification ✓ ✓ ✓ ✓ See Figure B.6 Payment ✓ ✓ ✓ ✓ See Figure B.6 Does not have unique ID Commitment ✓ Verification ✓ ✓ ✓ Payment ✓ Has missing IDs Commitment ✓ ✓ ✓ ✓ ✓ See Figure B.2 Verification ✓ ✓ ✓ ✓ ✓ See Figure B.3 Payment ✓ ✓ ✓ ✓ ✓ See Figure B.4 Notes: This table presents the limitations in the budget execution data. ’Does not have unique ID’ refer to the cases where we were not able to build an identifier that uniquely identifies each row in the given table. Even if we manage to construct a unique identifier for most of the observations of a state-table pair, we may still have observations within that table for which this is not possible. In these cases, we mark it with a tick in ’Has missing ID’. The percentage of cases in which that happens can be found in Figure B.2, Figure B.3 and Figure B.4. 46 Table B.2: Limitations in the procurement data CE MG PB PE PR RS Obs Incomplete temporal coverage Tender Item Participant Has some municipality missing Tender ✓ ✓ ✓ ✓ ✓ See Figure B.5 Item ✓ ✓ ✓ ✓ ✓ ✓ See Figure B.5 Participant ✓ ✓ ✓ ✓ ✓ See Figure B.5 Does not have unique ID Tender Item Participant Has missing IDs Tender ✓ See Figure B.1 Item ✓ ✓ Participant Notes: This table presents the limitations in procurement data. ’Does not have unique ID’ refer to the cases where we were not able to build an identifier that uniquely identifies each row in the given table. Even if we manage to construct a unique identifier for most of the observations of a state-table pair, we may still have observations within that table for which this is not possible. In these cases, we mark it with a tick in ’Has missing ID’. The percentage of cases in which that happens can be found in Figure B.1. 47 Figure B.1: Missing tender identifiers &( 0*                            3% 35                             56 3(