Policy Research Working Paper                    10117




      Welfare Analysis of Changing Notches
                   Evidence from Bolsa Família

                             Katy Bergstrom
                             William Dodds
                               Juan Rios




Development Economics
Development Research Group
July 2022
Policy Research Working Paper 10117


  Abstract
 This paper develops a framework to bound the welfare                               use reduced-form bunching evidence for welfare analysis
 impacts of reforms to notches using two sufficient statistics:                     without strong assumptions on the economic environment.
 (1) the number of households bunching at the old notch                             These two statistics are estimated using a difference-in-dif-
 who move toward the new notch, and (2) the number of                               ference strategy for a reform to the anti-poverty program
 households who “jump” down to the new notch. The bounds                            Bolsa Famılia, finding that the reform’s marginal value of
 hold in a wide class of models, highlighting a new way to                          public funds lies between 0.90 and 1.12.




 This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the
 World Bank to provide open access to its research and make a contribution to development policy discussions around the
 world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may
 be contacted at kbergstrom@worldbank.org.




         The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
         issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
         names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
         of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
         its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                       Produced by the Research Support Team
                                                                    ılia∗
       Welfare Analysis of Changing Notches: Evidence from Bolsa Fam´

                                 Katy Bergstrom† , William Dodds‡ , and Juan Rios§




       Keywords: bunching, jumping, notches, MVPF, bounds, sufficient statistics
       JEL: H31, H53, I38, O12




   ∗
     Some of the ideas in this paper were part of a chapter of Juan’s dissertation at Stanford University, titled “Welfare Analysis of
Transfer Programs with Jumps in Reported Income: Evidence from the Brazilian Bolsa Fam´         ılia”. We are grateful to members of
Juan’s dissertation committee Douglas Bernheim, Raj Chetty, Luigi Pistaferri and Florian Scheuer for their advice and guidance.
We also want to thank Pierre Bachas, Jose Maria Barrero, Alexander Gelber, David McKenzie, Berk Ozler,      ¨      Peter Phillips, and
seminar participants at the NBER Public Economics Meeting, Stanford University, the World Bank, the University of Auckland,
and Brookings for their comments and suggestions. This research was supported in part using high performance computing (HPC)
resources and services provided by Technology Services at Tulane University, New Orleans, LA. The findings, interpretations,
and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the
International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive
Directors of the World Bank or the governments they represent.
   †
     World Bank. Email: kbergstrom@worldbank.org.
   ‡
     Tulane University. Email: wdodds@tulane.edu.
   §
     Uber. E-mail: juanriosriver@gmail.com
                                                 1     Introduction

    Throughout developed and developing countries, tax and transfer schedules often feature notches wherein
incremental changes in behavior lead to discrete changes in benefits or tax liabilities (Slemrod (2010); Kleven
(2016)). For example, Saving and Viard (2021) document numerous notches in the United States tax code
including the well-known Medicaid eligibility notch (Yelowitz, 1995), Kleven and Waseem (2013) document
notches of up to 4 percentage points in average personal income tax rates in Pakistan, and Bachas and
Soto (2021) document 10 percentage point jumps in the average corporate tax rate in Costa Rica. Given
this prevalence, it is important for economists and policymakers to understand the welfare impacts of policy
reforms that change the structure of notches in tax and transfer systems.
    Typically the behavioral and welfare impacts of notches are analyzed using the standard bunching ap-
proach whereby one non-parametrically estimates the mass of agents bunching around a notch and then
translates this bunching mass into a structural parameter(s) (Kleven, 2016). One can then use these struc-
tural parameters to gauge the welfare impacts of previous policy changes or to predict the impacts of future
policy reforms.1 However, the chief limitation of this approach is that translating a bunching mass into a
structural parameter typically relies heavily on the modeling assumptions of the agent problem. For example,
assumptions on preferences, the types of behavioral responses available to agents, and the degree of opti-
mization frictions can all greatly impact the mapping between a bunching mass and a structural parameter.2
Consequently, Kleven (2016) states “although bunching provides compelling non-parametric evidence of a
behavioral response, moving from observed bunching to a structural parameter that can be used to predict
the effects of policy changes is difficult.”
    This paper develops a technique that uses changes in bunching to assess the welfare impacts of changes
to notches in transfer schedules while imposing minimal structure on the agent problem. In particular, we
show how to bound the welfare impact of such a reform using two sufficient statistics: (1) the number of
agents bunching at the original threshold who move towards the new threshold as a result of the reform,
and (2) the number of agents who “jump” down to bunch at the new threshold as a result of the reform.
Importantly, our bounds require only minimal structure on the agent problem: agents can have arbitrary
preference heterogeneity, agents can have any number of choice variables, and agents may face optimization
frictions such as limited choice sets and/or adjustment costs. Thus, our approach overcomes a key limitation
of the standard bunching approach by using reduced-form bunching evidence to inform welfare analysis
without making strong assumptions about the agent’s optimization problem.
    We then use our method to bound the welfare impacts of a reform that expanded a notch in one of the
                                                 ılia (BF). Our bounds suggest that the marginal value of
world’s largest cash transfer programs, Bolsa Fam´
public funds for this reform lies between 0.90 and 1.12, which implies that the reform was welfare improving
if the government values giving R$0.90 to BF recipients (in a non-distortionary way) more than spending
one dollar on their next best alternative and welfare decreasing if the government values giving R$1.12 to BF
recipients (in a non-distortionary way) less than spending one dollar on their next best alternative. Because
                       ılia are typically very poor (Bastagli (2008), Lindert et al. (2007)), we argue that it is
recipients of Bolsa Fam´
    1. Note that while one could, in principle, evaluate the welfare impacts of particular notch reforms using the structural
parameters estimated from the standard bunching approach, it is more common for papers in this literature to use these structural
parameters to identify hypothetical schedules that would improve or maximize welfare (see, for example, Best et al. (2015) or
Bachas and Soto (2021)).
    2. This is not always the case: in some applications, structural parameters appear robust to model augmentations, e.g., Best
et al. (2020).


                                                               1
very likely to be the case that the government is willing to spend R$1 to get R$0.90 to BF recipients. While
the welfare impact of this reform is important in its own right, our application also highlights more generally
how our approach can be used to conduct meaningful welfare analyses.
   The paper begins by building intuition in a simple, static model in which a government provides a
constant transfer only to households who report an income below a certain threshold; hence, the transfer
schedule features a notch. Households are endowed with an income and choose how much income to report
to the government subject to misreporting costs. We derive bounds on the marginal value of public funds
(MVPF) of changing the notch, comprising both an increase in the benefit level and eligibility threshold.
The MVPF is the ratio of households’ willingness-to-pay (WTP) for the reform relative to the government’s
budgetary cost of the reform (Hendren and Sprung-Keyser, 2020). We show that aggregate WTP for the
reform (and, in turn, the MVPF) can be bounded using two empirical objects: (1) the number of “bunching
households” who bunch at the old notch and move towards the new notch as a result of the reform, and (2)
the number of “jumping households” who jump down to bunch at the new notch as a result of the reform.
To calculate WTP for the reform, we first note that for households who do not jump or bunch, their WTP
is simply equal to the amount of additional money they receive as a result of the reform. However, for the
jumping and bunching households, their WTP for the reform differs from their increase in benefits; this is
because we cannot simply appeal to the envelope condition in our setting as we consider arbitrary discrete
reforms. In particular, by jumping down to the new notch, jumping households receive the new benefit but
also incur a misreporting cost. By revealed preference, jumping households’ utility is improved by changing
their behavior; hence their WTP is weakly positive. Moreover, their WTP cannot exceed the size of the new
benefit given that misreporting costs are weakly positive. Thus, each jumping household has a WTP between
0 and the size of the new benefit. On the other hand, bunching households see an increase in the size of
their benefit as well as a reduction in their misreporting costs as they move toward the new, higher notch.
Thus, bunching households’ WTP for the reform is at least as large as their increase in benefits; moreover,
by revealed preference, their WTP cannot exceed the size of the new benefit level because if it did, it would
not have been optimal for these households to bunch at the original notch in the first place. We can use these
bounds on household WTP to bound aggregate WTP for the reform (and, in turn, the MVPF) as long as
we can observe the number of bunching and jumping households. Because the welfare impact of a reform is
equal to the MVPF multiplied by a normative welfare weight, our bounds on the MVPF enable us to make
welfare statements about the impacts of changing a notch using estimable, reduced-form objects.
   While our results are initially stated in the context of a simple, static misreporting model to build
intuition, we show that our welfare bounds are highly robust to the agent problem. In particular, our
bounds hold in models with any sort of behavioral response margin (e.g., labor supply responses instead of
misreporting responses), arbitrary preference heterogeneity, adjustment costs, limited choice sets (e.g., labor
supply frictions), and (non-extreme) misperceptions of the benefit schedule. Moreover, we discuss how our
bounds can be augmented to allow for dynamic decision making with uncertainty and for more complex
policy environments such as in-kind transfers and the presence of other tax and transfer schedules. We view
this generality as a key contribution: the number of jumping and bunching households can be used to bound
the MVPF of a notch change in a wide class of models. We therefore refer to the number of jumping and
bunching households as the sufficient statistics to bound the MVPF, and in turn, the welfare impact of a
notch change.
   The second part of this paper is concerned with bounding the welfare impacts of a reform to a notch


                                                      2
                                                               ılia (BF). BF is a Brazilian anti-poverty
in one of the world’s largest cash transfer programs, Bolsa Fam´
program that started in 2003 and gave benefits to around 14 million households as of 2014 (Gazola Hellmann,
2015). Eligibility for BF benefits is based on self-reported household income, likely generating substantial
scope for misreporting given that approximately 50% of economically active individuals work in the informal
sector (Henley, Arabsheibani and Carneiro, 2009). Importantly, the BF schedule features a pronounced notch
that was expanded in June 2014. Prior to June 2014, households reporting a per-capita income below the
extreme-poverty threshold of R$70 per-month were eligible for an unconditional monthly benefit of R$70
per-month, whereas households reporting an income above this threshold were not.3 In June 2014, both the
eligibility threshold and benefit were raised by 10%: the eligibility threshold increased from R$70 per-capita,
per-month to R$77 per-capita, per-month, while the unconditional benefit increased from R$70 per-month
to R$77 per-month.
                                                                                                 ´
    We have access to administrative data spanning December 2011 to September 2016 from Cadastro Unico,
which is the Brazilian government’s national registry used to determine eligibility for all federal social welfare
programs. Using this data, we seek to estimate the number of original bunching households who move towards
the new notch and the number of households jumping down to the new notch as a result of the reform. The
number of bunching households is simply equal to the reduction in the bunching mass at the old notch (as a
result of reform), while the number of jumping households is equal to the increase in the bunching mass at
the new notch less the reduction in the bunching mass at the old notch (as a result of the reform). However,
when looking at the raw data, there are clear time trends in the number of households locating at the old
and new notch in the pre-reform period. Thus, our identification challenge is to understand how the number
of households reporting incomes at the old and new notch would have evolved in the post-reform period had
the reform not occurred.
    At a high level, our identification strategy is based on the following insight: the incentives to report an
income at and above the original notch are affected by the reform, but the incentives to report an income
below the original notch are unaffected by the reform. Hence, we can use portions of the reported income
distribution below the original notch as control groups for underlying time trends in the reported income
distribution at and above the original notch. To do so, we use a generalized difference-in-difference strategy
as employed in, for example, Wolfers (2006) or Mora and Reggio (2013). A standard difference-in-difference
estimator requires that the control groups and treatment groups have “parallel trends” pre-reform that would
have persisted in the post-reform period had the reform not occurred. Similarly, the generalized difference-in-
difference estimator requires that there exist stable relationships, which are well approximated by relatively
low order polynomials, between treatment and control groups which existed pre-reform that would have
persisted in the post-reform period had the reform not occurred. In our setting, we discretize the reported
income distribution into bins, using bins below the original notch as control bins while using bins that include
the original and new notches as treatment bins. We then use the estimated relationships between treatment
and control bins in the pre-reform period to create counterfactuals for how the treatment bins would have
evolved absent the reform. Fundamentally, there are complex structural relationships between the number of
people reporting in each income bin; these relationships are determined by how the true income distribution
is evolving over time, how the true income distribution is mapped to the reported income distribution, as
   3. BF also includes a conditional variable benefit for households with children. To receive this variable benefit, households
must make specified health and education investments in their children and have a reported per-capita income below the “poverty
threshold” (which is higher than the extreme-poverty threshold). We restrict our attention to the unconditional component of
BF in this paper.



                                                              3
well as growth in applicants to the BF program over time. Our generalized difference-in-difference strategy
essentially assumes that the differences over time between the number of people reporting in each income
bin are well approximated by (low-order) polynomials which would have persisted in absence of the reform;
this assumption appears to hold in the pre-reform period, lending credence to our identification assumptions.
Moreover, we implement a number of placebo tests which further support the validity of our identification
assumptions.
   Using this generalized difference-in-difference strategy, we find that the number of households reporting
incomes at the new notch increased by approximately 49,000 and the number of households reporting incomes
at the old notch decreased by approximately 27,000 as a result of the reform. Hence, the reform induced
27,000 bunching households to move toward the new threshold and an additional 22,000 households to jump
down to the new threshold. This translates into a lower bound for the MVPF of the reform of approximately
0.9 and an upper bound for the MVPF of approximately 1.12. In terms of welfare implications, as long as the
government values giving R$0.90 (in a non-distortionary manner) to the BF households more than spending
R$1 on their next best alternative, the reform was welfare improving. We argue that this is likely the case
given that BF has high coverage of households in extreme poverty and given that the households misreporting
to get the benefit typically fall in the poorest half of the population (Bastagli (2008) and Lindert et al. (2007)).
A back-of-the-envelope calculation suggests that the welfare effect of spending R$1 on the BF reform is at
least as high as the welfare effect of spending R$1.50 on a non-distortionary universal transfer. Hence, our
empirical findings contribute to the debate around targeted vs. universal transfers in developing settings
(Hanna and Olken, 2018): we find that even in a setting with a highly pronounced notch and substantial
scope for misreporting, the efficiency cost generated by behavioral responses is simply not large enough to
outweigh the equity gain associated with the increased generosity of benefits targeted to poor households
(relative to a universal transfer).


Relationship to the literature: We believe it is helpful to further compare and contrast our paper with
the growing literature that explores bunching at notches and kinks. Papers in this literature typically use
reduced-form bunching evidence to pin down parameters of interest by making structural assumptions on
agent optimization problems. For example, Kleven (2016) notes that inferring a structural parameter from
reduced-form bunching evidence requires structural assumptions and that “changing any of these model
features will in general change the mapping between bunching and structural parameters.” Conversely,
our approach uses reduced-form bunching evidence in a sufficient statistics approach (i.e., without making
substantive parametric assumptions on the agent optimization problem). In line with Chetty (2009), we
view our approach as a complement rather than a substitute to standard bunching approaches because
they have opposing strengths and weaknesses. The strength of our approach lies in obviating the need for
strong parametric assumptions while the weaknesses are that our bounds are only relevant for the reform
in question and that our approach is more data intensive because to estimate changes in bunching masses
without structural assumptions, one requires cross-sectional data in both the pre- and post-reform period.
   Our empirical strategy also differs from standard bunching analyses despite being conceptually similar
insofar as both methods use portions of the distribution which are unimpacted by the notch (or, in our
case, the notch change) to infer properties about portions of the distribution which are impacted by the
notch (or notch change). In particular, we focus on estimating changes in bunching that result from a
reform, whereas standard bunching analysis typically focuses on estimating bunching at a given notch or


                                                        4
kink. Using a difference-in-difference strategy, we use portions of the distribution unimpacted by the reform
to control for underlying time trends in portions of the distribution which are impacted by the reform.
Conversely, the standard bunching approach uses portions of the distribution unimpacted by the notch
combined with smoothness assumptions on the underlying counterfactual distribution to non-parametrically
identify bunching induced by the notch. One may wonder if we could have adapted the standard bunching
approach to estimate changes in bunching while controlling for underlying time trends as done in Carril
(2022);4 however, as discussed further in Section 4.3, our distribution of reported incomes is extremely non-
smooth (we have extreme bunching at numbers equal to 0 mod 50, substantial bunching at numbers which
were notches many years earlier, and less extreme bunching at numbers equal to 0 mod 10). While bunching
estimation methods have been augmented to deal with “round-number” bunching and bunching at “reference-
points” (e.g., Kleven and Waseem (2013) and Best and Kleven (2017)), we believe that the pervasiveness
and the variability of round-number and reference-point bunching in our setting will make it too difficult to
precisely identify the counterfactual distributions around our notches (in particular, around the original notch
of R$70 which is equal to 0 mod 10). Thus, our empirical strategy highlights an alternative way to estimate
changes in bunching when smoothness assumptions cannot be used to estimate counterfactual distributions.5
    Finally, on the theoretical side, our paper contributes to the sufficient statistics approach for welfare
analysis (Chetty, 2009). One broad contribution to this literature is showing how to apply a sufficient
statistics framework to settings with large, discrete policy reforms when the envelope condition cannot be
applied. Kleven (2021) discusses how to do approximate welfare analysis using an expanded set of sufficient
statistics when reforms are large; however, he argues that these additional statistics may not be easily
estimable. We overcome this need to estimate difficult statistics by focusing on welfare bounds. This paper is
also related to Lockwood (2020) who observes that in the presence of notches, the sufficient statistic approach
for the welfare analysis of tax systems needs to be augmented to include a correction term which captures
the change in bunching at a notch in response to a tax reform. Our key theoretical contribution relative
to Lockwood (2020) is the generality of our results. Whereas Lockwood (2020) characterizes the welfare
impacts of particular infinitesimal reforms to notches in a model with quasi-linear utility, one dimension of
agent heterogeneity, and one choice variable, our approach allows us to bound the welfare impacts of reforms
to programs with notches of any size while placing very little structure on the agent problem.6
    The remainder of the paper is organized as follows. Section 2 introduces the theoretical framework for
                                                                                        ılia program, Section
welfare analysis of changing notches, Section 3 discusses the structure of the Bolsa Fam´
4 discusses our strategy to empirically identify the number of jumping and bunching households for the June
2014 reform, Section 5 presents our results and robustness analysis, and Section 6 concludes.
    4. Carril (2022) uses standard bunching techniques to estimate the counterfactual distributions that would ensue in absence
of a notch in both the pre- and post-reform periods and then uses changes in these counterfactual distributions to control for
underlying dynamics. In contrast, we use portions of the observed distribution that are unimpacted by the reform to control for
underlying dynamics.
    5. While methodologically different, our approach is conceptually similar to Best et al. (2020) who also use the behavior of
non-bunching individuals to make inferences about the counterfactual behavior of bunching individuals.
    6. We also allow for changes in the location of the notch, which turns out to be more technically complicated.




                                                              5
       2    Theoretical Framework for Welfare Analysis of Changing
                                                           Notches

      This section derives bounds for the welfare impact of changing a notch in a transfer program. We begin
by deriving bounds in a simple, static misreporting model. This simple setup is useful not only for building
intuition but also because misreporting responses are likely common in our empirical application. We show
that we can bound the welfare impact of a notch change using two empirical objects: (1) the number of
households bunching at the old notch who move towards the new notch as a result of the reform, and (2)
the number of households who jump down to the new notch as a result of the reform. Finally, we show that
our results still hold in a much more general model which puts minimal structure on the household problem,
thereby arguing that our two empirical objects are “sufficient statistics” to bound the change in welfare from
a notch change.


2.1     Baseline Model Set-Up

                                                                                                 ˆ, subject
      To begin, we consider a static world in which households choose how much income to report, y
to a policy p = {b, τ } where b denotes the level of the benefit and τ denotes the cut-off level of reported
                                                                ˆ > τ receive nothing.7 Households have
                            ˆ ≤ τ receive b and those reporting y
income s.t. those reporting y
two dimensions of heterogeneity: (1) endowed income y distributed according to CDF F (y ), and (2) aversion
to misreporting governed by µ ∈ {1, 2} with probability mass function π (µ). Type µ = 1 households are
“truth-telling” households who never misreport, whereas type µ = 2 households are willing to misreport their
income. For simplicity, we assume there is no fixed cost of reporting an income and that all households know
the policy p. We also assume households have quasi-linear utility in consumption. Type µ = 1 households
therefore have utility under policy p given by:

                                             U ∗ (y, µ = 1; p) ≡ y + b1 (y ≤ τ )                                             (1)


      Utility for type µ = 2 households under policy p is equal to:

                                                                         ˆ) 1(y > y
                                      U ∗ (y, µ = 2; p) ≡ max c − v (y − y        ˆ)
                                                             ˆ
                                                             y
                                                                                                                             (2)
                                                          s.t. c = y + b1 (ˆ
                                                                           y ≤ τ)

             ˆ) 1(y > y
where v (y − y                                                ˆ when true income is y . We assume v > 0
                      ˆ) captures the disutility of reporting y
and v ≥ 0 so that the cost of misreporting is increasing and convex in the discrepancy between true and
reported income.8 If a household is indifferent between reporting truthfully and over-reporting, we break
their indifference by assuming they report truthfully.
                                ˆ∗ , for type µ = 2 households are characterized as follows (see Appendix A.1
      Optimal reported incomes, y
   7. For the sake of parsimony, our theoretical framework ignores other aspects of the tax and transfer system. Thus, we
assume that behavioral responses to changes in the notch do not have budgetary impacts on other programs. We discuss how
our framework can account for other aspects of the tax and transfer schedule in Section 2.3 and how potential fiscal externalities
impact our findings in Section 5.2.
   8. Modeling evasion using a convex and deterministic cost function v (·) dates back to Mayshar (1991) and Slemrod (2001).




                                                                 6
for a formal derivation):                              
                                                       
                                                       
                                                       y       if y ≤ τ
                                                       
                                     ˆ∗ (y, µ = 2; p) = τ
                                     y                          if y ∈ (τ, y c (p)]
                                                       
                                                       
                                                               y > y c (p)
                                                       
                                                       y

where y c (p) is the income level for which households are indifferent between misreporting at τ and reporting
truthfully, implicitly defined by y c (p) + b − v (y c (p) − τ ) = y c (p). In words, type µ = 2 individuals with
y < τ report truthfully and get the benefit, those with y ∈ (τ, y c (p)] get the benefit by misreporting and
bunching at the notch, and those with y > y c (p) report their income truthfully and do not get the benefit.
      Next, we define G(x; p) to capture the number of households reporting an income less than or equal to x
under policy p:
                              G(x; p) =               1(ˆ
                                                        y ∗ (y, µ; p) ≤ x)dF (y |µ)π (µ)
                                          µ∈{1,2} Y

Hence, G(τ ; p) captures the number of households locating under the eligibility threshold τ . Finally, we
assume that total social welfare under policy p, W (p), is given by a weighted sum of household utilities
evaluated at optimal choices under policy p less the budgetary cost of the policy multiplied by the shadow
value of public funds λ, which captures the welfare gain of spending a dollar on the government’s next best
alternative. In other words, we assume that spending $1 on the program comes at the cost of spending $1 less
on some other program which decreases welfare by a constant value λ. Welfare under policy p is therefore
given by:
                         W (p) =               φ(y, µ)U ∗ (y, µ; p)dF (y |µ)π (µ) − λbG(τ ; p)              (3)
                                   µ∈{1,2} Y

where φ(y, µ) denotes the government’s welfare weight on a household with income y and type µ.


2.2     Welfare Effect of the Reform in the Baseline Model

      Our goal is to evaluate the welfare impact of a reform from policy p = {b, τ } to p = {b , τ }, defining
{∆b, ∆τ } = p − p. For ease of exposition, let us assume ∆b > 0 and ∆τ > 0 (i.e., we increase the level
and location of the notch). Note that we are not restricting to infinitesimal reforms; the bounds we derive
allow for arbitrary, discrete reforms to b and τ . Our goal is to derive bounds for W (p ) − W (p) in terms
of empirically observable objects (we discuss the advantages of focusing on bounding the welfare impact as
opposed to characterizing the exact impact in Section 2.3). Hendren and Sprung-Keyser (2020) show that
the welfare impacts of any policy reform can be expressed in terms of a normative welfare weight along with a
positive sufficient statistic, denoted the marginal value of public funds (MVPF), which captures households’
willingness-to-pay (WTP) for the reform relative to the total budgetary cost of the reform. Our goal then is
to derive bounds for the MVPF.
      Because the utility function is quasi-linear, a household with income y and misreporting type µ has a
WTP for the reform implicitly defined by:

                                       U ∗ (y, µ; p) = U ∗ (y, µ; p ) − W T P

Equivalently, WTP is simply equal to the compensating variation. We now heuristically derive bounds on
the WTP for all households impacted by the reform (we formally derive these bounds in the proof to Lemma

                                                           7
1). To do so, we will split the set of households impacted by the reform into four groups and discuss the
WTP for each group separately. We call these four groups the mechanical households, bunching households,
threshold households, and jumping households. Figure 1 depicts the mass of households contributing to each
group for a hypothetical reported income distribution along with the change in the hypothetical reported
income distribution as a result of the reform.




Note: This figure shows a hypothetical density of reported incomes under the initial policy p = {b, τ } (in grey) and how this
density changes as a result of a reform that increases the policy to p = {b , τ } (in black). Note, the vertical grey line at τ and
the vertical black line at τ represent the bunching households under the initial policy and new policy, respectively. This figure
also depicts which households are classified as mechanical households, bunching households, threshold households, and jumping
households.

               Figure 1: A Hypothetical Density of Reported Incomes under p and p

      The mechanical households are the households who report at or below τ under policy p and who do
not change their behavior as a result of the reform. The number of mechanical households is given by
M = G(τ ; p ). Notably, M is not equal to the mass reporting at or below τ under policy p as this mass
includes bunching households who do update their behavior as a result of the reform (see discussion below).
Conversely, those reporting at or below τ under p will also report at or below τ under p. Mechanical
households receive an increase in benefits equal to ∆b. Hence, their WTP for the reform is exactly equal to
∆b.
      The bunching households are the households who misreport and bunch at the original threshold τ under
policy p and move with the threshold as it is increased (i.e., they report between (τ, τ ] under policy p ).9
Thus, the number of bunching households is equal to reduction in households locating at or below τ as a
result of the reform : B = G(τ ; p) − G(τ ; p ). The bunching households receive an increase in benefits equal
to ∆b. Moreover, they experience a reduction in their misreporting costs as they move from τ to τ . Hence,
the bunching households have a WTP of at least ∆b.10 Moreover, by revealed preference arguments, the
    9. Not all bunching households will move all the way to τ . In particular, bunchers with y ∈ (τ, τ ) will report y  ˆ = y under p .
  10. Even for an infinitesimal reform, the utility gain bunching households experience from moving with the notch has a first-
order impact on welfare. This is because the envelope theorem cannot be applied for these individuals: in order to argue that the
derivative of indirect utility with respect to the policy is equal to the derivative of utility with respect to the policy evaluated at
optimal decisions, utility must be differentiable with respect to the policy given any fixed choices (see Theorem 2 of Milgrom and
Segal (2002)). However, utility is actually discontinuous as a function of the parameter τ holding decisions fixed: for example,
individuals reporting an income of τ see a discrete drop in consumption (and hence utility) if τ is reduced by any amount.



                                                                  8
most the bunching households can value this reduction in misreporting costs is b dollars. If they value this
reduction by more than b dollars, it would not have been optimal for these households to bunch at τ under
policy p.11 Thus, the bunching households’ WTP for the reform is in [∆b, ∆b + b] = [∆b, b ].
    The threshold households are the households who report in (τ, τ ] under policy p. Thus, the number
of threshold households is given by T = G(τ ; p) − G(τ ; p). Notably, these households will not update
their behavior in response to the reform. This is because all households reporting above τ given policy p
are reporting truthfully regardless of their type µ. Under p the optimal choice for these households is to
continue reporting truthfully as doing so allows them to receive b and not incur any misreporting costs.
These households go from receiving no benefits to receiving b in benefits. Hence their WTP is equal to b .
    Finally, the jumping households are the households who report above τ given policy p but who report at
τ given policy p . In particular, households who were previously close-to-indifferent between misreporting at
the threshold and truthfully reporting above the threshold but opted for the latter (i.e., type µ = 2 households
with y ∈ (y c (p), y c (p )]) will now jump and misreport to the new threshold τ .12 Notably, we describe the
behavioral response of these households as a “jump” because these households experience a discontinuous
change in their optimal reported income as we move from p to p . The number of jumping households, J , is
given by J = G(τ ; p ) − G(τ ; p), which is equivalent to the increase in households reporting at or below the
new threshold as a result of the reform (note, the number of jumping households is not equal to the mass of
households locating at τ given policy p as some of the households locating at τ may be original bunching
households who moved with the notch). By revealed preference, jumping households’ utility is improved
by changing their behavior; hence, their WTP is weakly positive.13 Moreover, after the reform, jumping
households get b more dollars but incur misreporting costs. Therefore jumping households’ WTP cannot
exceed b as misreporting costs are weakly positive. Hence the WTP of jumping households is in [0, b ].
    Table 1 summarizes the bounds on the WTP for each of our four groups along with the number of
households falling into each group and the cost that each group imposes on the government.

                        Table 1: WTP and Cost to the Government of the Reform

      Group                        Number of households         WTP (per-household)        Cost to Govt. (per-household)
      Mechanical Households        M = G( τ ; p )               ∆b                         ∆b
      Bunching Households          B = G(τ ; p) − G(τ ; p )     ∈ [∆b, b ]                 ∆b
      Threshold Households         T = G(τ ; p) − G(τ ; p)      b                          b
      Jumping Households           J = G(τ ; p ) − G(τ ; p)     ∈ [0, b ]                  b

Note: This table shows the willingness-to-pay (WTP) and cost to the government for all the households impacted by the reform of
moving from policy p to p . We split the households into four groups: mechanical, bunching, threshold, and jumping households.


    This brings us to Lemma 1 which bounds the total WTP for the reform:

Lemma 1. If individuals solve Problem (2), the total WTP of the reform from p to p with p − p =
   11. Suppose bunching households have a WTP for the relaxation in their misreporting costs greater than b: v (y − τ ) − v (y − τ ) >
b. This implies v (y − τ ) > b. However if v (y − τ ) > b, bunching households would have preferred to report truthfully above τ
over misreporting at τ under policy p.
   12. For large changes in τ s.t. y c (p) < τ , only µ = 2 households with y ∈ (τ , y c (p )] jump and misreport to the new notch.
   13. For an infinitesimal reform, jumping households have a WTP of 0 as the only households who jump are those that are
indifferent between locating at the notch and reporting truthfully. Thus, our lower bound for the WTP for jumping households
is exact for an infinitesimal reform. Note, however, that the jumping households still have a first order impact on social welfare
through their effect on the government’s budget as each jumping household costs the government b dollars (despite the fact that
the mass of jumpers is measure 0 for an infinitesimal reform; see Bergstrom and Dodds (2021) for further discussion).


                                                                  9
{∆b, ∆τ } > 0 can be bounded as follows:

                              ∆b(M + B ) + b T ≤ Total WTP ≤ ∆bM + b (B + T + J )

where M, B, T and J denote the mass of mechanical, bunching, threshold and jumping households mathemat-
ically defined in Table 1.

Proof. See Appendix A.2.

Moreover, using the cost per-household to the government given in Table 1, we can express the total budgetary
cost of the reform as follows:

                           Total Cost ≡ b G(τ ; p ) − bG(τ ; p) = ∆b(M + B ) + b (T + J )                                    (4)


Hence, we can construct bounds for the MVPF of the reform as follows:

Proposition 1. If individuals solve Problem (2), the marginal value of public funds of the reform from p to
p with p − p = {∆b, ∆τ } > 0 can be bounded as follows:

                                                      ∆b(M + B ) + b T               J
                        M V P F ≥ M V P FL ≡                                =1−b
                                                    ∆b(M + B ) + b (T + J )      Total Cost

                                                    ∆bM + b (B + T + J )             B
                        M V P F ≤ M V P FU ≡                                =1+b
                                                    ∆b(M + B ) + b (T + J )      Total Cost

Proof. This follows directly from Lemma 1 and Equation (4).

The lower bound for the MVPF captures the fact that the lower bound for the WTP of the jumping households
is 0 while the cost they impose on the government is b each (whereas the lower bound for the WTP of bunching
households is exactly equal to the cost they impose on the government). Hence, if all jumping households
value the reform at their lower bound, b J/(Total Cost) of each dollar of spending is “wasted”. Meanwhile,
the upper bound for the MVPF captures the fact that the upper bound for the WTP of the bunching
households is b while the cost they impose on the government is only b − b (whereas the upper bound for
the WTP of jumping households is exactly equal to the cost they impose on the government). Hence, if
bunching households all value the reform at their upper bound, the government “gains” bB/(Total Cost) for
each dollar of spending. We can then use these bounds on the MVPF to construct bounds on the money
metric welfare gain relative to the budgetary cost of the reform using Proposition 2:14

Proposition 2. If individuals solve Problem (2) and social welfare is given by Equation (3), then the money
metric welfare gain relative to the budgetary cost of the reform from p to p with p − p = {∆b, ∆τ } > 0 can
be bounded as follows:
                                                     1
                                                     λ[W (p ) − W (p)]
                             ωL M V P FL − 1 ≤                             ≤ ωU M V P F U − 1
                                                   b G(τ ; p ) − bG(τ ; p)
  14. Note, that the welfare change is expressed in dollar units (as opposed to welfare units) as we divide through by the shadow
value of public funds, λ.




                                                               10
where ωL (ωU ) captures the weighted average money-metric welfare gain from giving a dollar to mechanical,
bunching, threshold, and jumping households, where the weights are determined by the relative size of each
group’s lower bound (upper bound) for WTP.

Proof. We do not provide a separate proof for Proposition 2; we only provide a proof for Proposition 3, which
nests Proposition 2 (see Appendix A.3).

      Proposition 2 bounds the increase in total welfare from spending $1 on the reform. In particular, M V P FL
captures our lower bound on the total WTP of the mechanical, bunching, threshold, and jumping households
when we spend $1 on the reform, while ωL denotes the welfare gain, measured in dollars, of splitting $1
among the mechanical, bunching, threshold, and jumping households (where the split is determined by the
lower bounds on each group’s WTP for the reform). Subtracting the budgetary cost of $1 from ωL M V P FL
gives a lower bound for the total welfare gain of spending $1 on the reform. Symmetric logic explains why
ωU M V P FU − 1 is an upper bound for the increase in total welfare, measured in dollars, of spending $1 on
the reform.
      Finally, in light of Proposition 1, we can express M V P FL and M V P FU in terms of two positive objects:
(1) the number of bunching households, and (2) the number of jumping households. Thus, the number of
bunchers and jumpers are the empirical objects needed to construct bounds for the welfare impact of the
reform.15


2.3     Robustness to Model Specification

      To highlight the robustness of our bounds for the welfare impact, we now show that we actually require
very little structure on preferences or behavioral responses of agents. Suppose households have several
decisions variables denoted by the vector x within a choice set X . Household decisions are made conditional
on primitives denoted by the vector θ ∈ Θ and the policy p. Households get the benefit b if their reported
       ˆ, which can be a decision variable or a function of decision variables and primitives, is below τ .
income y
Household income, denoted y , is also potentially a function of decisions x.16 Households solve:

                                       U ∗ (θ; p) = max u (c, x; θ)
                                                       x∈X
                                                                                                                                  (5)
                                                       s.t. c = y (x, θ) + b1(ˆ
                                                                              y (x, θ) ≤ τ )

where c denotes consumption. We assume total welfare is given by a weighted sum of utilities, with welfare
weights given by φ(θ):
                                        W ( p) =        φ(θ)U ∗ (θ; p)dF (θ) − λbG(τ ; p)                                         (6)
                                                   Θ

where λ represents the shadow value of public funds and G(τ ; p) =                   y (θ,p)≤τ
                                                                                   θ:ˆ           dF (θ) represents the number of
households receiving the benefit under policy p. More generally, we define G(z ; p) =                          y (θ,p)≤z
                                                                                                           θ:ˆ           dF (θ). This
setup allows us to substantially generalize Proposition 1 and Proposition 2:

   15. Technically, we also need to know the total cost of the reform; however, Equation (4) shows that the total cost is a function
solely of G(τ ; p ) and G(τ ; p), which are needed to construct J and B .
                                                                                                                         ˆ = y = nl.
   16. For example, in a labor supply model, x could equal household labor supply l, θ could include productivity n, and y




                                                                  11
Proposition 3. Suppose households solve Problem (5), welfare is given by Equation (6), and τ > τ . Defin-
ing:
                                                 b [G(τ ; p ) − G(τ ; p)]          J
                             M V P FL ≡ 1 −                               =1−b                                                (7)
                                                  b G(τ ; p ) − bG(τ ; p)      Total Cost

                                                  b[G(τ ; p) − G(τ ; p )]          B
                              M V P FU ≡ 1 +                              =1+b                                                (8)
                                                  b G(τ ; p ) − bG(τ ; p)      Total Cost

Then as long as b G(τ ; p ) − bG(τ ; p) > 0 have:

                                                      1
                                                      λ[W (p ) − W (p)]
                             ωL M V P FL − 1 ≤                              ≤ ωU M V P F U − 1                                (9)
                                                    b G(τ ; p ) − bG(τ ; p)

where ωL (ωU ) captures the weighted average money-metric welfare gain from giving a dollar to mechanical,
bunching, threshold, and jumping households, where the weights are determined by the relative size of each
group’s lower bound (upper bound) for WTP.

Proof. See Appendix A.3.

    Proposition 3 highlights that we can bound the MVPF and, consequently, bound the welfare impacts
of the reform using empirically observable objects while only putting limited structure on the household
problem.17    18   In particular, to bound the MVPF we simply need to estimate how the number of people
locating below the new notch changes as a result of the reform, J = G(τ ; p ) − G(τ ; p), as well as how the
number of people locating below the old notch changes as a result of the reform, B = G(τ ; p) − G(τ ; p ).19 In
the context of our simple baseline model, B equals the number of households bunching at the original notch
and J equals the number of households jumping down to bunch at the new notch. However, if households
solve the more general household problem (5), the interpretation of these terms may be changed. For instance,
consider a labor supply model in which households can only work full-time, half-time, or not at all. In this
case B = G(τ ; p) − G(τ ; p ) simply captures the change in the number of households reporting at or below
the original notch as a result of the reform, which does not correspond to a reduction in the bunching mass at
the original notch as almost all households cannot precisely bunch. Similarly, J = G(τ ; p ) − G(τ ; p) simply
captures the change in the number of households reporting below the new notch as a result of the reform,
which does not correspond to the number of households jumping down to bunch at the new notch as almost
all households cannot precisely bunch. Nonetheless, for ease of exposition, we will continue to refer to B as
the number of bunching households and J as the number of jumping households.
    The core intuition for Proposition 3 is that we can again use revealed preference arguments to bound WTP
for a reform from policy p to policy p . And these revealed preference arguments require very little structure
on household utility, household heterogeneity, the choice variables, or choice sets available to households.
  17. We are slightly abusing notation here. Our upper bound for the MVPF, M V P FU , is actually the upper bound on the
MVPF from moving from policy p to p. This is because our upper bound on the WTP is actually an upper bound on the
negative WTP of moving from policy p to p, or equivalently, it is an upper bound on the willingness-to-accept (WTA) to
move from policy p to p . We obviate this distinction in Proposition 1 by making the assumption that utility is quasi-linear in
consumption, so that WTP=WTA.
  18. Note that in Proposition 3 we have assumed τ > τ and that b G(τ ; p ) − bG(τ ; p) > 0. Assuming τ > τ is WLOG
because one can also use Equation (9) to get bounds on the welfare gain from moving from p to p. On the other hand, if
b G(τ ; p ) − bG(τ ; p) < 0, both inequalities in Equation (9) are simply reversed.
  19. Again, we also need to know the total cost of the reform; but the total cost can be constructed from G(τ ; p ) and G(τ ; p),
which are also needed to construct J and B .


                                                               12
Problem (5) can encompass a variety of important realisms: (1) households may respond by changing labor
supply instead of misreporting their income (or respond on a variety of dimensions),20 (2) households may
face limited choice sets (e.g., restrictions on labor supply), (3) households may face reporting costs (e.g.,
hassle or time costs) and thereby face a decision of whether to report/update their income on the registry21 ,
(4) households may have a wide range of heterogeneity in their utility functions (e.g., households may have
varying preferences over the labor/leisure trade-off or varying preferences to locate at round numbers). We
view this robustness as perhaps the most important aspect of our theory: we can construct bounds for the
MVPF of changing a notch in a manner which is, in large part, model-free.
    But of course there are some implicit restrictions encoded in the assumed household problem (Problem
(5)) used to prove Proposition 3. Perhaps most importantly, Proposition 3 requires that households correctly
perceive the benefit schedule and the reform. The proof to Proposition 3 uses the fact that household re-
optimization improves utility; if misperceptions are extreme for many households this may not be the case.
However, it is straight-forward to extend Proposition 3 when households misperceive the schedule if we are
willing to assume that, on average, behavioral responses to the reform from p to p improve welfare (i.e.,
perceptions are not so extreme that households, on average, harm themselves by responding to the reform;
see Appendix A.5). For example, if some proportion of households are entirely unaware of the reform while
the rest of the population is perfectly aware of the reform, our bounds hold as unaware households will not
respond to the reform whereas those who are aware of the reform improve their utility via their behavioral
response.22
    Moreover, Problem (5) is also a static problem, so that Proposition 3 does not allow for dynamic decision
making or uncertainty. We augment Proposition 3 in Appendix A.6 to show that we can bound the discounted
welfare impact of the policy over time relative to the discounted total budgetary cost in a general dynamic
model allowing for income dynamics, savings, and stochastic shocks. In this case, the relevant bounds for
the MVPF are constructed using the discounted sum of the expected number of jumping households over
time and the discounted sum of the expected number of bunching households over time. Proposition 3 also
requires that there are no externalities from household decisions so that decisions of one household do not
directly impact the utility of any other household. Our bounds can be generalized to allow for externalities
by augmenting the upper and lower bounds for the MVPF with an additional term measuring the WTP for
these externalities relative to the total cost; however, measuring WTP for externalities is likely difficult in
practice.
    Additionally, Proposition 3 can be augmented to allow for more complex policy environments. Proposition
3 assumes that there is no underlying tax and transfer system beyond the benefit b given to those with a
                ˆ ≤ τ . But we can easily extend Proposition 3 to account for more complex underlying tax
reported income y
and transfer schedules; in this case, the total budgetary cost of the reform must include the impacts that
behavioral responses have on other programs that impact the government’s budget (i.e., we need to calculate
the fiscal externalities associated with the reform).23 Proposition 3 also assumes that the transfer is given in
   20. This is consistent with Feldstein (1999), who argues that the efficiency costs of taxation do not depend on whether behavioral
responses occur on the labor supply margin or the misreporting margin.
   21. To capture adjustment/updating costs, one could suppose x = y     ˆt and θ consists of current income yt , aversion to misre-
porting µ, and prior reported income y    ˆt−1 ; households incur an adjustment cost k if their reported income today differs from
prior reported income.
   22. In the extreme case where all households are unaware, both B and J are 0 so that our upper and lower bounds for the
MVPF coincide at 1, which is definitionally the MVPF of non-distortionary cash transfers.
   23. In particular, the total cost of the reform would equal b G(τ ; p ) + R(p ) − bG(τ ; p) − R(p) where R(p) denotes net



                                                                13
cash; if the transfer is paid in-kind, one would require an estimate of the average WTP for $1 worth of the
in-kind good to apply an analogue to Proposition 3.
    Lastly, we discuss the advantages of bounding the welfare impacts (as opposed to exactly characterizing the
welfare impacts) of a notch reform. First, because we cannot apply envelope conditions in our setting, exactly
characterizing welfare impacts requires one to take a stance on which margins households are responding, the
functional form of their utility function, the sorts of frictions they face, etc., whereas bounding welfare impacts
requires very little structure on the household optimization problem. In fact, we prove in Appendix A.4 that
the bounds in Proposition 3 are as tight as possible without making additional assumptions on primitives.
Second, our bounds on the welfare impact of a notch change are expressed in terms of estimable reduced-
form objects, whereas exactly characterizing welfare impacts would require one to estimate a potentially large
number of structural parameters.
    We now turn to our empirical application: estimating the number of bunching and jumping households
                           ılia reform, and, in turn, calculating bounds for the welfare impact of this reform.
for the June 2014 Bolsa Fam´
                                                     ılia program, our data, and the June 2014 reform.
We begin by describing the structure of the Bolsa Fam´


                                  3      The Bolsa Fam´
                                                      ılia Program

                                                ılia (BF) provides cash transfers to poor households based
    The Brazilian anti-poverty program Bolsa Fam´
on their reported, monthly per-capita income. BF was implemented in October 2003 and is administered by
                                       erio do Desenvolvimento Social, or MDS). BF is one of the world’s
the social development ministry (Minist´
largest cash transfer programs with around 14 million households receiving benefits in 2014 (Gazola Hellmann,
2015). Applicants report their information, including information on household income, expenditures, assets,
                                                                                            ´
socioeconomic characteristics, and demographic characteristics, to interviewers at Cadastro Unico agencies,
which are program offices spread across Brazil’s 5,570 municipalities. Information is entered by interviewers
                  ´
into the Cadastro Unico registry. Beneficiaries of the BF program are required to update their information
once every 2 years to maintain their benefits.
    Eligibility for the BF program is based on reported household per-capita income. Household per-capita
income is calculated in five steps. First, the applicant is asked to report labor income for the last month as
well as average monthly labor income over the past year for each member in the household (the applicant must
present a government-issued ID for herself and for each family member thus making it difficult to register
fictitious family members). Second, the applicant is asked to report average monthly income received from
five additional sources for each household member (see Appendix B.1, which shows the questionnaire used
to calculate monthly income). Third, for each individual, the computer calculates the minimum between the
average monthly labor income over the past year and last month’s labor income. Fourth, the computer sums
this minimum monthly labor income along with income from the five additional sources to get a measure of
government spending under policy p (exclusive of spending on BF), and ∆R = R(p ) − R(p) captures the fiscal externalities of
the reform. In this case, the lower and upper bounds for the MVPF are given by:
                                        b [G(τ ; p ) − G(τ ; p)] − ∆R          J           ∆R
                       M V P FL ≡ 1 −                                 =1−b            −
                                         b G(τ ; p ) − bG(τ ; p) + ∆R      Total Cost   Total Cost

                                          b[G(τ ; p) − G(τ ; p )] + ∆R           B            ∆R
                        M V P FU ≡ 1 +                                 =1+b             −
                                         b G(τ ; p ) − bG(τ ; p) − ∆R        Total Cost    Total Cost
Thus, in addition to B and J , we need to observe ∆R to assess the welfare impact of the reform.



                                                             14
total monthly income for each individual. Finally, the computer sums this individual total monthly income
across all household members and divides it by the number of household members.
      The government then transfers a monthly, unconditional benefit (referred to as the “basic benefit”) to all
households reporting a per-capita income below the “extreme-poverty threshold” and transfers an additional
benefit (referred to as the “variable benefit”) to households with children who have a reported per-capita
income below the higher “poverty threshold” conditional on these households making health and education
investments in their children.24 Between July 2009 and June 2014, the extreme-poverty threshold was R$70
per-capita, per-month and the poverty threshold was R$140 per-capita, per-month.
      Finally, the MDS has several enforcement mechanisms to prevent income misreporting. First, during the
interview, the income questions come at the end of the questionnaire so that questions on expenditures and
assets can help the interviewer asses the veracity of the reported income (Bastagli, 2008). Second, during the
interview, the applicant is reminded of her responsibility to provide true statements under penalty of losing
the right to be eligible for government programs (Gazola Hellmann, 2015). Third, the ministry conducts
audits, which can be triggered by citizens’ complaints and cross-checks of registry data with other datasets
such as administrative data on formal employment, deaths, or automobile purchases (Gazola Hellmann, 2015).
However, despite these attempts, the large informal sector in the Brazilian economy leaves substantial scope
for misreporting, which we conjecture is an important response margin to the reform.


3.1     Data Sources and Sample Description

                                     ´
      We have access to the Cadastro Unico household registry, which is used to determine the eligibility of
households for BF as well as all other targeted federal social programs (Veras Soares, 2011). Many of these
other programs have eligibility criteria above the BF thresholds which explains the large number of ineligible
applicants in the registry.25 These other programs do not change concurrently with the BF reform that we
analyze and are discussed in more detail in Appendix B.2.
      The final data set is constructed by appending eight extractions of the registry: one in December of
each year from 2011 until 2015, one in April 2015, one in August 2015, and one in September 2016.26 Each
extraction contains the latest information for all households on the registry at the time of the extraction date.
For instance, if a household updated its information in August of 2011 and September 2013, its information
will appear as of August of 2011 in the 2011 and 2012 extractions and as of September 2013 in the 2013, 2014,
2015, and 2016 extractions. The reform to BF that we study occurred in June 2014. Summary statistics on
household per-capita income and number of family members as of June 2014 are displayed in Table 2. Note,
  24. Technically, eligible families do not automatically become BF beneficiaries. There is a quota (cap) on the number of
beneficiaries per municipality. Prior to 2009, these quotas were based on the predicted number of households below the poverty
threshold in each municipality. Post 2009, these quotas were based on the predicted number of households below the poverty
threshold scaled by 1.18 (Gerard, Naritomi and Silva, 2021). If the number of eligible families exceeds the quota in a municipality,
priority is assigned first to certain vulnerable populations (indigenous families, Quilombola families, families with children in
child labor, families who collect recyclable materials, families with members free from situations similar to slave labor), second
to families with lower per-capita income, and third to families with a larger number of children. Given our focus will be on
households reporting below the extreme-poverty threshold post 2009, we believe that it is reasonable to assume that the vast
majority of these households do in fact receive the basic benefit.
  25. Moreover, the government aims to register all families with per-capita incomes below half the minimum wage (or total
incomes below three times the minimum wage) (Veras Soares, 2011). The minimum wage was R$724 per-month in 2014; thus
half the minimum wage was R$362, which is substantially higher than the highest BF threshold.
                                                   ´
  26. Every time the MDS analyzes the Cadastro Unico      data, it creates one of these extractions. Therefore, the frequency of the
extractions are a result of previous data analyses by the ministry. Appendix B.3 contains a figure depicting the timeline of the
data extractions.


                                                                15
Table 2 also presents separate statistics for single individual households (households with one adult and no
children) as our main analysis will focus solely on these households (we discuss why below).

                                 Table 2: Summary Statistics as of June, 2014

                        Variables                                                Mean       Median

                        Per Capita Income (single individual households)         330.51      200.00
                                                                                (324.30)
                        Per Capita Income (all households)                       151.17       76.66
                                                                                (184.97)
                        Number of Members Per Household                           3.04        3.00
                                                                                 (1.46)
                        Observations (single individual households)            4,232,528
                        Observations (all households)                          28,932,001

                                                                         ´
Note: This table shows summary statistics for households in the Cadastro Unico database as of June, 2014. Per-capita income
denotes household, monthly, per-capita income and is measured in Brazilian reais. The PPP conversion from US dollars to
Brazilian reais was 1.813 in 2014 (OECD, https://data.oecd.org/conversion/purchasing-power-parities-ppp.htm).



3.2     The Transfer Schedule and the June 2014 Reform

      At the beginning of our data (December 2011), the extreme-poverty threshold was equal to R$70 per-
capita, per-month and the poverty threshold was equal to R$140 per-capita, per-month.27 The basic benefit
was equal to R$70 per-month, while the variable benefits were based on the number and ages of the children in
the household (see Appendix B.4 for more information on the variable benefits). In June 2014, the government
increased both the benefits and thresholds by 10%. Thus, the extreme-poverty threshold was raised from
R$70 to R$77 per-capita, per-month, the poverty threshold was raised from R$140 to R$154 per-capita,
per-month, the basic benefit was raised from R$70 to R$77 per-month, and the variable benefits were also
increased by 10%. This reform was announced on national television by the president in April 2014.
      Our main analysis will focus on single individual households. These households are not eligible for the
variable benefits as they do not have children. Thus, prior to June 2014, these households received R$70 per-
month if their reported income was less than or equal to R$70 per-month and 0 otherwise. After June 2014,
these households received R$77 per-month if their reported income was less than or equal to R$77 per-month
and 0 otherwise. Thus, the benefit schedule for these households has a single notch which increased both in
level and location as a result of the June 2014 reform; see Figure 2.
  27. In 2003, the extreme-poverty threshold and the poverty threshold were set to equal one-fourth and one-half of the monthly
minimum wage of R$200, respectively. These thresholds have since been periodically adjusted for inflation; the time betwen
adjustments is ad hoc and not linked to the minimum wage. Prior to June 2014, the last readjustment was in July 2009
(Gazola Hellmann, 2015).




                                                              16
    Figure 2: June 2014 Reform to the Benefit Schedule for Single Individual Households


    We focus on single individual households because in February 2013, the government instituted a guar-
anteed minimum income of R$70 per-capita for all households which was subsequently raised to R$77 in
the June 2014 reform.28 However, because the basic benefit is equal to the guaranteed minimum income,
the benefit schedule for single individual households is not impacted by this minimum. In contrast, for all
other households, this minimum creates a kink in the benefit schedule below the extreme poverty threshold
(the location of this kink will vary based on household composition); moreover, the location of this kink
changes with the 2014 reform. For example, prior to the reform, households with two adults and no children
had a kink at the reported per-capita income level of R$35 which increased to R$38.5 post June 2014.29
This is problematic because, as will be discussed in Section 4, one of our identification assumptions is that
the reported income distribution below the extreme-poverty threshold is unaffected by the June 2014 reform;
clearly this is not necessarily true for households with more than one member because the kink created by the
guaranteed minimum income changes concurrently with the notch. Nonetheless, we will discuss the impacts
of the June 2014 reform on households with more than one individual in Section 5.5.
    Finally, in June 2016, there was another reform to the BF program where both the benefit and the thresh-
old were further increased. This reform, like the June 2014 reform, affected households of all compositions.
Thus, in our empirical analysis, we will restrict our attention to the 4 year window around the June 2014
reform (i.e., June 2012 - June 2016).30


4    Empirical Strategy to Bound the MVPF of the June 2014 Reform

    Next, we apply our theoretical framework from Section 2 to empirically bound the MVPF of the June
              ılia reform for single individual households. Using the notation from Section 2, the June 2014
2014 Bolsa Fam´
   28. This guaranteed minimum income was instituted earlier for households with children. In particular, this guarantee was
instituted in June 2012 for households with children below the age of 6, in November 2012 for households with children below
the age of 15, and in February 2013 for all remaining households.
                                                                                                                   ˆ < 35, will
   29. In particular, prior to June 2014, a two member household with a reported per-capita income less than R$35, y
receive an additional monthly benefit equal to 2(70 − y ˆ) − 70. E.g., a two member household reporting a per-capita income of
R$20 will receive R$70 in the basic benefit and an additional benefit of R$30.
   30. We do not have enough data beyond June 2016 to analyze the effects of this later reform because our final data extraction
is in September 2016.


                                                              17
reform changed the policy from p = {b, τ } = {70, 70} to p = {b , τ } = {77, 77}. Thus, we seek to calculate:

                                           77 × [Gt¯(77; p ) − Gt
                                                                ¯(77; p)]                  Jt¯
                             ¯≡ 1−
                    M V P FL,t                                              = 1 − 77 ×                                     (10)
                                         77 × Gt¯(77; p ) − 70 × Gt¯(70; p)            Total Costt
                                                                                                 ¯


                                           70 × [Gt¯(70; p) − Gt¯(70; p )]                 Bt¯
                             ¯≡ 1+
                    M V P FU,t                                              = 1 + 70 ×                                     (11)
                                         77 × Gt¯(77; p ) − 70 × Gt¯(70; p)            Total Costt
                                                                                                 ¯

                                                       ¯ =June 2016. We will then use the bounds on
      ¯ denotes the end of our four-year-window, i.e., t
where t
the MVPF to bound the welfare impact of the reform for June 2016.31 Thus, we need to estimate the
number of jumping and bunching households as of June 2016, which, in turn, requires us to estimate the
                                                                                ¯: Gt
number of households locating below the old and new notch under both p and p at t              ¯(70; p ),
                                                                                    ¯(70; p), Gt
Gt¯(77; p), and Gt
                 ¯(77; p ). However, because this was a national reform, only Gt ¯(77; p ) and Gt
                                                                                                ¯(70; p ) are
                                                             ¯. Thus, our goal is to estimate the number of
observed directly because only policy p was offered in period t
                                                                   ¯ had the reform not happened: Gt
households reporting less than or equal to R$70 and R$77 in period t                               ¯(77; p)

     ¯(70; p).
and Gt
                                  ¯(77; p) and Gt
      Ideally, we would estimate Gt             ¯(70; p) using control groups from random experimental vari-
ation (e.g., from staggering the implementation of the reform randomly across geographies); however, as
mentioned above, the BF reform was a national reform implemented everywhere starting June 2014. Conse-
quently, our identification strategy will rely on using regions of the reported income distribution that were
not impacted by the reform to control for underlying time trends in the portions of the reported income
distribution that were impacted by the reform. This brings us to our two identification assumptions.


4.1     Identification Assumption 1

      Our identification strategy relies on first finding a region of the reported income distribution which was not
impacted by the reform. In the context of the baseline model in Section 2.1, the number of single individual
households reporting an income strictly below R$70 should be unchanged by the reform. The intuition for
this is that anyone who has a true income below R$70 always reports truthfully both pre- and post-reform
and anyone who has a true income above R$70 prefers to misreport at the threshold rather than misreport
to an income level below the threshold (because misreporting costs are increasing in the distance between
true and reported incomes). While in this baseline model bunching should occur precisely at R$70, in reality
bunching is typically more diffuse due to small optimization errors and/or frictions (Kleven, 2016). Thus, our
first identification assumption is that the number of people reporting incomes at or below R$63 is unaffected
by the reform (i.e., we assume that R$63 is sufficiently far below R$70 such that there are no “bunchers” at
or below R$63):

Identification Assumption 1. The distribution of reported incomes below R$63 is unaffected by the reform:
Gt (x; p) = Gt (x; p ) ∀ x ≤ 63, t.

      To provide suggestive evidence that Assumption 1 is reasonable, Figure 3 plots the number of single
individual households reporting in income bins of size 7 from R$0 to R$63 (with the numbers in each bin
   31. Implicitly, we assume that agents are not making dynamic decisions (e.g., agents are myopic) so that we can apply Propo-
sition 3 and calculate the welfare impact of the reform in a given time period. Note, there is another reform to the schedule in
June 2016 so we cannot estimate the welfare impact of the reform beyond June 2016.



                                                              18
normalized to 1 in June 2012). Figure 3 shows that it is not obvious any of the bins below R$63 were impacted
by the reform.32




Note: This figure shows the number of single individual households with reported incomes in the various bins. The number in
each bin is normalized to 1 in June, 2012. For example, this means that there are 3.5 times as many single individual households
with reported incomes in R$(7, 14] in June, 2016 as in June, 2012. The timing of the reform is indicated by the gray, shaded
region.

   Figure 3: Number of Single Individual Households Reporting Incomes in Various Bins,
                              Normalized to 1 in June, 2012


                                                   ¯(63; p) = Gt
    Thus, Identification Assumption 1 implies that Gt                                                 ¯ and
                                                               ¯(63; p ). This in turn implies that Jt

 ¯ can be expressed solely in terms of how the number of households reporting incomes in bins R$(63, 70]
Bt
and R$(70, 77] changed as a result of the reform:33



                                     ¯(70; p ) − Gt
                              ¯ = − Gt
                             Bt                   ¯(70; p)
                                                                                                                              (12)
                                 =−                    ¯(63; p ) − [Gt
                                          ¯(70; p ) − Gt
                                         Gt                          ¯(70; p) − Gt
                                                                                 ¯(63; p)]


                                Jt   ¯(77; p ) − Gt
                                 ¯ =Gt            ¯(77; p)

                                                   ¯(70; p )] − [Gt
                                      ¯(77; p ) − Gt
                                   =[Gt                           ¯(77; p) − Gt
                                                                              ¯(70; p)] +                                     (13)
                                                    ¯(63; p )] − [Gt
                                       ¯(70; p ) − Gt
                                     [Gt                           ¯(70; p) − Gt
                                                                               ¯(63; p)]


                ¯ is equal to the reduction in households locating in R$(63,70] as a result of the reform (i.e.,
In particular, Bt
                                                               ¯ is equal to the increase in households locating
the reduction in households bunching at the old notch), while Jt
in R$(70,77] less the reduction in households locating in R$(63,70] as a result of the reform (i.e., the increase
   32. We discuss reasons why Identification Assumption 1 may fail to hold and explore robustness to Identification Assumption
1 in Section 5.3.
   33. The total cost of the reform can be estimated from the components necessary to estimate Bt        ¯ and Jt ¯ as Total Cost =
77Gt¯(77; p ) − 70Gt               ¯(77; p ) − Gt
                    ¯(70; p) = 77[Gt                         ¯(70; p ) − Gt
                                                ¯(70; p ) + Gt                             ¯(70; p) − Gt
                                                                          ¯(63; p )] + 70[Gt           ¯(63; p)] + 7Gt¯(63; p ).



                                                                19
in households bunching at the new notch less the reduction in households bunching at the old notch).
    Figure 4 shows how the number of households reporting incomes in bins R$(63, 70] and R$(70, 77] evolved
in a four year window around the reform from June 2012 to June 2016. In contrast to Figure 3, Figure 4a
depicts a clear trend departure commensurate with the reform. Thus, Figure 4a provides highly suggestive
evidence that the BF reform induced a substantial behavioral response that increased the number of house-
holds bunching at the new threshold. Note that the number of individuals reporting incomes in R$(70, 77] is
increasing over time prior to the reform - this is likely due to a combination of factors including population
growth and (nominal) income dynamics.34 In particular, Brazil was experiencing slowing and then negative
per-capita GDP growth from 2012-2016 with a large drop in 2015 along with relatively high inflation (World
Bank Data). However, the slope of the line in Figure 4a clearly changes at the time of the reform, leading
to a sharp increase in the number of individuals locating in R$(70, 77]. The fact that this reform induced a
slope shift rather than a level shift may reflect households learning about the threshold shift over time.35
    In contrast to Figure 4a, there is less convincing prima facie evidence in Figure 4b that the reform induced
a substantial change in the number of households locating in R$(63,70]. While Figure 4b may indicate that
the number of households in R$(63,70] declined as a result of the reform, underlying time trends make it
difficult to assess the extent to which this is the case (Figure 24 in Appendix C.9 shows a much starker
downward trend break for two adult households). Hence, we need a way to control for underlying time trends
in the numbers reporting in R$(63, 70] and R$(70, 77] to estimate how these quantities would have evolved
over time in absence of the reform. This brings us to our second identification assumption.




         (a) Number Reporting in R$(70,77]                                 (b) Number Reporting in R$(63,70]
Note: This figure shows the number of single individual households that report incomes in the intervals R$(70, 77] and R$(63, 70]
for each month between June 2012 to June 2016. The timing of the reform (from the announcement in April 2014 to the
enactment in June 2014) is indicated by the gray, shaded region.

  Figure 4: Number of Single Individual Households Reporting an Income in R$(70,77] and
                                         R$(63,70]

                                                             ´
  34. Similarly, the number of households on the Cadastro Unico    registry is growing over this time period - see Figure 12 in
Appendix C.1. This too is likely due to a variety of factors including population growth, a struggling Brazilian economy, and
increased awareness/understanding of the BF program over time.
  35. Alternatively, households may wait to update simply because they are only required to update every two years and/or to
avoid suspicion of misreporting that could result from updating immediately after the reform.




                                                              20
4.2     Identification Assumption 2

      At a high level, our second identification assumption is going to relate how the numbers reporting in bins
below R$63, R$(0,7],....,R$(56,63], evolve over time to how the numbers reporting in our two bins of interest,
R$(63, 70] and R$(70, 77], would have evolved over time in absence of the reform. In this sense, bins below
R$63 can be viewed as our “control bins” while R$(70,77] and R$(63,70] can be viewed as our “treatment
bins”. Building towards our second identification assumption and our main empirical specification, consider
the following regression, where N(x−7,x],t denotes the number of individuals reporting in income bin R$(x−7, x]
in month t:
                                                 K
                      log N(x−7,x],t =                αk,x tk       +      β1,x postt + β2,x postt × t          +   xt
                                                k=0
                                                                                                                                (14)
                     log # in (x − 7, x]                                post-reform deviation from polynomial
                                           bin-specific polynomial


where postt takes value 1 if month t is after the reform and 0 otherwise (i.e., postt = 1 if t ≥June 2014 and
0 otherwise). Regression (14) simply estimates a break from the (bin-specific) polynomial time trend in the
post-reform period for the (log) number of people reporting in R$(x−7, x]. Running Regression (14) separately
                                                                          ˆ2,x t
                                                                   ˆ1,x + β
for each bin (using cubic bin-specific polynomials), Figure 5 plots β           ¯ for each x ∈ {7, 14, ..., 77} where
¯ represents the final month in our analysis period (June, 2016). In other words, Figure 5 plots how the log
t
number of people reporting in each seven increment bin below R$77 deviated from its bin-specific cubic time
trend in the post-reform period:




Note: This figure plots how the log number of people reporting in each seven increment bin deviated in the post-reform period
from a bin-specific, cubic time trend estimated via Regression (14) along with 95% confidence intervals using robust standard
                                                                                                                         ˆ2,x t
                                                                                                                  ˆ1,x + β
errors. For each x ∈ {7, 14, ..., 77}, we estimate the deviation from the bin-specific time trend in June, 2016 as β           ¯, where
¯
t represents the final month in our analysis period (June, 2016). The green horizontal line plots the average deviation (-0.038)
from trend for bins with x ≤ R$63.

        Figure 5: Deviation from Cubic Time Trend in Post-Reform Period by Income Bin


      Figure 5 shows that the number of people reporting in income bins below R$63 saw very minor (and
mostly statistically insignificant) trend breaks in the post-reform period. The dashed green line in Figure 5


                                                                 21
indicates an average trend break of -0.038 across these bins, indicating that the number of people reporting
incomes in bins {R$(0, 7], R$(7, 14], ..., R$(56, 63]} was, on average, approximately 3.8% lower in June, 2016
relative to polynomial trend. However, under Identification Assumption 1, any post-reform trend break for
these bins must be due to underlying time variation unrelated to the reform. Loosely speaking, our second
identification assumption is going to be that our two treatment bins, R$(63, 70] and R$(70, 77], would have
seen the same deviation post-reform as the control bins had the reform not happened, i.e., these two bins
also would have seen a 3.8% reduction in the number of people in June, 2016 (relative to their bin-specific
polynomial time trends) had the reform not happened. However, as can be seen in Figure 5, the number of
people reporting incomes in R$(63, 70] saw a reduction of around 13.6% (a 0.146 log-point decrease) while
the number of people reporting incomes in R$(70, 77] saw an increase of 326.3% (a 1.45 log-point increase)
as of June 2016. Formally, our second identification assumption is:

Identification Assumption 2. In absence of the reform, the (log) number of people reporting in each bin
evolves according to:

                                                        K
                          log N(x−7,x],t = h(t) +            αk,x tk + νx,t for x ∈ {7, 14, ..., 77}
                                                       k=0


    Under Identification Assumption 2, the difference between the (log) number of households in any two
7-increment bins below R$77 is, in absence of the reform, governed by a stable polynomial plus a random
error term. Combining Assumptions 1 and 2, we can use our control bins (i.e., bins ≤R$63) to identify h(t)
in the post-reform period.36 In other words, we can use our control bins to identify the expected deviation
from bin-specific polynomials in the post-reform period if the reform did not occur. Any deviation observed
above and beyond h(t) in our two treatment bins is then attributed to the reform. This brings us to our
main empirical specification: a generalized difference-in-difference specification where we allow for flexible
pre-treatment dynamics between treatment and control bins:

                                K
     log N(x−7,x],t = δt +           αk,x tk + [β1,x postt + β2,x postt × t] × treatx +   xt   for x ∈ {7, 14, ..., 77}    (15)
                               k=0


where δt represents a set of month fixed-effects; treatx takes value 1 if x ∈ {70, 77} and 0 otherwise; and
  K          k
  k=0 αk,x t     captures polynomial time trends for each bin (indexed by x) which predate the reform and are
assumed to persist into the post-reform period had the reform not occurred. Under our two identification
assumptions, the causal impact of the reform on the (log) number of households locating in R$(x − 7, x] in
                                                       ˆ2,x t for x ∈ {70, 77}.
                                                ˆ1,x + β
month t, denoted ∆ log N(x−7,x],t , is equal to β
    Before we present the results from Equation (15) in Section 5 (using various polynomial degrees K ), let
   36. We use the distribution below R$63 rather than the distribution above R$77 to control for underlying time trends because
the distribution above R$77 is presumably impacted by the reform given that the jumping households would have located above
R$77 in absence of the reform. However, reported incomes sufficiently far above R$77 should not be impacted by the reform.
The question is, how far is sufficiently far? Kleven and Waseem (2013) use an iterative procedure to determine how far above a
notch is sufficiently far so that the distribution beyond this point is unaffected by the notch. Their procedure relies on equating
the excess mass at and below the notch with the missing mass above the notch. However, such an exercise is not feasible in our
setting due to extensive margin responses. In particular, while some of the “excess” households locating below R$77 would have
located above R$77 in absence of the reform, some of them may have simply decided to not report an income and, thus, not be
on the registry. Thus, we cannot equate the excess mass below R$77 with the missing mass above R$77. Consistent with our
selection of control bins, Kleven (2016) notes that when extensive margin responses are strong, one should consider only using
data below the notch to estimate the bunching mass.


                                                               22
us discuss the statistical and economic interpretations of Identification Assumption 2. From a statistical
perspective, Identification Assumption 2 is simply a generalization of the standard difference-in-difference
assumption (this generalization is employed in, for example, Wolfers (2006) and is discussed more generally
in Mora and Reggio (2013)).37 Setting K = 0 results in a standard difference-in-difference estimator in which
the control groups and treatment groups are assumed to have “parallel trends” pre-reform that would have
persisted in the post-reform period had the reform not occurred. The standard difference-in-difference strategy
therefore assumes that the difference between the treatment and control groups follow a 0th order polynomial
in absence of the reform (i.e., that they differ by a constant). Larger values of K require progressively weaker
parallel assumptions. For instance, setting K = 1 requires the “parallel growths” assumption which asserts
that the growth rates in the treatment and control groups over time are the same (i.e., that second and
higher differences between treatment and control groups are constant over time).38 Setting K = 2 leads to
the “parallel accelerations” assumption which asserts that the acceleration rates in the treatment and control
groups over time are the same (i.e., third and higher differences between treatment and control groups are
constant over time). Setting K > 2 results in what Mora and Reggio (2013) refer to as the “parallel-K ”
assumption which asserts that the K + 1 and higher differences between the treatment and control groups
are constant over time.
    But what is the economic meaning of Identification Assumption 2? Fundamentally, there are complex
structural relationships between the number of people reporting in each income bin; these relationships are
determined by how the true income distribution is evolving over time, how the true income distribution is
mapped to the reported income distribution, as well as growth of the BF registry over time. By making
Identification Assumption 2, we are essentially assuming that the differences over time between the number
of people reporting in each income bin (which are determined by unknown structural relationships) are well
approximated by polynomials which would have persisted in absence of the reform. Identification Assumption
2 is not fully testable in the same way that the standard differences-in-differences assumption (that parallel
pre-trends persist post treatment) is not fully testable. However, as in the case of the standard “parallel
trends” assumption, we can gauge whether Identification Assumption 2 seems sensible by looking at pre-
reform data. Identification Assumption 2 implies that the difference between the log number of households in
any two 7-increment bins below R$77 is governed by a stable polynomial in absence of the reform. Appendix
C.4 shows how the differences log N(70,77] − log N(x−7,x] and log N(63,70] − log N(x−7,x] evolved in the
pre-reform period for x ∈ {7, 14, ..., 63}. These differences follow stable, low order polynomial relationships
in the pre-period. While this of course does not imply that these relationships would have persisted into the
post-reform period (just as parallel pre-trends do not imply that the trends would continue on a parallel path
post treatment), it is at least suggestive that stable relationships exists between the various income bins.
    However, because we have several control bins, we can partially test the validity of our two identification
assumptions. In particular, under Identification Assumption 1, our control bins should be unaffected by the
reform. Thus, in the post-reform period, we can test whether the number of households locating in each
of our control bins can be accurately predicted by our other control bins using Identification Assumption
2. Thus, at the end of Section 5, we run a number of placebo tests, suggesting that our two identification
assumptions are reasonable.
  37. Equation (2) in Wolfers (2006) uses an analogous generalized difference-in-difference accounting for differential quadratic
pre-trends. Our Equation (15) is a combination of Equation (17) and Equation (20) from Mora and Reggio (2013).
  38. A number of papers discuss the extended version of differences-in-differences under the “parallel growths” assumption (e.g.,
Mora and Reggio (2013), Rambachan and Roth (2020), and Bilinski and Hatfield (2019)).



                                                              23
4.3     Could We Use a Bunching Estimator to Bound the MVPF?

      Our empirical strategy amounts to estimating the reduction in the mass of households bunching at the
old notch and the increase in the mass of households bunching at the new notch as a result of the reform
using a difference-in-difference strategy. But one may wonder whether we could have used standard bunching
techniques (developed in Kleven and Waseem (2013) and Saez (2010)) to back out the changes in these
masses. Of course, even using standard bunching techniques, one must account for time variation, i.e., one
must estimate how many people would have bunched at the old notch and at the new notch in the post-reform
period had the reform not occurred. As an example of how one might use standard bunching techniques while
also controlling for time variation, one could first use these techniques to back out the counterfactual reported
income distributions that would result if there were no notches in both the pre- and post-reform period (e.g.,
in June 2014 and June 2016). To do so, one would use the observed reported income distributions in the
pre- and post-reform periods and make the following two assumptions (which are standard in the bunching
literature): (a) the reported income distribution would be smooth in absence of the notch, and (b) the
reported income distribution sufficiently far away from the notch is unaffected by the notch. One could
then assume that the relationship that exists between the counterfactual and the observed reported income
distributions in the pre-reform period would have stayed the same in the post-reform period in absence of
the reform. This assumption combined with the estimated counterfactual reported income distribution in
the post-reform period would then allow one to calculate the reported income distribution in the post-reform
period had the reform not happened. A similar strategy is employed in Carril (2022). However, this strategy
is unlikely to be feasible in our setting. As seen in Figure 17 in Appendix C.3, the reported income distribution
in Brazil is extremely non-smooth. For instance, in June 2014, in addition to substantial bunching at the
notch of R$70, there is extreme bunching at numbers equal to 0 mod 50, less severe bunching at numbers
equal to 0 mod 10, and substantial bunching at R$60. R$60 was a previous BF eligibility threshold (the notch
of R$60 was implemented in 2006 while the notch of R$70 was implemented in 2009; see Gazola Hellmann
(2015)). While bunching estimators have been augmented to deal with “round-number” bunching and other
“reference-point” bunching (e.g., Kleven and Waseem (2013) and Best and Kleven (2017)), we believe that
the pervasiveness and the variability of round-number and reference-point bunching in our setting will make
it too difficult to precisely identify the counterfactual distributions around our notches (in particular, around
the original notch of R$70 which is equal to 0 mod 10). Thus, our empirical strategy highlights an alternative
way to estimate changes in bunching when smoothness assumptions cannot be used to estimate counterfactual
distributions.


                                               5    Results

      In this Section, we present results from our generalized difference-in-difference specification, Equation
(15), and, in turn, calculate bounds on the MVPF. Based on these bounds, we then discuss the welfare
implications of the reform. Next, we show robustness of our results to a variety of factors: we show robustness
to Identification Assumption 1 by allowing the number of households between R$(56, 63] to also be impacted
by the reform; we show robustness to a more general version of Identification Assumption 2; and, we show
robustness of our results to household composition. Finally, we conduct a placebo exercise which partially
tests the validity of our two identification assumptions.



                                                       24
5.1     Difference-in-Difference Results

      First, we present results from estimating Equation (15) seting K = 3, which assumes the difference in the
number of households reporting in any two income bins below R$77 approximately follows a cubic polynomial
over time (in absence of the reform). Once we have estimated Equation (15), we can recover the causal impact
of the reform on the log number of households in R$(x − 7, x] for x ∈ {70, 77} in any given month t, denoted
∆ log N(x−7,x],t = β       ˆ2,x t. Figure 6 plots the evolution of log N(63,70],t and log N(70,77],t over time
                    ˆ1,x + β
along with the estimated counterfactual path if the reform did not occur: log N(63,70],t − ∆ log N(63,70],t
and log N(70,77],t − ∆ log N(70,77],t . The difference between the actual path and the counterfactual path
yields the causal impact of the reform under our two identification assumptions.




        (a) N(63,70] : Actual & Counterfactual                            (b) N(70,77] : Actual & Counterfactual
Note: This figure shows the log number of single individual households reporting incomes in R$(63, 70] and R$(70, 77] over time
along with the counterfactual paths had the reform not happened. The counterfactual paths are equal to the actual log number
                                                                                        ˆ2,x t, estimated using Equation (15) where
                                                                                 ˆ1,x + β
of people reporting in the given interval minus the causal impact of the reform, β
we set treatx = 1 if x ∈ {70, 77} and K = 3. Confidence intervals are constructed from clustered standard errors at the bin level.
The timing of the reform is indicated by the gray, shaded region.

               Figure 6: Causal Impact of the Reform Estimated From Equation (15)


      Consistent with the theoretical model from Section 2.1, Figure 6 shows that the reform led to a large,
significant increase of about 1.5 log points in the number of households reporting in bin R$(70, 77] and
a significant decrease of about 0.1 log points in the number of households reporting in bin R$(63, 70].39
Translating these log changes into levels, the reform led to an increase of approximately 49,000 households
locating in R$(70, 77] and a decrease of approximately 27,000 households locating in R$(63, 70]. Plugging
                                                                          ¯ ≈ 22, 000. Thus, of the additional
                                                         ¯ ≈ 27, 000 and Jt
these numbers into Equations (12) and (13), we get that Bt
49,000 households bunching at the new notch, 27,000 would have bunched at the old notch had the reform
   39. All standard errors for our analysis are clustered at the bin level (using STATA’s default small-number-of-clusters bias
adjustment) as this is the level of “treatment assignment”, see Abadie et al. (2017). Thus, we have 9 control clusters and 2
treatment clusters. However, Imbens and Koles´    ar (2016) show that these standard errors are modestly under-estimated when
the number of clusters is small (they find that for samples with 10 clusters, 95% confidence intervals only have a 91% coverage
rate). Hence, our standard errors may be modestly overstating the true statistical confidence we have in our estimates. An
alternative approach is to use non-clustered wild bootstrapped p-values (note, wild cluster bootstrapped p-values lead to severe
bias in difference-in-difference settings with a small number of treated clusters, see Roodman et al. (2019)). In our setting, these
p-values are smaller than those generated from our cluster-robust standard errors. Finally, Ferman and Pinto (2019) suggest
another alternative for difference-in-difference settings with a small number of treated clusters; unfortunately, their results rely
on asymptotic theory in the number of control clusters, which we believe is inappropriate for our setting given that we only have
9 control clusters.


                                                                25
not happened, while 22,000 would have located above the new notch had the reform had not happened (i.e.,
22,000 households jump into the program as a result of the reform).40
      We repeat this exercise under the alternative assumptions that the differences in the (log) numbers
in any two bins follow quadratic, quartic, or quintic polynomials over time, i.e., we re-estimate Equation
(15) for K = 2, 4, 5.41 Figures of the counterfactual paths for our two treatment bins under these alternative
assumptions are shown in Appendix C.2. Table 3 presents the estimated impact of the reform on the numbers
locating in R$(63,70] and R$(70,77], ∆N(63,70],t and ∆N(70,77],t , along with the estimated number of bunching
households Bt
            ¯, estimated number of jumping households Jt
                                                       ¯, and lower and upper bounds on the MVPF in
June 2016. All of these quantities are stable across different values of K . Under our preferred specification
(K = 3), we find a lower bound for the MVPF of the reform around 0.9 and an upper bound of around 1.12.

 Table 3: Impacts of Reform and MVPF Bounds in June 2016 Estimated from Equation (15)

                                            (1)           (2)          (3)        (4)         (5)           (6)
             Polynomial Degree, K       ∆N(63,70],t
                                                  ¯   ∆N(70,77],t
                                                                ¯      Bt
                                                                        ¯         Jt
                                                                                   ¯       M V P FL,t
                                                                                                    ¯   M V P FU,t
                                                                                                                 ¯

             Quadratic, K = 2             -26,279        51,759       26,279     25,480       0.88         1.11
                                          (6, 163)      (2, 160)     (6, 163)   (6, 659)    (0.03)        (0.03)
             Cubic, K = 3                 -27,452        49,247       27,452     21,794       0.90         1.12
                                          ( 4,357)        (234)      (4, 357)   (4, 592)    (0.02)        (0.02)
             Quartic, K = 4               -29,338        50,873       29,338     21,535       0.90         1.13
                                          ( 6,257)      (1, 345)     (6, 257)   (6, 503)    (0.03)        (0.03)
             Quintic, K = 5               -29,240        50,559       29,240     21,318       0.90         1.13
                                          ( 5,912)      ( 1,184)     ( 5,912)   ( 6,141)    ( 0.03)       ( 0.03)

Note: Columns (1) and (2) show the estimated impacts of the reform on the number of single individual households reporting
incomes in bins R$(63,70] and R$(70,77] for June 2016: ∆N(63,70],t                  ¯. Estimates are calculated using Equation
                                                                    ¯ and ∆N(70,77],t
(15) with various polynomial degrees K ∈ {2, 3, 4, 5}. Columns (3) and (4) show the estimated number of bunching and jumping
                           ¯ and Jt
households for June 2016, Bt       ¯, calculated using Equations (12) and (13). Columns (5) and (6) show the estimated upper
and lower bounds for the MVPF for June 2016, calculated using Equations (10) and (11). Standard errors are presented in
parentheses and are computed from the delta method from the clustered standard errors estimated in Equation (15).



5.2     Discussion

      Our lower bound on the MVPF of the June 2014 BF reform is approximately 0.90 whereas our upper
bound on the MVPF is approximately 1.12. This implies that the WTP for the reform relative to the
budgetary cost is between 0.90 and 1.12. Notably, these bounds are somewhat tightly centered around 1.
This is because the number of mechanical households (households locating below 70 both with and without
the reform) is much larger than the number of bunching or jumping households. In particular, for June 2016,
we estimate there to be around 1.88 million mechanical households compared to 22,000 jumping households
and 27,000 bunching households.
      What can we say about the welfare implications of an MVPF between 0.90 and 1.12? We can be sure
   40. One may then wonder whether we estimate a reduction of 22,000 households locating above R$77 as a result of the reform.
To investigate this, we can include income bins above R$77 in Equation (15) as additional treatment bins. Cumulatively, we
estimate that income bins between R$77 and R$147 saw a decline of around 14,300 households when setting K = 3 (with a
standard error of around 6800). However, as mentioned in footnote 36, the set of jumping households consists of both households
who would have located above R$77 in absence of the reform and households who would not have reported an income to the
registry at all in absence of the reform (i.e., new entrants). Because of these new entrants, it is not surprising that we estimate
the number of jumping households as exceeding the reduction in households reporting above R$77.
   41. Our results are also robust to estimating Equation (15) with more granular income bins, see Appendix C.5.



                                                                26
that the reform was welfare improving as long as the government values giving R$0.90 to BF households in
a non-distortionary manner more than spending R$1 on their next best alternative. On the other hand, the
reform was welfare decreasing if the government values spending R$1 on their next best alternative more
than giving R$1.12 to BF households in a non-distortionary manner.42
    But what is the government’s best alternative use of funds? In general, this is a difficult question to
answer. Lindert et al. (2007) show that BF performs better (in terms of targeting performance) than other
anti-poverty programs in both Brazil and other countries in Latin America; this suggests that there is not a
better alternative anti-poverty program that the government could finance. Instead, for expositional simplicity
we assume that the next best use of funds is a non-distortionary UBI. Under this assumption, we can perform
a conservative back-of-the-envelope calculation to argue that the 2014 BF reform was almost certainly welfare
improving. Suppose the government is utilitarian and that the true income distribution is log-normal with
parameters taken from the World Bank’s PovcalNet database for Brazil in 2016.43 Conservatively, we assume
that the BF household per-capita income distribution matches the distribution for the bottom 50% of income
earners in Brazil.44 Moreover, we assume utility of consumption is CRRA: u(c) = c1−γ /(1 − γ ). Based on
these assumptions, the government prefers giving R$0.90 (in a non-distortionary manner) to BF households
more than spending R$1 on a non-distortionary UBI as long as γ > 0.14. While estimates for γ vary widely,
typically estimates fall between 1 and 10 (Outreville, 2014). Alternatively, if γ = 1 (i.e., u(c) = log(c)), giving
R$0.90 (in a non-distortionary manner) to BF households yields equivalent welfare gains to spending R$1.50
on a non-distortionary UBI. Thus, if the next best alternative policy is a non-distortionary UBI and the
government is utilitarian, the BF reform was almost certainly welfare improving. This finding contributes to
the debate around targeted vs. universal transfers in developing settings (Hanna and Olken, 2018): we find
that even in a setting with a highly pronounced notch and substantial scope for misreporting, the efficiency
cost generated by behavioral responses is simply not large enough to outweigh the equity gain associated
with increased benefit generosity targeted to poor households (relative to a universal transfer).
    Finally, when calculating our bounds for the MVPF, we have ignored any potential fiscal externalities
when determining the total cost of the reform (i.e., we have ignored the impact that behavioral responses
have on other components of the government budget). There is evidence to suggest that some beneficiaries
partially substitute formal sector employment for informal sector employment to become eligible for the
BF benefit (as informal income is less easily verified making misreporting less costly; see De Brauw et al.
(2015)). While it is easier to evade taxes in the informal sector, it is unlikely that this behavioral response will
affect tax revenues as BF beneficiaries are almost certainly sufficiently poor so as to be exempt from income
   42. Technically, the reform is welfare improving as long as the government values spending R$1 on their next best alternative
less than splitting R$0.90 (in a non-distortionary manner) among the mechanical, threshold, bunching, and jumping households,
where the split is determined by the lower bounds on each group’s WTP for the reform. Similarly, the reform is welfare decreasing
if the government values spending R$1 on their next best alternative more than splitting R$1.12 (in a non-distortionary manner)
among the mechanical, threshold, bunching, and jumping households, where the split is determined by the upper bounds on each
group’s WTP for the reform.
   43. In 2016, the mean per-capita income in Brazil was R$615.32 and the Gini coefficient was 0.533; these parameters fully pin
down the income distribution if we assume log-normality.
   44. Bastagli (2008) finds that of all transfers paid in 2004, 91% went to households in the bottom 50% of the income distribution.
Alternatively, Lindert et al. (2007) suggests even better targeting performance, finding that 94% of all benefits go to the bottom
40%. Notably, transfers are not evenly distributed across the bottom 50% of households: the poorest households within this
group receive more (e.g., Bastagli (2008) finds that the bottom 20% of the income distribution receive 50% of BF transfers, while
Lindert et al. (2007) report that the bottom 20% receive 73% of BF transfers). Thus, we believe the assumption that the income
distribution of BF households is the same as the income distribution for the bottom 50% of income earners in Brazil is highly
conservative.




                                                                27
taxation.45 However, there is evidence showing that increased BF transfers lead to an increase in formal
sector employment of non-beneficiaries, which leads to higher GDP and tax revenue (Gerard, Naritomi and
Silva, 2021). If this positive fiscal externality exists for the June 2014 reform, this would lower our estimate
for the total cost of the reform, thus raising our estimates for both the lower and upper bound of the MVPF
(see footnote 23). Thus, incorporating this positive fiscal externality would only reinforce our finding that
the 2014 BF reform was welfare improving relative to a non-distortionary UBI.


5.3     Robustness to Identification Assumption 1

      Identification Assumption 1 may not hold if optimization frictions are large. For example, consider
households who, in response to the reform, jump below the threshold by changing their labor supply but
face large labor market frictions. These households may not be able to perfectly jump and bunch at the
new notch, but may instead jump to an income level below R$63. Alternatively, under the original policy,
some households may wish to bunch at R$70, but are only able to locate at an income below R$63 due to
labor market frictions; if these households are able to update their income to an income between R$(70,77],
they may move to the new notch as a result of the reform. Hence, large labor market frictions may imply
that the distribution below R$63 is impacted by the reform. To test for this possibility (or equivalently,
to allow for larger frictions), we relax Identification Assumption 1 by instead assuming that the number of
people reporting in bins ≤R$56 are unaffected by the reform, allowing for the possibility that the number
reporting in bin R$(56, 63] is affected by the reform. We augment Equation (15) by setting treatx = 1 if
x ∈ {63, 70, 77} and 0 otherwise, i.e., we include R$(56, 63] as a treatment bin. Figure 7 plots the actual
and counterfactual paths for log N(56,63],t , log N(63,70],t , and log N(70,77],t . It does not appear that the
reform had a significant impact on the number reporting in R$(56, 63] and the impact of the reform on
our two original treatment bins is robust to including R$(56,63] as a treatment bin. Using this relaxation
of Identification Assumption 1, we find that the MVPF is bounded between 0.82 and 1.07 (see Table 7 in
Appendix C.6).


5.4     Robustness to Identification Assumption 2

      Identification Assumption 2 essentially boils down to assuming that the differences over time between the
(log) number of people reporting in each income bin (which are determined by unknown, complex structural
relationships) are well approximated by polynomials which would have persisted in absence of the reform.
Identification Assumption 2 therefore implies that if, for example, bin R$(35,42] experiences a 1% deviation
from its bin-specific polynomial trend, bin R$(63,70] would also experience a 1% deviation from its bin-specific
polynomial trend in absence of the reform. However, one may argue that, for example, a structural change
in the economy which leads to a 1% deviation from the bin-specific polynomial time trend for bin R$(35,42]
would lead to a 2% deviation for bin R$(63,70]. As such, we relax Identification Assumption 2 to instead
assume that the log number reporting in each bin evolves according to:

                                                       K
                        log N(x−7,x],t = γx h(t) +          αk,x tk + νx,t for x ∈ {7, 14, ..., 77}
                                                      k=0

  45. For example, in 2015, individuals earning less than R$1903.98 per-month were exempt from income taxation.




                                                             28
Note: This figure the log number of one single individual households reporting incomes in R$(56, 63], R$(63, 70], and R$(70, 77]
over time along with the counterfactual paths had the reform not happened. The counterfactual paths are estimated using
Equation (15) (where we set treatx = 1 if x ∈ {63, 70, 77}, i.e., we have three treatment bins: R$(56, 63], R$(63, 70], and
R$(70, 77]). We assume the difference in the (log) number reporting in any two bins follows a cubic time trend in absence of the
reform (i.e., we set K = 3 in Equation (15)). Confidence intervals are constructed from clustered standard errors at the bin level.
The timing of the reform is indicated by the gray, shaded region.

         Figure 7: Counterfactual Paths for Three Treatment Bins Using Equation (15)


Thus, we now assume that deviations from underlying bin-specific polynomials (in logs or, equivalently,
percentage terms) are not necessarily the same across all bins. This relaxation of Identification Assumption
2 allows for a structural change in the economy which creates a 1% deviation from bin-specific polynomial
trend for R$(35,42] to create a (γ70 /γ42 )% deviation from bin-specific polynomial trend for bin R$(63,70].
Under this relaxed version of Identification Assumption 2, we estimate the following non-linear least squares
regression:46

                                    K
       log N(x−7,x],t = γx δt +          αk,x tk + [β1,x postt + β2,x postt × t] × treatx +     xt   for x ∈ {7, ..., 77}    (16)
                                   k=0

Essentially, running Regression (16) will estimate a common set of month dummies, δt , as well as bin-specific
factors, γx , which multiply these common month dummies for each bin x. We show in Appendix C.7 that
our results are very robust to this relaxation of Identification Assumption 2.


5.5     Robustness to Household Composition

      We also consider the possibility that some of the estimated behavioral impact of the reform for single
individual households may be coming from households misreporting their family composition. For example,
a two adult household can receive greater benefits if they report to be two separate one adult households
  46. Clearly, Identification Assumption 2 is nested within this more general functional form.



                                                               29
as benefits are paid out per-household as opposed to per-capita.47 Hence, the reform may have increased
incentives to misreport family composition as well as income.48 From a theory perspective, Proposition 3
holds even if misreporting responses occur on the family composition margin. However, such a behavioral
response may affect the validity of our identification strategy (for example, it may no longer be reasonable
to assume that the distribution below R$63 is unaffected by the reform if these new “single individual”
households enter at income levels well below the threshold). Thus, we re-do our main analysis restricting to
single individual households whose composition does not vary over the sample period. Our point estimates
for the lower bound for the MVPF still lie between 0.88 and 0.91, while our point estimates for the upper
bound are now smaller lying between 1.01 and 1.05; see Appendix C.8.
      Finally, our main analysis is restricted to single individual households due to the fact that the schedule for
households with more than one individual features a notch at R$70 and a kink below R$70 (e.g., this kink is at
R$35 for two adult households with no children; see Section 3). Consequently, the 2014 BF reform led to both
a change in the notch and a change in the kink for households with more than one individual. Thus, for these
households, it is not necessarily reasonable to make Identification Assumption 1 as we might expect the reform
to affect the number of people reporting in bins around the kink. We nonetheless proceed with estimating the
bounds on the MVPF for two adult households without children, noting that our identification assumptions
are imperfect in this setting. For these households we see the same pattern of behavioral responses to the
reform and estimate lower bounds for the MVPF between 0.88 and 0.96 and upper bounds for the MVPF
between 1.08 and 1.12; see Appendix C.9.49


5.6     Placebo Tests

      We can partially test the validity of our two identification assumptions via placebo tests. Together,
Identification Assumptions 1 and 2 imply that all of the income bins below R$63 should evolve approximately
according to a common time trend plus a bin-specific polynomial throughout the entire analysis period. To
test this, we re-estimate Equation (15) but only include bins below R$63 (i.e., x ∈ {7, 14, ..., 63}) and randomly
assign some of these bins to be “treatment” bins (i.e., treatx = 1). If both of our identification assumptions
are correct, the “treatment effects” from these placebo regressions should be close to zero, i.e., the post-reform
deviation of the “treated” bins relative to the post-reform deviation of the “control” bins should be close to
0. There are a total of 510 different ways to assign our nine bins below R$63 “treatment” status.50
   Figure 8 plots the results from all 510 such regressions, showing the estimated “treatment effects” in June
             ˆ2,x t
      ˆ1,x + β
2016: β                                        ¯ represents June 2016. Each bin gets assigned treatx = 1 255
                  ¯ for x ∈ {7, .., 63}, where t

   47. E.g., a household with two adults and no children that has a combined per-capita income of R$60 is eligible for R$70 in
transfers pre-reform and R$77 in transfers post-reform. However, if this household were to report that they were actually two
single individual households with incomes of R$60, they would each be eligible for R$70 in transfers pre-reform and R$77 in
transfers post-reform.
   48. However, the ability of households to misreport the number of members is limited as individuals must provide government
issued IDs for all family members to be on the registry. Moreover, household composition is arguably more verifiable than income
for many households given the large informal sector in Brazil. Thus, we suspect that households are more likely to misreport
income than family composition.
   49. In Appendix C.10, we show strong suggestive evidence of a behavioral response to the change in the basic benefit notch
for households with children, but we do not attempt to bound the MVPF for these households. This is because households with
children may also receive the “variable benefit”. Both the level and location of the notch associated with the variable benefit also
changed with the June 2014 reform. Thus, the WTP of households with children needs to account for changes in the variable
benefit schedule in addition to changes in the basic benefit schedule. This exercise is beyond the scope of the current paper.
   50. There are 9 1
                     ways to pick one “treatment” bin, 9 2
                                                           ways to pick two “treatment” bins; thus, there are a total of 8    9
                                                                                                                          i=1 i =
510 possible regressions.


                                                               30
times meaning we have 255 “treatment effect” estimates for each bin. Figure 8 shows a fairly tight clustering
of these treatment effects around zero: the mean “treatment effect” (in absolute value) across all bins is
around 4% (0.04 log points). Hence, the “treatment effects” are relatively small for bins below R$63.51 This
suggests that Identification Assumptions 1 and 2 are fairly reasonable.




Note: This figure shows the placebo “treatment effects” (in logs) for each income bin R$(x − 7, x] obtained from estimating
Regression (15) with only bins x ∈ {7, 14, ..., 63} and every possible assignment of treatx ∈ {0, 1} for each x ∈ {7, 14, ..., 63}.
There are 510 = 8        9                                           ˆ      ˆ      ¯
                    i=1 i such regressions and we plot the values of β1,x + β2,x × t for each bin R$(x-7,x] whenever bin R$(x-7,x]
                                   ¯
is assigned to be a treatment bin (t represents June, 2016). We use bin-specific cubic polynomials in each regression, i.e., K = 3.
The solid red line shows the estimated treatment effect for R$(70,77] and the dashed red line shows the estimated treatment
effect for R$(63,70], both from our preferred specification (J = 3).

    Figure 8: Placebo Analysis: Calculating “Treatment Effects” for each Control Bin


    Finally, we can also use these placebo tests as an alternative gauge of the statistical significance of our
estimated treatment effects for income bins R$(70,77] and R$(63,70] from Regression (15). Under Identifica-
tion Assumptions 1 and 2, the post-reform deviations from bin-specific trend plotted in Figure 8 are due to
randomness. Hence, the distribution of “treatment effects” in Figure 8 can be used to construct p-values for
the estimated treatment effects for income bins R$(70,77] and R$(63,70] from Regression (15) in a manner
akin to randomization inference.52 In Figure 8, the solid red line shows the treatment effect of about 1.6
log points for R$(70,77] while the dashed red line shows the treatment effect of about -0.12 log points for
R$(63,70], both from our preferred specification (K = 3). The treatment effect for R$(70,77] is an order of
magnitude larger than any estimated “treatment effect” from placebo regressions, giving us high certainty
that the number of individuals reporting incomes in R$(70,77] was impacted by the reform. Similarly, the
treatment effect for R$(63,70] is larger (in absolute magnitude) than 97.8% of the placebo “treatment effects”,
implying a p-value of 0.022 against the null hypothesis that R$(63,70] was not impacted by the reform.
   51. We find a negative “average treatment effect” of -0.105 log points for R$(14, 21]. However, R$(14, 21] seems implausibly
far away from the notch to be impacted by the reform, especially given that other bins around R$(14,21] do not appear to be
impacted by the reform. Hence, we interpret this negative “average treatment effect” simply as random variation.
   52. See Imbens and Rubin (2021) or Young (2018) for discussions of randomization inference.




                                                                31
                                            6    Conclusion

   This paper proposes a theoretical framework and empirical strategy to gauge the welfare impacts of
reforms that change notches in transfer schedules. We show that we can use two empirically observable
objects to bound the MVPF of the reform: the number of households who jump down to the new notch and
the number of households bunching at the old notch who move toward the new notch. These bounds on the
MVPF can then be used to construct bounds for the welfare impact of changing the notch. We estimate
                                                ılia program that changed the structure of the notch,
bounds for the MVPF of a reform to the Bolsa Fam´
finding a lower bound of 0.9 and an upper bound of 1.12. Back-of-the-envelope calculations suggest that this
reform was likely welfare improving.
   More broadly, we believe that this paper highlights a new manner in which reduced-form evidence on
jumping and bunching can be used to inform policy. Given the ubiquity of notches, we hope that the methods
developed in this paper will be useful for analyzing reforms in a variety other contexts such Medicaid, income-
dependent tax credits, or firm tax schedules. Finally, the bounding techniques developed in this paper may
be helpful for bounding welfare impacts of large reforms which do not necessarily feature notches.




                                                      32
                                            References
Abadie, Alberto, Susan Athey, Guido Imbens, and Jeffrey Wooldridge. 2017. “When Should You
  Adjust Standard Errors for Clustering?”

Bachas, Pierre, and Mauricio Soto. 2021. “Corporate Taxation under Weak Enforcement.” American
  Economic Journal: Economic Policy, 13(4): 36–71.

Bastagli, Francesca. 2008. “The design, implementation and impact of conditional cash transfers targeted
                                                  ılia.” Thesis submitted to the London School of Economics
  on the poor: An evaluation of Brazil’s Bolsa Fam´
  for the degree of Doctor of Philosophy.

Bergstrom, Katy, and William Dodds. 2021. “Optimal taxation with multiple dimensions of hetero-
  geneity.” Journal of Public Economics, 200: 104442.

Best, Michael Carlos, and Henrik Jacobsen Kleven. 2017. “Housing Market Responses to Transaction
  Taxes: Evidence From Notches and Stimulus in the U.K.” The Review of Economic Studies, 85(1): 157193.

Best, Michael Carlos, Anne Brockmeyer, Henrik Jacobsen Kleven, Johannes Spinnewijn, and
  Mazhar Waseem. 2015. “Production versus Revenue Efficiency with Limited Tax Capacity: Theory and
  Evidence from Pakistan.” Journal of Political Economy, 123(6): 1311–1355.

Best, Michael Carlos, James S Cloyne, Ethan Ilzetzki, and Henrik J Kleven. 2020. “Estimating
  the Elasticity of Intertemporal Substitution Using Mortgage Notches.” The Review of Economic Studies,
  87(2): 656–690.

Bilinski, Alyssa, and Laura Hatfield. 2019. “Nothing to see here? Non-inferiority approaches to parallel
  trends and other model assumptions.” Working Paper.

Carril, Rodrigo. 2022. “Rules Versus Discretion in Public Procurement.” Working Paper.

Chetty, Raj. 2009. “Sufficient Statistics for Welfare Analysis: A Bridge Between Structural and Reduced-
  Form Methods.” Annual Review of Economics, 1: 451–488.

                                                                                     ılia and
De Brauw, Alan, Daniel O. Gilligan, John Hoddinott, and Shalini Roy. 2015. “Bolsa Fam´
  Household Labor Supply.” Economic Development and Cultural Change, 63(3): 423–457.

Feldstein, Martin. 1999. “Tax Avoidance and the Deadweight Loss of the Income Tax.” Review of Eco-
  nomics and Statistics, 81(4): 674–680.

Ferman, Bruno, and Cristine Pinto. 2019. “Inference in Differences-in-Differences with Few Treated
  Groups and Heteroskedasticity.” The Review of Economics and Statistics, 101(3): 452–467.

Gazola Hellmann, Aline. 2015. “How does Bolsa Familia Work? Best Practices in the Implementation of
  Conditional Cash Transfer Programs in Latin America and the Caribbean.” Inter-American Development
  Bank Technical Note, 856.

Gerard, Franois, Joana Naritomi, and Joana Silva. 2021. “Cash Transfers and Formal Labor Markets:
  Evidence from Brazil.” Policy Research Working Papers.



                                                    33
Hanna, Rema, and Benjamin Olken. 2018. “Universal Basic Incomes versus Targeted Transfers: Anti-
  Poverty Programs in Developing Countries.” Journal of Economic Perspectives, 32(4): 201–226.

Hendren, Nathaniel, and Ben Sprung-Keyser. 2020. “A Unified Welfare Analysis of Government Poli-
  cies*.” The Quarterly Journal of Economics, 135(3): 1209–1318.

Henley, Andrew, G. Reza Arabsheibani, and Francisco G. Carneiro. 2009. “On Defining and Mea-
  suring the Informal Sector: Evidence from Brazil.” World Development, 37(5): 992–1003.

Imbens, Guido, and Donald B. Rubin. 2021. Causal inference for statistics, social, and biomedical
  sciences: an introduction. Cambridge University Press.

                                  ar. 2016. “Robust Standard Errors in Small Samples: Some
Imbens, Guido W., and Michal Koles´
  Practical Advice.” Review of Economics and Statistics, 98(4): 701–712.

Kleven, Henrik J. 2021. “Sufficient Statistics Revisited.” Annual Review of Economics, 13(1): 515–538.

Kleven, Henrik Jacobsen. 2016. “Bunching.” Annual Review of Economics, 8(1): 435–464.

Kleven, Henrik J., and Mazhar Waseem. 2013. “Using Notches to Uncover Optimization Frictions
  and Structural Elasticities: Theory and Evidence from Pakistan.” The Quarterly Journal of Economics,
  128(2): 669–723.

                                               en´
Lindert, Kathy, Anja Linder, Jason Hobbs, and B´                 ere. 2007. “The Nuts and Bolts
                                                 edicte de la Bri`
                       ılia Program: Implementing Conditional Cash Transfers in a Decentralized Context.”
  of Brazil’s Bolsa Fam´
  World Bank Social Protection Discussion Paper, 0709.

Lockwood, Ben. 2020. “Malas notches.” International Tax and Public Finance, 27(4): 779–804.

Mayshar, Joram. 1991. “Taxation with Costly Administration.” The Scandinavian Journal of Economics,
  93(1): 75.

Milgrom, Paul, and Ilya Segal. 2002. “Envelope theorems for arbitrary choice sets.” Econometrica,
  70(2): 583601.

Mora, Ricardo, and Iliana Reggio. 2013. “Treatment Effect Identification Using Alternative Parallel
  Assumptions.” Working Paper.

Outreville, J. Francois. 2014. “Risk Aversion, Risk Behavior and Demand for Insurance: A Survey.”
  Journal of Insurance Issues, 37(2): 158–186.

Rambachan, Ashesh, and Jonathan Roth. 2020. “An Honest Approach to Parallel Trends.” Working
  Paper.

Roodman, David, Morten Ørregaard Nielsen, James G. Mackinnon, and Matthew D. Webb.
  2019. “Fast and wild: Bootstrap inference in Stata using boottest.” The Stata Journal: Promoting com-
  munications on statistics and Stata, 19(1): 4–60.

Saez, Emmanuel. 2010. “Do Taxpayers Bunch at Kink Points?” American Economic Journal: Economic
  Policy, 2(3): 180–212.


                                                      34
Saving, Jason, and Alan Viard. 2021. “Notches in the Tax System: The Good, the Bad, and the Ugly.”
  Tax Notes Federal, 171(8): 1253–1263.

Slemrod, Joel. 2001. “A General Model of the Behavioral Response to Taxation.” International Tax and
  Public Finance, 8: 119–128.

Slemrod, Joel. 2010. “Buenas notches: lines and notches in tax system design.” Working Paper.

                                              ılia: A Review.” Economic and Political Weekly, 46(21): 55–
Veras Soares, Fabio. 2011. “Brazil’s Bolsa Fam´
  60.

Wolfers, Justin. 2006. “Did Unilateral Divorce Laws Raise Divorce Rates? A Reconciliation and New
  Results.” American Economic Review, 96(5): 1802–1820.

Yelowitz, A. S. 1995. “The Medicaid Notch, Labor Supply, and Welfare Participation: Evidence from
  Eligibility Expansions.” The Quarterly Journal of Economics, 110(4): 909–939.

Young, Alwyn. 2018. “Channeling Fisher: Randomization Tests and the Statistical Insignificance of Seem-
  ingly Significant Experimental Results.” The Quarterly Journal of Economics, 134(2): 557–598.



                      A      For Online Publication: Proofs Appendix

A.1     Optimal Reported Incomes in Baseline Model

                                                                 ˆ ≤ τ (and are indifferent between
   Type µ = 2 households with y < τ clearly prefer to report y ≤ y
reporting income levels in this range as only under-reporting is costly). We assume they break this indifference
by reporting at y . Anyone with y > τ prefers misreporting to τ over misreporting to an income less than τ
because v > 0. By definition, those with y = y c (p) are indifferent between misreporting at the threshold τ
and truthfully reporting (we break their indifference by saying they will misreport at τ ), where y c (p) solves
y c (p) + b − v (y c (p) − τ ) = y c (p).53 Those with τ < y ≤ y c (p) prefer misreporting to τ over truthfully
reporting and those with y > y c (p) prefer the reverse by the fact that v > 0.


A.2     Proof of Lemma 1

   Let us begin by calculating the WTP for the mechanical households, which are the households who report
an income ≤ τ under both p and p . Because households with y > τ will never report ≤ τ under policy p
as v > 0, mechanical households are those with y ≤ τ ; they always report truthfully and receive the benefit.
Hence, their utility is equal to y + b under policy p and equal to y + b under policy p. Clearly, their WTP for
the reform (or compensating variation) is equal to b − b = ∆b. Because only mechanical households report
under τ given policy p , the number of such households is given by M = G(τ, p ).
   Next, let us discuss the WTP for bunching households. Bunching households are those with type µ = 2
and y ∈ (τ, y c (p)]. These individuals have utility y + b − v (y − τ ) under policy p. Bunching households with
y ∈ (τ, τ ] will simply report truthfully after the program, getting utility y + b . Those with y ∈ (τ , y c (p)]
  53. The existence of a unique y c (p) follows from v > 0.



                                                              35
will bunch at the new threshold, getting utility y + b − v (y − τ ).54 Hence, we know that the WTP (or
compensating variation) satisfies:

                                                                             ˆ∗ (y, 0; p ))
                                 y + b − v (y − τ ) = y + b − W T P − v (y − y

      ˆ∗ (y, 0; p ) equals y if y ∈ (τ, τ ] and equals τ if y ∈ (τ , y c (p)]. Equivalently:
where y

                                                                          ˆ∗ (y, 0; p ))
                                      W T P = b − b + v (y − τ ) − v (y − y

                                                                ˆ∗ (y, 0; p ) > τ for bunching households), we know
                               ˆ∗ (y, 0; p )) > 0 (as v > 0 and y
Given that v (y − τ ) − v (y − y
                                     ˆ∗ (y, 0; p )) ≥ 0 and y + b − v (y − τ ) ≥ y for bunching households (or,
W T P ≥ ∆b. Moreover, because v (y − y
equivalently, v (y − τ ) ≤ b), we know that W T P ≤ b − b + b = b for bunching households. Only bunching
and mechanical households report incomes ≤ τ under policy p; under policy p bunching households strictly
increase their reported income. Hence, the number of bunching individuals is given by B = G(τ ; p) − G(τ ; p ).
      Next, let us discuss threshold households with income y ∈ (τ, τ ] and µ = 1 (note, if y c (p) ≤ τ , then type
µ = 2 households with y ∈ (y c (p), τ ] are also threshold households). These households all receive utility y
under policy p and receive utility y + b under policy p . Hence, their WTP is given by b . Definitionally,
threshold households are those reporting incomes in (τ, τ ] under policy p so that the total number of such
households is given by T = G(τ ; p) − G(τ ; p).
      Finally, there are jumping households with type µ = 2 and income y ∈ (y c (p), y c (p )] (or y ∈ (τ , y c (p )]
                                                                                                       ˆ∗ (y, 0; p ))
if y c (p) ≤ τ ). These households receive utility y under policy p and receive utility y + b − v (y − y
      ˆ∗ (y, 0; p ) ≤ τ . For these individuals we have:
where y

                                                                      ˆ∗ (y, 0; p ))
                                           y = y + b − W T P − v (y − y

                                                                                                ˆ∗ (y, 0; p )) ≥ 0,
                       ˆ∗ (y, 0; p )) ≥ y by revealed preference, W T P ≥ 0. And because v (y − y
Because y + b − v (y − y
W T P ≤ b for jumping individuals. Jumping households are those who report incomes above τ under policy
p and (weakly) below τ under policy p . Because all households who report incomes under τ under policy
p also report incomes under τ under policy p , the number of jumping households is given by the increase
in households reporting at or below τ as a result of the reform: J = G(τ ; p ) − G(τ ; p).
      Putting this all together we get:

                               ∆b(M + B ) + b T ≤ Total WTP ≤ ∆bM + b (B + T + J )


A.3     Proof of Proposition 3
                                                                   1
                                                                     [W (p )−W (p)]
Proof. We start with proving the lower bound for                [b G(τ ;p )−bG(τ ;p)] .
                                                                   λ
                                                                                          First, welfare under policy p is given
by:

            W (p) =        φ(θ)u (y (x∗ (θ, p), θ) + b1 (ˆ
                                                         y (x∗ (θ, p) , θ) ≤ τ ) , x∗ (θ, p); θ) dF (θ) − λbG(τ ; p)
                       Θ

  54. Note, if y c (p) ≤ τ , i.e., the change in τ is large, all bunching households will report truthfully under p .




                                                                  36
Next, note that by revealed preference, we have the following for any x ∈ X :

                                    y (x∗ (θ, p), θ) ≤ τ ) , x∗ (θ, p); θ) ≥ u (y (x, θ) + b1 (ˆ
         u (y (x∗ (θ, p) , θ) + b1 (ˆ                                                          y (x, θ) ≤ τ ) , x; θ)


Put simply, optimal decisions conditional on any given θ under p, x∗ (θ, p), yield weakly higher utility than
any other set of decisions x that one could make. This yields the following bound on welfare under policy
p = {b , τ }, which ensues by evaluating utility under policy p , but holding household decisions constant at
their values under policy p (i.e., by revealed preference):

       W (p ) =       φ(θ)u y x∗ (θ, p ), θ + b 1 y
                                                  ˆ x∗ θ, p , θ ≤ τ , x∗ (θ, p ); θ dF (θ) − λb G(τ ; p )
                  Θ

              ≥       φ(θ)u y (x∗ (θ, p), θ) + b 1 y
                                                   ˆ (x∗ (θ, p) , θ) ≤ τ , x∗ (θ, p); θ dF (θ) − λb G(τ ; p )
                  Θ


So as to slightly reduce some cumbersome notation, let us define:

                                                      y ∗ (θ, p) ≡ y (x∗ (θ, p) , θ)

                                                                  ˆ(x∗ (θ, p), θ)
                                                      ˆ∗ (θ, p) ≡ y
                                                      y

Thus, for the reform from p = {b, τ } to p = {b , τ } with p − p = {∆b, ∆τ } and ∆τ > 0:

          W (p ) − W (p) ≥               φ(θ) u y ∗ (θ, p) + b , x∗ (θ, p); θ − u (y ∗ (θ, p) + b, x∗ (θ, p); θ) dF (θ)
                               y ∗ (θ,p)≤τ
                             θ:ˆ

                               +         φ(θ) u y ∗ (θ, p) + b , x∗ (θ, p) ; θ − u (y ∗ (θ, p), x∗ (θ, p) ; θ) dF (θ)            (17)
                            y ∗ (θ,p)∈(τ,τ ]
                          θ:ˆ

                               − λ[b G(τ ; p ) − bG(τ ; p)]

                                           ˆ∗ (θ, p) > τ is zero as we move from p to p , holding decisions
Note, the change in utility for those with y
                                                       ˆ∗ (θ,p)≤τ } as the government’s average welfare weight
fixed, as they do not receive a transfer. Next, define η{y
                                               ˆ∗ ≤ τ under policy p. η{y
on the households who optimally report incomes y                        ˆ∗ (θ,p)≤τ } captures the average
welfare gain from giving these households an extra $1:

                                         φ(θ ) b 1       ∗              ∗                  ∗             ∗
                                                 −b [u (y (θ, p) + b , x (θ, p); θ ) − u (y (θ, p) + b, x (θ, p); θ )] dF (θ )
                           y ∗ (θ,p)≤τ
                         θ:ˆ
        ˆ∗ (θ,p)≤τ } =
      η{y
                                                                         G(τ, p)

And define η{y
            ˆ∗ (θ,p)∈(τ,τ     ]}   as the government’s average welfare weight of giving a dollar to the households
                             ˆ∗ ∈ (τ, τ ] under policy p. η{y
who optimally report incomes y                              ˆ∗ (θ,p)∈(τ,τ                 ]}   captures the average welfare gain




                                                                   37
from giving these households an extra $1:55
                                                            1
                                                      φ(θ ) b [u (y ∗ (θ, p) + b , x∗ (θ, p); θ) − u (y ∗ (θ, p), x∗ (θ, p); θ)] dF (θ)
                                   y ∗ (θ,p)∈(τ,τ ]
                                 θ:ˆ
      η{y
        ˆ∗ (θ,p)∈(τ,τ   ]}   =
                                                                           G(τ ; p) − G(τ, p)

                                      ˆ∗ (θ,p)≤τ } and η{y
   Note, by the mean value theorem, η{y                  ˆ∗ (θ,p)∈(τ,τ                    ]}   are equal to some average social marginal
utilities of consumption for their respective groups of households. Next, let us define the aggregate welfare
weight, ηL , which equals the weighted average welfare weight of giving a dollar to all households, where the
weights are determined by the lower bound of WTP for the reform:

                                      η{y                           ˆ∗ (θ,p)∈(τ,τ ]} b [G(τ ; p) − G(τ, p)]
                                        ˆ∗ (θ,p)≤τ } ∆bG(τ ; p) + η{y
                             ηL =
                                                      ∆bG(τ ; p) + b [G(τ ; p) − G(τ, p)]

Then, dividing Equation (17) through by the budgetary effect multiplied by λ, we have (recall we assume the
budgetary effect is > 0):

            1
            λ [W (p ) − W (p)]           ˆ∗ (θ,p)≤τ } ∆bG(τ ; p) + η{y
                                       η{y                            ˆ∗ (θ,p)∈(τ,τ ]} b [G(τ ; p) − G(τ ; p)]
                                     ≥                                                                         −1
         [b G (τ ; p ) − bG (τ ; p)]                         λ [b G(τ ; p ) − bG(τ ; p)]
                                       ηL ∆bG(τ ; p) + b [G(τ ; p) − G(τ ; p)]
                                     =                                                −1
                                       λ          [b G (τ ; p ) − bG (τ ; p)]
                                                = ωL M V P FL − 1


where ωL = ηL /λ and M V P FL is given by:

                                           ∆bG(τ ; p) + b [G(τ ; p) − G(τ, p)]     b [G(τ ; p ) − G(τ ; p)]
                  M V P FL =                                                   =1−
                                               b G (τ ; p ) − bG (τ ; p)            b G(τ ; p ) − bG(τ ; p)

                                                               1
                                                                 [W (p)−W (p)]
   Next, we prove the upper bound for                       [b G(τ ;p )−bG(τ ;p)] .
                                                               λ
                                                                                      We use identical revealed preference logical to
bound welfare under policy p = {b, τ } by evaluating utility under policy p, but holding household decisions
constant at their values under policy p :

                W (p) =              φ(θ)u (y ∗ (θ, p) + b1 (ˆ
                                                             y ∗ (θ, p) ≤ τ ) , x∗ (θ, p); θ) dF (θ) − λbG(τ ; p)
                                 Θ

                             ≥       φ(θ)u y ∗ (θ, p ) + b1 y
                                                            ˆ∗ (θ, p ) ≤ τ , x∗ (θ, p ); θ dF (θ) − λbG(τ ; p)
                                 Θ


Hence, for the reform from p = {b, τ } to p = {b , τ } with p − p = {∆b, ∆τ } and ∆τ > 0:

         W (p ) − W (p) ≤                  φ(θ) u y ∗ (θ, p ) + b , x∗ (θ, p ); θ − u y ∗ (θ, p ) + b, x∗ (θ, p ); θ         dF (θ)
                               y ∗ (θ,p )≤τ
                             θ:ˆ

                                 +         φ(θ) u y ∗ (θ, p ) + b , x∗ θ, p ; θ − u y ∗ (θ, p ), x∗ θ, p ; θ              dF (θ)          (18)
                            y ∗ (θ,p )∈(τ,τ ]
                          θ:ˆ

                                 − λ[b G(τ ; p ) − bG(τ ; p)]

  55. We have used                         dF (θ) = G(τ ; p) − G(τ, p).
                        y ∗ (θ,p)∈(τ,τ ]
                     θ :ˆ




                                                                          38
              ˆ∗ (θ,p )≤τ } as the government’s average welfare weight on the households who optimally report
Next, define η{y
        ˆ∗ ≤ τ under policy p . η{y
incomes y                         ˆ∗ (θ,p )≤τ } captures the average welfare gain from giving these households
an extra $1:

                                             φ(θ ) b 1       ∗               ∗                   ∗              ∗
                                                     −b [u (y (θ, p ) + b , x (θ, p ); θ ) − u (y (θ, p ) + b, x (θ, p ); θ )] dF (θ )
                            y ∗ (θ,p
                          θ:ˆ          )≤τ
        ˆ∗ (θ,p )≤τ } =
      η{y
                                                                                  G(τ, p )

And define η{y
            ˆ∗ (θ,p )∈(τ,τ           ]}   as the government’s average welfare weight of giving a dollar to the households
                             ˆ∗ ∈ (τ, τ ] under policy p . η{y
who optimally report incomes y                               ˆ∗ (θ,p )∈(τ,τ                            ]}   captures the average welfare gain
from giving these households an extra                      $1:56
                                                             1
                                                       φ(θ ) b [u (y ∗ (θ, p ) + b , x∗ (θ, p ); θ) − u (y ∗ (θ, p ), x∗ (θ, p ); θ)] dF (θ)
                                    y ∗ (θ,p)∈(τ,τ ]
                                  θ:ˆ
      η{y
        ˆ∗ (θ,p )∈(τ,τ   ]}   =
                                                                             G(τ ; p ) − G(τ, p )

                                       ˆ∗ (θ,p )≤τ } and η{y
   Again, by the mean value theorem, η{y                   ˆ∗ (θ,p )∈(τ,τ                         ]}   are equal to average social marginal
utilities of consumption for their respective groups of households. Next, let us define the aggregate welfare
weight, ηU , which equals the weighted average welfare weight of giving a dollar to all households, where the
weights are determined by the upper bound of WTP for the reform:

                                       η{y                             ˆ∗ (θ,p )∈(τ,τ ]} b [G(τ ; p ) − G(τ, p )]
                                         ˆ∗ (θ,p )≤τ } ∆bG(τ ; p ) + η{y
                              ηU =
                                                        ∆bG(τ ; p ) + b [G(τ ; p ) − G(τ, p )]

Then, dividing Equation (18) through by the budgetary effect multiplied by λ, we have (recall we assume the
budgetary effect is > 0):

            1
            λ [W (p ) − W (p)]         η{y                              ˆ∗ (θ,p )∈(τ,τ ]} b [G(τ ; p ) − G(τ ; p )]
                                          ˆ∗ (θ,p )≤τ } ∆bG(τ ; p ) + η{y
                                     ≤                                                                              −1
         [b G (τ ; p ) − bG (τ ; p)]                           λ [b G(τ ; p ) − bG(τ ; p)]
                                       ηU ∆bG(τ ; p ) + b [G(τ ; p ) − G(τ ; p )]
                                     =                                                    −1
                                        λ           [b G (τ ; p ) − bG (τ ; p)]
                                               = ωU M V P FU − 1


where ωU = ηU /λ and M V P FU is given by:

                                          ∆bG(τ ; p ) + b [G(τ ; p ) − G(τ, p )]     b [G(τ ; p) − G(τ ; p )]
                   M V P FU =                                                    =1+
                                                b G (τ ; p ) − bG (τ ; p)            b G(τ ; p ) − bG(τ ; p)




A.4     Proposition 3 Cannot Be Improved

Proposition 4. Without further assumptions on primitives, the bounds in Proposition 3 cannot be improved.

Proof. To show that one cannot construct tighter bounds than Proposition 3 without additional structure
on the agent’s problem, we provide examples for which these bounds are attained. In particular, we create
  56. We have used                          dF (θ) = G(τ ; p ) − G(τ, p ).
                        y ∗ (θ,p )∈(τ,τ ]
                     θ :ˆ




                                                                            39
examples for which (1) bunching households have a WTP of ∆b and jumping households have a WTP of 0
as well as (2) bunching and jumping households both have a WTP of b .57


  Example 1: Consider our baseline, misreporting model in Section 2.1 with v (0) = 0, v > 0 and v > 0.
Suppose that ∆b, ∆τ > 0. Consider a distribution for the misreporting types (µ = 2) with a mass point at
y = τ , no density on (τ, y c ), and another mass point at y = y c , where y c solves b = v (y c − τ ). All bunching
househols therefore have y = τ . Thus, the utility change for bunching households is equal to:

                                             τ + b − [τ + b − v (τ − τ )] = ∆b

Hence, the WTP for bunching households is ∆b. Because there are no individuals with incomes ∈ (τ, y c ), all
jumping households have y = y c . Hence, all the jumping households report truthfully (and do not get the
benefit) under the original policy p. We break their indifference by assuming that they jump to the threshold
τ under the new policy p . The change in utility for jumpers is therefore given by:

                                         y c + b − v (y c − τ ) − y c = b − b = 0

Thus, each jumping household’s WTP is equal to 0. Hence, the total WTP for the reform equals (M +
B )∆b + T b = M V P FLB .


  Example 2: Consider our baseline, misreporting model in Section 2.1 with v (0) = 0, v > 0 and v > 0.
Suppose that ∆b, ∆τ > 0. Let y c solve b = v (y c − τ ) and y c solve b = v (y c − τ ). Finally, suppose that
τ − τ is large enough so that τ > y c . Consider a distribution for the misreporting types (µ = 2) with no
density on [τ, y c ), a mass point at y = y c , and no density on (y c , y c ]. For the µ = 2 individuals with y = y c ,
they are indifferent between bunching at τ and reporting truthfully under policy p. In the former case, they
are bunching individuals (and there are no jumping individuals) and in the latter case they are jumping
individuals (and there are no bunching individuals). Regardless, the utility gain for these households is equal
to:
                                            y c + b − [y c + b − v (y c − τ )] = b

Thus the total WTP for the reform will equal M ∆b + (B + T + J )b = M V P FU B .




A.5     Incorporating Misperceptions of the Schedule

      We now assume that households do not necessarily understand how the policy impacts their consumption.
Households solve the following problem:

                                              max u (c, x; θ)
                                               x∈X
                                                                                                                           (19)
                                                                    ˆ(x, θ), p, θ)
                                              s.t. c = f (y (x, θ), y

  57. Note that our examples require mass points of the income distribution. But one can approximate our example scenarios
arbitrarily well with smooth income distributions; hence, we can get arbitrarily close to the cases when either (1) all bunching
households have a WTP of ∆b and all jumping households have a WTP of 0 or (2) all bunching and jumping households both
have a WTP of b .



                                                              40
   In words, households make decisions under the assumption that their consumption is some function of
                                            ˆ, the p, and state variables θ. For instance, this framework
their true income y , their reported income y
                                                                                               ˆ(x, θ), p, θ) =
allows for households to misperceive the threshold τ or the benefit level b (e.g., f (y (x, θ), y
y (x, θ) + (b + θ1 )1(ˆ
                      y (x, θ) ≤ (τ + θ2 )), where θ1 , θ2 are state variables for the household). Total welfare is still
given by:

            W ( p) =         φ(θ)u (y (x∗ (θ, p), θ) + b1 (ˆ
                                                           y (x∗ (θ, p) , θ) ≤ τ ) , x∗ (θ, p); θ) dF (θ) − λbG(τ ; p)   (20)
                         Θ


   In order for Proposition 3 to hold under the more general model with misperceptions (Problem (19)), we
need to make two additional assumptions. We need to assume that when the policy changes from p to p ,
household behavioral re-optimization improves welfare, on average. In other words, misperceptions of the
policy reform cannot be so severe that households make themselves worse off (on average) by responding to
the reform. Mathematically, we require that:

         W (p ) =        φ(θ)u y x∗ (θ, p ), θ + b 1 y
                                                     ˆ x∗ θ, p , θ ≤ τ , x∗ (θ, p ); θ dF (θ) − λb G(τ ; p )
                    Θ

               ≥         φ(θ)u y (x∗ (θ, p), θ) + b 1 y
                                                      ˆ (x∗ (θ, p) , θ) ≤ τ , x∗ (θ, p); θ dF (θ) − λb G(τ ; p )
                    Θ


Note, that the previous inequality holds by revealed preference if households correctly perceive the schedule
(i.e., behavioral re-optimization can only improve utility). If agents misperceive the schedule, we simply need
to assume that behavioral responses improve welfare on average. Correspondingly, our second assumption is
that if, hypothetically, the policy were to change from p to p, household behavioral re-optimization would
also improve welfare, on average. Mathematically, this amounts to assuming:

          W ( p) =        φ(θ)u (y (x∗ (θ, p), θ) + b1 (ˆ
                                                        y (x∗ (θ, p) , θ) ≤ τ ) , x∗ (θ, p); θ) dF (θ) − λbG(τ ; p)
                     Θ

                ≥         φ(θ)u y x∗ (θ, p ), θ + b1 y
                                                     ˆ x∗ θ, p , θ ≤ τ , x∗ (θ, p ); θ dF (θ) − λbG(τ ; p)
                     Θ


If we are willing to make these two assumptions, the rest of the proof to Proposition 3 goes through, so that
we can bound the welfare impact of changing notches if individuals misperceive the schedule. Hence, we can
state:

Proposition 5. Suppose households solve Problem (19), welfare is given by Equation (20) and τ > τ . If we
assume:

          W (p ) ≥        φ(θ)u y (x∗ (θ, p), θ) + b 1 y
                                                       ˆ (x∗ (θ, p) , θ) ≤ τ , x∗ (θ, p); θ dF (θ) − λb G(τ ; p )
                     Θ

and

          W (p) ≥         φ(θ)u y x∗ (θ, p ), θ + b1 y
                                                     ˆ x∗ θ, p , θ ≤ τ , x∗ (θ, p ); θ dF (θ) − λbG(τ ; p)
                     Θ



                                                                 41
Then as long as b G(τ ; p ) − bG(τ ; p) > 0 have:

                                                            1
                                                            λ[W (p ) − W (p)]
                              ωL M V P FL − 1 ≤                                   ≤ ωU M V P F U − 1
                                                          b G(τ ; p ) − bG(τ ; p)

where M V P FL is given by Equation (7), M V P FU is given by Equation (8), and ωL (ωU ) captures the
weighted average money-metric welfare gain from giving a dollar to mechanical, bunching, threshold, and
jumping households, where the weights are determined by the lower bound (upper bound) of each group’s
WTP.


A.6    Discounted Welfare Impact of Reform

    Suppose households have several decisions variables at time t denoted by the vector xt (within a potentially
limited choice set Xt ). Household decisions are made conditional on state variables denoted by the vector
θt ∈ Θt and the policy p. Households get the benefit b if their reported income y
                                                                               ˆt , which can be a function
of decision variables xt of included as a decision variable, is below τ . Household income, denoted yt , is also
potentially a function of decisions xt .58 Households in period t solve:

                                    V (θt ) = max u (ct , xt ; θt ) + β Eθt+1 |θt ,xt [V (θt+1 )]
                                                 xt ∈Xt
                                                                                                                                            (21)
                                    s.t. ct = yt (xt , θt ) + b1(ˆ
                                                                 yt (xt , θt ) ≤ τ )

where ct denotes consumption in period t, β is a discount factor, and                                    Eθt+1 |θt ,xt [V (θt+1 )] represents the
expected value of starting period t + 1 with state variables θt+1 , noting that the expectation over θt+1 may
be impacted by current state variables θt and current decisions xt . Equivalently, we can write out individual
utility from the perspective of time period 0 as:

                                       T
                                            β t Eθ1 |θ0 ,x0 [. . . Eθt |θt−1 ,xt−1 [u (ct , xt ; θt )] . . . ]
                                      t=0
                                            T
                                      =          β t Eθt |θ0 ,{xt } [u (ct , xt ; θt )]
                                           t=0

where Eθt |θ0 ,{xt } represents the expectation over θt from the perspective of time period 0 (taking into account
the impact of all conditional decisions {xt } between time 0 and time t on the underlying expectations). The
equality above follows from the law of iterated expectations.
So as to slightly reduce some cumbersome notation, let us define:

                                                      ∗
                                                     yt (θt , p) ≡ yt (x∗
                                                                        t (θt , p) , θt )

                                                       ∗
                                                     ˆt
                                                     y              ˆt (x∗
                                                         (θt , p) ≡ y    t (θt , p), θt )


Using this notation, we again assume total discounted welfare is given by a weighted discounted sum of
utilities, with welfare weights given by φ(θ0 ), less the total discounted budgetary cost of the policy multiplied
  58. For example, in a dynamic misreporting model, xt = y       ˆt and θt could include current income yt , aversion to misreporting
µ, prior reported income y  ˆt−1 , and a parameter governing expected future income growth. Households may also make savings
decisions if assets are a state variable in θt , current savings is included in xt , and ct represents post-transfer income.


                                                                        42
by a shadow value of public funds λ:
     T
           β t Wt (p) =
     t=0
                  T                                                                                                                                 T                     (22)
         φ(θ0 )         β   t
                                E   θt |θ0 ,{x∗
                                              t (θt ,p)}
                                                              ∗
                                                         [u (yt (θt , p)   + b1   (ˆ∗
                                                                                   yt (θt , p)         ≤    τ ) , x∗
                                                                                                                   t (θt , p); θt )]dF (θ0 )   −λ          t
                                                                                                                                                          β bGt (τ ; p)
    Θ0            t=0                                                                                                                               t=0


where β represents the governments discount rate, λ represents the shadow value of public funds in time
t = 0 (so that the shadow value of public funds in future periods equals β t λ), and:

                                               Gt (τ ; p) =           Eθt |θ0 ,{x∗
                                                                                 t (θt ,p)}
                                                                                            [1(ˆ∗
                                                                                               yt (θt , p) ≤ τ )]dF (θ0 )
                                                                 Θ0


represents the expected number of households receiving the benefit under policy p in period t. More generally,
we define:
                                                G t ( z ; p) =        Eθt |θ0 ,{x∗
                                                                                 t (θt ,p)}
                                                                                            [1(ˆ∗
                                                                                               yt (θt , p) ≤ z )]dF (θ0 )
                                                                 Θ0

as the expected number of households with a reported income below z under policy p in time period t.
This setup allows us to bound the welfare impacts of a policy reform over many time periods:

Proposition 6. Suppose households solve Problem (21), total welfare is given by Equation (22), and τ > τ .
Defining:
                                                                                   T
                                                                                           β t b [Gt (τ ; p ) − Gt (τ ; p)]
                                                                                  t=0
                                                   M V P FL,T ≡ 1 −
                                                                                           β t [b Gt (τ ; p ) − bGt (τ ; p)]
                                                                                   t

                                                                                       T
                                                                                            β t b[Gt (τ ; p) − Gt (τ ; p )]
                                                                                   t=0
                                                   M V P FU,T ≡ 1 +
                                                                                           β t [b Gt (τ ; p ) − bGt (τ ; p)]
                                                                                   t

Then as long as                      t [b   Gt (τ ; p ) − bGt (τ ; p)] > 0 have:
                                tβ

                                                                           T
                                                                      1
                                                                      λ          β t [Wt (p ) − Wt (p)]
                                                                           t=0
                        ωL,T M V P FL,T − 1 ≤                     T
                                                                                                                           ≤ ωU,T M V P FU,T − 1
                                                                       β t [b    Gt (τ ; p ) − bGt (τ ; p)]
                                                                 t=0


where ωL,T (ωU,T ) captures the discounted weighted average money-metric welfare gain from giving a dollar
to mechanical, bunching, threshold, and jumping households, where the weights are determined by the lower
bound (upper bound) of each group’s discounted WTP.

                                                                                                       T
                                                                                                  1
                                                                                                  λ
                                                                                                            β t [Wt (p )−Wt (p)]
                                                                                                      t=0
Proof. We start with proving the lower bound for                                              T
                                                                                                                                      . First, note that by revealed
                                                                                                      β t [b Gt (τ ;p )−bGt (τ ;p)]
                                                                                              t=0




                                                                                             43
preference, we have the following:

                            T
                                  β t Eθt |θ0 ,{x∗
                                                 t (θt ,p)}
                                                                 ∗
                                                            [u (yt (θt , p) + b1 (ˆ∗
                                                                                  yt (θt , p) ≤ τ ) , x∗
                                                                                                       t (θt , p); θt )]
                           t=0
                                  T
                           ≡           β t Eθt |θ0 ,{x∗
                                                      t (θt ,p)}          t (θt , p) , θt ) + b1 (ˆ
                                                                 [u (yt (x∗                       yt (x∗                         ∗
                                                                                                       t (θt , p), θt ) ≤ τ ) , xt (θt , p); θt )]
                                 t=0
                                  T
                           ≥           β t Eθt |θ0 ,{xt } [u (yt (xt , θt ) + b1 (ˆ
                                                                                  yt (xt , θt ) ≤ τ ) , xt ; θt ]
                                 t=0


Put simply, optimal decisions conditional on any given θt under p, x∗ (θt , p), yields weakly higher utility than
any other set of decisions xt that one could make. This yields the following bound on welfare under policy
p = {b , τ }, which ensues by evaluating utility under policy p , but holding household decisions constant at
their values under policy p (i.e., by revealed preference):
  T
       β t Wt (p ) =
 t=0
                       T                                                                                                              T
      φ( θ 0 )             β t Eθt |θ0 ,{x∗
                                          t (θt ,p )}
                                                           ∗
                                                      [u (yt (θt , p) + b 1 (ˆ∗
                                                                             yt (θt , p) ≤ τ ) , x∗
                                                                                                  t (θt , p ); θt )]dF (θ0 ) − λ           β t bGt (τ ; p )
Θ0                t=0                                                                                                                t=0

                           T                                                                                                                      T
 ≥          φ( θ 0 )             β t Eθt |θ0 ,{x∗
                                                t (θt ,p)}
                                                                ∗
                                                           [u (yt (θt , p) + b 1 (ˆ
                                                                                  yt ( x∗                         ∗
                                                                                        t (θt , p), θt ) ≤ τ ) , xt (θt , p); θt )]dF (θ0 ) − λ         β t bGt (τ ; p )
      Θ0                   t=0                                                                                                                    t=0



Hence, for the reform from p = {b, τ } to p = {b , τ } with p − p = {∆b, ∆τ } and ∆τ > 0:
  T
       β t [Wt (p ) − Wt (p)] ≥
 t=0
  T
                                                   ∗
       βt                              φ(θ0 ) [u (yt (θt , p) + b , x∗                      ∗                ∗                                ∗
                                                                     t (θt , p); θt ) − u (yt (θt , p) + b, xt (θt , p); θt )] dF (θt |θ0 , {xt (θt , p)})dF (θ0 )
 t=0        Θ0 θt :ˆ∗ (θ ,p)≤τ
                   yt   t

       T
                                                           ∗
 +          βt                                 φ(θ0 ) [u (yt (θt , p) + b , x∗                      ∗            ∗                                ∗
                                                                             t (θt , p); θt ) − u (yt (θt , p), xt (θt , p); θt )] dF (θt |θ0 , {xt (θt , p)})dF (θ0 )
      t=0      Θ0 θt :ˆ∗ (θ ,p)∈(τ,τ ]
                      yt   t

           T
 −λ            β t [b Gt (τ ; p ) − bGt (τ ; p)]
       t=0
                                                                                                                                                                   (23)

Next, define η{y ∗ (θ ,p)≤τ } as the government’s average expected welfare weight on the households who opti-
               ˆt   t
mally report incomes ≤ τ under policy p at time t. η{y    ∗ (θ ,p)≤τ } captures the average expected welfare gain
                                                         ˆt   t
from giving these households an extra $1:
                                                                    ∗                 ∗                      ∗                ∗                                ∗
                                                   φ(θ0 ) b 1
                                                            −b [u (yt (θt , p) + b , xt (θt , p); θt ) − u (yt (θt , p) + b, xt (θt , p); θt )] dF (θt |θ0 , {xt (θt , p)})dF (θ0 )
                            Θ0 θt :ˆ∗ (θ ,p)≤τ
                                   yt   t
η {y
   ˆt∗ (θ ,p)≤τ } =
         t
                                                                                                     Gt (τ, p)

And define η{y ∗ (θ ,p)∈(τ,τ ]} as the government’s average expected welfare weight of giving a dollar to the
             ˆt   t
households who optimally report incomes ∈ (τ, τ ] under policy p at time t. η{y    ∗ (θ ,p)∈(τ,τ ]} captures the
                                                                                  ˆt   t




                                                                                        44
average expected welfare gain from giving these households an extra $1:59
                                                                         ∗
                                                                  1
                                                         φ( θ 0 ) b [u (yt (θt , p) + b , x∗                 ∗            ∗                                ∗
                                                                                           t (p); θt ) − u (yt (θt , p), xt (θt , p); θt )] dF (θt |θ0 , {xt (θt , p)})dF (θ0 )
                              Θ0 θt :ˆ∗ (θ ,p)∈(τ,τ ]
                                     yt   t
η {y
   ˆt∗ (θ ,p)∈(τ,τ
         t           ]}   =
                                                                                                           Gt (τ ; p) − Gt (τ, p)

Note, by the mean value theorem, η{y
                                   ˆt∗ (θ ,p)≤τ } and η{y
                                         t              ˆt∗ (θ ,p)∈(τ,τ
                                                              t                                                                ]}   are equal to average expected social
marginal utilities of consumption for their respective groups of households. Next, let us define an aggregate
discounted welfare weight, ηL,t , which equals the weighted average discounted expected welfare weight of
giving a dollar to all households, where the weights are determined by the lower (discounted) bound of
expected WTP for the reform:

                                     T                                                          T
                                          β t η{y
                                                ˆt∗ (θ ,p)≤τ } ∆bGt (τ ; p) +
                                                      t
                                                                                                         β t η{y
                                                                                                               ˆt∗ (θ ,p)∈(τ,τ ]} b [Gt (τ ; p) − Gt (τ, p)]
                                                                                                                     t
                                    t=0                                                        t=0
                      ηL,T =                                    T
                                                                      β t [∆bGt (τ ; p) + b [Gt (τ ; p) − Gt (τ, p)]]
                                                                t=0

Then, dividing Equation (23) through by the budgetary effect multiplied by λ, we have (recall we assume the
budgetary effect is > 0):

      T                                                     T                                                              T
              1
           βt λ [Wt (p ) − Wt (p)]                                β t η{y
                                                                        ˆt∗ (θ ,p)≤τ } ∆bGt (τ ; p) +
                                                                              t
                                                                                                                                β t η{y
                                                                                                                                      ˆt∗ (θ ,p)∈(τ,τ ]} b [Gt (τ ; p) − Gt (τ ; p)]
                                                                                                                                            t
     t=0                                                    t=0                                                           t=0
 T
                                                        ≥                                                 T
                                                                                                                                                                                       −1
      β t [b   Gt (τ ; p ) − bGt (τ ; p)]                                                                      β t λ [b   Gt (τ ; p ) − bGt (τ ; p)]
t=0                                                                                                      t=0
                                                                      T
                                                                              β t [∆bGt (τ ; p) + b [Gt (τ ; p) − Gt (τ ; p)]]
                                                          ηL,T      t=0
                                                        =                              T
                                                                                                                                                    −1
                                                           λ
                                                                                           β t [b   Gt (τ ; p ) − bGt (τ ; p)]
                                                                                    t=0

                                                        = ωL,T M V P FL,T − 1


where ωL,T = ηL,T /λ and M V P FL,T is given by:

                                     T                                                                                               T
                                          β t [∆bGt (τ ; p) + b [Gt (τ ; p) − Gt (τ, p)]]                                                 β t b [Gt (τ ; p ) − Gt (τ ; p)]
                                    t=0                                                                                             t=0
           M V P FL,T =                        T
                                                                                                                          =1−        T
                                                    β t [b Gt (τ ; p ) − bGt (τ ; p)]                                                     β t [b Gt (τ ; p ) − bGt (τ ; p)]
                                              t=0                                                                                   t=0


                                                                                   T
                                                                              1
                                                                              λ
                                                                                        β t [Wt (p )−Wt (p)]
                                                                                  t=0
      Next, we prove the upper bound for                                  T
                                                                                                                     . We use identical revealed preference logical
                                                                                  β t [b Gt (τ ;p )−bGt (τ ;p)]
                                                                       t=0
to bound welfare under policy p = {b, τ } by evaluating utility under policy p, but holding household decisions
   59. We have used                                     dF (θt |θ0 , {x∗
                                                                       t (θt , p)})dF (θ0 ) = Gt (τ ; p) − Gt (τ, p).
                              Θ0 θt :ˆ∗ (θ ,p)∈(τ,τ ]
                                     yt   t




                                                                                                    45
constant at their values under policy p :
 T                                      T                                                                                                       T
      β t Wt (p) =           φ(θ0 )          β t Eθt |θ0 ,{x∗
                                                            t (p)}
                                                                        ∗
                                                                   [u (yt (θt , p) + b1 (ˆ∗
                                                                                         yt (θt , p) ≤ τ ) , x∗
                                                                                                              t (θt , p); θt )]dF (θ0 ) − λ         β t bGt (τ ; p)
t=0                     Θ0             t=0                                                                                                    t=0

                                        T                                                                                                           T
                    ≥        φ(θ0 )          β t Eθt |θ0 ,{x∗
                                                            t (p )}
                                                                         ∗
                                                                    [u (yt (θt , p ) + b1 (ˆ∗
                                                                                           yt (θt , p ) ≤ τ ) , x∗
                                                                                                                 t (θt , p ); θt )]dF (θ0 ) − λ           β t bGt (τ ; p)
                        Θ0             t=0                                                                                                          t=0



Hence, for the reform from p = {b, τ } to p = {b , τ } with p − p = {∆b, ∆τ } and ∆τ > 0:
 T
       β t [Wt (p ) − Wt (p)] ≤
t=0
 T
                                               ∗
       βt                          φ(θ0 ) [u (yt (θt , p ) + b , x∗                       ∗                 ∗                                 ∗
                                                                  t (θt , p ); θt ) − u (yt (θt , p ) + b, xt (θt , p ); θt )] dF (θt |θ0 , {xt (θt , p )})dF (θ0 )
t=0        Θ0 θt :ˆ∗ (θ ,p )≤τ
                  yt   t

      T
                                                        ∗
+           βt                              φ(θ0 ) [u (yt (θt , p ) + b , x∗                       ∗             ∗                                 ∗
                                                                           t (θt , p ); θt ) − u (yt (θt , p ), xt (θt , p ); θt )] dF (θt |θ0 , {xt (θt , p )})dF (θ0 )
     t=0      Θ0 θt :ˆ∗ (θ ,p )∈(τ,τ ]
                     yt   t

          T
−λ            β t [b Gt (τ ; p ) − bGt (τ ; p)]
       t=0
                                                                                                                                                                 (24)

Next, define η{y ∗ (θ ,p )≤τ } as the government’s average expected welfare weight on the households who opti-
               ˆt   t
mally report incomes ≤ τ under policy p at time t. η{y    ∗ (θ ,p )≤τ } captures the average expected welfare gain
                                                         ˆt   t
from giving these households an extra $1:

η{y
  ˆt∗ (θ ,p )≤τ }
        t

                                                ∗                  ∗                       ∗                 ∗                                 ∗
                             φ( θ 0 ) b 1
                                        −b [u (yt (θt , p ) + b , xt (θt , p ); θt ) − u (yt (θt , p ) + b, xt (θt , p ); θt )] dF (θt |θ0 , {xt (θt , p )})dF (θ0 )
     Θ0 θt :ˆ∗ (θ ,p )≤τ
            yt   t
=
                                                                                       Gt (τ, p )

And define η{y ∗ (θ ,p )∈(τ,τ ]} as the government’s average expected welfare weight of giving a dollar to the
             ˆt   t
households who optimally report incomes ∈ (τ, τ ] under policy p at time t. η{y    ∗ (θ ,p )∈(τ,τ ]} captures the
                                                                                  ˆt   t
average expected welfare gain from giving these households an extra $1:  60


η{y
  ˆt∗ (θ ,p )∈(τ,τ
        t               ]}
                                                   ∗
                                            1
                                   φ( θ 0 ) b [u (yt (θt , p ) + b , x∗                       ∗             ∗                                 ∗
                                                                      t (θt , p ); θt ) − u (yt (θt , p ), xt (θt , p ); θt )] dF (θt |θ0 , {xt (θt , p )})dF (θ0 )
      Θ0 θt :ˆ∗ (θ ,p
             yt         )∈(τ,τ ]
                  t
 =
                                                                              Gt (τ ; p ) − Gt (τ, p )

      Again, by the mean value theorem, η{y
                                          ˆt∗ (θ ,p )≤τ } and η{y
                                                t               ˆt∗ (θ ,p )∈(τ,τ
                                                                      t                                         ]}   are equal to average expected social
marginal utilities of consumption for their respective groups of households. Next, let us define an aggregate
discounted welfare weight, ηU,t , which equals the weighted average discounted expected welfare weight of
giving a dollar to all households, where the weights are determined by the upper (discounted) bound of
     60. We have again used            Θ0
                                                                    dF (θt |θ0 , {x∗
                                                                                   t (θt , p)})dF (θ0 ) = Gt (τ ; p ) − Gt (τ, p ).
                                            θt :ˆ∗ (θ ,p )∈(τ,τ ]
                                                yt   t




                                                                                       46
expected WTP for the reform:

                             T                                                   T
                                   β t η{y
                                         ˆt∗ (θ ,p )≤τ } ∆bGt (τ ; p ) +
                                               t
                                                                                       β t η{y
                                                                                             ˆt∗ (θ ,p )∈(τ,τ ]} b [Gt (τ ; p ) − Gt (τ, p )]
                                                                                                   t
                             t=0                                                t=0
                    ηU,T =                              T
                                                             β t [∆bGt (τ ; p ) + b [Gt (τ ; p ) − Gt (τ, p )]]
                                                       t=0

Then, dividing Equation (24) through by the budgetary effect multiplied by λ, we have (recall we assume the
budgetary effect is > 0):
         T                                       T                                                 T
     1
     λ         β t [Wt (p ) − Wt (p)]                  β t η{y
                                                             ˆt∗ (θ ,p )≤τ } ∆bGt (τ ; p ) +
                                                                   t
                                                                                                        β t η {y
                                                                                                               ˆt∗ (θ ,p )∈(τ,τ
                                                                                                                     t            ]} b   [Gt (τ ; p ) − Gt (τ ; p )]
         t=0                                     t=0                                              t=0
 T
                                             ≤                                          T
                                                                                                                                                                       −1
      β t [b Gt (τ ; p ) − bGt (τ ; p)]                                            λ         β t [b Gt (τ ; p ) − bGt (τ ; p)]
t=0                                                                                    t=0
                                                         T
                                                              β t [∆bGt (τ ; p ) + b [Gt (τ ; p ) − Gt (τ ; p )]]
                                               ηU,T     t=0
                                             =                     T
                                                                                                                             −1
                                                λ
                                                                        βt   [b Gt (τ ; p ) − bGt (τ ; p)]
                                                                  t=0

                                             = ωU,T M V P FU,T − 1


where ωU,T = ηU,T /λ and M V P FU,T is given by:

                              T                                                                                      T
                                   β t [∆bGt (τ ; p ) + b [Gt (τ ; p ) − Gt (τ, p )]]                                     β t b [Gt (τ ; p) − Gt (τ ; p )]
                             t=0                                                                                    t=0
          M V P FU,T =                   T
                                                                                                        =1+         T
                                              β t [b Gt (τ ; p ) − bGt (τ ; p)]                                          β t [b Gt (τ ; p ) − bGt (τ ; p)]
                                        t=0                                                                        t=0




          B        For Online Publication: Bolsa Fam´
                                                    ılia Program Appendix

B.1                ılia Questionnaire
          Bolsa Fam´

      Figure 9 shows the entries on the questionnaire used to calculate each individual’s total monthly income
(household per-capita income will be calculated via summing total individual monthly incomes across all
members of a household divided by the number of members in the household). Questions 8.05-8.08 relate to
determining last month’s labor income and the labor income over the last 12 months. The computer will then
calculate an individual’s minimum monthly labor income via taking the minimum between the individual’s
labor income last month and the individual’s average monthly labor income over the last 12 months. Question
8.09 relates to determining the average monthly income from five other income sources: charity, pensions,
unemployment insurance, alimony, and other. An individual’s total monthly income is then equal to their
monthly income from these five sources plus their minimum monthly labor income.




                                                                                  47
                     4 - Empregado com carteira de trabalho assinada                         10 - Estagiário

                     5 - Trabalhador doméstico sem carteira de trabalho assinada             11 - Aprendiz

                     6 - Trabalhador doméstico com carteira de trabalho assinada

                  8.05 - No mês passado (nome) recebeu remuneração de trabalho?
                  (Se sim, registre o valor bruto da remuneração efetivamente recebida em todos os trabalhos)


                                 ,00       Last Month Income                                 0 - Não recebeu

                  8.06 - (Nome) teve trabalho remunerado nos últimos 12 meses?
                     1 - Sim                                                                 2 - Não - Passe ao 8.09

                  8.07 - Quantos meses trabalhou nesse período?




                  8.08 - Qual foi a remuneração bruta de todos os trabalhos recebidos por (nome) nesse período?

                                 ,00
                                           Last 12 Months Income
                  8.09- Quanto (nome) recebe, normalmente, por mês de:                 Charity Income
                  1 - Ajuda/doação regular de não morador                              ,00                         0 - Não recebe


                  2 - Aposentadoria, aposentadoria rural, pensão
                                                                                       Pensions
                                                                                       ,00                         0 - Não recebe
                  ou BPC/LOAS

                                                                                       Unemployment Insurance
                  3 - Seguro-desemprego                                                ,00                         0 - Não recebe

                                                                                       Alimony
                  4 - Pensão alimentícia                                               ,00                         0 - Não recebe


                  5 - Outras fontes de remuneração exceto bolsa                        Other Income
                                                                                       ,00                         0 - Não recebe
                  família ou outras transferências similares


                                                                                                             29
Note: The figure depicts the income categories reported by applicants for each member of the household. Each category has
been translated into English in the figure. This is a print out of the screen seen by interviewers on their computers when filling
in applicants’ information.

                                                     Figure 9: Income Questionnaire


B.2                                                         ´
       Other Social Security Programs Based on the Cadastro Unico

    This appendix describes other programs that set their eligibility based on information from the Cadastro
´
Unico database.
     ıcio de Preta¸
Benef´            c˜ao Continuada (BPC): This benefit targets the elderly (above 65 years of age) and
disabled. It gives a minimum wage to all households with per-capita income up to a quarter of the minimum
wage. Table 4 reports the minimum wage and BPC threshold across all years of the analysis. The Brazilian
Social Security System administers its own exam to define eligibility for this program.

                               Table 4: Minimum Wage and BPC Eligibility Thresholds

                                                 Year         Minimum Wage                     BPC Threshold
                                                 2011                         545.00                            136.25
                                                 2012                         622.00                            155.50
                                                 2013                         678.00                            169.50
                                                 2014                         724.00                            181.00
                                                 2015                         788.00                            197.00
                                                 2016                         880.00                            220.00


Carteira do Idoso: This “Elderly Card” guarantees to all individuals 60 years of age or older and with
income up to two times the minimum wage at least a 50% discount on any interstate trip by road, rail, or
waterway.
  editos Instala¸
Cr´             c˜                                      aria: Households with per-capita income
                  ao do Programa Nacional de Reforma Agr´


                                                                                    48
up to three times the minimum wage and that are living in camping grounds get points in a system that
selects beneficiaries to be settled through the Brazilian land reform.
Facultativo de Baixa Renda: This is an option to contribute to social security at a lower rate (5% of the
minimum wage). The individual cannot have any income and household income must be below two times
the minimum wage.
Identidade Jovem (ID Jovem): Discounts for cultural events and trips by road, rail, or waterway for
individuals between 15 and 29 years of age living in a household with up to two times the minimum wage.
    c˜
Isen¸                      c˜
      ao de taxas de inscri¸                  ublicos: Since 2008, households with per-capita income
                             ao em concursos p´
up to half of the minimum wage or total income of up to three times the minimum wage are exempt from
public tender registration payment.
   ıtica Nacional Assistˆ
Pol´                           ecnica Rural — PNATER Brasil Sem Mis´
                        encia T´                                   eria: Technical assistance
for households working on activities for their own consumption in rural areas.
         ´
Programa Agua para Todos — Programa Nacional de Universaliza¸
                                                            c˜                      ´
                                                              ao do Acesso e Uso da Agua:
Since July 2011, the government has installed cisterns to ensure access to clean water for all Brazilians, with
priority going to those who satisfy the criteria for BF program.
Bolsa Estiagem: This is a benefit of at least R$80 per month to households with total income up to two
times the minimum wage that live in areas hit by natural disasters.
                                         a Conserva¸
Programa Bolsa Verde — Programa de Apoio `         c˜ao Ambiental:                        Since October 2011,
this program transfers R$300 every 3 months to households in extreme poverty (first threshold of BF) and
that follow the requirements for using natural resources.
Programa Cisternas:        This program aims to provide cisterns to low-income families registered in the
         ´
Cadastro Unico.
                    c˜
Programa de Erradica¸ ao do Trabalho Infantil: This program transfers benefits similar to the BF (R$25
and R$40 per child per month in municipalities with less and more than 250,000 inhabitants, respectively)
to households whose incomes are above the BF threshold with working children (up to 16 years of age)
conditional on these children attending school 85% of the time instead of working.
                    as Atividades Produtivas Rurais:
Programa de Fomento `                                                 Since 2012, the government has made a
one-time transfer of around R$2,400 to families that are eligible for the BF program and work on agricultural
activities or belong to native or traditional communities.
Programa Minha Casa Minha Vida:               Households with total monthly income up to R$1,416.67 have
access to a subsidized credit line to purchase a house.
                       edito Fundi´
Programa Nacional de Cr´          ario: Households with total monthly income up to R$2,500 have
access to a subsidized credit line to purchase land for production.
     cos Socioassistenciais:
Servi¸                            MDS offers social services to poor individuals who have suffered any type
of violence or neglect.
               c˜
Sistema de Sele¸ ao Unificada — Sisu/Lei de Cotas: Since 2016, all federal universities in the country
reserve some seats for students coming from families with per-capita income up to 1.5 times the minimum
wage.
                           etrica:
Tarifa Social de Energia El´               Households with monthly per-capita income of up to a half the


                                                      49
minimum wage have access to a discounted electricity price.
Tefone Popular — Acesso Individual Classe Especial: The government offers a landline with lower
                                                  ´
prices for individuals registered in the Cadastro Unico database.
         c˜
Distribui¸ ao de Conversores de TV Digital:                       Since the September of 2015, MDS has offered digital
converters to beneficiaries of the program, which helps them transition from open TV to the new system.


B.3              ılia Program and Data Extraction Timeline
        Bolsa Fam´

    Figure 10 presents the data extraction timeline relative to when the BF program started and the June
2014 reform. As can be seen, the program started in 2003, the extractions we have span the months between
December 2011 to September 2016, and the reform we study occurred in June 2014. Note, there was also
another reform in June 2016.

  Start of        First           Second            Third                 Fourth     Sixth         Eighth
the Program     Extraction       Extraction       Extraction            Extraction Extraction     Extraction
      2003        12/11            12/12           12/2013               12/2014     8/15           9/16



                                                                6/14           4/15       12/15    6/16
                                                               Reform          Fifth     Seventh Reform
                                                                             Extraction Extraction

Note: The figure describes the timeline of the program and the data extractions. BF started in 2003, and the reform we study
occurred in June 2014. The final dataset is constructed from 8 extractions from December 2011 until September of 2016. Each
extraction contains the the most recent information on each household as of the extraction date. Note, there was another reform
in June 2016.

                                                   Figure 10: Timeline



B.4              ılia Schedule for Households with Children
        Bolsa Fam´

    At the beginning of our dataset (December 2011), the BF program has two eligibility thresholds in the
per-capita monthly income distribution for households with children: the extreme-poverty line (R$70) and
the poverty line (R$140). Households with per-capita income below the extreme-poverty line are eligible for
the constant basic benefit (R$70 per-month), a variable benefit proportional to the number of family members
between 0 and 15 years old (R$32 per-child, per-month), and a teenager benefit proportional to the number
of members aged 16 or 17 years old (R$38 per-teenager, per-month). Households with per-capita income
between the extreme-poverty and poverty thresholds are only eligible for the variable and teenager benefits.
Households with per-capita income above the second threshold are not eligible for any BF cash transfers.
Moreover, the total variable benefit was capped at R$160 (5 children per household) and the total teenager
benefit was capped at R$76 (two teenagers per household).61
    The reform this paper studies, which occurred in June 2014, increased the extreme poverty threshold
from R$70 to R$77 and the poverty threshold from R$140 to R$154. The basic benefit was raised from R$70
to R$77, the benefit per child from R$32 to R$35, and the benefit per teenager from R$38 to R$42. Note
   61. Households with children need to fulfill three additional conditions to receive the variable benefit and/or teenager benefit:
(1) children must maintain a minimum of 85% school attendance between ages 6 and 15 and 75% school attendance between
16 and 17; (2) households must keep track of their children’s vaccines; and (3) parents must maintain at least 85% attendance
in a social-education program if the household has violated child labor laws in the past. All conditionalities were held constant
during the analysis period.


                                                                 50
that the thresholds are based on per-capita income but the benefits are denominated in raw amounts. This
reform was announced on national television by the president in April 2014.
    Table 5 summarizes these aspects of the schedule before (first column) and after (second column) the
reform for households with children.
Table 5: Summary of Bolsa Fam´
                             ılia Schedule for Families with Children Before and After
                                   June 2014 Reform


                                                                                                 Before    After
           Extreme-Poverty Threshold                                                                70      77
           Poverty Threshold                                                                       140      154
           Basic Benefit (for those in extreme-poverty)                                             70       77
           Variable Benefit Per Child 15 or Younger (max 5) (for those in poverty)                   32      35
           Teenager Benefit Per Teen 16-18 (max 2) (for those in poverty)                           38       42

Note: The first two rows correspond to the the extreme-poverty and poverty thresholds, respectively. These are measured in
monthly, per-capita income. I.e., before the reform a household is below the extreme-poverty threshold if their monthly, per-
capita income is below R$70. The third, fourth, and fifth rows display the benefits given to households; these are denoted in
monthly amounts. I.e., before the reform a household below the extreme-poverty threshold receives R$70 per-month in the basic
benefit.


    Between December 2011 and February 2013 there were three other reforms to the BF program, which
successively instituted a guaranteed minimum income of R$70 per-capita (along with an associated negative
income tax) for several groups of households. This guaranteed minimum income was instituted in June 2012
for households with children below 6 years of age, in November 2012 for households with children below 15
years of age, and in February 2013 for all remaining households. These reforms thus created a kink (which
varies with household composition) in the benefit schedule as a function of reported, per-capita household
income for households with children as well as for two adults households with no children. For example, if
a two adult household without kids prior to June 2014 has a reported per-capita income of R$20, they get
R$70 from the basic benefit and then get an additional R$30 to bring them up to the guaranteed minimum
per-capita income of R$70. Mathematically, prior to June 2014 two adult households without kids face a
                                                             ˆ, equal to:
benefit schedule as a function of reported per-capita income, y

                                         y ) = 701(ˆ
                                      B (ˆ                                   ˆ}
                                                   y ≤ 70) + max{0, 70 − 2 × y

This benefit schedule therefore has a kink at R$35. The kink then changed slightly with the June 2014 reform
as the guaranteed minimum income was raised to R$77 per-capita. For example, after June 2014 two adult
households without kids face a benefit schedule with a kink at R$38.5:

                                         y ) = 771(ˆ
                                      B (ˆ                                   ˆ}
                                                   y ≤ 77) + max{0, 77 − 2 × y

Thus, the June 2014 reform potentially impacted the reported income distribution around the kink. Because
the kink is located below the first BF notch of R$70, Identification Assumption 1 is less likely to hold for these
households as households with reported incomes under R$63 may respond to this changing kink. Finally, a
reform in June 2016 further increased the extreme-poverty threshold to R$85 (per-capita, per-month), the
poverty threshold to R$170 (per-capita, per-month), the basic benefit to R$85 (per-month), the variable


                                                             51
benefit R$39 (per-child, per-month), and the teenager benefit to R$46 (per-teenager, per-month).
    For purposes of illustration, Figure 11 plots how the benefit schedules for two particular household com-
positions varied over time.

                            2 Adults + 1 Teen                                 1 Adult + 2 Children Under 15
                ˆ + B (ˆ
                y      y)                                       ˆ + B (ˆ
                                                                y      y)




                                              June 2016                                      June 2016
                                              June 2014                                      June 2014
                                              February 2013                                  November 2012
                   85                                              85
                   77                         December 2011        77                        December 2011
                   70                                              70

                                                                44.67
                   36



                               70 77 85    140 154 170
                                                         ˆ
                                                         y                     70 77 85    140 154 170
                                                                                                         ˆ
                                                                                                         y
Note: y                                                                 y ) denotes the monthly, per-capita benefits a household
       ˆ denotes the reported, per-capita, monthly household income. B (ˆ
                        ˆ. These benefits will also depend on household composition. For example, a household with 2 adults
receives if they report y
and 1 teenager reporting yˆ = 0 in December 2011 will receive R$70 in the basic benefit and R$38 in the teenager benefit. Thus
ˆ + B (ˆ
y      y ) = (70 + 38)/3 = 36.

   Figure 11: Bolsa Fam´
                       ılia Schedule Reforms For Two Example Household Compositions




                                                              52
                       C     For Online Publication: Data Appendix

C.1                                          ´
       Number of Individuals on the Cadastro Unico Registry Over Time




                                                                     ´
Note: This figure shows the raw number of individuals on the Cadastro Unico Registry over time. The timing of the reform is
indicated by the gray, shaded region.

                                                                 ´
                Figure 12: Number of Individuals on the Cadastro Unico Registry




                                                           53
C.2    Actual and Counterfactual Paths Estimated from Equation (15) with Different Polynomial De-
       grees




Note: This figure shows the log number of single individual households reporting incomes in R$(63, 70] and R$(70, 77] over time
along with the counterfactual paths had the reform not happened. The counterfactual paths are equal to the actual number of
                                                                                     ˆ2,x t, estimated using Equation (15) where
                                                                              ˆ1,x + β
people reporting in the given interval minus the causal impact of the reform, β
we set treatx = 1 if x ∈ {70, 77} and K = 2. Confidence intervals are constructed from clustered standard errors at the bin level.
The timing of the reform is indicated by the gray, shaded region.

             Figure 13: Actual and Counterfactual Paths for Treatment Bins, K = 2




Note: This figure shows the log number of single individual households reporting incomes in R$(63, 70] and R$(70, 77] over time
along with the counterfactual paths had the reform not happened. The counterfactual paths are equal to the actual number of
                                                                                     ˆ2,x t, estimated using Equation (15) where
                                                                              ˆ1,x + β
people reporting in the given interval minus the causal impact of the reform, β
we set treatx = 1 if x ∈ {70, 77} and K = 4. Confidence intervals are constructed from clustered standard errors at the bin level.
The timing of the reform is indicated by the gray, shaded region.

             Figure 14: Actual and Counterfactual Paths for Treatment Bins, K = 4



                                                               54
Note: This figure shows the log number of single individual households reporting incomes in R$(63, 70] and R$(70, 77] over time
along with the counterfactual paths had the reform not happened. The counterfactual paths are equal to the actual number of
                                                                                     ˆ2,x t, estimated using Equation (15) where
                                                                              ˆ1,x + β
people reporting in the given interval minus the causal impact of the reform, β
we set treatx = 1 if x ∈ {70, 77} and K = 5. Confidence intervals are constructed from clustered standard errors at the bin level.
The timing of the reform is indicated by the gray, shaded region.

             Figure 15: Actual and Counterfactual Paths for Treatment Bins, K = 5


C.3    Histograms of Reported Income Distribution

Figure 16 plots the distribution of reported incomes in June 2014 and June 2016 for single individual house-
holds split into seven increment bins. While these histograms show the pre- and post-reform distributions,
they should not be used to make inferences about the causal impact of the reform due to significant underlying
time trends in the reported income distribution.




                                                               55
Note: This figure shows the number of single individual households that report incomes in seven increment bins in the period
before the reform (June 2014) and two years after the reform (June 2016). The extreme poverty threshold before June 2014 is
shown with a dashed blue line and the extreme poverty threshold after June 2014 is shown with a dashed red line.

      Figure 16: Histogram of Single Individual Household Reported Incomes Pre- and
                                       Post-Reform


    Figure 17 shows a more granular view of the same reported income distributions for single individual
households split into one increment bins. There is substantial bunching at 0 mod 50 numbers and a lesser
degree of bunching at 0 mod 10 numbers:




Note: This figure shows the number of single individual households that report incomes in one increment bins in the period
before the reform (June 2014) and two years after the reform (June 2016). The extreme poverty threshold before June 2014 is
shown with a dashed blue line and the extreme poverty threshold after June 2014 is shown with a dashed red line.

Figure 17: Granular Histogram of Single Individual Household Reported Incomes Pre- and
                                     Post-Reform


    Finally, for better visualization of the reported income distribution for per-capita incomes> 0, we repeat

                                                            56
Figures 16 and 17 restricting to households with strictly positive per-capita incomes:




Note: This figure shows the number of single individual households that report incomes in seven increment bins in the period
before the reform (June 2014) and two years after the reform (June 2016). The extreme poverty threshold before June 2014 is
shown with a dashed blue line and the extreme poverty threshold after June 2014 is shown with a dashed red line.

      Figure 18: Histogram of Single Individual Household Reported Incomes Pre- and
                        Post-Reform (strictly positive incomes only)




Note: This figure shows the number of single individual households that report incomes in one increment bins in the period
before the reform (June 2014) and two years after the reform (June 2016). The extreme poverty threshold before June 2014 is
shown with a dashed blue line and the extreme poverty threshold after June 2014 is shown with a dashed red line.

Figure 19: Granular Histogram of Single Individual Household Reported Incomes Pre- and
                      Post-Reform (strictly positive incomes only)




                                                            57
C.4    Pre-Reform Differences




Note: This figure shows how the difference between the log number of people reporting incomes in R$(70,77], denoted
log N(70,77] , and the log number of people reporting in R$(x-7,x], denoted log N(x−7,x] , varied prior to the reform in June,
2014. Each plot also includes the cubic trend estimated from Equation (15).

      Figure 20: Pre-Reform Differences Between N(70,77] and N(x−7,x] For x ∈ {7, 14, ..., 63}




                                                             58
Note: This figure shows how the difference between the log number of people reporting incomes in R$(63,70], denoted
log N(63,70] , and the log number of people reporting in R$(x-7,x], denoted log N(x−7,x] , varied prior to the reform in June,
2014. Each plot also includes the cubic trend estimated from Equation (15).

      Figure 21: Pre-Reform Differences Between N(63,70] and N(x−7,x] For x ∈ {7, 14, ..., 63}


C.5    Results with Smaller Bin Sizes

    This Appendix contains results using a bin size of 3.5 rather than a bin size of 7 as in the main text.
We estimate Equation (15) for all x ∈ {3.5, 7, 10.5, ..., 77}. Estimated counterfactuals for bins R$(63,66.5],
R$(66.5,70], R$(70,73.5], and R$(73.5,77] are shown in Figure 22. The estimated treatment effects are large for
R$(66.5,70] and R$(73.5,77] but small for R$(63,66.5] and R$(70,73.5] suggesting that bunching is relatively
precise. The estimated effects and corresponding lower and upper bounds for the MVPF are shown in Table
6. The estimated MVPF bounds are relatively similar to our main results in Table 3.




                                                             59
Note: This figure shows the log number of single individual households reporting incomes in R$(63,66.5], R$(66.5,70], R$(70,73.5],
and R$(73.5,77] over time along with the counterfactual paths had the reform not happened. The counterfactual paths are equal
                                                                                                             ˆ2,x t, estimated using
                                                                                                      ˆ1,x + β
to the actual number of people reporting in the given interval minus the causal impact of the reform, β
Equation (15) where we set treatx = 1 if x ∈ {66.5, 70, 73.5, 77} and K = 3. Confidence intervals are constructed from clustered
standard errors at the bin level. The timing of the reform is indicated by the gray, shaded region.

Figure 22: Actual and Counterfactual Paths for Treatment Bins Using Equation (15) with
                                  Smaller Bin Sizes


 Table 6: Impacts of Reform and MVPF Bounds in June 2016 Estimated from Equation (15)
                                 Using Smaller Bin Sizes

                                        (1)            (2)               (3)            (4)            (5)          (6)
        Polynomial Degree, K       ∆N(63,66.5],t
                                               ¯   ∆N(66.5,70],t
                                                               ¯    ∆N(70,73.5],t
                                                                                ¯   ∆N(73.5,77],t
                                                                                                ¯   M V P FL,t
                                                                                                             ¯   M V P FU,t
                                                                                                                          ¯



        Quadratic, K = 2               5,362         -20,258             563           49,646         0.84         1.06
                                     ( 1,831)        ( 4,331)           ( 517)        ( 1,884)       ( 0.02)      ( 0.02)
        Cubic, K = 3                   4,270         -23,606             514           47,450         0.87         1.08
                                     ( 1,501)        ( 8,659)           ( 187)         ( 427)        ( 0.04)      ( 0.05)
        Quartic, K = 4                -8,170         -10,852            3,421          48,206         0.85          1.08
                                     ( 3,253)        ( 6,265)           ( 194)        ( 1,177)       ( 0.03)      ( 0.03)
        Quintic, K = 5                -5,393         -12,304            3,338          47,911         0.85          1.07
                                     ( 1,139)        ( 6,133)           ( 141)        ( 1,884)       ( 0.03)      ( 0.03)

Note: Columns (1), (2), (3), and (4) show the estimated impacts of the reform on the number of single individual households
reporting incomes in bins R$(63,66.5], R$(66.5,70], R$(70,73.5], and R$(73.5,77] for June 2016. Estimates are calculated using
Equation (15) with various polynomial degrees K ∈ {2, 3, 4, 5}. Columns (5) and (6) show the estimated upper and lower bounds
for the MVPF for June 2016, calculated using Equations (10) and (11). Standard errors are presented in parentheses and are
computed from the delta method using the clustered standard errors estimated in Equation (15).




                                                                   60
C.6    Results Using R$(56,63] as a Treatment Bin

 Table 7: Impacts of Reform and MVPF Bounds in June 2016 Estimated from Equation (15)
                             Using R$(56,63] as Treatment Bin

                                    (1)           (2)          (3)          (4)         (5)          (6)          (7)
      Polynomial Degree, K      ∆N(56,63],t
                                          ¯   ∆N(63,70],t
                                                        ¯   ∆N(70,77],t
                                                                      ¯      Bt
                                                                              ¯          Jt
                                                                                          ¯       M V P FL,t
                                                                                                           ¯   M V P FU,t
                                                                                                                        ¯

      Quadratic, K = 2             13,173      -23,591        51,879        10,418      41,461       0.82         1.04
                                  (5, 309)     (6, 397)      (2, 144)      (9, 607)    (9, 995)     (0.04)       (0.04)
      Cubic, K = 3                 8,579       -25,719        49,340        17,140      32,199       0.85         1.07
                                  (5, 381)     (3, 580)        (682)       (7, 006)    (7, 119)     (0.03)       (0.03)
      Quartic, K = 4               9,825       -27,331        50,968        17,507      33,462       0.85         1.07
                                  (8, 922)     (6, 275)      (1, 338)     (11, 592)   (11, 759)     (0.05)       (0.05)
      Quintic, K = 5               12,489      -26,669        50,683        14,180      36,503       0.83         1.06
                                 (10, 645)     (6, 397)      (1, 175)     (12, 709)   (12, 841)     (0.05)       (0.06)

Note: Columns (1) (2), and (3) show the estimated impacts of the reform on the number of single individual households
                                                                                       ¯, ∆N(63,70],t
reporting incomes in bins R$(56,63], R$(63,70], and R$(70,77] for June 2016: ∆N(56,63],t                             ¯. Estimates
                                                                                                    ¯, and ∆N(70,77],t
are calculated using Equation (15) with various polynomial degrees K ∈ {2, 3, 4, 5}. Columns (4) and (5) show the estimated
number of bunching and jumping households for June 2016, Bt     ¯ and Jt¯, calculated using Equations (12) and (13). Columns
(6) and (7) show the estimated upper and lower bounds for the MVPF for June 2016, calculated using Equations (10) and
(11). Standard errors are presented in parentheses and are computed from the delta method from the clustered standard errors
estimated in Equation (15).


C.7    Results Estimated from Nonlinear Least Squares Equation (16)




Note: This figure shows the log number of single individual households reporting incomes in R$(63, 70] and R$(70, 77] over time
along with the counterfactual paths had the reform not happened. The counterfactual paths are equal to the actual number of
                                                                                     ˆ2,x t, estimated using Equation (16) where
                                                                              ˆ1,x + β
people reporting in the given interval minus the causal impact of the reform, β
we set treatx = 1 if x ∈ {70, 77} and K = 3. Confidence intervals are constructed from clustered standard errors at the bin level.
The timing of the reform is indicated by the gray, shaded region.

            Figure 23: Counterfactual Paths for Treatment Bins from Equation (16)


                                                               61
 Table 8: Impacts of Reform and MVPF Bounds in June 2016 Estimated from Equation (16)

                                          (1)           (2)           (3)         (4)          (5)          (6)
            Polynomial Degree, K      ∆N(63,70],t
                                                ¯   ∆N(70,77],t
                                                              ¯        Bt
                                                                        ¯          Jt
                                                                                    ¯       M V P FL,t
                                                                                                     ¯   M V P FU,t
                                                                                                                  ¯

            Quadratic, K = 2            -29,038       50,724          29,038      21,686      0.90         1.13
                                       (13, 711)      (635)         (13, 711)   (13, 723)    (0.06)       (0.07)
            Cubic, K = 3                -28,834       49,505          28,834      20,671      0.90         1.13
                                       ( 5,485)       (309)          (5, 485)    (5, 532)    (0.02)       (0.03)
            Quartic, K = 4              -23,825       50,824          23,825      26,999      0.87         1.10
                                       ( 9,358)       (485)          (9, 358)    (9, 429)    (0.04)       (0.04)
            Quintic, K = 5              -24,571       51,228          24,571      26,658      0.87         1.11
                                       ( 8,311)       ( 426)        ( 8,311)    ( 8,374)     ( 0.04)      ( 0.04)

Note: Columns (1) and (2) show the estimated impacts of the reform on the number of single individual households reporting
                                                                 ¯ and ∆N(70,77],t
incomes in bins R$(63,70] and R$(70,77] for June 2016: ∆N(63,70],t               ¯. Estimates are calculated from the nonlinear
least squares regression given by Equation (16) with various polynomial degrees J ∈ {2, 3, 4, 5}. Columns (3) and (4) show the
estimated number of bunching and jumping households for June 2016, Bt     ¯ and Jt ¯, calculated using Equations (12) and (13).
Columns (5) and (6) show the estimated upper and lower bounds for the MVPF for June 2016, calculated using Equations (10)
and (11). Standard errors are presented in parentheses and are computed from the delta method from the clustered standard
errors estimated in Equation (15).


C.8    Results for Households with Constant Composition

    In this appendix, we present results for our sample of single individual households restricted to those who
do not change their reported family composition throughout the analysis period (June 2012 to June 2016). To
do so, we drop any household that reports a composition change over the analysis period. Moreover, we drop
any household who enters the registry post June-2014 as we cannot tell whether these new entrants are truly
new households or are pre-existing households (with multiple adults) pretending to be separate households so
as to increase the amount of benefits they receive. In other words, we restrict our sample to single individual
households who were (a) on the registry prior to June 2014, and (b) do not report a change in composition
over the analysis period. This reduces our sample from 1,938,653 single individual households with incomes
below R$77 in June 2016 to 1,039,573 single individual households with incomes below R$77 in June 2016.
Table 9 presents results for this exercise. We find very similar estimates for the MVPF lower bound and
slightly smaller estimates for the MVPF upper bound. Note that the estimated numbers of jumpers and
bunchers are smaller than in Table 3 due to the smaller sample size.




                                                               62
 Table 9: Impacts of Reform and MVPF Bounds in June 2016 Estimated from Equation (15),
                            Constant Composition Households

                                            (1)          (2)          (3)        (4)         (5)          (6)
             Polynomial Degree, K       ∆N(63,70],t
                                                  ¯   ∆N(70,77],t
                                                                ¯     Bt
                                                                       ¯         Jt
                                                                                  ¯       M V P FL,t
                                                                                                   ¯   M V P FU,t
                                                                                                                ¯

             Quadratic, K = 2             -4,714        19,019       4,714      14,305       0.88        1.04
                                         (3, 071)      (1, 106)     (3, 071)   (3, 404)    (0.03)       (0.02)
             Cubic, K = 3                 -6,847        17,713       6,847      10,866       0.91        1.05
                                         ( 2,345)        (619)      (2, 345)   (2, 524)    (0.02)       (0.02)
             Quartic, K = 4                -862         13,722        862       12,860      0.89          1.01
                                         ( 3,483)      (1, 696)     (3, 483)   (3, 984)    (0.03)       (0.03)
             Quintic, K = 5               -1,586        14,679       1,586      13,094       0.89        1.01
                                         ( 3,302)      ( 1,188)     ( 3,302)   ( 3,661)    ( 0.03)      ( 0.03)

Note: Columns (1) and (2) show the estimated impacts of the reform on the number of single individual households reporting
incomes in bins R$(63,70] and R$(70,77] for June 2016: ∆N(63,70],t                 ¯. Estimates are calculated from Equation (15)
                                                                   ¯ and ∆N(70,77],t
with various polynomial degrees K ∈ {2, 3, 4, 5}, restricting the sample to single individual households who do not change their
reported household composition throughout the analysis period. Columns (3) and (4) show the estimated number of bunching
and jumping households for June 2016, Bt  ¯ and Jt ¯, calculated using Equations (12) and (13). Columns (5) and (6) show the
estimated upper and lower bounds for the MVPF for June 2016, calculated using Equations (10) and (11). Standard errors are
presented in parentheses and are computed from the delta method from the clustered standard errors estimated in Equation (15).



C.9    Results for Two Adult Households with No Children

    In this appendix, we present results for households with two adults and no children. As discussed in
Section 3.2 and Appendix B.4, the guaranteed minimum income, which was instituted in February 2013
for two adult households without kids, generates a kink in the benefit schedule that occurs at a per-capita
household income of R$35. Hence, after February 2013 and prior to June 2014, two adult families in extreme
poverty (i.e., with household incomes below R$70) get the basic benefit of R$70 plus additional funds to bring
their per-capita income up to R$70. For example, a two adult family with a combined household income
of R$40 gets R$70 plus an additional R$30 (giving them a total income of R$140) to reach the guaranteed
minimum per-capita income of R$70. This generates a kink in the benefit schedule for two adult families at
the per-capita household income of R$35 after February 2013 and prior to June 2014.
    The June 2014 reform not only changed the extreme poverty threshold and the basic benefit both from
R$70 to R$77 but it also changed the guaranteed minimum income from R$70 per-capita to R$77 per-
capita. Hence, the June 2014 reform changed the location and level of both the notch and the kink in the
benefit schedule (for example, the location of the notch moved from R$70 to R$77 per-capita while the
location of the kink moved from R$35 to R$38.5 per-capita). Identification Assumption 1 is now harder
to justify given that the June 2014 reform changed incentives for households to locate around R$35 by
changing the kink in the benefit schedule from R$35 to R$38.5. However, many studies have shown that
behavioral responses to kinks are typically very small; kinks generally induce substantially less bunching than
do notches (Kleven, 2016). Hence, we simply ignore the presence of the kink and use all seven-increment
income bins below R$63 as control groups just as in our main analysis. Figure 24 shows the raw data for how
the number of two adult households reporting incomes in bins R$(63, 70] and R$(70, 77] evolved in a four
year window around the reform from June 2012 to June 2016. As in the corresponding figure for one adult
households (Figure 4a), Figure 24a depicts a clear trend departure commensurate with the reform, providing
highly suggestive evidence that the BF reform induced a substantial behavioral response that increased the



                                                               63
number of households bunching at the new threshold. In contrast to the corresponding figure for single adult
households (Figure 4b), Figure 24b depicts a clear decrease in the number of two adult households reporting
incomes in R$(63,70], also providing highly suggestive evidence that the number of households reporting just
below the old threshold decreased as a result of the reform.




         (a) Number Reporting in R$(70,77]                              (b) Number Reporting in R$(63,70]
Note: This figure shows the number of two adult households that report incomes in the intervals R$(70, 77] and R$(63, 70] for
each month between June 2012 to June 2016. The timing of the reform (from the announcement in April 2014 to the enactment
in June 2014) is indicated by the gray, shaded region.

     Figure 24: Number of Two Adult Households Reporting an Income in R$(70,77] and
                                       R$(63,70]

    Figure 25 shows the analogue of Figure 6 for two adult households with no kids. There is a clear increase
in the number of households locating in R$(70, 77] and a clear decrease in the fraction locating between
R$(63, 70] just as for single individual households.




                                                            64
Note: This figure shows the log number of single individual households reporting incomes in R$(63, 70] and R$(70, 77] over time
along with the counterfactual paths had the reform not happened. The sample is restricted to two adult households without
children. The counterfactual paths are equal to the actual number of people reporting in the given interval minus the causal
impact of the reform, β       ˆ2,x t, estimated using Equation (15) where we set treatx = 1 if x ∈ {70, 77} and K = 3. Confidence
                       ˆ1,x + β
intervals are constructed from clustered standard errors at the bin level. The timing of the reform is indicated by the gray, shaded
region.

   Figure 25: Counterfactual Paths for Treatment Bins, Two Adult Households with No
                                        Children


    Results from estimating Equation (15) for various polynomial degrees K can be found in Table 10. The
MVPF bounds are roughly similar in magnitude as for the single individual households discussed in Section
5 but are a bit more sensitive to the degree of polynomial used.62
  62. Note these bounds are for the MVPF associated with changing the location and level of the notch only, i.e., we ignore any
welfare impacts of changing the level and location of the kink.




                                                                65
Table 10: Impacts of Reform and Efficiency Loss as of June, 2016 Estimated from Equation
                      (15), Two Adult Households with No Children

                                            (1)           (2)          (3)        (4)         (5)          (6)
             Polynomial Degree, K       ∆N(63,70],t
                                                  ¯   ∆N(70,77],t
                                                                ¯      Bt
                                                                        ¯         Jt
                                                                                   ¯       M V P FL,t
                                                                                                    ¯   M V P FU,t
                                                                                                                 ¯

             Quadratic, K = 2            -19,897        37,055        19,897     17,157       0.88        1.12
                                         (1, 795)      (3, 592)      (1, 795)   (4, 292)    (0.03)       (0.01)
             Cubic, K = 3                -19,862        33,645        19,862     13,783       0.91        1.12
                                         ( 1,221)      (2, 048)      (1, 221)   (2, 536)    (0.02)       (0.01)
             Quartic, K = 4              -14,347        21,043        14,347     6,696        0.96        1.08
                                         ( 2,100)      (5, 011)      (2, 100)   (5, 610)    (0.04)       (0.01)
             Quintic, K = 5              -14,157        25,714        14,157     11,557       0.93        1.08
                                         ( 1,883)      ( 2,625)      ( 1,883)   ( 3,467)    ( 0.02)      ( 0.01)

Note: Columns (1) and (2) show the estimated impacts of the reform on the number of single individual households reporting
incomes in bins R$(63,70] and R$(70,77] for June 2016: ∆N(63,70],t                 ¯. Estimates are calculated from Equation (15)
                                                                   ¯ and ∆N(70,77],t
with various polynomial degrees K ∈ {2, 3, 4, 5}, restricting the sample to two adult households without children. Columns (3)
and (4) show the estimated number of bunching and jumping households for June 2016, Bt       ¯ and Jt¯, calculated using Equations
(12) and (13). Columns (5) and (6) show the estimated upper and lower bounds for the MVPF for June 2016, calculated
using Equations (10) and (11). Standard errors are presented in parentheses and are computed from the delta method from the
clustered standard errors estimated in Equation (15).



C.10     Results for Households with Kids

    The benefit schedule for families with children is substantially more complex than the benefit schedule
for households without children, as discussed in detail Appendix B.4. Prior to June 2014, in addition to the
guaranteed minimum income and the basic benefit, there was also a variable per-child benefit for households
below the poverty threshold of R$140 per-capita. The June 2014 reform led to changes in the levels and
locations of the guaranteed minimum income kink, the basic benefit notch, and the variable benefit notch.
For example, the poverty threshold was raised from R$140 per-capita to R$154 per-capita and the levels of
the variable benefits were also increased by around 10% (see Appendix B.4 for more details).
    Estimating the MVPF of the reform for households with children will require estimating the WTP for the
all of the different changes to the BF schedule. Estimating the WTP for the change to the variable benefit
schedule would be particularly difficult given that this benefit is made conditional on investments in children.
For example, suppose parents are under-investing in their children’s education and the reform increases school
attendance. Then we would need to estimate the childrens’ WTP for the increased education they receive as
a result of the reform to the variable benefit schedule. Thus, we leave calculating the MVPF of the reform
for households with children to future work. We do, however, show strong evidence that households with
kids respond to the reform. In particular, Figure 26 shows prima facie evidence that the reform increased
the number of households reporting incomes in R$(70, 77].




                                                                66
Note: This figure shows the number of households with kids that report incomes in the various bins. The number in each bin is
normalized to 1 in June, 2012. The timing of the reform is indicated by the gray, shaded region.

Figure 26: Number of Households with Kids Reporting Incomes in Various Bins, Normalized
                                   to 1 in June, 2012




                                                            67