Policy Research Working Paper 9167 Paying More for Less Why Don’t Households in Tanzania Take Advantage of Bulk Discounts? Brian Dillon Joachim De Weerdt Ted O’Donoghue Development Economics Knowledge and Strategy Team February 2020 Policy Research Working Paper 9167 Abstract Do poor households shop in a way that leaves money on less without reducing purchasing quantities. Several expla- the table? A simple way to maximize consumption, con- nations for this pattern are investigated, and the most likely ditional on available cash, is to avoid regularly purchasing mechanisms are found to be worries about over-consump- small amounts of nonperishable goods when bulk discounts tion of stocks and avoidance of social taxation. Contrary are available at modestly larger quantities. Using two-week to prior work, there is little indication that liquidity con- transaction diaries covering 48,501 purchases by 1,493 straints prevent poorer households in the sample from households in Tanzania, this paper finds that through bulk buying in bulk, possibly because the bulk quantities under purchasing the average household could spend 8.7 percent examination are not very large. This paper is a product of the Knowledge and Strategy Team, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at bmd28@cornell.edu. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Paying More for Less: Why Don’t Households in Tanzania Take Advantage of Bulk Discounts?∗ Brian Dillon, Joachim De Weerdt, and Ted O’Donoghue ∗ Brian Dillon (corresponding author) is an assistant professor at Cornell University, Ithaca, NY, USA; his email address is bmd28@cornell.edu. Joachim De Weerdt is Senior Lecturer at the University of Antwerp and Senior Research Fellow at KU Leuven, Belgium; his email address is Joachim.DeWeerdt@uantwerpen.be. Ted O’Donoghue is a professor at Cornell University, Ithaca, NY, USA; his email address is edo1@cornell.edu. For helpful discussions and comments the authors thank Jenny Aker, Abhijit Banerjee, Chris Barrett, Jim Berry, Ann Bostrom, Peter Brummund, Arun Chandrasekhar, Paul Christian, Kelly Husted, Joe Kaboski, Kelsey Jack, Supreet Kaur, Steve Kosack, Michael Kremer, Mujobu Moyo, Sendhil Mullainathan, Rohini Pande, Robert Plotnick, Imran Rasul, Jonathan Robinson, Mark Rosenzweig, Thaddeus Rweyemamu, Hilary Wething, Dean Yang, and seminar audiences at the ASSA conference, the CSAE conference in Oxford, University of Washington, University of Alabama, Cornell University, Michigan State University, and KU Leuven. 1 Introduction The marginal value of consumption is very high for poor households. For this reason, one expects households in low-income countries to be especially mindful of ways to arrange purchases so as to maximize consumption per dollar. This paper studies the surprising, contrary finding that many households in Tanzania purchase non-perishable goods in small increments, multiple times, over a two-week period. If price schedules were linear and trans- action costs minimal, frequent purchasing of small quantities would have no impact on the budget set. If bulk discounts were available, but only for large quantities representing several weeks or months of consumption, there could be many reasons not to buy in bulk. However, we find that many items exhibit bulk discounts over modest quantity ranges. Households ap- pear to be systematically over-paying for consumption goods in order to maintain a pattern of frequent, small-quantity purchases. The goal of this paper is to better understand this phenomenon. We aim to answer two questions. First, how much consumption is foregone because of small-quantity purchasing? Second, why do households make purchases in this way? To answer the first question we analyze transaction diaries maintained by 1,493 Tan- zanian households over a two-week period. Diary respondents recorded the date, quantity, price, and detailed description of every purchase. From these data we calculate the total quantity purchased at the household-item level over two weeks, the total amount paid, and the counterfactual cost if the household had purchased the total quantity in one transaction. This last step makes use of expenditure schedules that we estimate from the data (and val- idate with market price surveys). We limit our analysis to 19 standardized items that are non-perishable over the relevant quantity and time ranges. The item list includes numerous staples: maize, rice, cooking oil, kerosene, onions, dried sardines. We find that across items purchased multiple times over two weeks, the value of for- gone consumption is equal to 8.7% of the value of expenditure. By simply bundling purchases made over a short period of time, the average household could spend almost 9% less on a range of important goods without reducing consumption.1 There is substantial heterogene- 1 This should not be thought of as a 9% return on a two-week investment, which could be re-invested for annualized return of over 900%. The implicit return to the household is limited by the bundle it purchases, 2 ity: 9% of households have zero forgone consumption, while nearly a quarter could reduce expenditure by 10% or more without reducing quantity purchased. If we take the alternative approach of holding expenditure constant and calculating the counterfactual quantities that could be purchased by buying in bulk, we find even larger average values. Households could purchase 33% more kerosene, 50% more cooking bananas, 24% more cooking oil, 46% more onions, 18% more dried sardines – surprisingly large amounts for goods that are part of daily life in Tanzanian villages. It appears that there could be substantial welfare gains from rearranging purchases to avoid buying small quantities at high mark-ups. Whether that is actually true depends on the reasons that shoppers arrange purchases this way. The second half of the paper is dedicated to understanding why households forego bulk discounts. We consider a range of hypotheses that come from the literature, from discussions with Tanzanians, or from discussions with other researchers. These include: binding liquidity constraints prevent households from buying in bulk; people enjoy going to the market and shopping; it is too costly to transport bulk quantities; it is too costly to store bulk quantities; consumers are not aware of bulk discounts; coordination failures among household members lead to financially inefficient purchasing; frequent purchases are a way to avoid over-consuming stocks; frequent purchases are a way to avoid stocks that attract friends and neighbors requesting hand-outs. The paper presents evidence against all of these hypotheses except the last two, for which there is support. The finding that households make frequent purchases as a way to ration consumption, perhaps due to worries that they will consume large stocks more quickly than they would like, is based on the analysis of shopping patterns for “temptation goods,” which we identified through a separate survey effort in Tanzania. The suggestive evidence that households purchase in small quantities in order to avoid social taxation—non-household members consuming a portion of their purchases—aligns with recent work on redistributive pressures in similar settings (Anderson and Baland, 2002; Platteau, 2006; Goldberg, 2016; Baland, Guirkinger and Mali, 2011; Alby, Auriol and Nguimkeu, 2013; Jakiela and Ozier, 2016; Squires, 2016). which we hold constant in calculating the value of foregone consumption. Hence, the expected financial savings over the course of an entire year is still in the neighborhood of 9%. Of course, the household might have other ways to reinvest that 9% savings and further improve its financial position. 3 The final section of the paper shows that the rationing and social tax channels operate independently, rather than as manifestations of a single underlying set of choices. Nonethe- less, these mechanisms do have an important feature in common. Both involve limiting the near-term consumption of a future agent—one’s self, other household members, or non- household members—by avoiding stocks. This raises the question of whether the observed purchasing patterns are sub-optimal. While it is likely that some households could increase utility by buying in bulk, the calculation that the average household could increase con- sumption by 8.7% does not take into account the possibility of leakage through higher social taxes or over-consumption. In light of this, we are cautious about policy recommendations, discussing instead some directions for future research. This paper makes four main contributions. The first is to provide a plausible estimate of the value of consumption foregone from paying high mark-ups on small-quantity purchases. There is a large literature from both developing and developed countries on consumer choice when there are bulk discounts (Frank, Douglas and Polli, 1967; Kunreuther, 1973; Wansink, 1996; Chung and Myers, 1999; Bray, Loomis and Engelen, 2009; Griffith et al., 2009; Beatty, 2010; Orhun and Palazzolo, 2016; Rao, 2000; Attanasio and Frayne, 2006; Mussa, 2015; Attanasio and Pastorino, 2015; Gibson and Kim, 2018). Yet, to our knowledge, this is the first paper to estimate the value of forgone consumption from not buying in bulk. The second contribution is to provide evidence that liquidity constraints are not the key driver of small-quantity purchasing. This finding contrasts much prior work on this issue in developing countries (Rao, 2000; Attanasio and Frayne, 2006; Mussa, 2015; Attanasio and Pastorino, 2015), the exception being a recent paper from urban Papua New Guinea showing that households have sufficient liquidity to buy rice in bulk (Gibson and Kim, 2018). One of the benefits of the data is that we do not need to speculate about whether households have sufficient cash to buy in bulk. The diary of purchases reveals a lower bound on each household’s available liquid resources. We use the observed time path of expenditures at the household-item level to ask the following: how many days would a household have to delay buying the item, and delay buying non-essentials such as sugar and tea, before it had accumulated sufficient savings to buy the item in bulk? This is similar to an exercise in Mullainathan and Shafir (2013). After doing this once, the household could buy in bulk 4 in perpetuity, if a liquidity constraint were the problem. The average required delay in our sample is only 1.2 days. Among the poorest members of the subset of households that are the least financially efficient purchasers, the average is only 2.9 days. For various reasons described in Section 5, these figures are upper bounds. These durations are too short for liquidity constraints to drive the majority of financial losses in our data.2 In a broad sense, this finding matches the spirit of the evidence in the Portfolios of the Poor (Collins et al., 2010). In the Portfolios, fine-grained details about household finances show that even the poorest in poor countries have complex financial lives. In our setting, a detailed picture of households’ liquid resources reveals greater cash flow than we may have otherwise assumed, and motivates us to look beyond liquidity constraints for other factors that deter bulk purchasing. The third contribution is to document a clear connection between the temptingness of a good and consumers’ propensity to purchase it in bulk. We asked a set of experts on Tanzanian village life to identify which goods might be subject to over-consumption relative to one’s ex ante plan (Banerjee and Mullainathan, 2010). Because not all goods exhibit bulk discounts in every location, the effect of temptingness on purchase frequency is identified even while controlling for household fixed effects. We find robust evidence that households purchase tempting items more frequently than non-tempting items, causing them to forego some bulk discounts. The final contribution of the paper is to provide novel insights into the way that social taxation distorts the allocation of resources. Because we observe flows of both incoming and outgoing resources, we are able to construct a proxy measure of each household’s social tax rate, and examine the relationship between social taxes and bulk purchasing. We find a strong link: households that buy in bulk tend to face higher social tax rates. Additional analysis supports the interpretation that causality runs from buying larger quantities to higher social tax rates, rather than the other way around. It seems highly likely that it would be optimal for some households to reduce transaction quantities as a way to deter requests from friends and family, although this final piece is not something we can test 2 If we were to examine other questions about bulk purchasing, such as why households do not buy large wholesale quantities of consumption items and store them for months at a time, we might arrive at a different conclusion about liquidity. 5 directly. 2 Conceptual framework In this section we develop a conceptual framework for the empirical analysis. We begin with a stylized example, and then develop the approach formally. Motivating example In one of the study districts, maize is sold in the following three quantities: 1 kg for 650 TZS; 2 kg for 1200 TZS; and 3 kg for 1350 TZS.3 This price schedule exhibits bulk discounts. The unit (per-kg) price is 650 TZS for the 1 kg purchase, 600 TZS for 2 kg, and 450 TZS for 3 kg or more. A household that wishes to consume 4 kg of maize over two weeks has (at least) three options: purchase 4 kilograms all at once and consume it over the two weeks; purchase 2 kilograms, consume it over the first week, then purchase another 2 kilograms at the start of the second week; or purchase 1 kilogram four times over the course of the two weeks. From a purely financial perspective, purchasing the 4 kg in a single transaction—buying in bulk—is most efficient. Suppose that instead, a household purchases a 1 kg bag on four occasions over the two weeks. This raises two questions. First, how large are the losses incurred from not buying in bulk? Second, why would the household do this? The answer to the second question is the subject of Section 5. There are two ways to answer the first question. We can calculate a financial loss by taking the household’s actual expenditure on the 4 kg of maize (2600 TZS) and subtracting the cost of purchasing the entire 4 kg in bulk (1800 TZS).4 In this case the financial loss from small-quantity maize purchasing is 800 TZS. Alternatively, we could calculate the quantity that could have been purchased by spending the 2600 TZS all at once at the lowest per-unit price. This is 5.78 kg—2600 TZS at 450 TZS per kg—which represents a potential quantity increase of 1.78 kg. 3 These values represent three “focal quantities”–commonly observed quantities–and the median expendi- tures for those focal quantities. See Sections 2 and 4. 4 Consistent with our formal approach, this example assumes that a household can purchase any quantity larger than 3 kg at the per-unit price that applies for a 3 kg transaction (see discussion in the next subsection). 6 Framework for empirical analysis To formalize the above approach, we begin with a set of focal quantities. A focal quantity should be interpreted as roughly a package size or a common unit of trade, analogous to the three quantities in the maize example. In some cases these focal quantities correspond to actual package sizes from mass produced items, such as 1-liter bottles of cooking oil. In other cases, local units have emerged over time as vendors have adopted widely available canisters as standard units of trade. Suppose item i is available in R focal quantities {qr }R r=1 , ordered so that q1 < q2 < ... < qR . Let er denote the expenditure required to purchase quantity qr , and let pr denote the associated unit price, so pr = er /qr . If the focal quantities (weakly) exhibit bulk discounts, we have p1 ≥ p2 ≥ ... ≥ pR . Our approach will be to identify focal quantities empirically, using commonly observed transaction quantities. In the following section we provide details. For now, we take it as given that focal points {qr }R R r=1 and {pr }r=1 are known for each item. Over the study period, household h buys item i in K separate transactions. Let k = 1, . . . , K index the household’s purchases, with the associated quantities and expenditures K denoted qhik and ehik . Observed total expenditure is ehi = k=1 ehik , and observed total K quantity is qhi = k=1 qhik . Our goal is to calculate (i) the financial savings if h had instead purchased qhi in a single transaction, and (ii) the extra quantity if h had instead spent ehi in a single transaction. These calculations require knowing the expenditure associated with any transaction quantity. To reflect the reality of shopping in these markets, we base such estimates on the expenditure required to make purchases with the focal quantities. Specifically, we define the expenditure schedule, e∗ i (q ) as the weighted average of the expenditures for the nearest focal quantities on either side of q : qr+1 − q q − qr for any q ∈ [qr , qr+1 ], e∗ i (q ) ≡ er + er+1 . qr+1 − qr qr+1 − qr Similarly, the lowest unit price (pR ) is assigned to any quantity greater than the largest focal quantity (i.e., e∗ i (q ) = qpR for any q > qR ), and the highest unit price (p1 ) is assigned to any quantity less than the smallest focal quantity (i.e., e∗ i (q ) = qp1 for any q < q1 ). The 7 expenditure schedule can be converted into a unit price schedule using p∗ ∗ i (q ) = ei (q )/q . There are two interpretations of these weighted averages. The first relates to behavior in the market. Consider a shopper in the above example trying to buy 2.5 kg of maize in a single transaction. She may argue that she should pay at most the 2 kg unit price, and perhaps the lower 3 kg unit price. If the probability of receiving a particular unit price is proportional to the distance to the nearest focal quantities, our measure assigns the expected value. A second interpretation relates to our choice of an aggregation period of two weeks, a necessary but somewhat arbitrary decision. In most cases, qhi , the aggregate quantity purchased over two weeks, will not correspond to an exact focal quantity. We could just as easily aggregate purchases over a longer or shorter time period to ensure that qhi is equal to a focal quantity. Our approach effectively calculates the expenditure associated with aggregating to the next lower or next higher focal quantity, and then takes a weighted average. In the next section it will be apparent that bulk discounts are clearly identifiable within-village and even within-household. For power reasons, we will construct e∗ i (q ) at the district level. This is not as restrictive as it may sound. The identifying assumption required to still permit household-specific price schedules is that any within-district differences in expenditure schedules take the form of linear shifts over the relevant ranges. That is, if e∗ i (q ) is the district-level price schedule for item i, but household h faces price schedule e∗ i (q ) + γhi q for some scalar γhi , then our loss estimates are unbiased. What matters for the analysis are the relative unit price differences across quantities, not overall differences in price levels. With this approach, households in a district can still face different price schedules for any number of reasons (variation in bargaining power, village effects, or others). The majority of observed transactions take place at focal quantities. For those that do not, and for focal quantity transactions that are not at focal prices, we project all observed transactions onto the expenditure schedule prior to aggregation. That is, if observed expen- diture is represented as ehik = e∗ i (qhik ) + νhik , where the νhik is an idiosyncratic component, ˆhik ≡ e∗ then the adjusted expenditure is e i (qhik ). The household’s adjusted total expendi- K ˆhi = ture on item i is e ˆhik . k=1 e Using adjusted total expenditures in our calculations of losses ensures that our results are not distorted when a household’s actual expenditure in a 8 particular transaction happens to be above or below the expenditure schedule. We define the financial loss (or “bulking loss”) on item i as the forgone financial ˆhi − e∗ savings from purchasing qhi in a single transaction. This is calculated as Lhi ≡ e i (qhi ) = K ∗ k=1 ei (qhik ) − e∗ i (qhi ). The related measure, expressed as a percentage of expenditure, is ˜ hi ≡ Lhi /e the percentage loss L ˆhi . Alternatively, to find the extra quantity that household h could purchase if it held expenditure constant but aggregated its spending on i into a single transaction, we invert the expenditure schedule and calculate the quantity loss as Qhi ≡ e∗−1 ˜ hi ≡ Qhi /qhi . By construction, ehi ) − qhi . Likewise, percentage quantity loss is Q (ˆ i all four measures are zero if a household buys an item only once over two weeks.5 For most ˜ hi , because these can easily of the analysis we focus on the financial-loss measures, Lhi and L be aggregated across items (we will typically refer to these as “loss” and “percentage loss”). The quantity-loss measures provide an additional way to understand the magnitude of the purchasing inefficiencies in the data. Figure 1 gives a visual example. Imagine a household that buys maize in the market described in the previous subsection. The household reports three maize purchases over the observation period: 1 kg for 650 TZS, 1 kg for 750 TZS, and 1.5 kg for 975 TZS. The × in Figure 1 mark the actual transactions, with the unit price schedule in the left panel and the expenditure schedule in the right panel. Observed expenditure is 2375 TZS (point A). Adjusted expenditure is e∗ (1) + e∗ (1) + e∗ (1.5) = 650 + 650 + 925 = 2225 (point B). Counterfactual expenditure from bulk purchasing is e∗ (3.5) = 1575 (point C). These three expenditure values are associated with the total observed quantity of 3.5 kg. The counterfactual quantity that could be purchased using the total adjusted expenditure of 2225 all at once is e∗− i 1 (2225) = 2225/450 = 4.9 kg (point D). For this example, the financial measures of loss are Lhi = 2225 − 1575 = 650 (the vertical distance between points B and ˜ hi = 650/2225 = 29.2%. The quantity measures of loss are Qhi = 4.9 − 3.5 = 1.4 kg C), and L ˜ hi = 1.4/3.5 = 40%. This household (the horizontal distance between points B and D), and Q could reduce expenditure on rice by 29.2% without reducing quantity consumed, or increase the quantity of rice consumed by 40% without increasing expenditure. 5 In this respect the approach is conservative. The items we study are popular consumer goods in Tanzania, and in many cases they can be stored for months. Households that purchase item i only once over the study period could in all likelihood reduce expenditure by bulk purchasing for a longer time period. 9 This approach to constructing a counterfactual never requires that households have access to additional cash in order to buy in bulk (although they might need the cash a little sooner, once in the lifetime of the household—see the extensive analysis in the first subection of Section 5). Because we use observed total two-week quantities in constructing the counterfactual—not large, hypothetical purchases off of the observed support—buying in bulk as defined here can only increase, not decrease, the household’s cash reserves over the total two-week period. Because we are estimating price schedules, not demand curves, the approach is not threatened by censoring concerns related to the two-week observation window. The most likely form of data censoring is that very large purchases, e.g., of wholesale bags of grain, are too infrequent to appear as focal quantities, even though they may be widely available at markets. The absence of these large-quantity purchasing opportunities means that our calculations are likely to be lower bounds on actual losses. 3 Data and descriptive patterns The data for this paper are from the Survey of Household Welfare and Labor in Tanzania (SHWALITA). The survey was part of an experiment to test the impact of questionnaire design on consumption measures (see Beegle et al. (2012) for details). In one arm of the study, 9 households per village were randomly assigned to complete a consumption diary. Three of these households completed a single, household-level diary, with no monitoring by project staff. Three completed a single household diary but received multiple follow-up visits from field staff. For the last three households each adult member kept their own diary, with children placed on the diary of the most knowledgeable adult. Households in the third group received multiple follow-up visits, similar to those in the second group. The differences between module arms are small but not zero, and they have no impact on the findings in this paper. We control for diary type whenever relevant. The SHWALITA survey was conducted in 24 villages per district, in 7 districts. The resulting data set includes responses for 1,512 diary households. After dropping households that did not purchase any of the items that we study or that did not complete the end-line 10 survey, we are left with a sample size of 1,493. Data were collected from September 2007 to August 2008. All households in a village completed their diaries over the same 14 days. Survey work in each district was completed in less than two months.6 Each study household maintained a transaction diary for 14 days. The diary took the form of a paper log book, with a separate row for each transaction. On each day during the observation window, diary keepers noted the quantity, unit, value, and description of every item that entered or exited the household. If multiple items were purchased during a single shopping trip, each item received its own entry (row) in the diary. Purchases, gifts, own production, and stock adjustments were recorded separately. Because this paper deals with purchasing behaviors, we use only those rows of the transaction diaries that indicate purchases, unless otherwise specified. After collecting the diaries from respondents, project staff used the item descriptions to assign each purchased item to one of 73 categories, covering 58 food items and 15 non-food items. These categories are similar to those that would be available in a typical consump- tion survey. However, because we have access to the raw diary data, we use the detailed descriptions of each item to further narrow the definitions of the study items. We do this by dropping entries for which the description does not match that of the modal entry within the item category. For example, we drop “unrefined sugar” from “Sugar,” retain only “dried beans” (excluding soy beans) from the original category of “Peas, beans, lentils and other pulses,” keep only “immature coconuts” in the “Coconut” category, restrict the “Dried fish” group to only “dried sardines,” excluding larger fish, and so on. This removes much of the bias from grouping goods of different variety under a single heading. The end result is a data set in which items are far more uniform than those in a typical consumption survey. We also drop items with too few observations, and drop perishable items that cannot be stored for two weeks by most households (such as beef, milk, and fresh fish). We do not drop or retain items based on whether the price schedule exhibits bulk discounts. After selecting the items, we remove outliers by dropping any observations for which the quantity or unit price is more than 4 standard deviations from the item-level mean, and dropping 6 More details are available at the project page, accessible here: http://edi-global.com/publications/. Data are available by contacting the lead SHWALITA researchers listed on the project page. 11 observations from item-district cells that have insufficient representation to construct a price schedule (fewer than 15 observations). This reduces the sample size by roughly 1%. A further cleaning step was required to standardize units. Respondents reported many quantities in kilograms and liters, but others in bunches, heaps, tins, ladles, buckets or bundles. For some items these units were measured during the market price survey. For other commonly reported units, the team went back specifically to measure the kg or liter conversion. We use the median, district-level conversion rates to convert local units into kilograms or liters.7 The final data set contains details for 48,501 purchase transactions. Descriptive de- tails for the final set of 19 study items are reported in Table 1. Maize and cooking bananas— staple carbohydrates—are the items purchased in the largest kilogram quantities. Compar- ing transaction quantity to total 2-week consumption in Table 1, it is clear that households tend to buy items multiple times over two weeks. Table 2 provides more details on this and other patterns in the diary data. The total number of observed transactions ranges from 688 (maize) to 5319 (cooking oil). The average item was purchased by just under half of the sample (733 households), and was purchased multiple times by just over a third of the sample (505 households). Some items, such as cooking oil, kerosene, sugar, dried sar- dines, and onions, were purchased more than once by a majority of households. Among the households that purchase each item, the highest average expenditure is on maize at 7,354 TZS/household, and the lowest is on matches at 187 TZS/household. The average number of purchases per item is 3.6.8 The SHWALITA team also conducted a market price survey in each village, in con- junction with the household survey. Markets are relatively dense in Tanzania: 97% of study households live within 10 kilometers of the nearest market, and the median distance is 1.15 km. For 42 food items (10 of which meet the criteria for inclusion in this study), enumerators visited the village market and recorded the most common units in which each item was sold. They precisely measured the unit in kilograms or liters, and noted the price. Unit prices were 7 Helpfully, 98.7% of purchases recorded in units other than kilograms or liters were recorded as integer values. Most of the decimal entries in kilograms or liters stem from unit conversions, not from respondents being forced to convert non-standard units into kilograms or liters on the spot. 8 Additional descriptive statistics for the survey households are provided in Appendix S1. 12 collected for up to three different units at the item-vendor level, with the units determined by the vendor based on the most common units of trade. This was done for three vendors per market, with 1-3 visits per vendor. The team repeated the exercise at multiple markets if there was more than one in a village. While writing this paper we collected two types of additional data. The first came from informal interviews and focus groups in Tanzania, during the years 2012-2015. We con- ducted 10 interviews with consumers or shopkeepers, and held three informal focus groups, each with 5-6 people. These discussions helped us identify hypotheses for why households might forego bulk discounts, and provided relevant anecdotes about consumer behavior. Our second data collection effort was an on-line survey conducted in June-July 2016. This short survey was sent to a group of Tanzanians with extensive knowledge of household decision- making around economic issues. We describe this survey in the relevant subsection of Section 5. 4 Quantifying the value of forgone consumption In this section we estimate the value of consumption that households forego by not buying in bulk. The first subsection describes the bulk discounts and the estimated expenditure schedules. The next subsection uses the estimated expenditure schedules to provide estimates of the financial losses and quantity losses, and examines heterogeneity. Bulk discounts in the data While our main analysis uses focal points in the diary data to estimate price schedules (as described in Section 2), it is easy to see that bulk discounts are present in simple linear models. We first estimate such regressions using the market price survey data, which was collected in tandem to the diaries. Columns 1–3 of Table 3 show slope coefficients from item- specific, transaction-level regressions, with log unit price as the dependent variable and log quantity as the independent variable. These regressions include district (column 1), village (column 2), or vendor (column 3) fixed effects, with standard errors clustered at the village 13 level.9 The majority of coefficients in columns 1–3 are negative and statistically significant. There is only one positive and significant slope coefficients, on cooking bananas (column 2), but it is statistically indistinguishable from zero when we look within vendor (column 3). For comparison purposes we estimate similar regressions using the diary data. Columns 4 and 5 of Table 3 show the slope coefficients. These estimates are not directly comparable with those in columns 1–3, because for some items the quantity support is different across the two data sets. Our interest is primarily in the signs of the estimated coefficients. In columns 4 of Table 3, all but two estimated slope coefficient are negative, and none are positive and statistically significant. Looking within household (column 5), all coefficients are negative and 16 of 19 are statistically significant.10 Taken together, the clear pattern in the market survey and diary data sets is that unit price is decreasing in quantity for many items. This finding holds within village-day, within-vendor, and within-household.11 Having established that bulk discounts exist, we turn to the non-parametric approach of Section 2 to estimate expenditure schedules. The focal-point approach is less susceptible to measurement error than one based on parametric approximations of price schedules, and it reflects the reality of shopping in these markets. We designate a quantity as focal if it accounts for at least 5% of all observations at the item-district level.12 By this definition there are 1-8 focal quantities per item-district, with 3.4 on average. Overall, 69% of purchases are at focal quantities. We use the median unit price at the focal quantity to estimate the focal price. To provide a visual example of the estimated schedules, Figure 2 depicts expenditures and unit prices by quantity for 702 purchases of kerosene in one of the study districts. The 9 For these descriptive regressions there would be no benefit to pooling and using a full set of interactions. The units and scale vary across items, so we allow the levels of the fixed effects to vary as well. 10 To verify that the negative relationship between quantity and unit price is not due to division bias (from constructing unit price as the quotient of two variables measured with error), we also estimate expenditure schedules by regressing transaction-level expenditure on quantity and its square, suppressing the constant and fixed effects to enforce regression through the origin. The coefficient on q 2 is negative and significant for 13 of 19 items, and never positive and significant. This indicates that expenditure schedules are generally concave, which is consistent with bulk discounts. Results in Appendix S2. 11 In Appendix S3 we show that the extensive margin probability that households purchase an item is not related to the degree of bulk discounts, as measured by the slope coefficients from column 5 of Table 3. 12 We drop the roughly 1 in 5 candidate focal quantities that either require greater total expenditure than a larger-quantity focal point, or that have a higher unit price than a smaller-quantity focal point, because these points can never be part of an optimal counterfactual purchase. This is a conservative step. Because this also impacts adjusted expenditure, dropping these points slightly attenuates losses. 14 size of the circles corresponds to the number of transactions at the circle center. The triangles represent the estimated focal points, and the solid lines mark the unit price (left panel) and expenditure (right panel) schedules. At left, the downward orientation of the unit prices is clear. On the right, the changing slope of the expenditure line represents the drop in unit prices as quantity increases. The clustering of purchases at focal quantities is also clear. In each district there are some items that do not exhibit bulk discounts. Estimated unit price schedules are flat for 51 of the 126 item-district groups in the data (40%). Some items exhibit bulk discounts in every district, such as maize, cooking oil, kerosene, and tea leaves. Other items, including sweet bananas, cooking bananas, onions, salt, sugar, and dried sardines, have downward-sloping price schedules in three or more districts. Cigarettes, beans, rice, cassava, and matches only exhibit discounts in 1 or 2 districts.13 A natural concern with price schedules estimated from observational data is that unobserved quality variation could be misinterpreted as bulk discounts, if higher quality versions of each item are disproportionately purchased in smaller quantities, and unit prices factor in quality. There are three reasons to believe that this is not the case here. First, as described in Section 3, we are able to use the item descriptions in the diaries to create highly standardized item groups. This mitigates unobserved variation from different product types being grouped together. Second, there is no quality variation across quantities in the market survey price data, because enumerators asked vendors to provide prices for different quantities of the exact same item. Yet, we see clear evidence of bulk discounts in those data (Table 3, columns 1-3), and there is no clear pattern as to whether the market price data or the survey data exhibit larger discounts (4 out of 10 coefficients in column 3 of Table 3 are larger in magnitude than their counterparts in column 5). Finally, while nearly half of observed purchases take place exactly at a focal point, the other half are priced above or below the estimated price schedule. One might believe that some of this variation reflects unobserved quality, and, if that is true, that wealthier households may systematically pay prices above the price schedule while poorer households pay prices below it (because demand for quality is increasing in wealth). In Appendix S3 we show that deviations from the price 13 Why discounts emerge for only some items, and why they persist despite apparently robust competition in retail markets, are open questions not addressed in this paper. See Attanasio and Pastorino (2015). 15 schedule are not correlated with wealth. This is indirect evidence that the steps we took to homogenize the items were successful.14 The value of forgone consumption ∗ Recall from section 2 that the quantity of consumption forgone is given by Qhi = qhi − qhi , where the first term is the inverse of the expenditure function evaluated at total adjusted expenditure e∗ hi , and the second term is total observed quantity. Likewise, the financial loss, ˆhi − e∗ or value of forgone consumption, is defined as Lhi = e ˆhi is total adjusted hi , where e expenditure and e∗ hi is the cost of buying qhi in a single transaction. Summing across items at the household level gives Lh = i Lhi . Columns 1–3 of Table 4 report the item-level means of total observed quantity, qhi , ∗ counterfactual quantity, qhi ˜ hi . Calcu- , and the counterfactual percent increase in quantity, Q lations in this table are based on all households that purchase an item more than once. The results are striking: without changing total expenditure, households could increase quantity purchased by over 15% on average. Potential quantity increases are over 25% for kerosene, onions, cooking bananas, cooking oil, and tea leaves, and are almost 18% for dried sardines. These are staple goods: kerosene is the primary lighting fuel in much of Tanzania, cooking bananas are a staple carbohydrate in the two districts where they are commonly purchased, dried sardines are a key source of protein, cooking oil is the main source of cooking fat. Most households purchase one or more of these goods: 85% purchase kerosene, 78% pur- chase cooking oil, 70% purchase dried sardines, and 22% purchase cooking bananas (Table 2). The consumption losses implied by columns 1–3 of Table 4 are substantial at face value. The money-metric measures of loss tell a similar story. In columns 5–7 of Table 4 we report summary statistics for e ˜ hi . Items are listed by decreasing values of ˆhi , Lhi , and L column 7, so that high loss items are at the top (from now on we will usually display items in that order.) On average, losses represent 8.7% of total expenditure at the household- item level. For a number of frequently purchased items – kerosene, onions, cooking bananas, 14 In Appendix S3 we also argue that bulk discounts cannot be the manifestation social capital or buyer- seller relationships, because the prices collected by enumerators in the market price surveys also reflect bulk discounts. 16 cooking oil and tea leaves – losses represent more than 15% of expenditure. In columns 5–7 of the lower panel of Table 4 we report summary statistics for all households represented in the upper part of the table, divided into those above/below median Lh . We calculate household- eh = level means by first summing adjusted expenditure (ˆ ˆhi ) ie at the household level, ˜ h = Lh /e ˜ h , as L then averaging. We define the household-level percentage loss measure, L ˆh . Not surprisingly, losses vary substantially across households. The overall household-level average is 751 TZS, or 6.7% of expenditure. Financial losses among the above median group represent 8.9% of total expenditure, on average. Figure 3 shows histograms and kernel density estimates for the distributions of Lh ˜ h (right panel) among multi-purchase household-item pairs. Items with flat (left panel) and L price schedules are not dropped, so as not to bias the estimates toward large losses. There is substantial between-household variation in losses. Approximately 9% of households incur zero losses (with our conservative approach to estimation). Yet, nearly a quarter (24%) incur losses above 10% of expenditure. To characterize the substantial between-household variation in financial losses, we ˜ h on a vector of household char- estimate household-level descriptive regressions of Lh and L acteristics. Results are shown in Table 5. These regressions make use of a wealth index based on the first principal component of a vector of household assets (Filmer and Pritchett, 2001; Sahn and Stifel, 2003).15 Columns 1 and 2 report the estimates with Lh as the depen- dent variable. In both columns we see that the poorest quartile of households (the excluded category) have lower losses than the other three quarters of households. The age, gender, and education level of the household head do not meaningfully co-move with losses. Larger households exhibit slightly greater losses, a result we discuss further in the below subsection on coordination costs. ˜h Columns 3 and 4 of Table 5 report the results of the same specifications, with L as the dependent variable. Households in the first three wealth quartiles have similar mean percentage losses, but percentage losses are substantially lower for those in the wealthiest quartile. The estimated coefficients on head of household characteristics, in column 4, are too 15 We use assets to characterize wealth because, unlike expenditure, it is not endogenous to consumer prices—an important distinction for a study of choice in the face of nonlinear prices. 17 small in magnitude to be of importance. Distance from the market is statistically significantly associated with lower normalized losses, but the magnitude is inconsequential. Perhaps the most interesting results in Table 5 are those related to wealth. When using levels (columns 1-2), losses appear to be positively related to wealth. When using percentages (columns 3-4), losses are negatively related to wealth. This pattern suggests that there may be different types of loss-prone households—wealthy households that suffer large losses in levels but small losses as a percentage of total spending, and poor households that suffer small losses in levels but large losses as a percentage of total spending. In some of the analysis to follow we distinguish between households that are loss-prone in levels only, percentiages only, both, and neither, where a household is “loss-prone” if it is in the highest quartile of losses. Section S4 in the Appendix contains more details on this categorization. 5 What explains the observed purchasing patterns? We now turn to hypotheses about what might lead households to engage in financially ineffi- cient purchasing by foregoing bulk discounts. We consider a number of possibilities: binding liquidity constraints, utility from frequent shopping, transport costs, storage costs, ignorance of bulk discounts, purchasing in small quantities as a form of self- or other-control, avoid- ance of social taxation, and coordination costs within the household. These hypotheses arose from three sources: discussions with individuals and focus groups in Tanzania (described in Section 3), our own hypothesizing based on the literature or our knowledge of the context, and suggestions from colleagues or seminar participants. Our analysis uses observational data, so it is not possible to say that a mechanism is relevant or irrelevant for all households. Rather, we assess which mechanisms do or do not seem to have a substantial impact on the losses that we observe. Our findings provide an important check on certain hypotheses, and should be especially useful as a guide to steer future research. Because it is important for assessing some mechanisms, it is worth reiterating what it means to buy in bulk in our data. Bulk purchasing in developed countries typically connotes buying several weeks’ or months’ worth of an item at once. In our setting, bulk purchases 18 represent only a few days to a couple weeks’ worth of consumption. One way to see this ∗ is to compare average total 2-week purchase quantities with qmin , the minimum quantity required to pay the lowest unit price. Because price schedules are estimated at the district ∗ level, there are up to seven possible values of qmin (and the associated expendiure level, e∗ min ) for each item. Table 6 shows the mean, minimum, and maximum values of these two statistics, across the seven districts. In columns 3, 4, 6, and 7 we see substantial spatial ∗ variation in the minimum and maximum of qmin and e∗ min —what it means to “buy in bulk” varies across districts. Column 1 shows the mean quantity purchased over 2 weeks. Among households that purchase each item, the mean quantity purchased substantially exceeds the ∗ average value of qmin . ∗ To estimate the number of days’ worth of consumption represented by qmin , we divide ∗ qmin by the average total quantity purchased (at the district level), and multiply by 14. Figure 4 shows the histogram of results for the item-district pairs that exhibit bulk discounts (if we include the other items the distribution shifts significantly left). The median is 8.51 days. In half of cases, “bulk purchasing” involves buying roughly a week’s or less worth of consumption. The 75th and 90th percentiles are at roughly 2 and 3 weeks’ worth of consumption. Hence, even though the data are from 2-week transaction diaries, for most items the relevant time frame for storing and consuming a bulk purchase is substantially shorter. Bulk purchasing in this setting involves avoidance of very small quantities, rather than stocking up on several weeks’ or months’ worth of an item. Liquidity constraints A natural hypothesis for the financial losses in our data is that people would like to take advantage of bulk discounts, but they lack the liquidity to do so. Prior work has argued that poor households may pay higher unit prices than wealthy households because binding liquidity constraints prevent the poor from buying in bulk. This is an intuitive idea. Credit constraints are well understood to be an important barrier to investment in low-income countries, so it is reasonable to wonder whether they are a binding constraint here. For a liquidity constraint to drive small-quantity purchasing, a household must be ∗ unable to gather the cash needed to buy qmin , the minimum quantity required to buy at 19 the lowest unit price. Moreover, this constraint must apply both backwards and forwards ∗ in time. That is because if liquidity is the issue, the extra effort to save for qmin need only be made once. After that, the household could always consume more and/or spend less by eating down its stock, saving, and buying in bulk again, rather than purchasing smaller quantities at higher unit prices. In our setting, the magnitudes in question are small enough that the liquidity con- straint mechanism seems unlikely. To show this, we ask the following: based on household h’s observed time path of expenditures on item i, for how many days would h have to delay purchasing i in order to save enough to buy it at the lowest available unit price? In other words, for how long would h need to forego consumption in order to overcome a liquid- ˆhi /14 be the average daily expenditure on item i by household ity constraint? Let ahi = e h, and recall that e∗ min is the minimum expenditure required to buy item i at the lowest focal unit price. The answer to this question can be calculated as dhi = e∗ min /ahi . If we allow the household to cross-subsidize its savings by also delaying its purchase of a few non- ˜hi = e∗ /a necessities—tea leaves, salt, sugar, and cigarettes—we arrive at the measure d min ˜hi , ˜hi = where a j ∈D ˆhj /14, and D is a set including item i and the non-necessities. We calcu- e ˜hi , the self-financed purchasing delay, for all household-item pairs consisting of at least late d one transaction.16 The magnitudes of the self-financed purchasing delays give us insight into the possible role of binding liquidity constraints. ˜hi for all households (column 1), by Table 7 shows the item-level median values of d wealth quartile (columns 2–5), and for the three groups of loss-prone households (columns 6–8). The median delays are remarkably short, even for the high-loss items at the top of the table. The overall median is 1.2 days. There is only slight variation across wealth subgroups: the median delay for the poorest quartile is 1.5 days, and for the wealthiest quartile it is 1.0 days. The “Percentage-only” loss-prone households have the longest delays, at a median of 2.9 days (column 6). This group must wait roughly a week to purchase some of the higher-loss items in bulk: kerosene, cooking bananas, and cooking oil. Yet, losses by these 16 This follows the spirit of an exercise in Mullainathan and Shafir (2013). They investigate a related question for roadside vendors in Chennai, India (p. 123-124). Those vendors lose roughly half of their daily earnings to interest payments on short-term loans, and yet still buy a daily cup of tea. The authors calculate that by foregoing tea for 50 days, the average vendor could save enough to permanently avoid short term borrowing, resulting in a doubling of take-home pay (and more tea consumption, in perpetuity). 20 households on these items represent only 5.4% of total financial losses in the data. In other words, to find a household-item subgroup that would have to forego purchasing for a week or more in order to buy an item in bulk, we need to ignore 95% of the losses. This approach is conservative in at least four ways. First, we have assumed no cross- subsidization between goods, other than for the least essential items. A liquidity-constrained household seeking to buy in bulk could surely rearrange other purchases so as to buy one item in bulk on day 1, another item in bulk on day 2, and so on. This would substantially ˜hi . Second, we have assumed no access to additional sources of finance, treating the reduce d observed expenditure path as the only source of funds to be saved toward a bulk purchase. Access to any additional borrowing, windfall income, or other positive liquidity shock would ˜hi . Third, we have described a scenario in which households forego consumption of reduce d an item for consecutive days. However, a household could also reach the financially efficient purchasing path by foregoing consumption for a single meal or a single day at a time, smoothing the consumption sacrifice over weeks or months by purchasing quantities smaller ∗ than qmin but greater than those at the highest unit prices. Fourth, we have implicitly treated ˜hi calculated over the observation window as the best case scenario for the household. If at d any point in its lifetime the household commands more financial resources than during the ˜hi would fall. Hence, the results in Table 7 are upper bounds. study window, d Transport costs In wealthy countries, transport costs may constrain bulk purchasing (Griffith et al., 2009). A household cannot buy a carton of paper towels at a big-box store if it has no way to transport such a large item. The analogous concern in Tanzania is that shoppers that carry their purchases home from the market on foot, or by bicycle, may have difficulty transporting bulk purchases. If transport costs impede bulk purchasing, then other things equal we expect house- holds that live further from markets to incur larger losses. Table 5 reports the estimates from regressing bulking losses on distance from the community center and distance to market, as well as controls for wealth, household size, and various measures of human capital. There we see that bulking losses are slightly decreasing in both distance measures (columns 2 and 21 4). This is not evidence that transport never matters for any household. But it suggests that transport costs are not a key driver of small-quantity purchasing. One reason that transport may not play an important role is that over the support of the data, the minimum quantities required to reach the lowest unit price are relatively small. Consider the staple items that are purchased by most households and that are responsible for the largest share of losses: kerosene, cooking oil, onions, and dried sardines. In Table 6 ∗ we see that the maximum value (across the seven districts) of qmin is 1 liter for kerosene, 1 liter for cooking oil, 0.4 kg for sardines, and 1.4 kg for onions (column 4). For households in the other 6 districts, the minimum quantities are even smaller. The large majority of bulk purchases are of a size that can be carried easily in a single trip. To show this even more clearly, we calculate at the household level the total kilogram ∗ and liter quantities of qmin on all items for which the household incurs a loss. This is the minimum aggregate shopping bundle associated with buying all loss-inducing items at their lowest unit price. The average size of this bundle, totaled across all items, is 4.3 Kg and 0.8 liters.17 These quantities are not large relative to total purchases. The average sample household purchases a total of 31.4 Kg and 1.6 liters over the two-week study period. Furthermore, household shop frequently—the mean household reports at least one purchase on 11 out of 14 days (median = 12)—so the purchase of this bundle could be spread across multiple days if desired. If there is an exception to the arguments presented here, it could be for cooking ∗ bananas, in the districts where qmin is over 20 kilograms. For these purchases, which represent only a small fraction of losses in the data, transport costs may be important. If we exclude cooking bananas from the calculation in the previous paragraph, the average size of the aggregate qmin bundle falls by 25%, to 3.0 Kg. Storage costs and concerns about depreciation Perhaps households do not buy in bulk because it is too costly to store bulk quantities? In this subsection, we exmaine two components of storage costs: space constraints, and concerns 17 We exclude cigarettes and matches for this calculation, because they were measured in alternative units. These items are unlikely to impede transport. 22 about depreciation. Other storage-related concerns, due to self-control problems or social taxation, are examined below. ∗ As we argued with transport costs, the small quantities represented by the qmin make it unlikely that space constraints are the main driver of bulking losses. Table 6 shows that “bulk” purchases in our data are relatively small. The technologies to store small quantities of goods securely for a short period—plastic bins with sealed lids, used jerry cans with screw tops, used plastic water bottles with screw tops—are widely available and inexpensive. If bulk purchasing required a household to store its purchases in a 20 liter drum, or a 100 kg bag, then space constraints could be a more plausible concern. The above argument also makes it unlikely that concerns about depreciation during storage prevent households from bulk purchasing. Figure 4 plots the distribution of the av- ∗ erage number of days worth of consumption represented by qmin , which is also an indication of the average required storage duration for a bulk purchase. Half of the storage periods are a week or less; 75% are less than two weeks; 95% are less than a month. If depreciation is the concern, depreciation rates for stored items would need to be exceedingly high. Yet, there is little reason to believe that they are. Some of the study items—kerosene, cooking oil—are essentially free from depreciation related to moisture or pest exposure, which are the primary concerns. For the other items, we look to the existing evidence on post-consumer storage losses. A 2011 FAO report indicates that post-consumer losses in sub-Saharan Africa for a wide range of commodities—grains, roots, tubers, pulses, fruit, vegetables, meat, and seafood—are the lowest in the world, ranging from 0–2% (Gustavsson et al., 2011). Other work looking at the depreciation rate of crops stored by farmers post-harvest finds depre- ciation rates of 1-5% over periods of 6 months or longer in Ghana, Malawi, Tanzania, and Uganda (Kaminski and Christiaensen (2014); University of Ghana (2008), as cited in Zorya et al. (2011)). If there is an exception to this argument it is likely to be for bananas. For a small ∗ number of banana-purchasing households, qmin is over 20 kg for cooking bananas, and over 8 kg for sweet bananas. We cannot rule out that these households avoid bulk purchasing because of concerns about depreciation. Yet, this would explain only a tiny fraction of financial losses from not buying in bulk. 23 Utility from shopping Perhaps people make frequent, small-quantity purchases because there is a utility value from shopping—e.g., from the socializing and community engagement that one enjoys in the market. An obvious problem with this hypothesis is that people can visit shops and markets without making purchases, and do so frequently. That said, if one believes in a very specific social contract that relies on frequent purchasing in small quantities, that could generate the observed patterns in the SHWALITA data. It is straightforward to show that the data contradict this hypothesis. Many house- ∗ holds make purchases both above and below qmin . By rearranging purchases to always ∗ purchase exactly qmin , households could increase the total number of purchases while always buying in bulk. ∗ To illustrate, define Khi ˆhi /e∗ = e min . This is the maximum number of transactions that household h could make on item i at the lowest available unit price, without changing total expenditure on the item. The actual number of transactions that household h makes ∗ ∗ ∗ on item i, Khi , could be smaller or larger than Khi . The value Kh = i Khi tells us how many purchases the household could make if its goal were to maximize total purchases—to shop as much as possible—while only buying at the lowest available unit prices. ∗ ∗ Table 8 shows the mean values of Khi , Khi , and the difference Khi − Khi , as well as the same measures aggregated across items. The results in column 3 indicate that the average household in all four groups could shop at least as much, or more, than it does at present, while only buying in bulk. The average household could make 20.8 more purchases than at present. Households that are loss-prone in levels (only) could make 30.9 more transactions on average. The high percentage-loss group, who are relatively poor and tend to shop less, could make 0.1 more purchases. Finally, households that are in the 25% worst group in both levels and percentages could make 5.2 more purchases on average. These counterfactual shopping patterns are unlikely to be optimal for a variety of reasons. However, this analysis demonstrates that the desire to shop frequently cannot explain the failure to take advantage of bulk discounts, because households could already do more of both. 24 Ignorance of bulk discounts Is it possible that many people in Tanzania simply do not know of the available bulk dis- counts? We are doubtful. When we conducted informal interviews with individuals in the study area, everyone was well aware of bulk discounts for a wide range of consumption items. Furthermore, many households personally experience non-linear prices. Column 5 of Table 3 shows the results of item-level regressions of unit price on quantity, with household fixed effects. Even within household, bulk discounts are present. The members of households that purchase items multiple times—exactly those that are foregoing potential consumption by not buying in bulk—are surely aware of the available discounts. Rationing consumption Could it be that people avoid buying in bulk as a way to limit their consumption? A sophisticated but present-biased agent would forego bulk purchasing in order to prevent her future self from over-consuming (Laibson, 1997; O’Donoghue and Rabin, 1999). Relatedly, shoppers may not trust other household members to control their consumption, and so may limit stocks as a form of rationing. In focus-group discussions we heard numerous variations on this theme.18 If rationing is present, it is most likely to occur on items that are “temptation goods”—goods which, if held in stock, are likely to be consumed too quickly relative to one’s ex ante plan. Because temptation goods are culturally specific, we conducted a survey in Tanzania to measure the temptingness of the study items. We invited Tanzanian staff members from recent survey projects to rank the study items on a five-point scale (1 = not at all tempting; 5 = tempting for essentially everyone who consumes the item). These re- spondents implement consumption surveys and speak regularly with households about their economic choices—in essence, they constitute a panel of experts. We asked respondents to 18 The story of one respondent in Bukoba can be paraphrased as follows: “We know that we need 1 kilogram of maize flour each evening. But if we buy a 50 kilogram bag, we may find that it is gone at the end of one month, because we use too much each day. So it is better to buy smaller amounts.” Despite this anecdote, in the forthcoming analysis maize was not deemed to be “temptation goods” in our survey (see next paragraph). This underscores the subjectivity of temptation and the importance of grounding any analysis of temptation in data sourced from more than one person. 25 answer for a typical household in a typical village, not to self-assess their own temptations. The survey was conducted online in June-July 2016. Across the 43 responses that we re- ceived, we assign each item its average score on the 5-point scale, and classify the top third (6 items) as the temptation goods in the study. These are: sugar, rice, cooking oil, soap, cigarettes, and sweet bananas.19 For this analysis, household fixed effects are identified, so we can examine a specific behavior associated with not buying in bulk, rather than base tests on variation in losses. The primary outcome of interest is Khi , the number of purchases of item i by household h. We first examine whether households make fewer purchases when bulk discounts are present, a response that indicates at least some attending to bulk discounts. We then test the hypothesis that households ration their stocks of temptation goods by purchasing them more frequently than non-temptation goods. Finally, we estimate the joint effects of temptingness and bulk discounts on Khi , and test for differential responses between high- and low-loss households. Identification in these regressions is from three sources: household fixed effects control for average purchasing behavior; the temptation survey provides an objective, exogenous measure of the temptingness of a good; and the fact that households face some flat and some non-flat unit price schedules in their local markets provides exogenous within- household variation in the presence of bulk discounts.20 We include item fixed effects, when identified. Regression results are in Table 9. Column 1 shows that after controlling for both household and item fixed effects, the average effect of a bulk discount is to induce fewer purchases. The average household does react to bulk discounts, to some degree. Column 2 shows that this result does not hold without the item fixed effects. Households also react to the temptingess of the good, but in the opposite direction (column 3). Households make 0.65 more purchases on temptation goods than non-temptation goods. This is indicative of con- sumption rationing for tempting items. When variables for bulk discounts and temptation 19 The full ranking and average scores from the temptation survey are shown in Appendix S5. There is little correlation between a good having bulk discounts and being a temptation good. The correlation between the temptation dummy and the indicator for bulk discounts is −0.07, and it is not statistically different from zero. The relationship is unchanged if we control for district fixed effects. 20 For the average study household, 56% of observed purchases are of items with bulk discounts, and 44% are of items with flat price schedules. 26 are both included (column 4), the desire to ration a temptation good is actually stronger for goods that exhibit bulk discounts. This is not what we would expect to find if shopping choices were purely reflective of the tension between buying in bulk and rationing. Nonethe- less, the key takeaway is that on average, the desire to ration temptation goods outweighs the financial incentive to buy in bulk. In column 5 we repeat the analysis of column 4, allowing for heterogeneity by whether or not the household is loss-prone. We consider a household loss-prone if it is in the highest quartile of losses measured in levels and/or percentages. The first three coefficients in column 5 of Table 9 sum to 0.16, with a p-value = 0.16. This indicates that for non-loss-prone households, the net effect of bulk discounts and temptation is effectively zero. In contrast, for loss-prone households the net effect is a significant increase in the number of purchases. The marginal effect is 1.42 more purchases (p-value = 0.000). This heterogeneity analysis is a characterization, not a causal model. We cannot be certain that the difference between loss-prone and non-loss-prone households reflects differences in temptation or the desire to ration. But the finding is consistent with such an interpretation. On balance, we take the finding that all types of households tend to purchase tempting goods more frequently than other goods, and that this response overwhelms any tendency to make fewer purchases when an item has bulk discounts, as evidence that consumption rationing is an important factor in inducing households to forego bulk discounts and purchase in small quantities. The heterogeneity analysis by loss-prone type is further evidence that rationing is partly responsible for the financial losses from small quantity purchasing. Social taxation In much of sub-Saharan Africa, requests by non-household members for gifts, shared meals, or loans—which we refer to as “social taxes”—are commonplace (Platteau, 2014). Recent experimental work has shown that participants’ willingness to share windfall gains with others is related in part to the visibility of those gains, suggesting social pressure in favor of redistribution (Goldberg, 2016; Jakiela and Ozier, 2016). Baland, Guirkinger and Mali (2011) find that 1 in 5 savings-group participants in Cameroon are willing to pay to signal poverty to their social network, to deter requests. De Weerdt, Genicot and Mesnard (2015) 27 show that a transfer recipient’s perception of a donor’s wealth affects the value of the transfer between them, conditional on the donor’s actual wealth. Social pressure is clearly a factor in determining patterns of redistribution, and buying small quantities may be a useful way to deter requests from one’s social network. A particular feature of the data allows us to estimate a proxy for household-specific social tax rates and test whether buying in bulk is associated with higher social taxes. In addition to recording purchases, diary keepers also recorded the item description, quantity, unit, and estimated value of items sold or given away.21 These disposal events are recorded on the household’s daily transaction log, in the same manner as the purchases. To estimate a proxy for the household-level social tax rate, we sum the total value of items outgoing over the two-week study period, and divide by the total value of items incoming. Because we are calculating a proxy for a household-level variable, we use every row of the transaction diaries for this calculation, not just the rows associated with the 19 items that are the focus of the rest of this paper. To our knowledge, this is the first accounting measure of social taxes—one based on diary records from a large sample of households—in the literature. Table 10 shows descriptive statistics for the social tax estimates. The uppermost panel shows the value of resources outgoing in the form of sales or gifts, divided into sub- categories. The most important group is “meals and snacks,” which accounts for 40% of outgoing resource flows. The largest share of entries under “meals and snacks” are those for “full meals”—guests at the household table (see Appendix section S6 for details about the item categorization). The lower part of the table shows the value of incoming resources. The mean tax rate is 14.4% (s.d. = 26.5%).22 It is clear both that redistribution is widespread, and that there is significant heterogeneity between households in the implied social tax rate. How does the existence of social taxes affect a household’s incentives to purchase in 21 It is not possible to determine whether an item was sold or given away. This could be problematic if many households re-sell consumer items. That does not appear to be the case. Other households in the study villages—not the households that filled out consumption diaries—were randomly selected to participate in a survey experiment dealing with measurement of labor supply. These households were able to indicate their sectors of work, with varying degrees of detail. The share of respondents that indicated “buying and selling” as their primary sector of work ranged from 5–8% across survey modules. Hence, even if every purchase made by such households was for the purpose of re-selling—which is surely not the case—for the vast majority of disposal events it is still reasonable to assume that items “sold or given away” are in fact “given away.” 22 A small number of estimated tax rates are over 100%. All results are robust to winsorizing by replacing rates above 60% with 60%. 28 bulk? Let τ (q ) ∈ [0, 1) indicate the tax rate levied on transaction quantity q , and define s(q ) = 1 − τ (q ). A household purchasing quantity q is able to retain quantity s(q )q , while τ (q )q flows out of the household. The intuition from the prior literature suggests that τ (q ) is increasing in q , because larger quantities are more visible to non-household members, and thus lead to more requests to share. If social taxes take this form, then they would indeed push households to take less advantage of bulk discounts. Identifying the shape of τ (q ) is complicated by the fact that the transaction quantities we observe are in part driven by reactions to the shape of τ (q ). However, those transaction quantities are also driven in part by the existence of bulk discounts, which creates an incentive to purchase larger quantities. The effective unit price in the face of social taxes—the price paid per unit consumed by household members—depends on both the price schedule and the social tax schedule. How a household adjusts transaction quantities in the face of these countervailing forces depends on the relative magnitudes of these forces, budget constraints, and preferences.23 As long as there is sufficient variation so that it is optimal for some households to make purchases that expose them to variable social taxes, τ > 0 will generate a positive correlation between transaction quantities and social tax rates in our data. Suppose instead that the social tax rate is independent of transaction quantity, so τ (q ) = τ ¯ ∈ [0, 1). This tax has the effect of shifting up the price schedule by ¯ for some τ 1 factor 1−τ¯ . There is no clear reason to expect a direct effect of these taxes on the incentive to buy in bulk.24 Whether one buys quantity q every day, or quantity 5q every five days, ¯ may the total social taxes paid is the same. However, between-household variation in τ ¯ is correlated with other factors that indirectly affect the tendency to purchase in bulk, if τ determine transaction quantities. If this is what drives an empirical relationship between social tax rates and transaction quantities, we expect that relationship to be attenuated once 23 Formally: at transaction quantity q , the effective unit price is decreasing if the elasticity ηps = p (q ) s (q ) p(q ) / s(q ) is greater than 1, and increasing otherwise. To see this, note that the effective unit price on retained quantity ˆ, p q q ), is calculated as expenditure divided by quantity retained, i.e., p ˆ(ˆ ˆ(ˆq) = pˆ(s(q )q ) = p (q )q p(q ) s(q )q = s(q ) . On the margin, to increase qˆ the household must increase q . Hence p q) = p ˆ (ˆ p s − s2 s , which is zero when ηps = 1. 24 Because the tax acts like a price increase, it is tempting to assume that demand for taxed items is ¯. This would predict a negative correlation between tax rates and transaction quantities, the decreasing in τ opposite of the variable tax case. However, the conditions under which the law of demand applies to items with bulk discounts are not understood, in part because it is not possible to derive demand functions in the face of nonlinear prices (Beatty, 2010). 29 we control for observable factors that are likely to account for some of the heterogeneity in tax rates, such as wealth, age, and household size. A practical challenge to identifying the shape of τ (q ) is that we cannot match most flows of outgoing resources to specific transactions. Hence, instead of estimating τ (q ), we regress the estimated household-level social tax rates over two weeks on a measure of house- ˜h , hold tendencies to buy in bulk. Specifically, we regress the estimated social tax rate on q the average of the within-district percentile of transaction quantities on the items purchased by household h. Households that tend to make larger purchases than others in their district ˜h . If social tax rates are responsive to bulk purchasing, we should will have higher values of q ˜h and the household’s social tax rate. For robustness we find a positive correlation between q w ˜h also regress social tax rates on (i) the alternative q , which is the weighted average of the percentiles, where the weights are the shares of household expenditure on each item; and ˜ h , which proxies for not buying in bulk.25 (ii) the two loss measures, Lh and L ˜h is statistically Column 1 of Table 11 shows the baseline estimate. The coefficient on q significant and in the hypothesized direction: households that make larger purchases are also those that face higher rates of social taxes. The magnitude is economically significant: a ˜h (0.15 for the estimation sample) is associated with one standard deviation increase in q a tax rate change of 1.4 percentage points, or 9.7% of the mean. In column 2 we add controls for covariates that are likely to vary with exposure to social taxes, if there are indeed fixed differences in tax rates. Rather than being attenuated, the relationship between transaction quantities and social taxes becomes stronger. Surprisingly, social tax rates do not meaningfully co-move with wealth or human capital of the household. The only other statistically significant coefficient is the distance from market. There is of course still the possibility of correlated heterogeneity that is completely unobserved in our data. While we cannot fully rule out such unobserved heterogeneity, we do address one natural possibility. By chance, some households may have hosted a wedding, 25 To construct q˜h we first sort the observed transaction quantities at the item-district level, then assign the percentile value n/N to the nth largest quantity (1 is the smallest), where N is the number of item-district observations. When there are ties at the item-district-quantity level, we assign the average value of n/N to the tied observations. We then calculate the average percentile at the household-item level, for the items w purchased by the household. Finally, q ˜h is the average across items within household. For q ˜h , the final average is weighted by the share of household expenditure on each item. The mean (s.d.) of q ˜h is 0.52 (0.15); w the mean (s.d.) of q˜h is 0.57 (0.17). They are highly correlated (ρ = 0.89). 30 funeral, dinner party, or similar event during the study period. If these households bought items in bulk, in preparation, this could generate a correlation between outgoing resources and purchase quantities that is purely idiosyncratic.26 We address this possibility in two ways. First, we look for evidence of large events. Patterns in the data suggest that such events are rare. We find that 94% of households never host seven or more guests on a single day, and a further 4% host seven or more people on just one day.27 Second, we repeat the analysis in columns 1–2 of Table 11 after first recalculating social tax rates without including “meals and snacks” in the value of outgoing resources.28 Columns 3–4 of Table 11 show the results using this alternative measure. The findings are effectively unchanged. If anything, the relationship between social taxes and transaction quantities is stronger. In Appendix S6 we subject these results to a variety of robustness checks. These w ˜h include: using the alternative average percentile measure q as the key independent variable; ˜ h as the key independent variable; and using Lh as the key independent variable; using L winsorizing tax rates at 60%. In all cases we repeat the analysis in all 4 columns of Table 11, and the findings are qualitatively and quantitatively similar to those reported here. From these findings, we conclude that there is clear evidence that households face social taxes that are increasing in transaction quantities. The remaining question is whether households react to these social taxes by taking less advantage of bulk discounts. In theory, they ought to. The prior research cited above suggests that people react this way in other contexts. Moreover, in our focus groups, we regularly heard, without prompting, anecdotes suggesting a connection between small-quantity purchasing and social taxation. As one interviewee put it: “If I buy 5 kilograms of sugar, everyone will take their tea at my house.” Unfortunately, our cross-sectional data do not permit a direct test of how households react to variable social tax rates in this setting. 26 Note that one might naturally expect the propensity to host such events would depend on household characteristics such as wealth, age, and household size. Hence, the fact that controlling for these variables does not alter the basic relationship between social tax rates and measures of buying in bulk casts some doubt on this hypothesis. 27 To make these estimates, we calculate that the average value of a full meal for an outside guest is 630 TZS, based on the value of outside meals provided on days with only one guest. We then divide the total value of meals and snacks given away to outside guests by 630, to estimate the number of guests at the household-day level. 28 The implied social tax rate without “meals and snacks” is 9.2% (Table 10). The correlation between the two tax rates is just below 0.8. 31 Coordination costs within the household Finally, we examine the possibility that the purchase of financially inefficient small quantities could be driven by the costs associated with coordinating purchases between household members. In Table 5 we see that losses are increasing in household size, hinting at the possibility of coordination difficulties. Coordination failures would manifest as multiple household members buying the same item in small quantities, rather than aggregating into a single purchase so as to pay a lower unit price. Studying this phenomenon requires data on individual- rather than household-level purchases, so we restrict this analysis to the 500 households that were randomly assigned to complete personal rather than household diaries. The modal household in this group has 2 personal diary keepers (54% of observations); the mean number of diary keepers is 2.1, and the maximum is 7. To proxy for uncoordinated shopping, we calculate dmult , the number of days on which ˜ h on dmult and multiple household members purchase the same item. We then regress Lh or L an extensive set of household control variables. We also control for the total number of item- days on which any purchase is made. This focuses on the behavior of interest: controlling for the household’s tendency to make purchases at the item-day level, do we see that losses are increasing in the number of times that multiple household members buy the same item on the same day? The hypothesized link between coordination costs and bulking losses would show up as a positive and statistically significant coefficient on dmult . Results with Lh as the dependent variable are in columns 1 and 2 of Table 12. For both samples, the estimated coefficients on dmult (“Number of item-days with 2+ purchasers”) are negative and statistically significant. This runs directly contrary to the coordination-cost hypothesis. Not surprisingly, the coefficient on the count of item-days with a purchase is pos- itive and statistically significant. In columns 3 and 4 we repeat the analysis using percentage ˜ h ) as the dependent variable. The signs and pattern of statistical significance are losses (L the same as for Lh . These findings do not support the hypothesis that coordination costs are responsible for the forgone bulk discounts in the SHWALITA data. 32 6 Interactions between rationing and social taxation In the previous sections we separately investigate potential explanations for why households would forego bulk discounts. We find the strongest evidence for two hypotheses: rationing and social taxation. These mechanisms may be naturally confounded if social taxes are more likely to be levied on temptation goods. Also, the observational nature of the data introduces the concern that we could be detecting a single phenomenon through multiple channels. In this subsection we implement joint tests of these two mechanisms to shed light on whether they are operating independently. The regressions in the Section 5 subsections on rationing/temptation and social taxes use different dependent variables and are conducted at different units of analysis. Some compromises must be made in order to design joint tests. We approach the problem from two directions. First, we test for heterogeneity by social tax rate in the effect of temptation on the number of purchases at the household-item level (as in Table 9). These regressions include household fixed effects, so the direct effect of the social tax rate on Khi is not identified. Second, we calculate the sum of Khi , the dependent variable in the rationing analysis, across the temptation goods, and include it along with the social tax rate and ˜ h as the dependent variables. For other controls in household-level regressions with Lh and L this test we are regressing losses on the tax rate, instead of the other way around (as we do ˜ h as proxies for not buying for the robustness checks in Appendix S6, where we use Lh and L in bulk). Hence, inference is based on the implied correlation rather than a causal link. The goal of this full set of regressions is to examine whether evidence consistent with each of the two mechanisms is robust to the inclusion of variables that account for the other. Table 13 shows the results. In the upper panel we replicate the key specifications from Table 9, adding interactions between bulk discounts, temptation, and the household social tax rate. There is no evidence of heterogeneous effects by social tax rate: the interaction coefficients are far from statistically significant and trivial in magnitude. In contrast, the main effects of bulk discounts and temptation are largely unchanged from Table 9. In the bottom panel of Table 13 we see that each mechanism has an independent association with losses, in the expected direction (negative for social taxes, positive for 33 temptation). Yet, the interaction between the social tax rate and the number of temptation goods purchased is small in magnitude and far from statistically significant. The analysis in this subsection gives us little reason to suspect that household re- sponses to temptation and to social taxes are manifestations of a single force. It appears that the tendency to forego bulk discounts is driven by multiple channels. 7 Discussion In this paper we have shown that some households in Tanzania incur substantial financial losses by purchasing staple items in very small quantities, frequently, rather than buying modestly larger quantities at lower unit prices. We test numerous hypotheses for why house- holds would engage in this behavior, including the possibility of binding liquidity constraints. The evidence is consistent with households limiting stocks in order to ration their consump- tion of temptation goods, and to avoid social taxes. Other tested mechanisms seem not to have a major impact on the losses we observe. The nature of the data prevents us from testing additional hypotheses that emerged in the literature or in our qualitative work. Two of these bear mention. The first is that, at the time of purchase, shoppers may simply not be attentive to the high cost of buying in small quantities. If so, then making the aggregate implications of bulk discounts more salient at the time of purchase could be a rewarding intervention for many households. The second proposed hypothesis is that husbands in Tanzania ration the spending of their wives by giving them daily allowances to purchase necessities—such as the components of the family meal—which prevents these women from buying in bulk. This mechanism alone would not be sufficient to generate losses, because the wife could potentially save some cash and delay purchasing certain items in order to buy goods in bulk. However, if such behavior would be perceived as a violation of the social contract between spouses, the personal cost to the woman could be too high to justify saving up in order to bulk purchase. Tests of these potential mechanisms, in Tanzania or elsewhere, are left to future work. The two mechanisms highlighted by our analysis, social taxes and rationing, share a common feature. Both involve limiting the consumption of a future agent—one’s self, other 34 household members, or non-household members—by avoiding stocks. This raises the ques- tion: if stocking exposes households to leakage through social taxes or over-consumption, but some goods exhibit bulk discounts, what is the optimal purchasing pattern? There is a ten- sion between the financial savings from buying in bulk and the degree of over-consumption. The optimal shopping pattern depends on household-specific preferences, shadow prices, and the specific manifestation of the pressures to over-consume within a given household. Given this, we cannot be certain that the observed purchasing patterns are sub- optimal for all households. It seems highly likely that some households could increase utility by buying in bulk. However, our calculation that the average household could increase consumption by almost 9% through bulk purchasing does not take into account the likelihood of higher social taxes or over-consumption. Revealed preference evidence, outside of our scope, is needed to conclude that utility would be higher overall with more buying in bulk. That point notwithstanding, there may be some relatively benign but effective inter- ventions suggested by our findings. For example, programs to coordinate bulk purchasing across households could be beneficial. Groups of households should be able to buy in bulk, pay lower unit prices, and still avoid stocking. Before leaping to advocate for such interven- tions, we would argue for additional research to understand why households do not already coordinate purchases in this manner, what effect changes in shopper behavior would have on sellers (many of whom are poor themselves), and whether the findings from Tanzania are broadly representative of those elsewhere. One of the puzzles raised by our analysis is why bulk discounts persist in equilibrium despite apparently robust competition in retail markets. Do retailers impose high mark-ups on the smallest quantities, in order to take advantage of customer demand for rationing? The link between the demand-side factors examined in this paper and the related set of supply-side factors is a key area for future work. 35 References Alby, Philippe, Emmanuelle Auriol, and Pierre Nguimkeu. 2013. “Social barriers to entrepreneurship in Africa: The forced mutual help hypothesis.” Working paper. Anderson, Siwan, and Jean-Marie Baland. 2002. “The Economics of Roscas and Intrahousehold Re- source Allocation.” The Quarterly Journal of Economics, 117(3): 963–995. Attanasio, Orazio, and Christine Frayne. 2006. “Do the poor pay more?” Working paper. Attanasio, Orazio, and Elena Pastorino. 2015. “Nonlinear Pricing in Village Economies.” NBER work- ing paper 21718. Baland, Jean-Marie, Catherine Guirkinger, and Charlotte Mali. 2011. “Pretending to be poor: Bor- rowing to escape forced solidarity in Cameroon.” Economic Development and Cultural Change, 60(1): 1–16. Banerjee, Abhijit, and Sendhil Mullainathan. 2010. “The shape of temptation: Implications for the economic lives of the poor.” National Bureau of Economic Research. Beatty, Timothy KM. 2010. “Do the poor pay more for food? Evidence from the United Kingdom.” American Journal of Agricultural Economics, 92(3). Beegle, Kathleen, Joachim De Weerdt, Jed Friedman, and John Gibson. 2012. “Methods of household consumption measurement through surveys: Experimental results from Tanzania.” Journal of Development Economics, 98(1): 3–18. Bray, Jeremy W, Brett R Loomis, and Mark Engelen. 2009. “You save money when you buy in bulk: Does volume-based pricing cause people to buy more beer?” Health Economics, 18(5): 607–618. Chung, Chanjin, and Samuel L Myers. 1999. “Do the poor pay more for food? An analysis of grocery store availability and food price disparities.” Journal of consumer affairs, 33(2): 276–296. Collins, Daryl, Jonathan Morduch, Stuart Rutherford, and Orlanda Ruthven. 2010. Portfolios of the poor: How the world’s poor live on $2 a day. Princeton University Press. De Weerdt, Joachim, Garance Genicot, and Alice Mesnard. 2015. “Asymmetry of information within family networks.” National Bureau of Economic Research working paper 21685. Filmer, Deon, and Lant H Pritchett. 2001. “Estimating wealth effects without expenditure Data—Or tears: An application to educational enrollments in states of India.” Demography, 38(1): 115–132. Frank, Ronald E, Susan P Douglas, and Rolando E Polli. 1967. “Household correlates of package-size proneness for grocery products.” Journal of Marketing Research, 381–384. 36 Gibson, John, and Bonggeun Kim. 2018. “Economies of scale, bulk discounts, and liquidity constraints: comparing unit value and transaction level evidence in a poor country.” Review of Economics of the Household, 16(1): 21–39. Goldberg, Jessica. 2016. “The effect of social pressure on expenditures in Malawi.” Working paper. Griffith, Rachel, Ephraim Leibtag, Andrew Leicester, and Aviv Nevo. 2009. “Consumer shopping behavior: How much do consumers save?” The Journal of Economic Perspectives, 23(2): 99–120. Gustavsson, Jenny, Christel Cederberg, Ulf Sonesson, Robert Van Otterdijk, and Alexandre Meybeck. 2011. “Global food losses and food waste.” Food and Agriculture Organization of the United Nations, Rome. Jakiela, Pamela, and Owen Ozier. 2016. “Does Africa need a rotten kin theorem? Experimental evidence from village economies.” Review of Economic Studies, 83(1): 231–268. Kaminski, Jonathan, and Luc Christiaensen. 2014. Global Food Security, 3(3): 149–158. Kunreuther, Howard. 1973. “Why the poor may pay more for food: theoretical and empirical evidence.” The Journal of Business, 46(3): 368–383. Laibson, David. 1997. “Golden eggs and hyperbolic discounting.” The Quarterly Journal of Economics, 112(2): 443–477. Mullainathan, Sendhil, and Eldar Shafir. 2013. Scarcity: Why Having Too Little Means so Much. Macmillan. Mussa, Richard. 2015. “Do the Poor Pay More for Maize in Malawi?” Journal of International Develop- ment, 27(4): 546–563. O’Donoghue, Ted, and Matthew Rabin. 1999. “Doing it now or later.” American Economic Review, 89(1): 103–124. Orhun, A Yesim, and Mike Palazzolo. 2016. “Frugality is hard to afford.” Available at SSRN. Platteau, Jean-Philippe. 2006. “Solidarity norms and institutions in village societies: Static and dynamic considerations.” Handbook of the Economics of Giving, Altruism and Reciprocity, 1: 819–886. Platteau, Jean-Philippe. 2014. “Redistributive pressures in Sub-Saharan Africa: Causes, consequences, and coping strategies.” Africa’s Development in Historical Perspective, 153. Rao, Vijayendra. 2000. “Price heterogeneity and “Real” inequality: A case study of prices and poverty in rural south India.” Review of Income and Wealth, 46(2): 201–211. 37 Sahn, David E., and David Stifel. 2003. “Exploring Alternative Measures of Welfare in the Absence of Expenditure Data.” Review of Income and Wealth, 49(4): 463–489. Squires, Munir. 2016. “Kinship taxation as a constraint to microenterprise growth: experimental evidence from Kenya.” Unpublished. University of Ghana. 2008. “Harvest and Post Harvest Baseline Study.” Wansink, Brian. 1996. “Can package size accelerate usage volume?” The Journal of Marketing, 1–14. Zorya, Sergiy, Nancy Morgan, Luz Diaz Rios, Rick Hodges, Ben Bennett, Tanya Stathers, Paul Mwebaze, John Lamb, et al. 2011. “Missing food: the case of postharvest grain losses in sub-Saharan Africa.” The World Bank. 38 Figure 1: Maize expenditure and unit price schedules for district with three focal points Notes : Authors’ calculations from example data in text. Circles indicate the focal points that constitute the maize price schedule (left panel) and expenditure schedule (right panel) in one of the districts. ×s indicate the quantity and unit price (left panel) or expenditure (right panel) associated with three hyopthetical maize purchases by a single household: 1 kg for 650 TZS, 1 kg for 750 TZS, and 1.5 kg for 975 TZS. 39 Figure 2: Expenditure and unit price for kerosene purchases in one district Notes : Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Nine outliers were suppressed to improve readability of these figures. Figure depicts expenditures and unit prices by quantity for 702 purchases of kerosene in one of the study districts. The size of the circles corresponds to the number of transactions at the circle center. The triangles represent the estimated focal points, and the solid lines mark the unit price (left panel) and expenditure (right panel) schedules. At left, the downward orientation of the unit prices is clear. At right, the changing slope of the expenditure line represents the drop in unit prices as quantity increases. 40 Figure 3: Distribution of financial losses Notes : Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. The average exchange rate during the study period was 1,150 TZS per US dollar. Bars represent histograms. Curves represent kernel densities using an Epanechnikov kernel. The left panel shows the distribution of two-week household-level financial losses in Tanza- nia shillings, calculated as the difference between adjusted expenditure from multiple purchasing and the counterfactual adjusted expenditure associated with purchasing the total quantity in a single transaction. The right panel shows the same financial losses as a percentage of adjusted expenditure, again at the household level. 41 ∗ Figure 4: Average days of consumption represented by qmin Notes : Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Two outliers were suppressed to improve readability of these figures, but those outliers are represented in the labeled statistics. The bars represent a histogram of the average number of days of ∗ consumption represented by qmin , which is the minimum purchase quantity that achieves the lowest available unit price (at the district level). To es- ∗ timate the number of days’ worth of consumption represented by qmin , we ∗ divide qmin by the average total quantity purchased (at the district level), and multiply by 14. 42 Table 1: Descriptions of item standardization and units Avg. Avg. transac- total 2 Avg. tion week unit quantity quantity price Unit Description of item (1) (2) (3) (4) Maize: loose, dried maize kernels. Excludes maize flour, maize cobs, popcorn, or 10.36 20.90 393 Kg processed maize grains. Cooking Bananas: excludes any other type of banana such as roasting bananas, 7.57 17.16 477 Kg beer bananas or sweet bananas. Cassava: fresh, raw cassava. Excludes cassava flour and dried, boiled, fried, or 3.28 8.44 137 Kg roasted cassava. Soap: solid bar soap. Excludes powdered soap, beauty soap, dishwashing liquid. 2.22 6.57 137 Kg Charcoal: excludes wood, kerosene, other fuels for cooking. 2.05 13.06 417 Kg Rice: husked white rice. Excludes unhusked, brown, broken rice. 1.63 5.86 886 Kg Flour: white maize flour. Excludes brown flour, flours from wheat, millet, 1.28 7.17 613 Kg sorghum. Beans: dried kidney beans. Excludes fresh kidney beans, green beans, other 0.85 2.18 1022 Kg beans, green gram, lentils, chick peas, cow peas, pigeon peas, bambarra nuts, garden peas. Coconut: whole matured coconuts. Excludes immature coconuts. 0.76 3.88 444 Kg Salt: excludes coarse salt or any other spices. 0.55 1.13 635 Kg Sugar: refined sugar. Excludes unrefined sugar, honey, syrup, other sweeteners. 0.54 2.23 1222 Kg Sweet Bananas: excludes cooking, roasting or beer bananas. 0.49 1.12 1263 Kg Dried sardines dried dagaa. Excludes fresh dagaa and other fish. 0.35 1.16 1371 Kg Onions: fresh, whole onions. 0.30 1.15 693 Kg Tea Leaves: black tea leaves. Excludes other types of tea, ground coffee, instant 0.02 0.08 8170 Kg coffee and other raw ingredients for hot beverages. Kerosene: very homogenous so no need to exclude anything in this category. 0.26 0.95 2313 Liter Typically used for lighting and/or cooking. Cooking Oil: liquid vegetable oil. Excludes, butter, ghee, other types of fat. 0.18 0.84 2711 Liter Cigarettes: Portsman cigarettes. Excludes other brands, locally made cigarettes, 5.27 31.67 49 Piece chewing tobacco, and raw tobacco. Matches: excludes lighters or wicks. 1.98 4.19 47 Box Notes: Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. The “Description of item” column indicates which sub-items were retained or excluded for each study item, in order to improve standardization. Column 1 reports the average reported transaction quantity across all observed purchases of each item. Column 2 reports average total quantity purchased over two weeks by households that purchase each item at least once. Column 3 reports the average unit price paid across all observed purchases of each item. 43 Table 2: Purchase and expenditure patterns, by item, N=1493 households Among all households Among households purchasing item HHs % of HHs % of HHs multiple multiple Avg no. Avg total multiple Avg no. Avg total Total HHs pur- purchas- purchas- of expendi- purchas- of expendi- purchases chasing ing ing purchases ture ing purchases ture Item (1) (2) (3) (4) (5) (6) (7) (8) (9) Cooking Oil 5319 1166 927 62.1 3.6 1505 79.5 4.6 1928 Kerosene 4581 1273 980 65.6 3.1 1312 77.0 3.6 1539 Sugar 4298 1048 789 52.8 2.9 1859 75.3 4.1 2648 Onions 4123 1089 797 53.4 2.8 369 73.2 3.8 506 Flour 4021 715 555 37.2 2.7 2052 77.6 5.6 4286 Dried sardines 3494 1042 787 52.7 2.3 787 75.5 3.4 1128 Rice 3298 918 631 42.3 2.2 3074 68.7 3.6 5000 44 Soap 3165 1070 738 49.4 2.1 632 69.0 3.0 882 Salt 2284 1122 680 45.5 1.5 439 60.6 2.0 584 Tea Leaves 2228 612 391 26.2 1.5 158 63.9 3.6 387 Beans 2036 794 497 33.3 1.4 1149 62.6 2.6 2161 Matches 1942 917 515 34.5 1.3 115 56.2 2.1 187 Coconut 1905 373 312 20.9 1.3 417 83.6 5.1 1672 Charcoal 1712 269 230 15.4 1.1 874 85.5 6.4 4851 Cassava 969 377 197 13.2 0.6 267 52.3 2.6 1060 Cigarettes 956 159 115 7.7 0.6 162 72.3 6.0 1524 Sweet Bananas 750 324 146 9.8 0.5 162 45.1 2.3 750 Cooking Bananas 732 323 176 11.8 0.5 514 54.5 2.3 2379 Maize 688 341 143 9.6 0.5 1679 41.9 2.0 7354 AVERAGE 2552 733 505 33.9 1.7 923 67.1 3.6 2149 Notes : Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Table 3: Regressions of unit price on quantity, various specifications Dependent variable: transaction-level unit price Market survey data Transaction diary data Item (1) (2) (3) N (4) (5) N Rice -0.036** -0.013 -0.037 786 -0.018 -0.032*** 3298 (0.017) (0.023) (0.061) (0.012) (0.010) Maize -0.045*** -0.050*** -0.043*** 774 -0.084** -0.125*** 688 (0.009) (0.010) (0.015) (0.033) (0.026) Flour -0.204*** -0.381*** -0.310 532 -0.010* -0.015** 4021 (0.056) (0.055) (3.687) (0.006) (0.007) Cassava -0.491*** -0.676*** -0.786*** 452 -0.056 -0.048 969 (0.062) (0.104) (0.111) (0.078) (0.055) Cooking Bananas -0.016 0.056** 0.071 522 -0.654*** -0.567*** 732 (0.019) (0.026) (0.044) (0.105) (0.069) Sugar -0.227** -0.244** -0.253 877 -0.123*** -0.115*** 4298 (0.095) (0.106) (0.251) (0.031) (0.021) Beans -0.061** -0.069* 0.002 740 -0.032 -0.046** 2036 (0.030) (0.039) (0.069) (0.022) (0.019) Sweet Bananas -0.254*** -0.170*** -0.201** 459 -0.144 -0.350*** 750 (0.029) (0.035) (0.096) (0.096) (0.113) Dried sardines 0.146 0.133 0.161 724 -0.171*** -0.183*** 3494 (0.105) (0.135) (0.255) (0.028) (0.024) Cooking Oil -0.068*** -0.070*** -0.071*** 1444 -0.139*** -0.159*** 5319 (0.007) (0.008) (0.011) (0.019) (0.015) Coconut -0.052 -0.108*** 1905 (0.033) (0.028) Onions -0.365*** -0.377*** 4123 (0.032) (0.026) Salt -0.343*** -0.322*** 2284 (0.057) (0.051) Tea Leaves -0.329*** -0.417*** 2228 (0.034) (0.051) Charcoal -0.293*** -0.314*** 1712 (0.055) (0.052) Kerosene -0.262*** -0.261*** 4581 (0.019) (0.021) Matches -0.035 -0.054*** 1942 (0.022) (0.018) Soap 0.018 -0.024 3165 (0.020) (0.018) Cigarettes 0.006 -0.022 956 (0.121) (0.016) Fixed effects District Village Vendor Vill-day Household Notes: Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Standard errors in parentheses; standard errors clustered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. Controls for diary type included in regressions underlying columns 3 and 4. Each coefficient is from a separate regression of unit price on quantity, for only the item indicated. 45 Table 4: Transaction quantities and expenditures: observed and counterfactual (1) (2) (3) (4) (5) (6) (7) ITEM-LEVEL MEANS Potential % Num. HHs Adjusted Quantity quantity change multiple expenditure Loss % Loss Item qhi ∗ qhi Q˜ hi purchasing e ˆhi Lhi ˜ hi L Kerosene 1.11 1.33 33.2 950 1724 288 19.7 Onions 1.42 1.81 46.2 789 522 95 19.7 Cooking Bananas 26.49 30.85 50.2 172 3010 490 18.4 Tea Leaves 0.09 0.11 31.5 387 482 57 16.1 Cooking Oil 0.99 1.17 24.3 917 2191 291 15.4 Dried sardines 1.46 1.58 17.7 772 889 77 9.5 Salt 1.45 1.53 7.4 661 693 37 6.8 Coconut 4.53 4.80 7.7 305 1889 115 6.8 Maize 34.12 36.12 8.3 141 11846 711 6.6 Sweet Bananas 1.82 1.91 6.3 144 1192 32 4.8 Cigarettes 41.39 42.64 5.4 113 1845 42 4.6 Soap 8.59 8.90 4.5 719 925 36 4.5 Cassava 13.39 13.79 5.4 195 1482 64 4.5 Charcoal 14.88 15.37 7.2 224 5465 56 4.2 Matches 5.42 5.54 2.4 506 249 5 2.2 Sugar 2.70 2.71 0.8 781 3211 19 1.1 Flour 8.75 8.81 0.6 544 4936 38 1.0 Rice 7.54 7.58 0.6 614 6377 43 0.9 Beans 2.81 2.81 0.5 491 2812 9 0.5 AVERAGE 15.2 2264 114 8.7 HOUSEHOLD-LEVEL MEANS Households e ˆh Lh ˜h L All 14969 751 6.7 Below median 8757 174 4.5 Above median 21155 1326 8.9 Notes : Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Calculations based on multi-purchasing households; Lhi and percent change both set to zero for single- purchasing households; for Item panel, columns 3 and 6 calculated at household-item level before averaging across items, and column 3 calculated after throwing out upper 1% tail; for Household panel, “median” refers to median of Lh . 46 Table 5: Loss regressed on household characteristics Dependent variable: Loss Loss % Loss % Loss (1) (2) (3) (4) Wealth index quartile 2 (=1) 140.552*** 131.823*** 0.001 -0.001 (48.435) (47.189) (0.005) (0.005) Wealth index quartile 3 (=1) 130.488** 118.329* -0.007 -0.009* (61.172) (63.123) (0.005) (0.005) Wealth index quartile 4 (=1) 212.799*** 169.892** -0.030*** -0.032*** (77.138) (83.454) (0.008) (0.008) Age of head (years) -1.883 -0.001*** (1.771) (0.000) Head is female (=1) 1.783 0.000 (55.083) (0.005) Head years of education 3.199 -0.000 (9.116) (0.001) Household size 16.904* -0.002*** (9.726) (0.001) Distance to community center (km) -75.793 -0.004 (49.858) (0.003) Distance to market (km) -0.159 -0.000*** (0.130) (0.000) Observations 1452 1446 1452 1446 R-squared 0.18 0.18 0.15 0.18 Mean of dependent variable 751.531 748.929 0.067 0.067 Notes: Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Standard errors in parentheses; standard errors clustered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. All regressions include district fixed effects, controls for demographic composition of the household, and controls for questionnaire module. The wealth index is defined with quartile 1, the excluded group, as the poorest. 47 ∗ Table 6: Summary statistics across districts for qmin and e∗ min , by item Across the 7 study districts... Average ∗ qmin e∗min quantity purchased Mean Min Max Mean Min Max Item (1) (2) (3) (4) (5) (6) (7) Maize 20.9 9.38 3 20 2414 750 6498 Kerosene .95 .86 .5 1 1071 600 1300 Cooking Bananas 17.16 10.8 1.72 28 1067 201 1708 Cooking Oil .84 .58 .05 1 1050 100 2500 Rice 5.86 1.14 .5 4 771 300 2400 Sugar 2.23 .5 .25 2 629 250 2600 Flour 7.17 1.39 .25 4 617 100 1600 Charcoal 13.06 2.61 1.45 7.25 400 200 700 Beans 2.18 .43 .25 1 400 200 900 Coconut 3.88 .88 .57 1.1 383 200 550 Cassava 8.44 2.68 .58 8.67 325 50 997 Sweet Bananas 1.12 1.57 .05 8.61 276 50 550 Salt 1.13 .57 .25 1 264 100 500 Soap 6.57 2.29 1 8 243 100 704 Tea Leaves .08 .08 .01 .25 243 50 500 Dried sardines 1.16 .22 .1 .4 157 100 200 Matches 4.19 3.57 1 10 133 30 400 Onions 1.15 .51 .05 1.4 114 50 200 Cigarettes 31.67 2.57 1 6 106 50 240 Notes : Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Column 1 refers to average total purchase over 2 weeks at the household-item level, for households that purchased positive amounts of the item. Table is sorted by column 5. Units are kilograms except for cooking oil and kerosene (liter); cigarettes (piece); and matches (box). 48 Table 7: Median days required to save enough to purchase at lowest unit price If household temporarily foregoes purchasing tea leaves, salt, sugar, and cigarettes... All By wealth quartile (1 = poorest) Loss-prone HHs as measured by... %age Level 1 2 3 4 only only Both Item (1) (2) (3) (4) (5) (6) (7) (8) Kerosene 4.3 5.8 4.5 4.2 3.2 8.2 2.6 4.5 Onions 0.5 0.7 0.5 0.5 0.5 1.4 0.4 0.6 Cooking Bananas 3.2 3.3 3.7 4.4 2.5 8.1 2.1 4.0 Cooking Oil 2.9 4.0 3.7 3.5 1.9 8.0 1.5 3.3 Dried sardines 0.7 0.9 0.7 0.7 0.7 1.6 0.5 0.7 Coconut 1.0 0.7 0.9 0.6 1.2 4.2 0.9 1.6 Maize 4.7 4.9 5.6 4.9 2.7 13.6 3.8 7.9 Sweet Bananas 1.0 1.0 1.0 0.8 1.0 1.8 1.1 1.7 Soap 0.9 1.3 0.7 0.8 0.9 1.6 0.5 0.7 Cassava 0.9 1.0 0.6 0.5 1.8 0.9 1.3 0.7 Charcoal 1.0 2.1 1.5 1.4 1.0 3.3 0.9 1.5 Matches 0.4 0.5 0.4 0.6 0.2 1.3 0.2 0.5 Flour 0.6 0.5 0.5 0.5 1.1 1.2 0.9 1.1 Rice 1.1 1.7 1.5 1.4 0.6 3.2 0.6 1.7 Beans 1.0 1.5 1.1 1.1 0.7 2.1 0.7 1.4 OVERALL 1.2 1.5 1.3 1.1 1.0 2.9 0.8 1.6 Notes : Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Items ordered by decreasing values of mean L ˜ h , from Table 4. Column 6 includes households in highest ˜ ˜ h . Column 8 quartile by Lh but not Lh . Column 7 includes households in highest quartile by Lh but not L includes households in highest quartile by both. 49 Table 8: Counterfactual change in number of transactions, if purchasing at lowest unit price (1) (2) (3) Counter- Actual factual ∗ (Kh and (Kh and ∗ Subgroup Statistic Khi ) Khi ) Difference All households Mean total transactions 32.5 53.3 20.8 Mean transactions per item 3.5 5.7 2.2 Loss-prone households, level only Mean total transactions 67.5 98.4 30.9 Mean transactions per item 5.3 7.8 2.4 Loss-prone households, %age only Mean total transactions 17.8 18.0 0.1 Mean transactions per item 2.4 2.4 0.0 Loss-prone households, both Mean total transactions 37.5 42.8 5.2 Mean transactions per item 3.6 4.1 0.5 Notes: Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. “Level only” are in highest quartile by Lh , but not L ˜ h , but not ˜ h ; t“%age only” are in highest quartile by L Lh ; “both” are highest quartile by Lh and L ˜ h . Figures represent the number of transactions over two weeks on the 19 study items. 50 Table 9: Number of transactions, bulk discounts, and temptation Dependent variable: Number of transactions at household-item level (1) (2) (3) (4) (5) Bulk discount (=1) -0.131* 0.031 -0.063 -0.637*** (0.077) (0.063) (0.079) (0.087) Temptation good (=1) 0.653*** 0.451*** 0.108 (0.060) (0.098) (0.113) Temptation × Bulk 0.362** 0.687*** (0.141) (0.159) Loss-prone × Bulk 1.316*** (0.153) Temptation × Loss-prone 0.859*** (0.198) Temptation × Bulk × Loss-prone -0.755*** (0.287) Observations 9606 9606 9606 9606 9606 R2 .44 .33 .34 .34 .35 Household FE Yes Yes Yes Yes Yes Item FE Yes No No No No Notes: Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Standard errors in parentheses; standard errors clustered at household level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. 51 Table 10: Social taxation, descriptive statistics Category Mean s.d. Total value outgoing (TZS) 9511 23204 outgoing: meals and snacks 3299 9709 outgoing: grains 1605 6404 outgoing: pulses 1543 7543 outgoing: starches 670 2765 outgoing: meat and dairy 579 2933 outgoing: fruits and vegetables 238 1161 outgoing: other 1577 10288 Total value incoming (TZS) 73021 101209 incoming: purchases 51281 95503 incoming: own production 15288 23813 incoming: other 6453 11074 Implied social tax rate (%) 14.4 26.5 Implied social tax rate, excluding meals and snacks (%) 9.2 20.5 Notes: Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Estimates based on all activity by the 1,493 diary households, on all items. Figures are the total TZS values of each outgoing or incoming transaction reported in transaction diaries for the categories listed, aggregated to the household-category level by the authors. 52 Table 11: Social tax rate regressed on transaction quantity percentiles and characteristics social tax social tax social tax rate, social tax rate, Dependent variable: rate rate adjusted adjusted (1) (2) (3) (4) Average transaction percentile 9.683** 11.903** 11.228*** 11.872*** (4.661) (4.923) (3.902) (3.890) Wealth index quartile 2 (=1) 1.481 1.297 (2.127) (1.396) Wealth index quartile 3 (=1) 2.011 1.977 (2.126) (1.423) Wealth index quartile 4 (=1) 0.936 1.061 (3.128) (2.909) Household size -0.291 -0.100 (0.273) (0.222) Age of head (years) -0.016 -0.031 (0.051) (0.038) Head is female (=1) 1.417 -0.822 (1.718) (1.255) Distance to community center (km) -0.270 -0.482 (0.764) (0.628) Distance to market (km) 0.022*** 0.026*** (0.004) (0.003) Observations 1453 1446 1453 1446 R2 0.09 0.10 0.09 0.09 Notes: Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Standard errors in parentheses; standard errors clustered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. All regressions include district fixed effects and controls for questionnaire module. The “adjusted” social tax rate is the re-calculated rate without including outgoing meals and snacks (as in the final row of Table 10). To calculate average transaction percentiles we (i) sort the observed transaction quantities at the item-district level, (ii) assign the percentile value n/N to the nth largest quantity (1 is the smallest), where N is the number of item-district observations (when there are ties at the item-district-quantity level, we assign the average); (iii) calculate the average at the household-item level, for the items purchased by the household; and ((iv ) calculate the average across items within household. 53 Table 12: Testing the link between losses and shopping coordination failures Dependent variable: Loss % Loss Multi- Multi- Households: All diary All diary (1) (2) (3) (4) Number of item-days with 2+ purchasers -39.551** -59.527*** -0.002*** -0.002*** (15.430) (17.493) (0.001) (0.001) Count of item-days with a purchase 27.356*** 31.767*** 0.001*** 0.001*** (1.854) (2.234) (0.000) (0.000) Household size 22.283 33.577 0.000 0.001 (31.148) (34.230) (0.002) (0.002) Observations 495 375 495 375 R2 0.48 0.48 0.19 0.23 Notes: Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Standard errors in parentheses; standard errors clustered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. All regressions include controls for wealth, household demographics, age, gender, and education of household head, distance to community center, distance to market, and district fixed effects. Data are from individual-level diaries. 54 Table 13: Joint tests of rationing and social taxation Dependent variable: Number of transactions at the household-item level (1) (2) (3) Bulk discount (=1) -0.086 -0.029 (0.086) (0.091) Bulk × Social tax rate -0.003 -0.003 (0.002) (0.003) Temptation good (=1) 0.645*** 0.435*** (0.069) (0.114) Temptation × Social tax rate 0.001 0.001 (0.002) (0.004) Temptation × Bulk 0.377** (0.168) Temptation × Bulk × Social tax rate -0.001 (0.005) Observations 9606 9606 9606 R2 .44 .34 .34 Household FE Yes Yes Yes Item FE Yes No No Dependent variable: Lh Lh ˜h L ˜h L (1) (2) (3) (4) Social tax rate -2.075*** -2.810** -0.000** -0.000 (0.755) (1.389) (0.000) (0.000) Number of purchases, temptation goods 49.460*** 48.623*** -0.000 -0.000 (2.803) (3.002) (0.000) (0.000) Social tax rate × No. of tempt purchases 0.083 -0.000 (0.114) (0.000) Observations 1446 1446 1446 1446 R2 0.40 0.40 0.18 0.18 Notes: Authors’ calculations from Survey of Household Welfare and Labor in Tanzania (SHWALITA) data. Standard errors in parentheses; standard errors clustered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. Regressions in the lower panel include controls for wealth, household demographics, age, gender, and education of household head, distance to community center, distance to market, and district fixed effects. 55 S1 Household summary statistics Table S1.1 provides household summary statistics. Mean consumption per capita is almost 400 USD per year, but the distribution is heavily skewed; the median is only 265 USD per year.29 The median household has 5 people. The “Wealth index value” is the value of the first principal component from a vector of household assets (Filmer and Pritchett, 2001; Sahn and Stifel, 2003). This index serves as our primary measure of household wealth, because unlike expenditure, it is not endogenous to consumer prices.30 Table S1.1: Summary statistics at the household level Mean s.d. Median Age of head (years) 46.65 16.03 44.00 Head years of education 4.72 3.75 7.00 Head is female (=1) 0.20 0.40 0.00 Household size 5.34 2.96 5.00 Share under 15 yrs old 0.40 0.24 0.45 Share over 65 yrs old 0.10 0.22 0.00 Urban cluster (=1) 0.34 0.47 0.00 Acres owned 3.84 5.56 2.00 Wealth index value -0.02 1.00 -0.43 Nominal consumption (TZS/yr) 2002182 1976947 1449216 Nominal consumption (USD/yr) 1741 1719 1260 Nominal consumption per capita (TZS/yr) 447584 467340 302578 Nominal consumption per capita (USD/yr) 389 406 263 Notes : Authors’ calculations from SHWALITA data. Sample size is 1,497, because two households with incomplete demographic data are not included. 1,150 TZS = 1 USD. 29 We convert Tanzania shillings to US dollars at a rate of 1,150 TZS/$1. 30 One might worry that a stock measure of assets does not adequately capture the dimension of hetero- geneity that is most relevant for purchasing behavior, i.e., heterogeneity in income or liquid wealth. We are not overly worried. The wealth index is strongly correlated with observed household expenditure (r = 0.55). Furthermore, our main conclusions are not based on analyzing heterogeneity by wealth. 56 S2 Estimates of nonlinear expenditure schedules Table S2.2: Regressions of expenditure on quantity and its square Dependent variable: transaction-level expenditure Coefficient on q 2 point standard Item estimate error Rice -1.16 (4.66) Maize -1.22* (0.72) Flour -20.63*** (5.19) Cassava 0.45 (1.95) Cooking Bananas -2.84*** (0.85) Sugar -12.97** (5.53) Beans -0.08 (8.47) Coconut -14.58 (11.60) Onions -58.16 (45.86) Sweet Bananas -28.31*** (5.75) Dried sardines -259.10*** (34.52) Cooking Oil -415.85*** (68.88) Salt -9.61*** (1.52) Tea Leaves -3078.70*** (1163.53) Charcoal -9.91*** (1.46) Kerosene -240.60*** (31.41) Matches -0.77*** (0.14) Soap -4.54*** (0.70) Cigarettes -0.06 (0.13) Notes: Authors’ calculations from SHWALITA data. Standard errors in parentheses; standard errors clustered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. Each coefficient is from a separate regression of transaction-level expenditure on quantity and quantity squared, for only the item indicated. We report the coefficient on quantity squared. S3 Robustness of price schedules The estimated schedules represent the counterfactual cost of purchasing at quantities greater than or equal to the observed quantities. In this subsection we examine the validity of the schedules as a set of counterfactuals. We first consider whether bulk discounts are a function of buyer-seller relationships, and available only to locals. We then examine variation around the expenditure schedules – variation in unit price, conditional on quantity – to determine whether this is a dimension along which poor and rich households face different prices. 57 Are bulk discounts dependent on buyer-seller relationships? Could it be that bulk discounts are only available to consumers who have a relationship with a vendor? In this case, a buyer might pay higher unit prices today as an investment in a relationship that will allow future access to better prices. Our data could reflect a point-in- time snapshot of an ongoing process in which consumers gradually cultivate, maintain, and sometimes lose these vendor relationships. Or, it may be that vendors are only willing to sell some items as “loss leaders” – large quantity purchases provided at a heavy discount – when they are combined with smaller quantity purchases at higher unit prices. The data collected by project staff members from local markets allow us to reject this hypothesis. In the market price surveys, bulk discounts are clearly present (see columns 1-3 of Table 3). Yet, these staff members had no prior relationship with vendors, and asked only about purchasing one item at a time. Clearly, consumers do not need to invest in long-term relationships with sellers, nor must they combine large and small quantity purchases, in order to receive bulk discounts. Heterogeneity around the expenditure schedules Although our focus is on bulk discounts and why households might not take advantage of them, it is worth taking a moment to explore the nature of residual variation around the expenditure schedule. We observe numerous instances in which the price for the same quantity of the same item varies between transactions. Because of how we construct Lh and ˜ h , this price variation does not impact our loss analysis directly. But it does represent a L second dimension of between-household variation in prices that may be responsible for the “poor pay more” hypothesis. In Table S3.3 we show the proportion of transactions for each item that are below, on, and above the expenditure schedule. There is less variation than one might expect. On average, 46% of transactions are exactly on the schedule, with 19% below and 35% above. At the top of the table, with 74–95% of prices falling on the schedule, we find matches, tea and cigarettes. These are highly standardized goods that are sold in clearly identifiable and uniform units. At the bottom of the list are cooking bananas and cassava, with less than 20% 58 of transactions on the schedule. These goods are typically sold in imprecise units (heaps, bunches). This suggests that some of the variation in unit price conditional on quantity may be due to measurement error, either at the time of purchase or during data collection. Table S3.3: Position of transaction expenditure relative to expenditure schedule Below On Above Item (1) (2) (3) Cigarettes 0.03 0.82 0.15 Matches 0.14 0.75 0.11 Sugar 0.17 0.70 0.13 Onions 0.17 0.61 0.21 Soap 0.11 0.58 0.31 Rice 0.20 0.49 0.31 Tea Leaves 0.11 0.48 0.41 Beans 0.27 0.44 0.28 Salt 0.19 0.42 0.39 Kerosene 0.25 0.39 0.36 Charcoal 0.26 0.38 0.36 Sardines 0.09 0.38 0.53 Cooking Oil 0.21 0.35 0.44 Coconut 0.33 0.32 0.34 Sweet Bananas 0.17 0.30 0.53 Maize 0.27 0.25 0.48 Flour 0.21 0.22 0.57 Cooking Bananas 0.32 0.17 0.51 Cassava 0.51 0.11 0.38 AVERAGE 0.20 0.45 0.35 Wealth index quartile 1 0.17 0.46 0.36 Wealth index quartile 2 0.18 0.45 0.36 Wealth index quartile 3 0.20 0.46 0.34 Wealth index quartile 4 0.22 0.44 0.35 Notes: Authors’ calculations from SHWALITA data. The wealth index is defined with quartile 1 as the poorest. Table sorted by decreasing values of column 2. In Section 2 of the paper we labeled the idiosyncratic component of price, conditional on quantity, as νhik . This residual variation could reflect unobserved item quality, bargaining skill, shopping effort, or other factors. We can calculate the empirical analog of this term ˆhik = ehik − e as the difference between observed and adjusted expenditure, i.e., ν ˆhik . By ˆhik = 0. definition, the 46% of transactions that take place on the expenditure schedule have ν ˆhik , we first normalize it to its percentage difference from To examine the correlates of ν n ehik −e ˆhik ˆhik the expenditure schedule: ν =ν ˆhik /e ˆhik = ˆhik e , where the “n” superscript indicates n ˆhik “normalized.” The mean of ν is 0.16, indicating that the average transaction is 0.16 59 standard deviations above the expenditure schedule.31 Table S3.4: Regressions with idiosyncratic price component as dep. var., transaction level n n Dependent variable: ˆhik ν |ν ˆhik | (1) (2) Quantity z-score -0.038 -0.028 (0.03) (0.03) Precise unit (=1) 0.094 0.078 (0.12) (0.11) Market day purchase (=1) 0.012 0.007 (0.01) (0.01) Wealth index quartile 2 (=1) -0.033 -0.028 (0.02) (0.02) Wealth index quartile 3 (=1) -0.011 -0.007 (0.01) (0.01) Wealth index quartile 4 (=1) 0.001 -0.007 (0.02) (0.02) Observations 46927 46927 R-squared 0.25 0.25 Mean of dep. variable 0.16 0.23 Notes: Authors’ calculations from SHWALITA data. Standard errors in parentheses; standard errors clus- tered at district level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. All regressions include district fixed effects, item fixed effects, and controls for questionnaire module. Sample includes 1,496 households in 168 n villages in 7 districts. We dropped observations in the 1% upper and lower tails of the ν ˆhik distribution. n ˆhik To examine the variation in ν , we estimate regressions of the level and absolute n ˆhik value of ν on transaction and household characteristics. We use both the level and abso- lute value as dependent variables so as to explore factors associated with higher prices and greater spread. Table S3.4 shows results. The variables of main interest are the wealth quar- tile dummies and the variable “Precise unit”, which takes a value of 1 if the unit involved in the transaction is standardized and precisely defined (at the local level), and zero other- wise.32 Regressions also include district effects, item effects, questionnaire effects, controls for quantity (via item-level z-scores), and controls for purchases on village market days. Results are broadly similar across the two columns of Table S3.4. There are no coeffi- cients that are statistically different from zero at conventional levels. The main takeaway is 31 Recall that the focal expenditures that underlie the expenditure schedule are medians. The average transaction lies above the schedule because there is positive skewness in expenditure conditional on quantity. 32 Based on the market survey efforts of the research team, we designated the following units as precise: kilogram, liter, 25kg bag, 50kg bag, debe, kisadolini, and packet of tea leaves. These units are associated with standardized quantities that were measured by the research team at markets in every village. Imprecise units include bowls, cups, pieces, heaps, and others. These were also surveyed and measured by the research team, but they are prone to greater measurement error. Approximately 63% of transactions were recorded in precise units. 60 that the residual component of prices does not vary meaningfully with wealth. The estimated coefficients on the wealth quartile dummy variables are neither economically nor statistically significant. This establishes the main result for this subsection: on average, there do not ap- pear to be unobserved transaction-level characteristics that lead to poor households paying different prices from wealthy households for the same quantity of the same item. Bulk discounts and purchase frequency One might be concerned that bulk discounts change the extensive margin probability of purchase at the household level, so that items with and without bulk discounts appear in our data at different rates. That turns out to not be the case. Figure S3.1 plots the coefficients from column 5 of Table 3 in the paper against the share of households purchasing each item. There is essentially no relationship. The slope coefficient from the linear fit is far from significant (p = 0.70), and the correlation between the two variables is 0.09. 0 -.2 Slope coefficient -.4 -.6 0 .2 .4 .6 .8 Share purchasing Figure S3.1: Coefficients from Table 3, column 5, plotted against share of households pur- chasing Notes : Authors’ calculations from SHWALITA data. 61 S4 Categorizing households based on realized losses Table S4.5 presents summary statistics for four groups of households (moving from column 2 to column 5): (i) households in the highest quartile of Lh but not the highest quartile ˜ h but not the highest quartile of Lh , (iii) ˜ h , (ii) households in the highest quartile of L of L households in the highest quartile for both losses and percentage losses, and (iv) households that are in neither worst quartile. Table S4.5: Summary statistics by loss categories, household level Among the 25% highest loss households by... Both Lh Overall Lh only ˜ h only L and L˜h Neither (1) (2) (3) (4) (5) Proportion in group 1.00 0.12 0.12 0.13 0.63 Number of transactions 32.49 68.43 20.10 42.84 25.77 Number of items purchased 9.33 12.72 7.40 10.83 8.73 Adjusted expenditure 16539 37895 5301 18947 14018 Adjusted expenditure per capita 3792 8021 1637 4167 3308 Loss (level) 730.93 1677.09 564.08 2126.97 292.16 Loss (%) 0.05 0.05 0.12 0.12 0.02 Wealth index value -0.02 0.75 -0.40 -0.05 -0.09 Distance to community center (km) 0.61 0.38 0.55 0.55 0.67 Distance to market (km) 5.23 4.18 4.98 4.30 5.46 Age of head (years) 46.65 45.89 45.86 41.59 48.04 Head years of education 4.72 5.69 4.03 5.06 4.58 Head is female (=1) 0.20 0.19 0.23 0.18 0.20 Household size 5.34 5.97 4.55 5.34 5.37 No. of children 9-14 0.91 1.10 0.70 0.90 0.92 No. of adults 15-59 2.51 3.02 2.07 2.47 2.49 Notes: Authors’ calculations from SHWALITA data. Sample includes 1,497 households with complete data. Groups (i) and (ii) look like the rich and poor households discussed in the final paragraph of Section 4 of the paper. The 12% of households that have high losses but not high percentage losses (column 2) appear to be upper-class households. They make substantially more purchases, spend more than twice as much, and buy many more items than the average household. Their average level of the wealth index is almost a full standard deviation above the mean, and they are larger, more educated, and live nearer to the city center. In contrast, the 12% of households that have high percentage losses but not high level losses (column 3) appear to be poor and disadvantaged households. These households are smaller, less educated, and poorer in both expenditure and wealth terms. 62 Group (iii), the 13% of households that are in both high-loss categories (column 4), are interesting for a different reason: they exhibit very large losses despite having close to average expenditures. They also have near average wealth, household size, and education. Their most notable characteristic is that the household heads are younger and more likely to be male, raising the interesting possibility that they lack the foresight or maturity to organize household finances. Otherwise, there is little besides their inefficient shopping patterns that distinguishes these households from the average. 63 S5 Temptation survey results We invited Tanzanian staff members from recent survey projects to rank the study items on a five-point scale (1 = not at all tempting; 5 = tempting for essentially everyone who consumes the item). These respondents implement consumption surveys and speak regularly with households about their economic choices—in essence, they constitute a panel of experts. We asked respondents to answer for a typical household in a typical village, not to self-assess their own temptations. The survey was conducted online in June-July 2016. Across the 43 responses that we received, we assign each item its average score on the 5-point scale, and classify the top third (6 items) as the temptation goods in the study. Table S5.6 shows the results of the temptation survey. Table S5.6: Temptation survey results (1 = Not tempting; 5 = Highly tempting) Mean Item score Tea leaves 2.31 Maize 2.33 Cassava 2.43 Kerosene 2.54 Sardines 2.74 Salt 2.79 Matches 2.81 Coconut 2.88 Cooking bananas 3.00 Onions 3.00 Flour 3.02 Beans 3.02 Charcoal 3.19 Sweet bananas 3.38 Cigarettes 3.81 Soap 3.83 Cooking oil 3.85 Rice 4.29 Sugar 4.31 Notes : Authors’ calculations from survey conducted with 43 Tan- zanians. 64 S6 Social tax analysis: extensions Table S6.7 shows the items that we assigned ex ante to each category of outgoing resources in Table 10. Table S6.7: Category descriptions for social tax calculations Category Items included Meals and snacks Bottled/canned soft drinks (soda, juice, water), barbecued meat, chips, roast bananas, and other snacks, tea (leaves or prepared), snacks, sodas and other non-acoholic drinks, sweets and ice-cream, full meals (breakfast, lunch, dinner), local brews, wine, commercial beers, and spirits, kibuku and other local brews, bottled beer, wine and spirits Pulses Peas, beans, lentils and other pulses , groundnuts in shell or shelled, seeds and products from nuts/seeds (excl. cooking oil), cashews, al- monds and other nuts, coconuts (mature/immature) Grains Maize flour, maize grain, millet and sorghum grain, wheat, barley grain and other cereals, millet and sorghum flour, maize (green, cob) Starches Macaroni, spaghetti, buns, cakes and biscuts, Irish potatoes, cassava dry/flour, sweet potatoes, bread, cooking bananas, plantains, cassava fresh, yams and cocoyams, other starches Fruits and vegetables Any vegetable other than cabbage and pumpkin, mangoes, avocadoes and other fruits, citrus fruits (oranges, lemon, tangarines, etc.), ripe bananas Meat and dairy Beef including minced sausage, dried/salted/canned fish and seafood, bacon, goat meat, canned milk/milk powder, chicken and other poul- try, fresh fish and seafood, other domestic/wild meat products, fresh milk, eggs, cheese, yogurt , wild birds and edible insects Tables S6.8-S6.11 show the results of re-estimating the social tax regressions under various different conditions to test robustness. In Table S6.8, we see that the coefficient mag- w w nitudes from using qh as the measure of propensity to buy in bulk (where qh is the average transaction percentile weighted by the share of household expenditure on each purchased item) are effectively unchanged from the main results reported in Table 11 of the main pa- per. Table S6.9 uses level losses, Lh , as a proxy for not buying in bulk. The coefficients are no longer directly comparable to the main results, but the quantitative implications are of a similar magnitude. A one standard deviation increase in Lh (934 TZS for the estimation 65 sample) is associated with a tax rate change of -2.18 percentage points, or 15.1% of the ˜ h , as the proxy for not buying in bulk. mean. Similarly, Table S6.10 uses percentage losses, L ˜ h (about 0.066) is associated with a tax We find that a one standard deviation increase in L rate change of -1.7 percentage points, or 11.8% of the mean. Finally, in Table S6.11 we winsorize the estimated social tax rates at 60% (replace all observations above 60% with 60%—which affects roughly 4% of households). Note that our preferred specification is not winsorized, because even though some tax rates are surely overestimated, some are also sure to be underestimated. There is no reason to believe that upward bias is systematically more likely than downward bias. The estimated magnitudes of the key coefficient in Table S6.11, on “Average transaction percentile,” are uniformly larger than in Table 11 of the the main paper. These additional regressions provide robust support for the conclusions reported in the main body of the paper. 66 Table S6.8: Social tax rate regressed on weighted average transaction quantity percentiles and household characteristics social tax social tax social tax rate, social tax rate, Dependent variable: rate rate adjusted adjusted (1) (2) (3) (4) Average transaction percentile, 9.304** 11.793*** 9.291*** 10.042*** weighted (3.962) (4.095) (3.216) (3.108) Wealth index quartile 2 (=1) 1.601 1.418 (2.122) (1.403) Wealth index quartile 3 (=1) 1.984 1.978 (2.119) (1.415) Wealth index quartile 4 (=1) 0.905 1.178 (3.084) (2.869) Household size -0.331 -0.107 (0.269) (0.223) Age of head (years) -0.021 -0.033 (0.051) (0.038) Head is female (=1) 1.511 -0.765 (1.733) (1.263) Distance to community center (km) -0.256 -0.433 (0.752) (0.619) Distance to market (km) 0.022*** 0.026*** (0.004) (0.003) Observations 1453 1446 1453 1446 R2 0.09 0.10 0.09 0.09 Notes: Authors’ calculations from SHWALITA data. Standard errors in parentheses; standard errors clus- tered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. All regressions include district fixed effects and controls for questionnaire module. The “adjusted” social tax rate is the re-calculated rate without including outgoing meals and snacks (as in the final row of Table 10 in the main paper). 67 Table S6.9: Social tax rate regressed on loss levels and household characteristics social tax social tax social tax rate, social tax rate, Dependent variable: rate rate adjusted adjusted (1) (2) (3) (4) Lh -0.002*** -0.003*** -0.002** -0.002*** (0.001) (0.001) (0.001) (0.001) Wealth index quartile 2 (=1) 1.969 1.675 (2.186) (1.439) Wealth index quartile 3 (=1) 2.496 2.363* (2.136) (1.424) Wealth index quartile 4 (=1) 2.359 2.346 (3.028) (2.846) Household size -0.010 0.151 (0.277) (0.231) Age of head (years) -0.022 -0.033 (0.052) (0.039) Head is female (=1) 1.300 -0.951 (1.717) (1.266) Distance to community center (km) -0.204 -0.360 (0.738) (0.617) Distance to market (km) 0.023*** 0.027*** (0.004) (0.003) Observations 1453 1446 1453 1446 R2 0.10 0.10 0.09 0.09 Notes: Authors’ calculations from SHWALITA data. Standard errors in parentheses; standard errors clus- tered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. All regressions include district fixed effects and controls for questionnaire module. The “adjusted” social tax rate is the re-calculated rate without including outgoing meals and snacks (as in the final row of Table 10 in the main paper). 68 Table S6.10: Social tax rate regressed on loss percentages and household characteristics social tax social tax social tax rate, social tax rate, Dependent variable: rate rate adjusted adjusted (1) (2) (3) (4) ˜h L -22.956** -24.120** -16.087*** -15.842*** (9.064) (9.412) (5.847) (6.043) Wealth index quartile 2 (=1) 1.587 1.411 (2.149) (1.425) Wealth index quartile 3 (=1) 1.954 1.996 (2.122) (1.427) Wealth index quartile 4 (=1) 1.081 1.491 (2.967) (2.781) Household size -0.165 0.046 (0.281) (0.236) Age of head (years) -0.022 -0.032 (0.052) (0.039) Head is female (=1) 1.291 -0.958 (1.729) (1.275) Distance to community center (km) -0.131 -0.306 (0.739) (0.631) Distance to market (km) 0.022*** 0.026*** (0.004) (0.003) Observations 1453 1446 1453 1446 R2 0.09 0.10 0.09 0.09 Notes: Authors’ calculations from SHWALITA data. Standard errors in parentheses; standard errors clus- tered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. All regressions include district fixed effects and controls for questionnaire module. The “adjusted” social tax rate is the re-calculated rate without including outgoing meals and snacks (as in the final row of Table 10 in the main paper). 69 Table S6.11: Winsorized social tax rate regressed on transaction quantity percentiles and household characteristics social tax social tax social tax rate, social tax rate, Dependent variable: rate rate adjusted adjusted (1) (2) (3) (4) Average transaction percentile 11.304*** 12.898*** 12.830*** 13.926*** (3.120) (3.316) (2.554) (2.596) Wealth index quartile 2 (=1) 0.781 0.671 (1.237) (1.032) Wealth index quartile 3 (=1) 0.795 0.961 (1.325) (1.103) Wealth index quartile 4 (=1) -0.815 -1.213 (1.479) (1.233) Household size -0.205 -0.126 (0.174) (0.151) Age of head (years) 0.001 -0.015 (0.031) (0.025) Head is female (=1) 1.199 -0.747 (1.010) (0.767) Distance to community center (km) 0.462 -0.133 (0.539) (0.500) Distance to market (km) 0.021*** 0.025*** (0.003) (0.003) Observations 1453 1446 1453 1446 R2 0.15 0.16 0.16 0.16 Notes: Authors’ calculations from SHWALITA data. Standard errors in parentheses; standard errors clus- tered at village level; *** sig. at 0.01, ** sig. at 0.05, * sig. at 0.1. All regressions include district fixed effects and controls for questionnaire module. The “adjusted” social tax rate is the re-calculated rate without including outgoing meals and snacks (as in the final row of Table 10 in the main paper). 70