WPS7339 Policy Research Working Paper 7339 Small Cash Rewards for Big Losers Experimental Insights into the Fight against the Obesity Epidemic Boris Augurzky Thomas K. Bauer Arndt R. Reichert Christoph M. Schmidt Harald Tauchmann Development Research Group Impact Evaluation Team June 2015 Policy Research Working Paper 7339 Abstract This paper examines the sustainability of weight loss subsequently allocated to two treatment groups offered dif- achieved through cash rewards and, for the first time, the ferent monetary incentives for maintaining the previously potential of monetary incentives to prevent weight cycling. achieved target weight and to a control group. This is the In a three period randomized controlled trial, about 700 first experiment of this kind that finds sustainable effects obese persons were assigned to two treatment groups, which of weight loss rewards on the body weight of the obese were promised different cash rewards contingent on the even 18 months after the rewards were removed. Additional achievement of an individually assigned target weight, incentives to maintain an achieved body weight improve and to a control group. Successful participants were the sustainability of weight loss only while are in place. This paper is a product of the Impact Evaluation Team, Development Research Group. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at areichert@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Small Cash Rewards for Big Losers: Experimental Insights into the Fight against the Obesity Epidemic Boris Augurzky Thomas K. Bauer* Arndt R. Reichert Christoph M. Schmidt Harald Tauchmann JEL Codes: I12, I18, D03, C93 Keywords: field experiment, weight cycling, sustainability, incentives * Boris Augurzky, RWI; Thomas K. Bauer, RWI, Ruhr‐Universität Bochum, IZA Bonn; Arndt R. Reichert, The World Bank, RWI; Christoph M. Schmidt, RWI, Ruhr‐Universität Bochum, IZA Bonn, CEPR London; Harald Tauchmann, RWI, CINCH. – The authors are grateful to the “Pakt für Forschung und Innovation” for funding, and to Rüdiger Budde, Viktoria Frei, Karl‐Heinz Herlitschke, Klaus Höhner, Julia Jochem, Mark Kerßenfischer, Lionita Krepstakies, Claudia Lohkamp, Thomas Michael, Carina Mostert, Stephanie Nobis, Margarita Pivovarova, Gisela Schubert, and Marlies Tepaß for research assistance. We also gratefully acknowledge the comments and suggestions of Alec Brandon, Alfredo Paloyo, one anonymous World Bank Research Group staff, and participants of the following: participants of the research workshop “Empirical Economics” at the Ludwig Maximilian Universität München. We explicitly thank the medical rehabilitation clinics of the German Pension Insurance of the federal state of Baden‐ Württemberg and the association of pharmacists of Baden‐Württemberg for their support of and dedication to this experiment. In particular, we acknowledge the support of Susan Eube, Michael Falentin, Ina Hofferberth, Marina Humburg, Silke Kohlenberg, Thomas Krohm, Max Lux, Tatjass Meier, Monika Reuss‐Borst, Constanze Schaal, and Wolfgang Stiels. – All correspondence to Thomas Bauer, RWI, Hohenzollernstr. 1‐3, 45128 Essen, Germany, E‐Mail: Thomas.Bauer@rwi‐ essen. 1. Introduction People often make individual choices which differ from those that would maximize social welfare and even their own long‐run utility. Monetary incentives that seek to change this kind of behavior have become increasingly popular. In fact, behavioral interventions across a wide range of areas – from contributions to public goods to education and health – nowadays include financial incentive schemes. The emerging literature predominantly concludes that financial incentives for healthy behavior appear to be effective in the short run. Yet, it is unclear whether induced changes disappear or even reverse when incentives are removed. There are two conflicting theories concerning the long‐run effects of cash rewards: The “motivation crowding theory” claims that monetary incentives may reduce helpful behavioral motives, such as intrinsic motivation. Therefore, monetary incentives may even increase unhealthy behavior through, for instance, the signal that changing current behavior is difficult or not attractive. The alternative hypothesis (“habit‐formation”) explains a sustained change in habitual behavior by a positive correlation between past and current consumption, i.e., the development of behavioral automaticity.1 In this paper, we test whether financially induced healthy behavior is sustainable which would provide strong evidence in favor of the habit formation theory (cf., Charness and Gneezy 2009). Using data from a randomized experiment, we investigate the long‐run effects of financial incentives for weight reduction of obese people. Targeting body weight is particularly interesting in the context of habit formation because many obese individuals fail in their weight‐loss attempts and the majority among those who succeed soon regain weight (Crawford et al., 2000). Furthermore, we test whether monetary incentives can make a change in habitual behavior sustainable. Stretching short‐run success may be important for the sustainability of the effects because habit formation may take time. This is especially relevant for eating habits (and perhaps exercise habits) given the widespread phenomenon of weight cycling, i.e., repeated loss and regain of body weight. To examine the impact of financial incentives to sustain behavior change, we estimate the effect of monetary incentives for maintaining a previously achieved target weight on the change of body weight over time. The importance of finding effective means to fight the obesity pandemic is difficult to overestimate. The obesity pandemic is one of the major health problems of developed economies and associated with considerable economic costs. Obesity increases morbidity, reduces life expectancy, and deteriorates life satisfaction (for a comprehensive overview, see Sassi, 2010). Through negative effects on the probability of being employed (Morris, 2007; Reichert 2015) and wages (Han et al., 2009) as well as positive effects on the risk of early retirement (Houston et al.,                                                         1  See Gneezy et al. (2011) for a review of the literature.  2    2008), costs of absenteeism (Cawley et al., 2007) and lifetime health care expenditures (Bhattacharya and Sood, 2011), obesity represents a significant burden for welfare systems. The experiment conducted to address these questions was administered between spring 2010 and summer 2013 and involved 700 participants of four medical rehabilitation clinics. Two randomly assigned treatment groups were first offered EUR 150 (USD 188 in PPP) or EUR 300 (USD 376 in PPP) for achieving an individually assigned contractual target weight loss between 6 and 8 percent within 4 months.2 After completion of the weight‐loss phase, participants who had achieved at least 50 percent of the contractual weight loss were randomly assigned to three experimental groups, again two treatment groups and a control group. Individuals assigned to the treatment groups were promised EUR 250 (USD 313 in PPP) or EUR 500 (USD 627 in PPP) for maintaining a body weight below the target weight 10 months after enrollment. Body weight was measured once more at the end of a 12‐month period following the end of the second intervention. As documented elsewhere (Augurzky et al., 2012), we found strong effects of monetary incentives for weight loss 4 months after the start of the experiment. In this paper, we document that, even though the treatment groups partially regain weight after removal of the incentives, the weight‐loss effects are persistent. This is the first experiment involving monetary rewards that reports lasting effects on body weight. Promising successful “losers” an additional cash reward to keep the healthier body weight also appears to be an effective strategy, at least in the short‐run. While the control group of the second intervention significantly regained weight, both treatment groups were similarly successful in preventing large weight regains, i.e., the “yo‐yo effect”. In the long‐run, i.e., after 12 months, however, these differences were no longer observable. The results are robust against a series of sensitivity tests. For instance, the estimated effects of monetary incentives on weight loss are robust with respect to non‐random sample attrition. Moreover, results remain unchanged when controlling for variables that describe the condition at the control weigh‐in, capturing possible ways of how participants may influence their measured body weight other than through weight loss. Hence, potential strategic behavior of treated participants to achieve their targets is not able to explain the effects. The results of our analysis suggest that motivation crowding does not play a dominant role for the effects of monetary incentives on weight loss because we do not observe that they worsen the degree of obesity in the longer term. This would be expected in the presence of remarkable negative effects on intrinsic motivation once the opposing relative price effect of monetary rewards stops due to incentive removal. The finding of sustainable effects instead provides evidence in favor of the habit formation theory, at least that habit formation dominates motivational effects. However, the results concerning the effects of financial incentives to                                                         2 We use the purchasing power parities exchange rate of 2011 provided by OECD (2012). 3    maintain a previously achieved target weight are not unambiguous. Several explanations are discussed. This paper adds to a small but growing literature on the long‐run effects of monetary incentives to encourage health preventive behavior. Prominently, Charness and Gneezy (2009) report that financial incentives to exercise are successful in creating a positive habit in people who formerly did not regularly exercise. Acland and Levy (2015) find increased exercise levels after the removal of a monetary incentive but show that people eventually give up the acquired habit a few months later. These studies focus on inputs to weight loss as outcome variable as opposed to weight loss itself. The rationale for directly tying monetary rewards to weight loss is to provide individuals the option to choose the means of losing weight, which arguably leads better weight‐ loss results due to the possibility to combine inputs and use of private information to optimize the mix of inputs. Experimental studies on smoking cessation (e.g., Volpp et al. 2006) and weight loss (e.g., Volpp et al., 2008) do not find that monetarily induced lifestyle changes are sustainable in the sense that people exhibit improved behaviors even after incentive removal. We further contribute to the most recent but small literature on financial incentives to sustain healthy behavior. In a randomized experiment, Volpp et al. (2009) examine monetary rewards for completion of a smoking‐cessation program, smoking cessation, and, importantly, continued abstinence from smoking, finding that treated participants were significantly more likely to quit smoking and less likely to relapse. A limitation of the study is that the causal effect of monetary rewards for continued smoking abstinence cannot be separated from the long‐term effects of participation in a smoking‐cessation program and financial incentives for smoking cessation.3 Royer et al. (2015) analyze the effectiveness of a self‐funded commitment contract to improve the lasting effect of monetary incentives for exercise. The participants of the experiment were encouraged to deposit money, which was refundable contingent on the continuation of regular exercise. This approach has been previously examined in a randomized experiment on weight maintenance after substantial weight loss by Kramer et al. (1986). Participants in the treatment group paid a deposit of USD 120, which they were refunded conditional on not regaining weight within one year, and attended several discussion meetings about weight‐ maintenance progress and problems. While Royer et al. (2015) find that deposit contracts produced sustainable effects, Kramer et al. (1986) do not observe significant differences in weight development between the treatment and a control group. Since the participants in Kramer et al.                                                         3 It is evident that smoking cessation programs still have an effect several months after their completion (Zhu et al., 2000; Quist‐Paulsen and Gallefoss, 2003). Similarly, financial incentives for smoking cessation may have lasting effects. Volpp et al. (2006), for instance, find a positiveyet insignificantlong‐term effect of monetary rewards for attendance of a smoking cessation program and for smoking cessation. They cannot reject these effects due to lack of statistical power. 4    (1986) received not only financial incentives but also interacted with each other during discussion meetings, the effect of the deposit alone cannot be identified.4 Against this background, our paper contributes to the existing knowledge on monetary rewards for sustained health‐related behavioral change by singling out their causal effect from other factors that may confound existing results, as in Volpp et al. (2009) and Kramer et al. (1986). In doing so, we concentrate on “carrots” rather than “sticks” as compared to Royer et al. (2015), because carrots seem to be more relevant in the face of a remarkable tendency in modern legal systems to increasingly use carrots (De Geest and Dari‐Mattiacci, 2013). Moreover, we focus on obese individuals, who – compared to already healthy‐weight people – may respond differently to behavioral interventions due to self‐control problems. The remainder of this paper is organized as follows. The subsequent section describes the experimental design and provides some descriptive statistics of the participants. Section 3 discusses the estimation strategy, while Section 4 presents the estimation results. Section 5 concludes. 2. The Experiment 2.1. Experimental Design and Implementation In order to analyze whether monetary incentives are an effective means for obese people to lose weight and sustain their new weight, we conducted an experiment in cooperation with the association of pharmacists of Baden‐Württemberg and four medical rehabilitation clinics operated by the German Pension Insurance of the federal state of Baden‐Württemberg. The project has been funded by the joint initiative for research and innovation (Pakt für Forschung und Innovation), which is part of the excellence initiative of the German government. Obese patients of the four rehabilitation clinics were invited to participate in the experiment in the final week of their rehabilitation stay, which included a weight‐loss program that varied from clinic to clinic. Only patients with a BMI above 30 at admission, an age between 18 and 75 years, and who were a registered as resident in the German federal state of Baden‐Württemberg were eligible to participate in the experiment. Patients with considerable language barriers, psychological and eating disorders, a tumor disease within the last five years, a history of alcohol and drug abuse, and serious general diseases as well as patients who were pregnant have been excluded from the experiment. All participants have been informed about the procedures of the experiment by handouts and clinic personnel gave personal instructions. The study protocol was approved by the ethics commission of the Chamber of Medical Doctors of Baden‐Württemberg. Conditional on the agreement to participate in the experiment, the staff of the rehabilitation clinics conducted baseline measurements of several medical variables of the                                                         4  Further limitations of the study are discussed in Paloyo et al. (2013).  5    patients, such as the body‐mass index (BMI), blood glucose, and cholesterol level. Participants further answered a detailed questionnaire related to their socioeconomic background, additional health outcomes and preventive behavior. Moreover, the physician in charge assigned an individual weight‐loss target the participants were supposed to realize within four months after leaving the clinic. The target was chosen to lie between six and eight percent of the current body weight, which is about the critical threshold associated with beneficial health effects (Vidal, 2002). The experiment consisted of four phases (see Figure 1).5 After the discharge from the clinic (Rehabilitation Stay), participants entered the weight‐loss phase of four months, which was followed by a weight‐maintenance phase of 6 months and a 12‐month follow‐up phase. Two randomizations took place: one at the start of the weight‐loss phase and a second one at the start of the weight‐maintenance phase. The latter was not announced at the beginning of the experiment. Random assignment to the treatment and control groups in the weight‐loss phase took place after the discharge from the clinic. Stratified randomization by the clinics was carried out without replacement within blocks of 51 participants. Based on this randomization procedure, the participants were assigned to one of three groups with equal probability: either the control group or one of the two treatment groups. While members of the control group were not promised to receive any reward for achieving their weight‐loss target, members of the treatment groups were promised to receive up to EUR 150 (henceforth called Group 150) or EUR 300 (henceforth called Group 300).6 All successful participants (irrespective of group assignment in the weight‐loss phase) were randomized again at the start of the weight‐maintenance phase. Participants were considered as successful if their achieved weight loss exceeded 50 percent of the targeted weight loss. Randomization (without replacement and stratification by the clinics) was used to produce three weight‐maintenance experimental groups with equal shares of participants. In this phase of the experiment, two weight‐maintenance premium groups were promised to receive up to EUR 250 (henceforth called Group 250) or 500 (henceforth called Group 500). Participants assigned to the control group were not informed about weight‐maintenance randomization. All participants were told to assure that their weight does not exceed the individually assigned target weight during the weight‐maintenance and follow‐up phases. Members of the premium groups were paid the full bonus if they had reached or even exceeded their weight‐loss target at the end of the respective phase. Once the achieved weight loss exceeded 50 percent, they were rewarded proportionally to the maximum reward. As an example, consider a participant with an initial body weight of 120 kg (264.5 lbs.) and a target weight loss                                                         5  We present a flow chart in Figure A1 in the Appendix.   6 The premium levels and the length of the treatment period are in the range of previous studies. Jeffery (1983), for instance, used premiums of US$ 30, US$ 150, and US$ 300, which correspond in terms of PPP to EUR 54, EUR 272, and EUR 544 in prices of 2011 (converted into present values of EUR based on the US consumer price index and the PPP exchange rate of 2011 provided by OECD (2012)). 6    of 8.4 kg (18.5 lbs.) who loses 6 kg (13.2 lbs.) within four months and is able to maintain the reduced body weight during the weight‐maintenance phase. As a member of the control group in both phases, she receives no premium. As a member of the treatment group in the weight‐loss phase, she obtains EUR 107 (USD 134 in PPP) in the Group 150 and EUR 214 (USD 268 in PPP) if she had been in the Group 300. At the end of the weight‐maintenance phase, she receives another EUR 179 (USD 224 in PPP) and EUR 357 (USD 477 in PPP), depending on being a member of Group 250 or Group 500. In contrast, if she loses only 4.1 kg (9 lbs.), she receives no reward regardless of her group assignment and phase. For a weight loss of 8.4 kg (and maintaining the target weight later on), she receives the whole group‐specific premiums. Participants were informed by regular mail about their maximum possible premium (does not apply to members of the control groups) and of the week they were supposed to attend the weigh‐in at a pharmacy.7 Since the participants spent the intervention periods outside the medical rehabilitation clinic, interactions between participants are very unlikely. Thus, we do not expect a perception of unfairness that may be associated with randomization. Most importantly, control group participants should not be affected by the treatment status of other participants.8 We asked participants with any health complaints throughout the experiment to consult their general practitioner or the rehabilitation clinic. Two weeks prior to the end of each experimental phase, a reminder for the control measurement of the body weight was sent to the participants. The letter contained a questionnaire with the same set of questions on time‐varying variables as the one collected at the initiation of the experiment. In order not to rely on self‐ reported weight, the reminder indicated to the participants a nearby pharmacy for the control measurement. Pharmacies had been called by the project staff beforehand in order to ask for participation. By assigning participants to specified pharmacies, we ruled out that members of the treatment group marched from one pharmacy to the other in order to take advantage of probable measurement errors of the scales, i.e., strategic behavior to achieve their targets. Attrition from the experiment occurred in two ways. First, some participants left the experiment by actively canceling their participation. Second, a larger number did not send the required documents at the end of an experimental phase. To reduce sample attrition, all participants whose documents were still pending three working days after the specified week were contacted by phone. We encouraged them to make up for the weigh‐in and to send in the documents. All participants received EUR 25 (US$ 31 in PPP) if they sent in the documents, regardless of weight‐loss success and group assignment. The premiums were still paid if the date of measurement indicated by the pharmacist was within 14 days after the end of the supposed weigh‐in week.                                                         7 Participants could postpone the date of measurement or move it forward by means of an early phone call. 8 See Angrist and Lavy (2009) for a similar argument in the context of a within‐school randomized trial. 7    2.2. The Participants The recruitment of a total number of 700 participants took place between March 2010 and August 2011. Five individuals had to be excluded from the trial because of a missing consent form, becoming pregnant, developing cancer, and internal documentation problems.9 The last participant finished the follow‐up phase by the end of July 2013. Table 1 provides some descriptive statistics of the study population. The average body weight at the start of the experiment (at the end of the rehabilitation stay) was 113.0 kg (249.4 lbs.) or 37.6 in terms of BMI (Figure A1 shows the distribution of the BMI over time). About 68 percent of the participants are men and 21 percent have a migration background.10 For most patients of the cooperating clinics, medical rehabilitation is paid by the German pension fund, whose predominant goal is to avoid work disability and early retirement. Since there are many obese in the overall population that are already retired, our study population oversamples persons that are available for the labor market. As already noted, four rehabilitation clinics, located in different towns, have been involved in the trial. About 42 percent of the participants were recruited by the clinic in Bad Mergentheim, 33 percent in Bad Kissingen, 18 percent in Isny, and roughly 7 percent in Glottertal. The clinics in Bad Mergentheim and Isny primarily focus on orthopedic interventions, while the clinics in Bad Kissingen and Glottertal are specialized on gastroenterology as well as endocrinology and patients with psychosomatic disorders, respectively. Many participants came to the clinics because of diagnoses other than adiposity although their symptoms are related to their body weight. All participants are medically indicated to lose weight. 3. Hypotheses and Methods In our empirical analysis, we aim to estimate the causal effects of financial incentives on body weight. While short‐term effects are analyzed in detail by Augurzky et al. (2012), the present paper concentrates on the medium‐ and long‐term effects. Two main hypotheses are analyzed: (i) financial incentives for weight loss have an effect on body weight after their removal, i.e., post‐ intervention effects, and (ii) monetary rewards for maintaining a previously achieved target weight prevent weight regain during the intervention and after the intervention has ended. In addition to the percentage change of the body weight of the participants as the main outcome                                                         9 Results are robust with respect to treating these individuals as dropouts in sensitivity checks described in Section 3. 10  These shares are substantially lower than the corresponding averages for obese individuals in Germany, which we obtained using the German Socioeconomic Panel (SOEP) – a representative panel of German households. The mean age of the study population (48 years) lies about ten years below the average age of obese in Germany, while the share of employed participants (82 percent) is almost twice as large. Only the share of married participants does not deviate considerably from the respective share of obese in Germany.  8    variable, we consider a dummy variable which indicates whether the individually assigned target weight is met as the second outcome variable. Concerning the first hypothesis, we examine whether individuals who were exposed to financial incentives during the weight‐loss phase (Group 150 and Group 300) have lost more weight compared to the control group 10 and 22 months after the start of the experiment. If we find significant differences across the experimental groups in weight change between the start of the intervention period and 6 as well as 18 months after the intervention, previous estimates for the short‐run effects of monetary rewards for weight loss appear to be persistent. In order to examine the second hypothesis, we compare the mean weight loss over the weight‐maintenance phase between individuals who were promised rewards for maintaining a previously achieved target weight (Group 250 and Group 500) and the control group. Only individuals who were eligible for randomization prior to the weight‐maintenance phase, i.e., those who successfully reduced their body weight during the weight‐loss phase, are considered. This step of the empirical analysis also aims to address the question of whether the estimated short‐ run effects of Intervention II are sustainable. Therefore, we investigate weight changes between months 510 and between months 522. Note that the longer period allows us to investigate the effects of the two monetary rewards after they have been removed for about one year. We further analyze heterogeneity of the effects of monetary rewards for maintaining a previously achieved target weight across (i) the degree of target weight achievement in the weight‐loss phase and (ii) treatment status in the first intervention. To investigate whether the estimated effects of the monetary incentives vary with the degree of target weight achievement, we conduct the analysis separately for participants who partially and who fully achieved their weight‐loss target in the weight‐loss phase. This analytical step addresses the question of whether it makes a difference for the effectiveness of the second intervention that individuals, who only partially achieved their target weight in the weight‐loss phase, have to continue to lose weight in order to obtain the full premium. Note that these individuals actually receive a hybrid reward: While they receive some money for maintaining the previously achieved body weight, they may obtain some additional money if they achieve their target weight in full. In relation to effect heterogeneity across the previous treatment status, we analyze the impacts of the second intervention separately for the members of the treatment group and the members of the control group of the first intervention. Effect heterogeneity across treatment status in the first intervention enables us to indirectly test the theory of motivation crowding out. Due to the eligibility criterion for the second intervention, all participants who were included in the second randomization are successful losers. That is, members of the control group of the first intervention that are eligible for the second intervention were successful in the absence of extrinsic rewards, i.e., they reduced their body weight based on intrinsic motivation alone. In contrast, successful members of the reward groups of the first intervention (Group 150 and Group 9    300) reduced their body weight based on either intrinsic motivation, extrinsic motivation or a combination of the two. According to the motivation crowding theory, monetary rewards for maintaining the previously achieved body weight should be less effective for members of the control group than for members of the treatment groups of the first intervention. In the latter group, there is simply less intrinsic motivation that can be destroyed by additional extrinsic rewards. Since the present analysis rests on data generated in the course of a randomized trial, simply comparing means across treatment and control groups in principle yields an unbiased estimate of the causal effect because randomization ensures that the different experimental groups differ only in terms of receiving the treatment. To address the sensitivity of our results with respect to random imbalance of individual characteristics and potential strategic behavior of participants to achieve their target, we rely on multivariate ordinary least squares regressions when considering the percentage change of body weight as the main outcome variable, and linear probability models for the binary indicator of whether the target weight has been achieved as alternative outcome variable. As covariates we include age, gender, month of recruitment for the experiment, and variables that relate to the weigh‐in at the pharmacies. Concerning the latter, we asked the pharmacists to indicate whether the participants’ last food intake was more than half an hour or more than two hours ago, whether they were wearing shoes (and if so whether these were heavy), a pullover, long trousers, and whether they attended the control weigh‐in within the specified time. An additional set of dummy variables captures whether participants attended the control weigh‐in prior to the specified date of measurement, within the right week (reference category), two weeks, three weeks or more than three weeks after this date. Variables that describe the condition at the control weigh‐in allow us to capture possible ways of how participants may influence their measured body weight other than through weight loss. This may be particularly relevant for the analysis of monetary rewards for maintaining a previously achieved target weight since, at the end of the second intervention, members of the treatment groups may behave strategically to achieve their target in order to increase their bonus.11 Except for the variables related to the weigh‐in, all variables enter the analysis as pre‐ treatment values. Following a standard approach (e.g., Morris, 2006; Spenkuch, 2012), we deal with missing values in covariates by replacing them with zero and including additional dummy variables indicating missing values. Only the gender of the participants is imputed using predictions from a probit regression of the variable on relevant individual characteristics. Imputation is preferred to excluding observations with missing information because the latter                                                         11 The same argument applies to the first intervention. Yet, here we focus on the effects after the weight‐ loss rewards have been removed, i.e., members of the treatment groups do not have incentives to behave strategically. 10    would reduce the sample size substantially, despite the fact that the share of missing values is rather low for most covariates. When analyzing the first hypothesis, i.e., whether monetary incentives to lose weight have sustainable effects, we have to consider two potential selection problems that may bias our estimation results. The first selection problem may arise because the design of the experiment involves financial incentives at different stages of the experiment. This challenges the isolation of lasting effects of the first intervention because successful participants in the weight‐loss phase were eligible to become members of the treatment group in the weight‐maintenance phase, whereas those who failed to reach the target weight were excluded from the second intervention. Provided that members of the premium groups (i.e., Group 150 and Group 300) were relatively more successful in the weight‐loss phase than the control group, and assuming effectiveness of the second intervention, a simple comparison of weight development across the experimental groups of the first intervention may yield biased estimates. Specifically, these comparisons will most likely exaggerate the effects because they partially capture the effects of both weight‐loss and weight‐maintenance rewards. For this reason, we exclude individuals who are promised monetary rewards for maintaining a previously achieved target weight. This, however, invokes another identification problem. Since eligibility for these rewards is endogenous, a comparison of weight development across the experimental groups of the first intervention may give a disproportionate high weight to individuals who failed to reach the target weight. A simple inverse‐probability weighting estimator as, for instance, the one suggested by Wooldridge (2002), is able to solve this problem. The estimator weights observations in such a way that the original distribution of observations across the experimental groups of the first intervention is restored. Importantly, due to the experimental design, the probability that an individual is excluded from the sample is exogenous, i.e., the ignorability assumption is fulfilled. Moreover, we know the exact conditional selection probability. This mean that we dispose of correct information regarding the weight for each observation or, in other words, as opposed to standard observational studies, we do not need to estimate it. We refer to Section A.1 in the Appendix for a more technical description of the problem and the estimation method. The second selection problems may arise because of non‐random attrition from the experiment. As mentioned in Section 2.1, several participants dropped out of the experiment despite substantial efforts to keep attrition rates low. In detail, from the initial 695 participants 177 dropped out during the weight‐loss phase, another 106 did not attend the weigh‐in at the end of the weight‐maintenance phase, and an additional 96 participants dropped out during the follow‐up phase. If sample attrition was random, our estimates for the effects of the financial incentives would be unbiased. While this seems to be the case when analyzing the second hypothesis because members of the weight‐maintenance premium groups do not have 11    significantly lower attrition rates than members of the control group (see Table 5),12 we are concerned that the termination of participation is endogenous in the first intervention. The reason is that we observe lower cumulative attrition rates 10 and 22 months after the start of the experiment for members of the weight‐loss premium groups relative to members of the control group (see Table 2). This difference seems puzzling as participants have no financial motives for experiment continuation at this stage with weight‐loss rewards being already removed. Before that, however, after 4 months, weight‐loss premium groups had financial motives to stay in the experiment because they would receive a premium if they could proof weight loss success. Indeed, Group 300 did significantly more often continue with the experiment as compared to the control group. The lower attrition rate did apparently carry forward which can be explained by the experimental protocol which involved no longer following up with participants who once had dropped out of the trial. Our experimental results may be biased in the presence of non‐random sample attrition. To address this problem, we use several estimation methods in the analysis of the first hypothesis. The first approach aims to address the selection problem by using self‐reported information on body weight. Individuals with pending documents were called by phone and asked to make up for the weigh‐in at the pharmacy. In the course of the phone call, these participants were also asked about their current body weight. We assume that participants had no financial incentive to misreport body weight, as monetary rewards for weight loss were no longer promised (and participants who belong to the premium groups in the weight‐maintenance phase are excluded from the analysis). Using self‐reported weight for individuals without ordinarily measured weight information substantially increases the number of observations and, by implication, reduces the attrition rate in the estimation of the incentive effects. In addition of using self‐reported body weight, we apply two methods which estimate the treatment effects under extreme assumptions about the distorting effect induced by non‐random sample attrition. The first method, the intention‐to‐treat approach,13 represents best practice in the medical literature. This approach aims to consider all participants in the analysis irrespective of whether they actually dropped out from the trial. More specifically, the method imputes missing information, though no consensus on the imputation algorithm exists (Hollis and Campbell, 1999). Like most medical studies, we make the assumption that the body weight of dropouts remained at the baseline level, i.e., we assume zero reduction in body weight. This assumption is not                                                         12 In order to strengthen our argument for the type of attrition being indifferent between control and treatment groups, we display pre‐treatment covariates for all three experimental groups in the selected sample after attrition in Table A1 in the Appendix. Moreover, point estimates for the second intervention are robust with respect to sample attrition, corroborating that sample selection is no major issue here (results are available upon request). 13 In the present context, rather than selection into treatment, selection into the estimation sample is the relevant problem. Yet, we stick to this terminology as it is standard in the medical literature (see, e.g., Hollis and Campbell, 1999). 12    perfectly consistent with our data, as in any phase of the experiment, there are individuals who reduce weight and individuals who gain weight.14 Finally, we rely on the trimming procedure proposed by Lee (2009) to obtain bounds for the estimated treatment effects under extreme assumptions for the described selection process. This procedure trims the distribution of the outcome variable for the experimental group (treatment or control) that suffers less from sample attrition (that has relatively more participants with information on the outcome variable, i.e., “excess observations”) at the quantile that corresponds to the share of excess observations in this group. Then, the difference in means for the trimmed sample of one group and the not‐trimmed sample of the other group yields the estimated treatment effect bound. When applying this method, we assume that “excess observations” in one group are those with the most favorable and least favorable weight development. This yields a lower and an upper bound of the treatment effect, respectively, depending on whether trimming is from below or above. Following Lee (2009), this procedure provides bounds for the average treatment effect among always compliers. 4. Estimation Results This section first presents results on the medium‐ and long‐run effects of monetary incentives to lose weight before discussing the impact of monetary rewards for maintaining a previously achieved target weight. 4.1. Effect of monetary incentives for weight loss Validity of the randomization procedure Before investigating the effects of the weight‐loss premiums, we give reassurance that the randomization procedure to allocate the participants to the control and the two treatment groups worked properly and that the inverse‐probability weighting estimator restores the original distribution of observations across weight‐loss experimental groups. The upper panel of Table 2 provides an overview of relevant individual characteristics for the population used in the analysis of the first intervention (Column 1) and each experimental group separately (Columns 2‐4). Most variables appear to be balanced between the experimental groups, including body weight at the start of the medical rehabilitation stay and at the baseline measurement. Average target weight loss within the first four months of the trial amounts to 6.5 percent, which is well above the critical threshold for health improvements in the obese of 5 percent (Vidal, 2002). The lower panel of Table 2 describes the percentage weight loss of the different experimental groups during the first four months of the experiment. As previously reported in Augurzky et al. (2012), all experimental groups were able to reduce their body weight. Weight                                                         14 A similar problem applies to the bounds proposed by Horowitz & Manski (2000). 13    loss of the control group may be attributable to lasting effects of the clinic weight‐loss program or the effect of receiving a specified weight‐loss target by a physician. Those individuals who have been offered a premium lost on average significantly more weight than the control group. Likewise, they were more likely to achieve the individually assigned target weight. Group 300 was significantly less likely to drop out of the experiment after four months. The same attrition pattern is observable at the end of the weight‐maintenance and follow‐up phases, motivating the use of methods that deal with non‐random sample selection in sensitivity checks. Body Weight After 10 Months (6 Months After Intervention I Ended) At the end of the weight‐maintenance phase, all experimental groups on average weighed less than at the start of the experiment. As shown in the upper panel of Table 3 (Columns 1‐3), the control group, Group 150, and Group 300 lost about 1.3, 2.5, and 4.2 percent, respectively. Weight loss is only statistically significant in the two treatment groups. The differences between each treatment group and the control group amount to 1.1 and 2.8 percentage points (Columns 4‐5). This corresponds to weight loss of Group 300 (but not of Group 150) was significantly higher than weight loss of the control group. We do not observe any statistically significant difference in weight loss across the two treatment groups. Pooling both treatment groups together yields a difference to the control group of about 2.1 percentage points, which is significant at the 7 percent level (not displayed in the table). Figure 2 shows the distribution of weight loss by experimental group, indicating that the effects of the monetary incentives are not primarily due to a small number of participants with very large changes in body weight. Considering the binary indicator for target weight achievement, the point estimates indicate that both treatment groups were more likely to be successful than the control group. On average, Group 150 and Group 300 had a 5.5 and 3.2 percentage points higher probability to achieve their individual target weight than the control group. Remarkably, the share of successful participants is lower in Group 300 than in Group 150. Note, however, that these differences appear to be statistically insignificant. Body Weight After 22 Months (18 Months After Intervention I Ended) A similar pattern can be observed at the end of the follow‐up phase, i.e., 18 months after the first intervention (lower panel of Table 3). While weight loss diminishes in all experimental groups (Columns 1‐3), weight loss appears to be still larger in the two treatment groups. The difference between Group 150 and the control group remains roughly the same as after 10 months. The difference between Group 300 and the control group increases from 2.8 to 3.3 percentage points and is statistically significant (Column 5). The inter‐incentive‐group differential remains statistically insignificant. The difference between both treatment groups pooled together and the control group amounts to 2.3 percentage points (p‐value of 0.11, not displayed in the table). Figure 3 illustrates again the distribution of weight loss by experimental group. 14    The intergroup differences in the share of participants who achieve their individually target weight increases from month 10 to month 22. The Group 150 and Group 300 now has an 8.4 and 6.5 percentage points higher probability to be successful than the control group. Again, the share of successful participants is lower in Group 300 than in Group 150 but these differences are statistically insignificant. Sensitivity Analysis The estimation results appear to be robust when individual characteristics and variables related to the weigh‐in are taken into account in a multivariate regression. The coefficients for the binary indicator for Group 300 are statistically significant at the 2 and 6 percent level after 10 and 22 months, respectively (Table A2 in the Appendix). Only the coefficient of the indicator for Group 150 turns positive after 22 months. The coefficient of a pooled indicator for both treatment groups is statistically significant after 10 months and insignificant after 22 months (p‐value of 27 percent, results not displayed in the table). The estimation results are also robust concerning the alternative outcome variable “target weight realized”. Results for the three different approaches that deal with non‐random sample attrition are presented in Table 4. Including observations with self‐reported information on body weight into the analysis does not alter the results remarkably (Columns 1‐2 of Table 4). The differences between Group 300 and the control group are significant at the 6 and 2 percent levels after 10 and 22 months, respectively. At 22 months, there is a weight‐loss differential among the two premium groups that is significant at the 8 percent level. Results from simple comparisons of group means are also confirmed if the two treatment groups are pooled together (not displayed in the table). Considering the secondary outcome variable, we find qualitatively the same results at the end of the follow‐up period as before. Yet, at 10 months, both treatment groups are only about as successful as the control group. In general, the intention‐to‐treat analysis also confirms the basic results (Columns 3‐4 of Table 4). The differences in weight loss between Group 300 and the control group after 10 and 22 months are statistically significant. Also Group 150 has a higher average weight loss than the control group. However, the differences after 10 and 22 months are not statistically significant. The differences in weight loss between Group 300 and Group 150 are statistically significant at the 6 percent level after 10 months and at the 11 percent level after 22 months. The intention‐to‐treat analysis yields significant differences for the pooled treatment group after 10 and 22 months (p‐ values of 2 and 7 percent). In contrast, the trimming procedure proposed by Lee (2009) does not unambiguously confirm our previous results (Columns 5‐6 of Table 4) because the conservative lower effect 15    bounds are statistically insignificant.15 Results for the secondary outcome variable are qualitatively the same. Discussion Overall, our results suggest that monetary incentives for weight loss have sustainable effects on the body weight of obese individuals. Even 6 and 18 months after the monetary incentives have been removed, our results show a statistically significant effect of about 2.8 and 3.3 percentage points, respectively, for the EUR 300 reward. The lower reward is also positively related to weight loss. This effect, however, appears to be insignificant at the end of both post‐treatment periods. Furthermore, there is no statistically significant difference between the effects of the higher and the lower reward. If we pool both treatment groups together, we observe a significant treatment effect at significance levels of around 10 percent. These results are robust against a series of robustness checks. Only the estimated Lee‐ bounds do not unambiguously confirm the general finding of lasting effects of financial incentives due to the statistical insignificance of the lower absolute estimated effect bound. Yet, we argue that the lower bound estimate is not an appropriate benchmark for assessing what the true value of the effect might be. It just indicates that an extremely unfavorable scenario about the pattern of experiment drop out (excess observations are those with the most favorable weight development after the intervention has ended) is still consistent with the observed data. Such an unfavorable scenario, however, is difficult to justify because successful members of the reward groups do not have stronger financial incentives to continue the experiment provided that weight loss no longer implies any additional payments. Though the insignificance of the lower bound estimate does not allow for confirming a lasting incentive effect if taking the most conservative perspective possible, it provides little ground for challenging the less conservative approaches discussed above.16 Our finding that the higher reward does not cause significantly more weight loss after 10 and 22 months relative to the lower reward is sensitive with respect to both including covariates and accounting for non‐random sample attrition. Hence, we cannot rule out that lasting effects increase with reward size. It is important to mention that the estimated treatment effects after pooling the two treatment groups together also survive various sensitivity checks.                                                         15  In the case of the lower estimated Lee‐bound, the sign of the coefficient suggests lower weight loss of both treatment groups relative to the control group at the time of the post‐treatment weigh‐ins. In contrast, the statistically significant optimistic effect bounds (not displayed in the table) point to the possibility of a highly increased weight loss in both treatment groups relative to the control group. For instance, the estimated upper absolute Lee‐bound indicates an increased weight loss by 2.1 and 5.1 percentage points due the EUR 150 and EUR 300 reward, respectively.  16   Corroborative findings from Lee‐bounds would have made a very strong case for the robustness of our results against sample attrition due to the extreme underlying scenarios. We therefore included the analysis in our pre‐protocol, consequently showing the results in the paper.    16    The results for the first intervention tend to support the habit formation theory. Even though treated participants slightly regain weight after incentives are removed, we observe a lasting positive effect on weight loss if compared to the baseline weight. The development of a behavioral automaticity that operates against the general tendency of weight regain is best able to explain the results. In turn, our results argue against monetary rewards crowding out intrinsic motivation in the present study population. It may well be that motivational effects are also present, but they seem not to be large enough to notably oppose beneficial effect of developed habits. Our findings are in line with those obtained by Charness and Gneezy (2009). They show that financial incentives to exercise have lasting positive effects, arguing that there is scope for monetary intervention in health‐related habit formation. While previous experimental studies on financial incentives for weight loss, such as Volpp et al. (2008), do not find evidence of a backfire effect either, we are the first to report a positive effect of monetary rewards for weight loss after the intervention has ended. 4.2. Effect of monetary incentives to maintain a previously achieved target weight Validity of the randomization procedure The upper panel of Table 5 presents descriptive statistics for the study population used in the analysis of the second intervention (Column 1) and each experimental group (Columns 2‐4). Except the indicator for the clinic in Bad Mergentheim, all variables appear to be balanced between the different experimental groups. Most importantly, body weight at the start of the medical rehabilitation stay, at the start of the first intervention and after 4 months is uncorrelated with treatment (see middle panel of Table 5). The average participant in the second intervention lost about 7.3 percent of the original weight during the weight‐loss phase. About 59 percent of the participants achieved their individually assigned target weight after 4 months. In the lower panel of Table 5, we show the average attrition rates in our sample. Attrition among participants of the second intervention is lower than among participants of the first intervention (see Table 2). Yet, we do not observe any structural attrition pattern for the second intervention, i.e., the treatment groups are not significantly more likely to comply than the control group. For this reason, we abstain from an extensive discussion of the sensitivity of the results for the second intervention with respect to sample attrition (see Section 3). Body Weight After 10 Months (Period of Intervention II) While members of the control group had reduced their body weight by 7.7 percent during the weight‐loss phase (Table 5), they significantly regained about 2.8 percent during the weight‐ maintenance phase (see Table 6, Column 1). The treatment groups, which similarly reduced their body weight in the first four months, in contrast, did not significantly regain weight during the 17    intervention period (Table 6, Columns 2‐3). Group 250 slightly lost further weight and Group 500 regained only roughly 0.8 percent. These weight changes translate into significant differences between both treatment groups and the control group (Table 6, Column 4‐5). Weight change in Group 250 and Group 500 was 2.9 and 2.0 percentage points more favorable than in the control group. There is no statistically significant difference between the two treatment groups. As a matter of course, the difference in weight change between both treatment groups pooled together and the control group is significant. Figure 4 displays the distribution of weight change during the intervention period by experimental group. Concerning the secondary outcome, we find that Group 250 and Group 500 were about 16 and 18 percentage points more likely to maintain their weight‐loss target than the control group, respectively. Hence, Group 500 was, on average, about 2 percentage points more likely to be successful than Group 250. While the differences to the control group are statistically significant, we do not find a significant difference among the two reward groups. Body Weight After 22 Months (12 Months after Intervention II Ended) At the end of the follow‐up period, pronounced weight regain appears in all three experimental groups (lower panel of Table 6). Within 18 months after successful weight loss, the average participant of the second intervention regained about 4 percent of previously lost body weight. Nevertheless, average weight loss throughout the entire experiment still amounts to more than 3 percent (compare Table 5 with Table 6). After 22 months, we no longer observe statistically significant differences across the different experimental groups. Figure 5 shows the distribution of weight change at the end of the follow‐up phase by experimental group. Regarding the share of participants that confirmed their target weight after 22 months, we do not find any statistically significant differences between experimental groups either. Sensitivity Analysis Taking into account individual characteristics and variables related to the weigh‐in, a multivariate regression confirms the previous results (Appendix Table A3). A single exception is that the control group and the Group 250 no longer exhibit a significant difference in the share of participants who maintain their target weight at the end of the intervention period. This points toward strategic behavior adopted by treated participants to achieve their targets. Discussion Overall, our empirical results suggest that monetary incentives for maintaining a previously achieved target weight have a temporary effect on the body weight of obese individuals. During the intervention period, members of the two treatment group regained about 2 percentage points less body weight than the control group. The effects of the monetary incentives on the likelihood to realize the target weight exceed 15 percentage points. In sum, monetary incentives appear to 18    be effective to prevent weight regain in the short‐run. The higher reward did not prove to be more effective than the lower reward. One interpretation is that participants reach their upper bound on effort at EUR 250. Due to its clearly temporary nature, we attribute the effectiveness of the rewards first and foremost to the standard price effect, which makes weight regain less attractive. We do not rule out that the second intervention causes participants to improve healthy behavior acquired during the preceding weight‐loss phase through, for instance, learning effects. Improved behavioral automaticity arguably affects body weight in the same direction as the relative price effect. Evidently, the relative price effect, perhaps backed by the beneficial effects of continuous habit formation, dominates motivational effects. Given our previous result of monetary rewards for weight loss inducing lasting effects, the finding of significant effects of monetary incentives for maintaining a previously achieved target might seem a puzzle. However, despite these lasting effects, even treated participants regained some body weight once weight‐loss rewards have been removed. Moreover, the populations of the two interventions do not perfectly overlap. In fact, we do not condition on merely being exposed to financial incentives in the weight‐loss phase but on weight‐loss success. Within 12 months after the intervention period, the effects of monetary incentives for maintaining a previously achieved target weight vanish. Importantly, however, incentivized participants are not worse off at the end of the follow‐up phase than participants of the control group. Hence, results confirm our findings from the first intervention that there is no complete motivation crowding out due to monetary incentives in the present study population, i.e., extrinsic rewards do not backfire. Nevertheless, the absence of lasting effects of rewards for maintaining a previously achieved target weight seems to be at odds with the habit formation theory which was confirmed in the analysis for the first intervention. The finding that only monetary incentives for weight loss formed healthy habits may be explained by the sequential nature of the two interventions. Successful participants make their experience with physical exercise and healthy diets during the weight‐loss phase. Some of them already have developed some behavioral automaticity at the time that the second intervention begins, while others may have not. In the subsequent phase, there may well be a margin for behavioral changes for both those who had previously adopted behavioral patterns and those who have not. In either case, such behavioral changes are likely induced by hook or by crook and, therefore, are not sustainable. In other words, after the weight reduction phase is completed, the chance for permanent behavioral change is either seized or missed. An alternative explanation is based on the interplay between the theories of habit formation and motivation crowding out. One may argue that a developed behavioral automaticity may countervail negative motivational effects of extrinsic rewards, which only – or more strongly – arise in the second intervention. Monetary rewards for maintaining a previously achieved target 19    weight, as opposed to the weight‐loss rewards, were not announced in advance. Participants who were promised the rewards in the weight‐maintenance phase had all been successful in the previous four months and were most likely proud of their achievement, irrespective of prior group membership. Against this background, monetary rewards during the weight‐maintenance phase may signal that weight maintenance is even more difficult than weight loss and, therefore, reduce initial motivation to stay thinner.17 While the relative price effect overcompensates the negative effect of the rewards on intrinsic motivation during the weight‐maintenance phase, weight regain occurs once the incentives are removed. The developed behavioral automaticity may prevent participants from being worse off due to the second intervention. The main argument for the deceptive contradiction that there are only negative motivational effects of the monetary rewards for maintaining a previously achieved target weight is that weight‐loss rewards may be perceived as supportive while rewards in the weight‐maintenance phase are perceived as rather controlling, impairing self‐determination and self‐esteem. Explaining the results by the interplay between both theories reflects that our experiment neither provides ultimate evidence in favor of the one nor against the other. 4.3. Analysis of Effect Heterogeneity of Intervention II The Role of Target Weight Achievement at the End of the Weight‐loss Phase The design of the experiment allows us to investigate whether monetary incentives for weight maintenance for previously fully successful participants are as effective as monetary incentives to further reduce the body weight for previously partially successful participants. In order to answer this question, Table 7 presents the effects of the second intervention separately for participants who fully achieved their target weight (Columns 1‐2) and participants who partially achieved their target weight at the end of the weight‐loss phase (Columns 3‐4). The effects at the end of the weight‐loss phase and follow‐up phase do neither significantly differ across the two subgroups for the primary nor for the secondary outcome variable. This indicates that there is no effect heterogeneity across the degree of target weight achievement after the weight‐loss phase. Interestingly, during the intervention period, the monetary incentives did not significantly increase the likelihood to achieve the target weight of participants who had only partially achieved it in the previous period. Note that these results are not sensitive with respect to controlling for individual characteristics and variables that relate to the weigh‐in (see Appendix Table A3). Moreover, they are also robust towards excluding observations with the largest and lowest 2.5 percent weight change in each subsample (results available upon request), which indicates that results are not driven by outliers even though the estimation samples are small.18                                                         17 See Gneezy et al. (2011) for a similar argumentation. 18 The positive point estimates for the effect of the two rewards on percentage change in body weight at the end of the follow‐up period for participants who had only partially achieved the target weight in the 20    The Role of Group Membership in the Weight‐loss Phase Table 8 reports our estimation results on the effects of the second intervention separately for those who have been member of the treatment groups (Columns 1‐2) and those who have been member of the control group in the first intervention (Column 3‐4). The two subpopulations differ with respect to the degree of intrinsic motivation for weight loss that they had achieved in the first four months of the experiment. The estimation results suggest that monetary incentives for maintaining the target weight highly affect weight development of previously incentivized participants during the intervention period (upper panel of the Table 8). Effects on both outcome variables are statistically significant. Participants who were not previously incentivized, in contrast, do not exhibit a significant response to financial incentives for maintaining a target weight. Group 500 even gains more weight than the control group during the intervention period, i.e., the EUR 500 reward worsens weight development in this particular subgroup. Even though this difference is insignificant, which may be explained by the small number of observations, it may be interpreted to reflect motivation crowding out. Moreover, we observe statistically significant heterogeneity in the effect of the EUR 500 reward across weight‐loss experimental groups, which is indirect evidence of the importance of intrinsic motivation for weight loss. The results discussed in this subsection are robust with respect to controlling for individual characteristics and variables that relate to the weigh‐in. Moreover, the results are not sensitive with respect to excluding observations with the largest and lowest 2.5 percent weight change in each subsample (results available upon request). The effects for the post‐intervention period are all statistically insignificant, confirming the absence of lasting effects of the second intervention. We find no longer relevant effect heterogeneity across weight‐loss experimental groups at the end of the follow‐up‐period (lower panel of Table 8). This finding is sensitive to covariate adjustment. 5. Conclusion This paper presents unique evidence from a large randomized experiment to answer two main research questions. First, do effects of monetary rewards for weight loss persist after incentives are removed? Second, are financial incentives promised for the maintenance of a reduced body weight able to prevent weight cycling? Our work adds to a growing literature on the longer run effects of monetary incentives to encourage health preventive behavior and generates novel knowledge about coupling financial rewards with sustained health‐related behavioral change. The study is motivated by an increased popularity of using monetary incentives in the design of policy interventions across a wide range                                                         previous period (lower panel of the table, Columns 3‐4) are sensitive with respect to both covariate adjustment and trimming. 21    of areas such education and health. While there is generally too little knowledge on effective interventions to improve public health, this is particularly evident for obesity, the public health challenge of our time. Finding effective means to fight the obesity pandemic represents an urgent need for many obese individuals who fail in their weight loss attempts, and for welfare systems around the globe that are overloaded with costs attributable to obesity. The experiment involved 700 obese participants of four medical rehabilitation clinics who were randomly assigned to three experimental groups. The control group and two treatment groups were offered EUR 0, EUR 150 or EUR 300 for achieving an individually assigned contractual target weight loss. Successful participants were eligible for a subsequent intervention that was not announced in advance. In this second intervention, participants were offered incentives amounting to EUR 0, EUR 250, and EUR 500 for maintaining a body weight below the target weight. The body weight of the participants was measured 4, 10, and 22 months after the start of the experiment. Our results suggest that monetary rewards for weight loss have sustainable effects on the body weight of obese individuals. In a previous study, we show that participants did achieve weight loss by not snacking between meals and using the stairs instead of the elevator (Augurzky et al., 2012). The effect of the EUR 300 reward amounts to roughly 2.8 and 3.3 percentage points 6 and 18 months after the intervention period, respectively. This corresponds to a weight reduction of about 1.1 and 1.3 BMI points compared to a baseline of 38.2 points. Based on lower‐ bound estimates of the direct as well as indirect costs of morbidity and mortality attributable to obesity, the effect of the EUR 300 reward would need to persist about another 12 months, i.e., 30 months in total, to yield social net savings.19 There is no statistically significant difference between the effects of the higher and the lower reward. Previous studies did not find significant post‐intervention effects. One possible explanation for the success of our intervention is the absence of frequent feedback by the experimenter. In the experiments of Volpp et al. (2008) and John et al. (2010), for instance, members of the treatment groups had to weigh themselves every day and call in their weight to the project staff. Moreover, they received daily feedback about weight‐loss progress. By the end of the intervention period, this elevated attention stopped and treated participants may have felt alone with their weight problem.                                                         19 Konnopka et al. (2011) estimate that obesity and overweight caused EUR 9,873 million in Germany in 2002, which is in line with results for other countries (Dee et al. 2014) but likely to be a underestimate (Cawley and Meyerhoefer 2012). Under the lower‐bound assumption that three fourths of these costs are attributable to obesity (cf., Tsai et al. 2010) and using 10.6 million obese people as basis of calculation, the costs per obese individual amount to about EUR 928. In our study, the average participant needs to lose roughly 7.6 BMI points to fall below the obesity threshold (Table 2), which implies a yearly cost reduction of EUR 122 per BMI unit if we assume a linear relationship between BMI and costs in our sample. 22    Our empirical analysis further shows that monetary incentives for maintaining a previously achieved target weight have a temporary effect of more than 2 percentage points on the body weight of obese individuals. Again, the higher incentive appears not to be more effective than the lower reward. However, the effects vanish within 12 months after incentives to prevent weight cycling are removed. This is the first large‐scale experimental study that presents results for weight‐maintenance rewards. Overall, the results of the experiment suggest that habit formation dominates potential negative motivational effects of monetary incentives, i.e., negative effects of such incentives on intrinsic motivation may be existent but not visible. One explanation for the absence of lasting effects of the weight‐maintenance rewards is that the negative motivational effects, which are arguably stronger in the second intervention because rewards were not announced in advance and therefore perceived as more controlling, offset beneficial effects of a developed behavioral automaticity. The result of our experiment further suggests that the effects of monetary rewards are heterogeneous across initial motivation for weight loss. Separate estimations for weight‐loss treatment and control group members yield that, during the period of the second intervention, the reward effect is significantly larger among previously incentivized participants. To the best of our knowledge, this is the first – though indirect – evidence for the importance of intrinsic motivation for weight loss. The mere observation of a backfire effect of extrinsic rewards may be attributable to other factors than intrinsic motivation, such as the utilization of measures to lose weight that solely focus on short term success. Our experiment shows that financial incentives can have positive and sustainable effects on weight reduction of obese individuals and, hence, may be an effective measure to fight the obese pandemic. According to our results, however, it appears to be important to announce monetary rewards well in advance. Different to the effect of monetary incentives to lose weight, we are not able to provide a clear answer concerning the effects of financial incentives to sustain a certain body weight. At the end of the experiment, individuals who received both incentives for weight loss and incentives to maintain a lower body weight are neither better nor worse off than individuals who received only a weight‐loss reward. A considerable limitation of our study – as most field experiments – is related to the external validity of the results. Our participants may not be representative because experiment participation was voluntary. Indeed, we do not know to what extent our results are transferable to the general obese population. The effects on intrinsic motivation may, for example, be very different in the general population. However, most policy interventions consist of programs that involve voluntary participation, too. Hence, we provide estimates for a subpopulation that is arguably more interesting from the perspective of a policy maker. We are also concerned about general equilibrium effects, which threaten the success of monetary interventions. It may well be 23    that a large‐scale program might result in some people gaining weight in order to qualify for payments. We advocate a larger scale implementation of a similar program in combination with a scientific evaluation. 24    References Acland, D. and Levy, M. (2015): Naiveté, Projection Bias, and Habit Formation in Gym Attendance, Management Science 61(1): 146‐160. Augurzky, B., Bauer T. K., Reichert, A. R., Schmidt, C. M. and Tauchmann, H. (2012): Does Money Burn Fat? ‐ Evidence from a Randomized Experiment, Ruhr Economic Papers #368. Bhattacharya, J., Bundorf, M. K., Pace, N. and Sood, N. (2011): Does health insurance make you fat?, in M. Grossmann and N. H. Mocan (eds), Economic Aspects of Obesity, 1 edn, University of Chicago Press, chapter 2, pp. 35–64. Cawley, J. and Meyerhoefer, C. (2012). The medical care costs of obesity: An instrumental variables approach, Journal of Health Economics 31(1): 219–230. Cawley, J., Rizzo, J. A. and Haas, K. (2007): Occupation‐Specific Absenteeism Costs Associated With Obesity and Morbid Obesity, Journal of Occupational and Environmental Medicine 49(12): 1317–1324. Charness, G. and Gneezy, U. (2009): Incentives to Exercise, Econometrica 77(3): 909–931. Crawford, D., Jeffery, R.W., and French, S. A. (2000). Can anyone successfully control their weight? Findings of a three year community‐based study of men and women, International Journal of Obesity 24(9): 1107–1110. Dee, A., Kearns, K., O'Neill, C., Sharp, L., Staines, A., O'Dwyer, V., Fitzgerald, S., and Perry, IJ. (2014). The direct and indirect costs of both overweight and obesity: a systematic review. BMC Res Notes 7(242): 1–9. De Geest, G., and Dari‐Mattiacci, G. (2013): The Rise of Carrots and the Decline of Sticks, University of Chicago Law Review 80(1): 341392. Gneezy, U., Meier S., and Rey‐Biel, P. (2011): When and Why Incentives (Don`t) Work to Modify Behavior, Journal of Economic Perspectives 25(4): 191‐210. Han, E., Norton, E. C. and Stearns, S. C. (2009): Weight and wages: fat versus lean paychecks, Health Economics 18(5): 535–548. Heckman, J. J. (1976): The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models, Annals of Economics and Social Measurement 5(4): 475–492. Heckman, J. J. (1979): Sample selection bias as a specification error, Econometrica 47(1): 153161. Hollis, S. and Campbell, F. (1999): What is meant by intention to treat analysis? Survey of published randomised controlled trials, British Medical Journal 319(7211): 670–674. Horowitz, J. L. and Manski, C. F. (2000): Nonparametric analysis of randomized experiments with missing covariate and outcome data, Journal of the American Statistical Association 95: 7784. 25    Houston, D. K., Cai, J. and Stevens, J. (2008): Overweight and Obesity in Young and Middle Age and Early Retirement: The ARIC Study, Obesity 17(1): 143–149. Konnopka, A., Bödemann, M., and König, H.‐H. (2010). Health burden and costs of obesity and overweight in Germany. The European Journal of Health Economics 12(4): 345–352. Kramer, F. M., Jeffery, R. W., Snell, M. K. and Forster, J. L. (1986): Maintenance of successful weight loss over 1 year: effects of financial contracts for weight maintenance or participation in skills training. Behavior Therapy 17(3): 295–301. Lee, D. S. (2009): Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects, Review of Economic Studies 76(3): 1071–1102. Morris, S. (2007): The impact of obesity on employment, Labour Economics 14(3): 413–433. OECD (2012). Statistics, Organisation for Economic Co‐operation and Development. Accessed on June 8, 2012. URL: http://stats.oecd.org Paloyo, A. R., Reichert, A. R., Reinermann, H. and Tauchmann, H. (2013): The Causal Link Between Financial Incentives and Weight Loss: An Evidence‐Based Survey of the Literature, Journal of Economic Surveys. doi: 10.1111/joes.12010. Quist‐Paulsen, P. and Gallefoss, F. (2003): Randomised controlled trial of smoking cessation intervention after admission for coronary heart disease, British Medical Journal 327(7426): 1254. Reichert, A. (2015): Obesity, Weight Loss, and Employment Prospects – Evidence from a Randomized Trial, Journal of Human Resources, Forthcoming. Royer, H., Stehr, M., and Syndor, J. (2015): Incentives, commitments and habit formation in exercise: evidence from a field experiment with workers at a Fortune‐500 Company. American Economic Journal: Applied Economics, Forthcoming. Sassi, F. (2010): Obesity and the Economics of Prevention: Fit not Fat, OECD Publishing, Paris. Spenkuch, J. (2012). Moral hazard and selection among the poor: Evidence from a randomized experiment, Journal of Health Economics 31(1): 72–85. Tsai, A. G., Williamson, D. F., and Glick, H. A. (2011). Direct medical cost of overweight and obesity in the USA: a quantitative systematic review. Obesity Reviews, 12(1): 50–61. Volpp K. G., Gurmankin, L. A., Asch, D.A, Berlin, J. A., Murphy, J. J., Gomez, A., Sox, H., and Zhu, J. (2006): A randomized controlled trial of financial incentives for smoking cessation. Cancer Epidemiology, Biomarkers & Prevention 15(1):1218. Volpp, K., John, L., Troxel, A., Norton, L., Fassbender, J. and Loewenstein, G. (2008): Financial incentive‐based approaches for weight loss. Journal of the American Medical Association 300(22): 2631–2637. Volpp, K. G., Troxel, A. B., Pauly, M. V., Glick, H. A., Puig, A., Asch, D. A., Galvin, R., Zhu, J., Wan, F., DeGuzman, J., Corbett, E., Weiner, J., and Audrain‐McGovern, J. (2009): A Randomized Controlled Trial of Financial Incentives for Smoking Cessation, New England Journal of Medicine 360(7):699709. 26    Wooldridge, J. M. (2002): Inverse probability weighted M‐estimators for sample selection, attrition and stratification, Portuguese Economic Journal 1(2), 117–139. Zhu, S.‐H., Melcer, T., Sun, J., Rosbrook, B., and Pierce, J.P. (2000): Smoking cessation with and without assistance: A population‐based analysis, American Journal of Preventive Medicine 18(4): 305311. 27    Figure 1: Experimental Design 28    .08 .06 Estimated Kernel Density .02 .040 -25 -15 -5 5 15 Percentage Change in BMI (Mean Dashed) Control EUR 150 EUR 300 Figure 2: Distribution of Percentage Change in Body Weight by Experimental Groups of Intervention I (Months 0‐10) Notes: Inverse‐probability weights used for the estimation of kernel densities. 29    .08 .06 Estimated Kernel Density .02 .040 -25 -15 -5 5 15 Percentage Change in BMI (Mean Dashed) Control EUR 150 EUR 300 Figure 3: Distribution of Percentage Change in Body Weight by Experimental Groups of Intervention I (Months 0‐22) Notes: Inverse‐probability weights used for the estimation of kernel densities. 30    .15 Estimated Kernel Density .05 0 .1 -25 -15 -5 5 15 25 Percentage Change in BMI (Mean Dashed) Control EUR 250 EUR 500 Figure 4: Distribution of Percentage Change in Body Weight by Experimental Groups of Intervention II (Months 5‐10). 31    .06 Estimated Kernel Density .02 0 .04 -25 -15 -5 5 15 25 Percentage Change in BMI (Mean Dashed) Control EUR 250 EUR 500 Figure 5: Distribution of Percentage Change in Body Weight by Experimental Groups of Intervention II (Months 5‐22)   32    Table 1: Socioeconomic Background of the Study Population and the Obese in Germany Patients of the Representative four Obese in Study Population rehabilitation Germany (BMI clinics ≥30) Female (%) 32.23 34.17 39.98 Age (years) 48.11 49.69 57.11 Married (%) 61.03 71.37 62.23 Resident of Baden‐Württemberg (%) 100 94.99 11.84 Natives (%) 78.89 82.67 86.30 Full‐time employed◊,+ (%) 69.44 76.12 34.85 Part‐time employed◊,+ (%) 9.04 11.01 14.27 Unemployed○,+ (%) 13.20 8.23 6.90 Notes: Statistics relating to the patients of the four rehabilitation clinics are weighted averages. As the clinics’ weights, we use the shares of participants recruited by the clinics. ◊The remaining observations among those who report to be employed are marginally employed (2.15 percent) or have not provided information on the type of employment (1.72 percent). +Here, we distinguish between the unemployed and the not‐employed (4.45 percent). ○The categories full‐time employed, part‐time employed, marginally employed, no information on type of occupation, unemployed, and not‐employed add up to one. Source: Own data collection, German Federal Pension Fund (year 2011), German Socio‐economic Panel (SOEP, year 2011). 33    Table 2: Descriptive Statistics by Weight‐loss‐premium Groups (Mean Values, Inverse‐probability Weighting) (1) (2) (3) (4) All Control EUR 150 EUR 300 Pre‐treatment Values BMI Before Rehab 38.948 38.574 38.749 39.547 Baseline BMI 37.583 37.222 37.330 38.231 Target Weight Loss (Percent) 6.501 6.436 6.601 6.473 Bad Kissingen 0.336 0.290 0.367 0.356 Bad Mergentheim 0.421 0.453 0.394 0.413 Isny 0.053 0.057 0.062 0.040 Glottertal 0.190 0.200 0.177 0.191 Female 0.336 0.249 0.332 0.436** Age (years) 47.080 47.686 46.084 47.422 Native 0.796 0.810 0.767 0.810 Married 0.638 0.679 0.619 0.614 Post‐treatment Values Percentage Change Body Weight (Months 0‐4) ‐4.489 ‐2.998 ‐5.135** ‐5.260** Target Weight Realized After 4 Months 0.337 0.214 0.370** 0.418** Total Dropout Rate After Month: 4 0.254 0.314 0.283 0.160**°° 10 0.420 0.498 0.451 0.302**°° 22 0.566 0.624 0.588 0.480** # of Observations (Unweighted) 489 192 158 139 Notes: ** deviation from control group significant at 5%, * significant at 10%; °° deviation from EUR 150‐group significant at 5%, ° significant at 10%; standard deviations omitted because of most variables being binary. ‘Bad Mergentheim’, ‘Bad Kissingen’, ‘Isny’, and ‘Glottertal’ refer to the locations of the four rehabilitation clinics. 34    Table 3: Mean Comparison Across Weight‐loss‐premium Groups (Inverse‐probability Weighting) Experimental Groups ∆ to Control Control EUR 150 EUR 300 EUR 150 EUR 300 (1) (2) (3) (4) (5) Months 0‐10 Percentage Change in Body ‐1.343 ‐2.468** ‐4.155** ‐1.125 ‐2.812** Weight (0.926) (0.895) (0.879) (1.288) (1.277) 0.203** 0.258** 0.236** 0.055 0.032 Target Weight Realized (0.060) (0.065) (0.057) (0.088) (0.083) # of Observations 85 70 79 ‐ ‐ (Unweighted) Months 0‐22 Percentage Change in Body ‐0.137 ‐1.202 ‐3.472** ‐1.065 ‐3.335** Weight (1.194) (1.271) (1.135) (1.744) (1.648) 0.174** 0.258** 0.239** 0.084 0.065 Target Weight Realized (0.065) (0.074) (0.066) (0.098) (0.093) # of Observations 64 53 59 ‐ ‐ (Unweighted) Notes: ** significant at 5%, * significant at 10% ; °° difference between premium groups significant at 5%, ° significant at 10%; S.E.s for estimated means and for coefficients in parentheses. All coefficients are obtained by inverse‐probability weighting OLS, regressing the respective outcome variable on the dummy variables indicating the two premium groups. 35    Table 4: Attrition‐robust Effects of Weight‐loss Premiums (Inverse‐probability Weighting) Self‐reported Weight Intention‐to‐Treat # Lee‐Bounds EUR 150 EUR 300 EUR 150 EUR 300 EUR 150 EUR 300 (1) (2) (3) (4) (5) (6) Months 0‐10 Percentage Change ‐0.760 ‐2.137* ‐0.680 ‐2.225**° 0.334 0.391 in Body Weight (1.100) (1.145) (0.702) (0.807) (1.155) (0.869) Target Weight 0.007 ‐0.005 0.040 0.062 ‐0.014 ‐0.203 Realized (0.076) (0.074) (0.050) (0.053) (0.091) (0.036 # of Observations 305 488 488 (Unweighted) Months 0‐22 Percentage Change ‐0.757 ‐3.394**° ‐0.443 ‐1.754** 0.464 0.462 in Body Weight (1.506) (1.461) (0.693) (0.770) (1.969) (1.175) Target Weight 0.052 0.055 0.041 0.059* 0.013 ‐0.174 Realized (0.084) (0.083) (0.043) (0.046) (0.113) (0.040) # of Observations 215 488 488 (Unweighted) # Notes: Lower Absolute Lee‐Bound; ** significant at 5%, * significant at 10% (one‐sided test); °° difference between premium groups significant at 5%, ° significant at 10%; S.E.s for estimated means in parentheses. 36    Table 5: Descriptive Statistics by Weight‐maintenance‐premium Groups (Mean Values) All Control EUR 250 EUR 500 (1) (2) (3) (4) Pre‐treatment Values BMI Before Rehab 38.661 38.768 38.940 38.278 Baseline BMI 37.324 37.308 37.725 36.944 Target Weight Loss (Percent) 6.424 6.403 6.446 6.425 Bad Kissingen 0.305 0.308 0.282 0.327 Bad Mergentheim 0.389 0.404 0.388 0.375* Isny 0.058 0.029 0.058 0.087 Glottertal 0.248 0.260 0.272 0.212 Female 0.305 0.337 0.291 0.288 Age (years) 47.981 47.010 48.806 48.135 Native 0.800 0.832 0.788 0.782 Married 0.667 0.700 0.670 0.630 Values After Intervention I BMI after 4 Months 34.585 34.454 34.968 34.338 Percentage Change Body Weight (Months 0‐4) ‐7.316 ‐7.709 ‐7.180 ‐7.059 Target Weight Realized after 4 Months 0.585 0.558 0.583 0.615 Post‐treatment Values Total Dropout Rate after Month: 10 0.154 0.183 0.126 0.154 22 0.347 0.394 0.301 0.346 # of Observations 311 104 103 104 Notes: ** deviation from control group significant at 5%, * significant at 10%; °° deviation from EUR 150‐group significant at 5%, ° significant at 10%; standard deviations omitted because of most variables being binary. ‘Bad Mergentheim’, ‘Bad Kissingen’, ‘Isny’, and ‘Glottertal’ refer to the locations of the four rehabilitation clinics. 37    Table 6: Mean Comparison Across Weight‐maintenance‐premium Groups Experimental Groups ∆ to Control Control EUR 250 EUR 500 EUR 250 EUR 500 (1) (2) (3) (4) (5) Months 5‐10 Percentage Change in Body 2.800** ‐0.130 0.792 ‐2.930** ‐2.008** Weight (0.730) (0.571) (0.946) (0.940) (0.946) 0.353** 0.511** 0.534** 0.158** 0.181** Target Weight Realized (0.052) (0.053) (0.053) (0.075) (0.075) # of Observations 85 90 88 ‐ ‐ Months 5‐22 Percentage Change in Body 4.157** 3.650** 4.231** ‐0.507 ‐0.074 Weight (0.935) (0.958) (1.037) (1.391) (1.410) 0.333** 0.333** 0.324** ‐9*10‐17 ‐0.010 Target Weight Realized (0.060) (0.056) (0.057) (0.082) (0.083) # of Observations 63 72 68 ‐ ‐ Notes: ** significant at 5%, * significant at 10%; °° difference between premium groups significant at 5%, ° significant at 10%; S.E.s for estimated means in parentheses; a deviation from control group. All coefficients are obtained by OLS, regressing the respective outcome variable on the dummy variables indicating the two premium groups. 38    Table 7: Effects of Weight‐Maintenance Premiums by Success in Weight‐loss Phase Target Weight Achieved in Target Weight Not Achieved Weight‐Loss Phase in Weight‐Loss Phase EUR 250 EUR 500 EUR 250 EUR 500 (1) (2) (3) (4) Months 5‐10 Percentage Change in Body ‐3.528** ‐1.687 ‐2.110** ‐2.441** Weight (1.425) (1.418) (1.059) (1.081) 0.240** 0.188** 0.045 0.147 Target Weight Realized (0.093) (0.093) (0.094) (0.096) # of Obs. (Unweighted) 154 109 Months 5‐22 Percentage Change in Body ‐1.874 ‐0.483 1.377 0.732 Weight (1.943) (1.992) (1.894) (1.894) ‐16 0.040 0.011 ‐0.033 2*10 Target Weight Realized (0.112) (0.115) (0.101) (0.093) # of Obs. (Unweighted) 119 84 # Notes: Lower Absolute Lee‐Bound; ** significant at 5%, * significant at 10%; °° difference between premium groups significant at 5%, ° significant at 10%; ++ difference in effects across target weight achievement in the weight‐loss phase significant at 5%, + significant at 10%; S.E.s in parentheses. 39    Table 8: Effects of Weight‐Maintenance Premiums by Weight‐loss Experimental Group Premium Group in Weight‐ Control Group in Weight‐ Loss Phase Loss Phase EUR 250 EUR 500 EUR 250 EUR 500 (1) (2) (3) (4) Months 5‐10 ++ Percentage Change in Body ‐3.435** ‐2.990** ‐0.987 1.855 Weight (1.065) (1.076) (1.954) (1.923) + 0.221** 0.244** ‐0.109 ‐0.068 Target Weight Realized (0.083) (0.084) (0.168) (0.165) # of Obs. (Unweighted) 211 52 Months 5‐22 Percentage Change in Body ‐0.443 ‐0.578 ‐0.756 2.820 Weight (1.534) (3.304) (3.304) (3.425) 0.007 0.019 ‐0.024 ‐0.126 Target Weight Realized (0.093) (0.094) (0.177) (0.184) # of Obs. (Unweighted) 119 42 # Notes: Lower Absolute Lee‐Bound; ** significant at 5%, * significant at 10%; °° difference between premium groups significant at 5%, ° significant at 10%; ++ difference in effects across weight‐loss experimental groups significant at 5%, + significant at 10%; S.E.s in parentheses. 40    Appendix 1. Technical Description of Selection Problem in Analysis of the First Hypothesis Technically, the objective of the analysis is the estimation of ∆ ∆ , where ∆ denotes weight change over the entire period (10 and 22 month, respectively) and and group memberships in the weight‐loss phase. To simplify notation, we introduce a vector of group membership indicators . Following Wooldridge’s (2002) notation and indexing observations with i, the original (biased) estimator can be written as: min ∆ . The estimator calculates a vector of group means , which can be interpreted as running a linear regression on three experimental group indicators. By conditioning the analysis on participants who were not promised any reward in the weight‐maintenance phase, the estimator takes the following form: min ∆ , where is an indicator for the absence of rewards in the weight‐maintenance phase. This estimator yields inconsistent estimates if is correlated with the error term of the regression. The estimator, which  under certain conditions  proofs to be consistent in the presence of endogenous sample selection, is: min ∆ . Here, denotes the probability of entering the estimation sample conditional on ∆ and a vector of further variables , i.e., |∆ , . It is apparent that is a function of the endogenous variable ∆ . This implies that the above estimator for is consistent only if is uncorrelated with ∆ conditional on , rendering the estimator an inappropriate approach most of the times. In the present case, however, the inverse‐probability weighting estimator satisfies this requirement for the following reason: Including a binary variable , indicating success in the weight‐loss phase, along with into removes the dependence of and ∆ : 1 0 |∆ , , | 1⁄3 1. The equation states that unsuccessful individuals are never eligible for weight‐maintenance rewards and successful participants are promised weight‐maintenance rewards with probability 2/3, irrespective of observable and non‐observable factors. Hence, selection is exclusively based on observable factors.20 This selection problem does not arise in the examinations of Hypothesis 2. Since weight‐ maintenance rewards only concerns those individuals that have already lost sufficient weight during the weight‐loss phase, conditioning on 1, is logical. As randomization of group membership is conditional on 1, too, group membership is purely random. Hence, comparing group means across weight‐maintenance groups yields unbiased estimates.                                                         20 Moreover, the design of the experiment guarantees another essential condition for inverse‐probability weighting: | 1 0. In many studies, selection into the estimation sample is a deterministic function of variables such as success. Consider, for instance, a weight maintenance incentive scheme in a non‐experimental context. There, success in weight reduction deterministically makes an individual eligible for incentives. Hence, | 1 0 holds and inverse probability weighting becomes impossible. 41    2. Tables and Figures     Admission Medical Rehabilitation Clinic       Invitation to Participate in   Study     Deny Participation Agree to Participate     First Measurement of Body Weight + Assignment of   Weight Loss Target     Discharge from Medical   Rehabiliation Clinic     Random Assignment to   Weight loss rewards   (EUR 0, EUR 150, EUR 300)     Second Measurement of Body   Weight + Payment of   Premiums     Unsuccessful Participation Successfull Participation   (Weight Loss < 50% of (Weight Loss ≥ 50% of   Targeted Weight Loss) Targeted Weight Loss)       Random Assignment to Weight Maintenance Rewards   (EUR 0, EUR 250, EUR 500)       Third Measurement of Body   Third Measurement of Body Weight + Payment of   Weight Premiums     Fourth Measurement of Body Weight   Appendix – Figure A1: Flow Chart   42    .08 Estimated Kernel Density .02 .040 .06 23 33 43 53 63 73 BMI Rehabilitation Start Experiment Start Weight-loss Phase Weight-maintenance Phase Follow-up Phase Appendix – Figure A2: Distribution of BMI at the End of Each Phase Notes: Missing values were imputed by BMI at start of weight‐loss phase. The inclusion criterion of a BMI ≥ 30 refers to the day of clinic admission. Persons with a BMI ≥60 are often considered as “super‐super obese” (e.g., Stephens et al., 2008). 43    Appendix – Table A1: Descriptive Statistics of Selected Samples by Weight‐maintenance‐premium Groups (Mean Values) All Control EUR 250 EUR 500 (1) (2) (3) (4) Pre‐treatment Values of Selected Sample After 10 Months BMI Before Rehab 38.383 38.622 38.774 37.750 Baseline BMI 37.112 37.268 37.570 36.491 Target Weight Loss (Percent) 6.410 6.359 6.432 6.436 Bad Kissingen ‐7.344 ‐7.841 ‐7.150 ‐7.062 Bad Mergentheim 0.582 0.576 0.578 0.591 Isny 0.331 0.341 0.289 0.364 Glottertal 0.369 0.388 0.367 0.352 Female 0.068 0.035 0.067 0.102* Age (years) 0.232 0.235 0.278 0.182 Native 0.308 0.365 0.311 0.250 Married 48.441 47.600 49.000 48.682 # of Observations 263 85 90 88 Pre‐treatment Values of Selected Sample After 22 Months BMI Before Rehab 38.014 38.230 38.585 37.203 Baseline BMI 36.783 36.842 37.478 35.993 Target Weight Loss (Percent) 6.455 6.448 6.384 6.537 Bad Kissingen ‐7.437 ‐8.119 ‐7.226 ‐7.029* Bad Mergentheim 0.581 0.619 0.583 0.544 Isny 0.350 0.381 0.278 0.397 Glottertal 0.379 0.365 0.375 0.397 Female 0.069 0.048 0.069 0.088 Age (years) 0.202 0.206 0.278++ 0.118 Native 0.310 0.365 0.319 0.250 Married 48.576 47.159 49.194 49.235 # of Observations 203 63 72 68 Notes: ** deviation from control group significant at 5%, * significant at 10%; °° deviation from EUR 150‐group significant at 5%, ° significant at 10%; standard deviations omitted because of most variables being binary. ‘Bad Mergentheim’, ‘Bad Kissingen’, ‘Isny’, and ‘Glottertal’ refer to the locations of the four rehabilitation clinics. 44    Appendix – Table A2: Covariate Adjusted Effects of Weight‐loss Premiums (Inverse‐probability Weighting) EUR 150 EUR 300 Start of Weight‐loss Phase to End of Weight‐maintenance Phase (10 Months) ‐1.618 ‐2.984** Percentage Change in Body Weight (1.143) (1.310) 0.106 0.062 Target Weight Realized (0.079) (0.078) # of Observations (Unweighted) 234 Start of Weight‐loss Phase to End of Follow‐up Phase (22 Months) 0.185 ‐3.171* Percentage Change in Body Weight (1.636) (1.671) 0.085 0.121 Target Weight Realized (0.082) (0.093) # of Observations (Unweighted) 176 Notes: ** significant at 5%, * significant at 10% ; °° difference between premium groups significant at 5%, ° significant at 10%; S.E.s for estimated means in parentheses. 45    Appendix – Table A3: Covariate Adjusted Effects of Weight‐Maintenance Premiums by Characteristics at End of Weight‐loss Phase End of Weight‐ Target Weight Target Weight Not All Premium Group Control Group Loss Phase Achieved Achieved EUR 250 EUR 500 EUR 250 EUR 500 EUR 250 EUR 500 EUR 250 EUR 500 EUR 250 EUR 500 Start of Weight‐loss Phase to End of Weight‐maintenance Phase (10 Months) Percentage ‐2.681** ‐2.107** ‐3.673** ‐2.863 ‐2.190* ‐2.259* ‐2.814* ‐2.675** ‐2.154 1.850++ Change in Body (0.991) (0.983) (1.682) (1.757) (1.285) (1.289) (1.220) (1.188) (2.221) (1.976) Weight Target Weight 0.127 0.195** 0.207* 0.192* ‐0.011+ 0.100 0.164* 0.238** ‐0.102++ 0.085 Realized (0.080) (0.080) (0.109) (0.114) (0.114) (0.115) (0.097) (0.094) (0.224) (0.199) # of Observations 263 154 109 211 52 (Unweighted) Start of Weight‐loss Phase to End of Follow‐up Phase (22 Months) Percentage 0.151 ‐0.070 ‐1.047 ‐0.801 ‐0.943 ‐1.206 0.434 ‐0.681 ‐2.706 ‐7.086+ Change in Body (1.544) (1.613) (2.521) (2.606) (2.369) (2.426) (1.884) (1.822) (3.395) (4.061) Weight Target Weight ‐0.007 0.002 0.019 0.007 ‐0.062 ‐0.019 ‐0.004 0.020 0.164 0.425++ Realized (0.090) (0.094) (0.145) (0.150) (0.131) (0.134) (0.110) (0.106) (0.261) (0.313) # of Observations 203 119 84 161 42 (Unweighted) Notes: # Lower Absolute Lee‐Bound; ** significant at 5%, * significant at 10% ; °° difference between premium groups significant at 5%, ° significant at 10%; ++ difference across subsamples (full vs. partial target weight achievement in weight‐loss phase and weight‐loss premium vs. control group, respectively) significant at 5%, + significant at 10%; S.E.s for estimated means in parentheses.   46