Policy Research Working Paper 11070 Effects of a Community-Driven Water, Sanitation, and Hygiene Intervention on Diarrhea, Child Growth, and Local Institutions A Cluster-Randomized Controlled Trial in Rural Democratic Republic of Congo John P Quattrochi Kevin Croke Caleb Dohou Luca Stanus Ghib Yannick Lokaya Aidan Coville Eric Mvukiyehe Development Economics A verified reproducibility package for this paper is Development Impact Group available at http://reproducibility.worldbank.org, February 2025 click here for direct access. Policy Research Working Paper 11070 Abstract Diarrhea and growth faltering in early childhood reduce The percentage of villages in the intervention group with survival and impair neurodevelopment. This paper assesses an active water, sanitation, and hygiene (or just water) whether a national program in the Democratic Republic committee was 21 percentage points higher than the con- of Congo reduced diarrhea and stunting and strength- trol group. Households in the intervention group were 24 ened local water and sanitation institutions. The program percentage points more likely to report using an improved combined (i) funds for latrine and water upgrades, (ii) water source, 18 percentage points more likely to report institutional strengthening activities, and (iii) behavior using an improved sanitation facility, and reported more change campaigns. In 2018, the program was randomly positive perceptions of water governance. The Democratic assigned, after stratifying by province and cluster size, with Republic of Congo’s national rural water, sanitation, and 50 intervention and 71 control clusters. In 2022–23, 3,283 hygiene program increased access to improved water and households were interviewed, at a median of 3.6 years sanitation infrastructure, and created new water, sanitation, post-intervention. The intervention had no effect on diar- and hygiene institutions, all of which persisted for more rhea and no effect on length-for-age Z-scores in children. than three years. However, these effects were not sufficient Villages in the intervention group had a 0.40 higher score to reduce diarrhea or growth faltering. on the water, sanitation, and hygiene institutions index. This paper is a product of the Development Impact Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at acoville@worldbank.org. A verified reproducibility package for this paper is available at http://reproducibility. worldbank.org, click here for direct access. RESEA CY LI R CH PO TRANSPARENT ANALYSIS S W R R E O KI P NG PA The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Effects of a Community-Driven Water, Sanitation, and Hygiene Intervention on Diarrhea, Child Growth, and Local Institutions: A Cluster-Randomized Controlled Trial in Rural Democratic Republic of Congo John P Quattrochi 1, Kevin Croke 2, Caleb Dohou 3, Luca Stanus Ghib3, Yannick Lokaya3, Aidan Coville3, Eric Mvukiyehe4 1 Graduate School of Arts & Sciences, Georgetown University 2 Department of Global Health & Population, Harvard TH Chan School of Public Health 3 Development Impact (DIME) Department, World Bank 4 Department of Political Science, Duke University Corresponding author is Aidan Coville (acoville@worldbank.org). This paper was made possible through collaboration between the Foreign, Commonwealth, and Development Office (FCDO-DRC), FCDO’s Evaluation Unit, the United Nations Children's Fund (UNICEF-DRC), the VEA Coordination Team from DRC’s ministries of Health and Education, and the World Bank's Development Impact Department (DIME). We thank the continued leadership, and persistent efforts of several members from these organizations. This study was funded by FCDO - DRC under the World Bank-administered i2i Trust Fund. DIME Analytics conducted a reproducibility review of the data and results in this paper. The study received ethical clearance from IRB Solutions (Protocol \#2019/10/20) and from the Research Center for Health Promotion (CRPS) Institutional Ethical Committee (CEI) of the Institut Supérieur des Techniques Médicales de Bukavu (ISTM-Bukavu) in the DRC, and a Health Research Approval from the DRC Ministry of Health. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. 2 Introduction The most recent estimates of the global burden of morbidity and mortality attributable to unsafe water, sanitation, and hygiene (WASH) are that 1.4 million deaths and 74 million disability- adjusted life years lost could have been prevented in 2019 [1]. People living with unsafe WASH have higher exposure to fecal-oral pathogens, resulting in enteric dysfunction, diarrheal illnesses, and, in children, growth faltering. Growth faltering, in turn, has long-term negative impacts on health, cognition, and human capital [2,3]. In 2020, 2.0 billion people did not have access to safely managed drinking water services, 3.6 billion did not have access to safely managed sanitation services, and 2.3 billion did not have access to handwashing facilities with soap and water at home [4,5]. While access has been increasing, progress will need to accelerate by three-to-six- fold to meet the Sustainable Development Goals for 2030 [6]. The challenge of increasing access is particularly acute for people living in or near armed conflict – one in six people worldwide – both through the direct effects of conflict and because violence and insecurity impede collective action to provide public goods like WASH [7,8]. In the Democratic Republic of Congo (DRC), as of 2020, 48 million people still lacked basic drinking water services, 11 million people still practiced open defecation, and 72 million people still lacked basic hygiene services [9]. To increase access to safe WASH, governments and donors have increasingly turned to community-led approaches. While WASH experts called for greater community participation for over 30 years [10], the sector did not fully embrace this approach until the late 1990s. Beginning with community-led total sanitation in Bangladesh in 1999, community-led WASH programs have been implemented in at least 60 countries, and 15 countries have incorporated them into national policy [11,12]. Despite this broad adoption, the health effects of community-led WASH interventions – and of many WASH interventions in general – remain poorly understood. The accumulation of evidence has accelerated in recent years, but the length of the causal chain from the intervention to the outcome, and the vast design space for WASH interventions (which can incorporate behavior change campaigns, infrastructure, institutions, and/or new technologies), means that many fundamental questions remain unanswered. A meta-analysis of 13 randomized WASH trials found no effect on child length-for-age, but a meta-analysis of 124 WASH studies (randomized and observational) found a protective effect against diarrhea [11,13–24]. There is a great deal of heterogeneity in effect size across studies, likely due to variation in intervention components and intensity, and to the influence of contextual factors such as baseline exposure to fecal matter. To our knowledge, this is the first trial of an intervention that combines the creation of new institutions, funding for new or improved infrastructure, and a behavior change campaign, all within a locally-led process of targeting and implementation. We study this complex intervention as it is implemented at scale, providing a realistic estimate of effectiveness for policy makers in similar contexts. Our follow-up period is unusually long (3.6 years), enabling us to address the question of sustainability. We also provide evidence in a conflict-affected setting, where WASH interventions have rarely been evaluated using experimental designs. 3 The intervention is the ‘healthy villages’ component of the DRC’s Healthy Villages & Schools program, co-led by the Ministries of Public Health, and of Primary, Secondary, and Professional Education, with support from the United Nations Children’s Fund (UNICEF). Since 2008, nearly 9 million people in almost 11,000 villages have been reached with WASH services through the program. Healthy Villages & Schools was the largest WASH program implemented by UNICEF globally and comprised 90% of total external funding committed to rural WASH in the DRC from 2005 to 2020 [25]. Our goal was to estimate the effect of Healthy Villages & Schools on diarrhea prevalence, child length-for-age, and WASH institutions. Methods Study design and participants The intervention of interest is a DRC government-run program that began several years before our study. For our study, the government agreed to randomly assign the next phase of the program (i.e. the next group of villages to receive the intervention). In 2018, we randomly assigned groups of villages to intervention or control (details below). Since we did not collect any data prior to randomization, we did not yet register the study. We collected the first round of data in late 2019, about 5 months after the intervention was implemented (implementation took about one year in each group of villages) [26]. We pre-registered the analysis of that first round of data collection (5 month follow-up), before any of the authors saw the data, in the American Economics Association (AEA) Registry (AEARCTR-0004648) (https://www.socialscienceregistry.org/trials/4648). In Feb 2021, we registered plans for additional data collection in the Pan-African Clinical Trials Registry (PACTR202102616421588; https://pactr.samrc.ac.za/TrialDisplay.aspx?TrialID=14670). In April 2023, before the PIs had seen any data for the 3.6-year follow-up (i.e., the data for the current manuscript), we updated the registration with our primary and secondary outcomes. In April 2023, we also posted our pre- analysis plan for the 3.6 year follow up on the AEA Registry (see "Three Year Follow up Pre- Analysis Plan" at https://www.socialscienceregistry.org/trials/4648). All planned studies from this project are now registered and any future work will be registered prospectively. This study is reported as per CONSORT guideline (S1 Text). We worked with intervention implementers to design a cluster-randomized trial in rural villages in five DRC provinces: Kongo Central, Kasai, Kasai Central, North Kivu, and South Kivu. The implementers identified 403 candidate villages in which the intervention could be launched during the study period, based on the established criteria for the intervention: that the village was located in a secure and accessible Health Area that was not already served by the WASH Consortium, the Health Area staff were dynamic and interested in participating, and there was a problem of diarrhea, cholera, and/or malnutrition. Among these villages, 34 already had program activities in process before research activities began, leaving 369 eligible villages. 4 To avoid spillover effects from treatment villages to control villages, we grouped those villages into clusters. We considered any villages within 2.5 km of each other (using Euclidean distance between village centroids) to be part of the same cluster. Therefore, all clusters have at least 2.5 km between them. We relax this rule in South Kivu, where density is greater, and use a minimum distance of 1km. In total, this resulted in 124 clusters. North Kivu had only three clusters (covering 30 villages); as a result, it was not logistically feasible to include these villages in the trial. That left 121 clusters (339 villages) in four provinces. Each village in the intervention clusters received the intervention, as described above. Villages in control clusters did not receive any intervention. Data collection procedures were identical in the two groups. The study protocol was approved by the Institutional Ethics Committee of the Institut Superieur des Techniques Médicales de Bukavu (DRC) (#001/2019 & #008/2022) and by Solutions IRB (USA) (#2019/10/20). With direction from the study investigators, Innovative Hub for Research in Africa (IHfRA) was responsible for data collection. No study data was collected by intervention implementers. All residents of the targeted villages who had lived in the village for at least four years (i.e., moved to the village prior to the intervention) were eligible to participate in the study. Two groups of households were interviewed: households that were interviewed at 5 months post-intervention (4 per village) and households that had not been previously interviewed (6 per village). Both groups were randomly selected (at their respective entries into the study) as follows: from the center of the village, interviewers went in opposite directions to the nth household, where n was a randomly selected number between 1 and 20. We interviewed the head woman of the household. We also asked to measure the height and weight of the youngest child aged 2-5 years, or, if none, the oldest child aged 0-2 years. Adult participants provided informed consent verbally, which was recorded electronically. Randomization and masking A total of 339 villages in 121 clusters were eligible for randomization. In all provinces except Kasai Central, each cluster was given equal probability of being selected for treatment or control. In Kasai Central, due to budget constraints, we increased the probability of being selected into the control group to 75%, to reflect the fact that only 16 out of 81 villages could receive the intervention. Thus the allocation probability for the intervention group was 25%. We block randomized by province and number-of-villages-per-cluster (12 blocks total; see S1 Table). Since randomization was based on clusters but the implementing organization’s operational targets were based on villages, it was not possible to force the randomization to select the exact number of villages targeted without introducing potential bias. Instead, we compared the number of target villages per province to the number of treatment villages selected after randomization. In cases where the number of intervention villages was larger than the operational target, we randomly dropped an equal number of intervention villages from the largest control and intervention clusters until operational targets were met. We dropped 2 villages in Kongo Central and 4 villages in Kasai. We also dropped one control village in Kasai due to a coding error. This left 146 villages in 50 clusters in the intervention group, and 186 villages in 71 clusters in the 5 control group. S1 Table shows how intervention and control villages and clusters are distributed across provinces. Randomization was done by the research team in Stata. Due to the participatory and visible nature of the intervention, neither participants nor data collectors were masked to treatment status. However, data collectors did not participate in intervention implementation and were employed by a separate, independent organization. Procedures The intervention, “Healthy Villages & Schools”, was developed by the DRC government and UNICEF. We focused on the village rather than the school component. This program mobilizes communities to become a “Healthy Village” with 3-6 months of support from government health officials and local NGOs, including approximately $2,000 of financing for new or improved water infrastructure, $2,000 for new or improved sanitation infrastructure, and $3,000 for personnel costs, per village. The mean village size in the intervention group was 456 people (median 400; IQR 502). The seven norms to become a Healthy Village are: 1. There is a dynamic village WASH committee. 2. At least 80% of the population has access to safe drinking water. 3. At least 80% of households use a hygienic latrine. 4. At least 80% of households dispose of their household waste hygienically. 5. At least 60% of the population washes their hands before eating and after going to the latrine. 6. At least 70% of the population is aware of fecal-oral disease transmission and how to prevent this. 7. The village is cleaned at least once a month. The program is implemented in nine steps (Table 1) [27]. Table 1. The Healthy Village and Schools Program’s Nine Steps Step Description 0 The community learns about the program and collectively decides to adopt it before submitting a formal request to the relevant Health Zone. (A Health Zone is a geographic unit of the Congolese health system that contains roughly ten Health Areas and 100,000 residents, run by a Chief Medical Officer (CMO)). Program protocols state that the entire community should be involved in the decision to participate. 1 A statement of agreement between the community and the Health Zone is signed. 2 Health Zone officials survey 19 households on knowledge, attitudes, and practices (KAP). The community self-evaluates on eight practices, including handwashing, water use, and sanitation. 3 The community spends about 11 hours over five days creating calendars and maps, visiting water points, classifying hygienic practices as healthy or unhealthy, discussing fecal-oral disease transmission, calculating medical costs, and assessing which individuals and organizations influence sanitation and hygiene in the community. This includes 1.5-2 hours in a facilitated activity around the question, “What are the hygiene practices that we want to change in our village?” 6 4 The Health Zone provides training for 20 volunteers on maintenance of latrines, water supply systems, and sanitation, conflict management, and petty cash management. The community elects a village WASH committee. 5 The community spends ten hours over three days describing a community vision, analyzing the barriers to reducing diarrheal diseases, choosing improvements to drinking water, sanitation, and hygiene, and formulating an action plan. The community is asked to identify practical, low-cost solutions with a minimum of outside assistance. New infrastructure is evaluated in terms of accessibility, technical feasibility, and technical capacity. 6 The community builds new infrastructure over 90-180 days, supported by project funds. Key messages about sanitation and hygiene are discussed during sensibilization meetings or during visits to families by the WASH committee, community health workers, or other volunteers. Health Zone staff are expected to visit the community monthly during this time; Health Area staff weekly. 7 The community self-evaluates again, to measure progress since Step 2. The Health Zone conducts additional KAP surveys and hosts three hours of meetings to assess the findings and make a plan to maintain progress. 8 The CMO spends one day in the community to assess whether or not the community has completed its action plan and achieved the seven norms. If they have, a certification ceremony is held. The CMO and the village WASH committee develop a Community Action Plan for Maintenance so that the changes achieved through the program can be sustained over time. The IHfRA data collection team used electronic tablets and transmitted data to a cloud-based server, allowing the research team to conduct quality control measures in real-time, checking for consistency and errors. Additionally, IHfRA randomly selected 15% of villages for a second round of interviews, by different interviewers, with a shorter questionnaire, to check consistency across key variables. Separately, children from two households per village had their height and weight re-measured by an IHfRA supervisor, as a quality check. Outcomes Primary outcomes were caregiver-reported diarrhea in the last seven days among all children who were under 5 years old at the time of the survey, length-for-age Z score for a randomly selected child in each household, and a WASH governance index. If the household had one child between age 2 and 5, we measured the length and weight of that child. If the household had more than one child between age 2 and 5, we randomly selected one child. If the household had no children between ages 2 and 5, but one or more children between ages 0 and 2, we measured 7 length and weight of one of those children (randomly selected). Salter scale (Model 235 6S) and wall-mounted measuring rods (portable baby/child length/height measuring system) were used. The WASH governance index combined questions about the presence of a water committee, frequency of committee meetings, WASH expenses (excluding maintenance), presence of a maintenance plan, whether the committee tracks health conditions in the community, and whether it tracks hygiene and sanitation. Secondary outcomes were access to improved water and sanitation facilities, water quality at water points and in homes, hygiene knowledge and behaviors, observed handwashing, perceptions of WASH governance, children’s school absenteeism, child weight-for-age Z score, and child weight-for-height Z score. Structured observation of handwashing was done in four households per village. We first attempted to observe the four households that were interviewed at the five-month follow-up; if any were unavailable or unwilling, we randomly selected from the six new households to replace them. A research assistant spent two hours in each of these households, recording if handwashing occurred at critical junctures such as before preparing food (see S8 Table for full list of junctures). This took place before the interview, to minimize Hawthorne effects. Access to improved water and sanitation was self-reported, i.e. respondents reported whether their main water source is improved or not according to the Joint Monitoring Program standard definitions (e.g. boreholes are considered improved, while unprotected springs or surface water sources are not). Cost paid by households for water and time spent collecting water were also self-reported. To measure water quality, we tested samples collected (i) at each of the water points used by members of each village, and (ii) at household water storage containers in six randomly selected households per village, on average. Testing was done concurrently with the household interviews. We used the Aquagenx Compartment Bag Test E. Coli +Total Coliform Most Probable Number (MPN) Kit. This measured the MPN of fecal indicator bacteria [28]. To measure subjective performance of local WASH institutions, we used survey questions about: fairness of selection of water governance entity, perception of fair treatment, confidence in the entity’s management of money, confidence in the entity’s response to infrastructure breakdowns, confidence in management, and overall satisfaction. The data collection team also directly observed water point functionality. Length of breakdowns was reported via a water committee and village leader survey. Statistical analyses The number of village clusters in the study was determined by the program budget and proximity of villages to one another (details above). Sample size calculations were used to determine how many households in each village should be interviewed. Based on the primary outcome of diarrhea prevalence, with a minimum detectable effect of 8 percentage points (pp), 32% 8 prevalence in the control group (based on the 5-month follow-up results), intracluster correlation of 0.09 (based on 5-month follow-up), and 1.3 children under 5 years per household (based on 5- month follow-up), we required 10 households per village. We estimated intervention effects according to random assignment (intention to treat), irrespective of adherence to the intervention. For both primary and secondary outcomes, whether binary or continuous, we fitted linear models with a binary variable indicating whether the participant was in a treatment or control cluster [29]. Because randomization was stratified by province and number-of-villages-per-cluster, we included binary variables (fixed effects) for each stratum (n=12) in the model. We also included gender and age (month) indicator variables for all child health outcomes. We clustered standard errors at the cluster (i.e. group of villages) level. For the primary outcome of WASH institutions, and secondary outcomes consisting of multiple measures, we calculated a summary index to avoid over-rejection of the null hypothesis due to multiple inferences. We rescaled each outcome so that higher values implied better outcomes, and averaged standardized values relative to the control group. Treatment effects were estimated as the difference in the summary index between treatment and control groups, such that treatment effects are expressed in standard deviation units relative to the control group. We pre-registered two analyses restricted by subgroup: by province (for all three primary outcomes), and by gender (for diarrhea and length-for-age). We test for interaction on the additive scale, using interaction terms in linear models. Statistical analyses were conducted in Stata version 16.0. Role of the funding source This study was funded by the UK Foreign, Commonwealth, and Development Office (FCDO) (https://www.gov.uk/government/organisations/foreign-commonwealth-development-office), via Amendment No. 3 to the Supplemental Arrangement with the World Bank regarding Multi-Donor Trust Fund for Impact Evaluation to Development Impact (TF072617, parallel to TF072161). AC, KC, and EM were staff at Development Impact at that time. The Healthy Villages & Schools program was a DRC government national program funded by UK’s FCDO and implemented with UNICEF’s support. The funder and implementing partners provided inputs at the design stage to ensure the study addressed policy and program priorities of importance to them. The funders had no role in data collection and analysis, decision to publish, or preparation of the manuscript. Results Of the 1,312 respondents in 328 villages interviewed for the 5-month follow-up, 1,133 (86%) in 328 villages were re-interviewed at the 3.6-year follow-up, between November 24, 2022 and February 10, 2023 (Figure 1). We also reached two villages that were not accessible during 5- month follow-up and lost one village that was surveyed at 3.6-year follow-up due to conflicts. In 39 households with a 5-month follow-up interview, a new respondent was interviewed at 3.6 9 years, and 140 households (11%) were replaced between 5-month and 3.6-year follow-ups. Additionally, in each village at the 3.6-year follow-up, six never-previously-interviewed households were randomly selected, conditional on having lived in the village for at least four years, yielding 1,970 interviews (in four villages, only five households were reached). Thus, at 3.6 years, we interviewed a total of 3,283 households. Of those households, 75% (2,466 out of 3,283) had at least one child eligible for a caregiver’s reports of diarrhea, and 72% (2,374 out of 3,283) had at least one child eligible for length and weight measurement. The primary outcome of WASH institutions was measured in 329 villages. In the intervention group, the median time since the completion of Healthy Villages Step 6 (construction of infrastructure) was 3.6 years (IQR = 3.4 to 3.7). At 3.6 years, respondents in intervention and control groups were similar with regard to characteristics unlikely to be affected by the intervention, such as marital status, educational attainment, age, religion, household size, and home construction materials (Table 2). Table 2. Household and respondent characteristics by intervention group, at 3.6-year follow-up Control Intervention Outcomes n Mean SD n Mean SD Adj. Diff. p-value Household has improved roof 1843 0.42 0.49 1436 0.47 0.50 -0.01 0.81 Household has improved wall 1845 0.01 0.09 1438 0.01 0.12 0.01 0.18 Household has improved floor 1816 0.04 0.19 1430 0.08 0.27 0.03 0.11 Household size 1845 7.20 2.86 1438 7.18 2.93 0.03 0.80 Respondent identifies as Catholic 1845 0.18 0.38 1438 0.19 0.40 0.02 0.54 Respondent identifies as Protestant 1845 0.31 0.46 1438 0.32 0.47 -0.06 0.05 Respondent identifies with other religion 1845 0.02 0.15 1438 0.03 0.18 0.01 0.26 Respondent age 1845 40 13.39 1438 40 13.07 0.72 0.25 Respondent has completed primary school 1845 0.31 0.46 1438 0.34 0.48 -0.02 0.50 Respondent has completed secondary school 1845 0.06 0.23 1438 0.07 0.25 0.00 0.78 Respondent is married or cohabitating 1845 0.83 0.37 1438 0.82 0.38 -0.01 0.49 Adj. diff. = adjusted difference between intervention group and control group, estimated with models that include controls for randomization blocks based on province and number of villages per cluster, and standard errors clustered by cluster. There were 121 clusters in total. Improved roof=1 if roof is finished roofing (i.e., metal, wood, calamine/cement fiber ceramic tiles, cement or roofing shingles); improved walls=1 if walls are ‘finished walls’; improved floor=1 if floor is ‘finished floor’. All variables are binary except ‘HH size’ and ‘respondent age’; for these binary variables, the mean represents the proportion of respondents who are in the listed category. In the intervention group, 96% of villages reported that they created a community action plan and prioritized actions to improve WASH (as instructed by the program); 86% reported that they had implemented that plan. The intervention had no effect on diarrhea (adjusted mean difference -0.01 [95% -0.05– 0.03]) (Table 3). Diarrhea prevalence was high overall, at 38% in the treatment group and 42% in the control group. The intra-cluster coefficient (ICC) for diarrhea in the control group was 0.05; in the intervention group, 0.07. 10 Table 3. Intervention effects on primary outcomes: diarrhea, length-for-age, and WASH institutions Control Intervention Outcomes n Prevalence/Mean SD ICC n Prevalence/Mean SD ICC ITT CI 95% Diarrhea prevalence 2310 42% 0.05 1762 38% 0.07 -0.01 -0.05 0.03 Length-for-age Z-score 1223 -2.18 1.60 0.04 919 -2.20 1.59 0.06 -0.01 -0.15 0.12 WASH institutions index 185 0.00 1.00 0.47 144 0.46 0.75 0.11 0.40 0.16 0.65 ITT = intention-to-treat effect estimate; ICC =intracluster correlation; HH = household. Effects are estimated with models that include controls for randomization blocks based on province and number of villages per cluster. There were 121 clusters in total. The WASH institutions index was calculated by rescaling each variable in the index (eg, presence of WASH committee) so that higher values imply better outcomes, then standardizing relative to the control group, following Kling et al. Effects are in standard deviation units. The intervention had no effect on length-for-age Z-scores in children (adjusted mean difference - 0.01 [95% CI -0.15–0.12]). In the control group, the mean length-for-age Z-score was -2.18 (1.60 SD) (Figure 2). The ICC for length-for-age Z-score in the control group was 0.04; in the intervention group, 0.06. Villages in the intervention group had a 0.40 higher score on the WASH institutions index (95% CI 0.16–0.65). The percentage of villages in the intervention group with an active water, sanitation, and hygiene (or just water) committee was 21 pp higher than the control group. The ICC for the WASH institutions index in the control group was 0.47; in the intervention group, 0.11. Households in the intervention group were 24 pp (95% CI 12–36) more likely to report using an improved water source, 18 pp (95% CI 10–25) more likely to report using an improved sanitation facility, and reported more positive perceptions of water governance (adjusted difference 0.19 SD [95% CI 0.04–0.34]) (Table 3). The more positive perceptions of water governance were driven by higher reported satisfaction with water access (0.56 points higher (95% CI 0.27–0.85) on a 1- 5 scale). Intervention group households were also 9 pp more likely to report paying for water (95% CI 0-19). Conditional on paying for water, there was no difference in the amount paid between the intervention and control groups. The intervention had no effect on time spent collecting water. Intervention group water points were 11 pp less likely to be currently functional (95% CI -0.18– - 0.05); 97% of water points in control villages and 85% of water points in intervention villages were functioning. Intervention group respondents scored 0.23 higher (95% CI 0.12–34) on the index of self-reported hygiene & behavior index. This was driven by several measures. The intervention group was 3 pp more likely to report treating their water (95% CI 1–5), and 9 pp more likely to report handwashing with soap or ash at least once in the previous day (95% CI 6-13). The intervention group also scored 0.43 points higher (95% CI 0.19–0.67) on the handwashing knowledge scale (range 0-10). However, in structured observations of handwashing behavior, there was no difference between intervention and control households in measures of any observed handwashing or handwashing with soap or ash. Water samples from intervention village water points showed a small but statistically significant improvement in thermotolerant coliforms per 100mL compared to control village samples (-0.17 adjusted mean difference in log10(MPN); 95% CI -0.32– -0.02). Overall water quality was low, 11 even from improved sources. Among unimproved sources, 86% of samples had coliform levels over 100 per 100mL, ‘very high risk’ according to WHO standards [30]; among improved sources, 60% of samples had coliform levels over 100 per 100mL. Water samples from intervention household water containers showed no difference in thermotolerant coliforms per 100mL compared to control household samples. The intervention had no effect on the psychological well-being index, the life satisfaction & self- esteem index, or school attendance. In the prespecified subgroup analysis of primary outcomes by province, we find that the intervention reduced diarrhea in one of four provinces (Kongo Central), reduced length-for-age in one province (Kasai Central), and increased the WASH institutions index in two provinces (Kasai and Kasai Central) (see S3 Table). We also used interaction terms in linear models to test for effect modification on the additive scale (S5 Table). Of the nine coefficients (three provinces, leaving out a reference, and three primary outcomes), two were statistically significant: in Kasai Central province, the results suggest that the intervention increased diarrhea prevalence relative to the treatment effect in the reference province (Kongo Central); in Kasai province, the results suggest that the intervention increased length-for-age. For both diarrhea and length-for-age, the Wald test rejects the null hypothesis that all province-by-intervention coefficients are zero at the 0.05 level. For the WASH institutions index, the Wald test fails to reject the null. In the prespecified subgroup analysis of diarrhea and length-for-age by child sex, we find no difference in intervention effects by sex (see S4 Table). In linear models with interaction terms for intervention-by-sex, the coefficients are not statistically significant (S7 Table). Discussion We tested the effects of the national community-led rural WASH program in the DRC on child length-for-age, diarrhea, and WASH institutions. The program improved community WASH institutions, with intervention villages more likely to have a WASH committee, and for this institution to actively monitor community health conditions. Intervention villages also had greater access to improved water and sanitation infrastructure as a result of the program. However, we cannot reject the null hypothesis that the intervention did not have any effect on child length-for- age or diarrhea. The finding of no effect on length-for-age is consistent with a recent meta-analysis of 11 WASH trials with length-for-age as a primary outcome that found an adjusted mean difference in Z-score between intervention and control of 0.00 (95% CI -0.03–0.04) [13]. These trial results stand in contrast to many observational studies finding that WASH protects against growth faltering [31]. This suggests that the observational results may be confounded by other household or community characteristics. We measured fecal indicator bacteria in water sources and household water containers. We found no difference between the intervention and control group household water quality, despite the fact that intervention households were 24 pp more likely to use an improved water source. This is likely due to the fact that improved water sources in our study had low-quality water, consistent with evidence from DRC [25] and elsewhere [33]. It may also be linked to recontamination of water 12 between collection and its ultimate use in the household. Overall, intervention water points had only slightly lower levels of fecal indicator bacteria than control water points; log10MPN/100mL was 0.17 lower in intervention water points (95% CI -0.32– -0.02), with a mean of 1.71 in the control group. This is consistent with a meta-analysis of five WASH trials that found only a 6% reduction in prevalence of enteropathogens in environmental samples [32]. In addition, two-thirds of our respondents reported spending at least one day per week working on an agricultural plot (modal response = 4 days). Of those who did, 95% report open defecation while in the field, and 91% report drinking from surface water or unprotected springs. This highlights the challenge of delivering safe and comprehensive WASH services in some agricultural settings. Our finding of no effect on diarrhea contrasts with a meta-regression (conducted as part of a systematic review) which found use of an off-site improved water source reduces the relative risk of diarrhea by 19% compared to unimproved water [14]. The same analysis found that basic sanitation without sewer connection lowered the relative risk of diarrhea by 21% relative to unimproved or limited sanitation, which also contrasts with our results, while hygiene interventions reduced the relative risk of diarrhea by 30%. However, another review found that effective handwashing promotion typically requires daily to fortnightly contact between the promoter and participant [31]. It is possible that our intervention achieved that frequency of contact during the most intensive 90-180 days of implementation, but also likely that effects would have faded out by our measurements over three years later. We found no effect of the intervention on handwashing with soap or ash during structured observations of study participants by our research team. However, participants in the intervention group were 9 pp more likely to report washing with soap or ash the previous day. This underscores the limitations of self-reported data, particularly when the socially desirable outcome is likely to be known and salient. Sustainability is a widely-recognized challenge for WASH interventions. At five months post- intervention, Healthy Villages and Schools increased access to improved water sources and improved sanitation facilities[26]. Notably, these improvements largely persisted to 3.6 years post- intervention, as did the improvements in WASH institutions. Yet these improvements did not result in any measurable effect on diarrhea or growth faltering. This highlights that it is crucial to measure health outcomes directly and not assume that better inputs are sufficient to yield improvements. This study has several limitations. First, the trial had incomplete adherence: 86% of villages in the intervention group reported that they had implemented the community action plan to address WASH challenges (e.g. by building new infrastructure) by the 3.6-year follow-up. However, this is a realistic level of adherence for a government-implemented program. Indeed, given that many study villages were conflict-affected, the take-up rate was substantial. Second, we have no baseline measures. Although the randomized design means that such measures are not required for unbiased estimates of treatment effects, there are theoretical challenges; for example, if permanent migration out of study villages was affected by the intervention, then our estimates may be biased. However, at 3.6 years post-intervention we were able to re-interview 88% of the 13 households interviewed at 5 months post-intervention, suggesting that migration was relatively rare in our study population. We also restricted households that were newly recruited at 3.6 years to those who had lived in their current residence for at least four years. Third, the outcomes that are self-reported may suffer from reactivity or social-desirability bias. Our results reinforce calls for more ambitious attempts to improve WASH services to reduce stunting and diarrhea, such as “transformative WASH.” [34] Proponents of transformative WASH argue that some or all of the following may be necessary to produce significant health gains: high community coverage of improved sanitation facilities; living environments free from animal feces; continuous, convenient access to clean water; new approaches to behavior change; or new technologies to deliver WASH services. [31] Others go further and argue for transformative housing, with connections to water and sanitation networks. [12] These critiques are relevant in our setting, given the multiple potential sources of contamination, the low quality of even “improved sources”, and the high burden of disease. Business as usual is not enough. Data sharing statement Individual-level, de-identified data from this study and code to reproduce all results are publicly available in the World Bank micro-data catalogue here: https://reproducibility.worldbank.org/index.php/catalog/239. Figure titles and legends Fig 1. Trial Profile. HH = household Fig 2. Distribution of length-for-age Z-scores, intervention and control groups Length-for-age Z-scores for children aged 0-5 years, in the intervention group (n=919) and the control group (n=1223). Supporting information S1 Table. Randomization strata S2 Table. Intervention effects on WASH institutions index and index sub-components S3 Table. Intervention effects on all secondary outcomes, including index sub-components S4 Table. Intervention effects on all primary outcomes, separately by province (pre-specified) S5 Table. Intervention effects on all primary outcomes, province-by-intervention interaction models S6 Table. Intervention effects on diarrhea and length-for-age z score, separately by sex (pre- specified) S7 Table. Intervention effects on diarrhea and length-for-age z score, sex-by-intervention interaction models S8 Table. Variable definitions S1 Text. Consort checklist S2 Text. Protocol and pre-analysis plan, 5-month follow-up S3 Text. Protocol and pre-analysis plan, 3.6 year follow-up 14 References 1. Wolf J, Johnston RB, Ambelu A, Arnold BF, Bain R, Brauer M, et al. Burden of disease attributable to unsafe drinking water, sanitation, and hygiene in domestic settings: a global analysis for selected adverse health outcomes. The Lancet. 2023;401: 2060–2071. 2. Grantham-McGregor S, Cheung YB, Cueto S, Glewwe P, Richter L, Strupp B, et al. Developmental potential in the first 5 years for children in developing countries. The Lancet. 2007;369: 60–70. 3. Black MM, Walker SP, Fernald LC, Andersen CT, DiGirolamo AM, Lu C, et al. Early childhood development coming of age: science through the life course. The Lancet. 2017;389: 77–90. 4. Prüss-Ustün A, Wolf J, Bartram J, Clasen T, Cumming O, Freeman MC, et al. Burden of disease from inadequate water, sanitation and hygiene for selected adverse health outcomes: An updated analysis with a focus on low- and middle-income countries. Int J Hyg Environ Health. 2019;222: 765–777. doi:10.1016/j.ijheh.2019.05.004 5. WHO. Progress on household drinking water, sanitation and hygiene 2000-2020: five years into the SDGs. Progress on household drinking water, sanitation and hygiene 2000-2020: five years into the SDGs. 2021. 6. WHO/UNICEF JMP. Progress on household drinking-water, sanitation and hygiene 2000- 2022: Special focus on gender. WHO/UNICEF JMP; 2023. Available: https://www.who.int/publications/m/item/progress-on-household-drinking-water--sanitation- and-hygiene-2000-2022---special-focus-on-gender 7. Bendavid E, Boerma T, Akseer N, Langer A, Malembaka EB, Okiro EA, et al. The effects of armed conflict on the health of women and children. The Lancet. 2021;397: 522–532. 8. Armed Conflict Location and Event Data Conflict Index Mid-Year Update. 2023. Available: https://acleddata.com/acled-conflict-index-mid-year-update-2023 9. UNICEF. Progress on drinking water, sanitation and hygiene in Africa 2000–2020: five years into the SDGs. 2022. 10. Okun DA. The value of water supply and sanitation in development: an assessment. Am J Public Health. 1988;78: 1463–1467. doi:10.2105/AJPH.78.11.1463 11. Pickering AJ, Djebbari H, Lopez C, Coulibaly M, Alzua ML. Effect of a community-led sanitation intervention on child diarrhoea and child growth in rural Mali: a cluster- randomised controlled trial. The Lancet Global Health. 2015;3: e701–e711. doi:10.1016/S2214-109X(15)00144-8 12. Whittington D, Radin M, Jeuland M. Evidence-based policy analysis? The strange case of the randomized controlled trials of community-led total sanitation. Oxford Review of Economic Policy. 2020;36: 191–221. 13. Bekele T, Rawstorne P, Rahman B. Effect of water, sanitation and hygiene interventions alone and combined with nutrition on child growth in low and middle income countries: a systematic review and meta-analysis. BMJ Open. 2020;10: e034812. doi:10.1136/bmjopen-2019-034812 14. Wolf J, Hubbard S, Brauer M, Ambelu A, Arnold BF, Bain R, et al. Effectiveness of interventions to improve drinking water, sanitation, and handwashing with soap on risk of diarrhoeal disease in children in low-income and middle-income settings: a systematic review and meta-analysis. The Lancet. 2022;400: 48–59. doi:10.1016/S0140- 6736(22)00937-0 15. Null C, Stewart CP, Pickering AJ, Dentz HN, Arnold BF, Arnold CD, et al. Effects of water quality, sanitation, handwashing, and nutritional interventions on diarrhoea and child growth in rural Kenya: a cluster-randomised controlled trial. The Lancet Global Health. 15 2018;6: e316–e329. doi:10.1016/S2214-109X(18)30005-6 16. Luby SP, Rahman M, Arnold BF, Unicomb L, Ashraf S, Winch PJ, et al. Effects of water quality, sanitation, handwashing, and nutritional interventions on diarrhoea and child growth in rural Bangladesh: a cluster randomised controlled trial. The Lancet Global Health. 2018;6: e302–e315. doi:10.1016/S2214-109X(17)30490-4 17. Humphrey JH, Mbuya MN, Ntozini R, Moulton LH, Stoltzfus RJ, Tavengwa NV, et al. Independent and combined effects of improved water, sanitation, and hygiene, and improved complementary feeding, on child stunting and anaemia in rural Zimbabwe: a cluster-randomised trial. The Lancet Global Health. 2019;7: 132–147. 18. Patil SR, Arnold BF, Salvatore AL, Briceno B, Ganguly S, Jr JMC, et al. The Effect of India’s Total Sanitation Campaign on Defecation Behaviors and Child Health in Rural Madhya Pradesh: A Cluster Randomized Controlled Trial. PLOS Medicine. 2014;11: e1001709. doi:10.1371/journal.pmed.1001709 19. Bowen A, Agboatwalla M, Luby S, Tobery T, Ayers T, Hoekstra RM. Association between intensive handwashing promotion and child development in Karachi, Pakistan: a cluster randomized controlled trial. Archives of pediatrics & adolescent medicine. 2012;166: 1037– 1044. 20. Briceño B, Coville A, Gertler P, Martinez S. Are there synergies from combining hygiene and sanitation promotion campaigns: Evidence from a large-scale cluster-randomized trial in rural Tanzania. PLOS ONE. 2017;12: e0186228. doi:10.1371/journal.pone.0186228 21. Cameron LA, Shah M, Olivia S. Impact evaluation of a large-scale rural sanitation project in Indonesia. World Bank policy research working paper. 2013 [cited 30 May 2024]. Available: https://www.jstor.org/stable/pdf/resrep26254.pdf 22. Clasen T, Boisson S, Routray P, Torondel B, Bell M, Cumming O, et al. Effectiveness of a rural sanitation programme on diarrhoea, soil-transmitted helminth infection, and child malnutrition in Odisha, India: a cluster-randomised trial. The Lancet Global Health. 2014;2: e645–e653. doi:10.1016/S2214-109X(14)70307-9 23. Du Preez M, Conroy RM, Ligondo S, Hennessy J, Elmore-Meegan M, Soita A, et al. Randomized Intervention Study of Solar Disinfection of Drinking Water in the Prevention of Dysentery in Kenyan Children Aged under 5 Years. Environ Sci Technol. 2011;45: 9315– 9323. doi:10.1021/es2018835 24. Galiani SG, Orsola-Vidal P, Alexandra. Promoting Handwashing Behavior in Peru: The Effect of Large-Scale Mass-Media and Community Level Interventions. Policy Research Working Papers The World Bank. 2012. doi:10.1596/1813-9450-6257. 25. World Bank. WASH Poor in a Water-Rich Country: A Diagnostic of Water, Sanitation, Hygiene, and Poverty in the Democratic Republic of Congo. Washington, DC: World Bank; 2017. 26. Quattrochi JP, Coville A, Mvukiyehe E, Dohou CJ, Esu F, Cohen B, et al. Effects of a community-driven water, sanitation and hygiene intervention on water and sanitation infrastructure, access, behaviour, and governance: a cluster-randomised controlled trial in rural Democratic Republic of Congo. BMJ global health. 2021;6: e005030. 27. Programme National Ecole et Village Assaini (PNEVA). Atlas 2018 : Acces a L’eu potable, a l’hygeine et a l’assainaissement pour les commautes rurales et periurbaines de la republique democratique du Congo. 2018. Available: https://www.unicef.org/drcongo/media/2806/file/COD-Atlas2018.pdf 28. Gronewold AD, Sobsey MD, McMahan L. The compartment bag test (CBT) for enumerating fecal indicator bacteria: basis for design and interpretation of results. Science of the Total Environment. 2017;587: 102–107. 29. Gomila R. Logistic or linear? Estimating causal effects of experimental treatments on binary outcomes using regression analysis. J Exp Psychol Gen. 2021;150: 700–709. doi:10.1037/xge0000920 16 30. WHO. Guidelines for Drinking-water Quality. World Health Organization; 2004. 31. Pickering AJ, Null C, Winch PJ, Mangwadu G, Arnold BF, Prendergast AJ, et al. The WASH Benefits and SHINE trials: interpretation of WASH intervention effects on linear growth and diarrhoea. Lancet Glob Health. 2019;7: 1139–46. 32. Mertens A, Arnold BF, Benjamin-Chung J, Boehm AB, Brown J, Capone D, et al. Effects of water, sanitation, and hygiene interventions on detection of enteropathogens and host- specific faecal markers in the environment: a systematic review and individual participant data meta-analysis. Lancet Planet Health. 2023;7: e197–e208. doi:10.1016/S2542- 5196(23)00028-1 33. Bain R, Johnston R, Khan S, Hancioglu A, Slaymaker T. Monitoring Drinking Water Quality in Nationally Representative Household Surveys in Low- and Middle-Income Countries: Cross-Sectional Analysis of 27 Multiple Indicator Cluster Surveys 2014–2020. Environ Health Perspect. 2021;129: 097010. doi:10.1289/EHP8459 34. Cumming O, Arnold BF, Ban R, Clasen T, Esteves Mills J, Freeman MC, et al. The implications of three major new trials for the effect of water, sanitation and hygiene on childhood diarrhea and stunting: a consensus statement. BMC Medicine. 2019;17: 173. doi:10.1186/s12916-019-1410-x 1 17 Figures Figure 1. Trial profile 403 villages identified by intervention implementer 34 villages in Kasai province excluded because they received the intervention prior to 369 villages eligible, grouped into 124 clusters 30 villages (3 clusters) in North Kivu dropped due to 339 villages in 121 clusters randomly assigned to intervention or control 50 clusters randomly 71 clusters randomly assigned to the intervention assigned to the control group group ( ) 2 villages randomly dropped from largest cluster in in Kasai, 3 villages randomly dropped due to intervention budget from largest cluster in in Kasai, constraint due to intervention budget constraint 1 village randomly dropped from largest cluster in in Kongo 1 village randomly dropped Central, due to intervention from largest cluster in in Kongo budget constraint Central, due to intervention budget constraint 1 village in Kasai was not able to be reached 1 village in South Kivu did not exist 1 village in Kasai did not exist 50 clusters 71 clusters 144 villages 185 villages Four midline panel HHs per village = Four midline panel HHs per village = 489 644 Replacement HHs for midline HHs = Replacement HHs to midline HHs = 92 87 Six new randomly selected HHs per Six new randomly selected HHs per village = 1,109 HHs* village = 862 HHs* *In one village, only five new HHs *In three villages, only five new HHs were interviewed were interviewed Total = 1,845 HHs interviewed 3.5 18 Figure 2. Distribution of length-for-age Z-scores, intervention and control groups 19 Supporting Information S1 Table. Randomization strata Total Control Total Villages- villages Intervention villages clusters Intervention Control per- in villages in in in clusters in clusters in Province Stratum cluster stratum stratum stratum stratum stratum stratum Kongo Central 1 1-2 15 8 7 12 6 6 2 3-4 13 6 7 4 2 2 3 5,7 12 7 5 2 1 1 Kasai 1 1-2 37 19 18 26 13 13 2 3-5 39 21 18 11 6 5 3 10,12 22 12 10 2 1 1 Kasai Central 1 1-2 34 8 26 25 6 19 2 3-4 34 8 26 10 2 8 3 6-7 13 0 13 2 0 2 South Kivu 1 1-2 20 9 11 13 6 7 2 4-8,10 74 39 35 12 6 6 3 12,14 26 12 14 2 1 1 Total 339 149 190 121 50 71 20 S2 Table. Intervention effects on WASH institutions index and index sub-components Control Intervention CI 95% Lower Upper Outcomes n Mean SD n Mean SD ITT Bound Bound WASH institutions index 185 0.00 1.00 144 0.46 0.75 0.40 0.16 0.65 Committee (y/n) 185 0.70 0.46 144 0.97 0.16 0.21 0.10 0.32 Committee mtg freq* 88 2.91 1.49 104 2.62 1.61 -0.29 -0.79 0.20 WASH expenditures (CDF per month, IHS) 130 2.12 4.07 140 3.28 4.58 1.14 0.07 2.20 Track health (y/n) 185 0.69 0.46 144 0.84 0.37 0.16 0.07 0.25 Track sanitation (y/n) 185 0.74 0.44 144 0.78 0.41 0.05 -0.06 0.15 ITT = intention-to-treat effect estimate. CDF = Congolese francs. IHS = inverse hyperbolic spline. Effects are estimated with models that include controls for randomization blocks based on province and number of villages per cluster. There were 121 clusters in total. The WASH institutions index was calculated by rescaling each variable in the index (e.g., presence of WASH committee) so that higher values imply better outcomes, then standardizing relative to the control group, following Kling et al. Effects are in standard deviation units. The index values range from -2.35 to 2.12. *Committee meeting frequency is coded 1-6, where 1=weekly, 2=Fortnightly, 3=Monthly, 4=Every 3 months, 5=Every 6 or more months, 6=No regular schedule, based on needs 21 S3 Table. Intervention effects on all secondary outcomes, including index sub- components Control Intervention CI 95% Prevalence/ Prevalence/ Outcomes n Mean SD N Mean SD ITT Lower Upper Improved water source 1845 43% 1437 74% 0.24 0.12 0.36 Improved sanitation 1843 14% 1437 34% 0.18 0.10 0.25 Water governance perception index 1845 0.00 1.00 1438 0.22 0.88 0.19 0.04 0.34 Committee selected fairly 600 1.76 0.81 1119 1.82 0.76 0.06 -0.04 0.17 Committee treats community fairly 621 4.15 1.31 1158 4.07 1.31 -0.09 -0.27 0.09 Committee manages money well 635 3.25 1.12 721 3.31 1.14 0.05 -0.13 0.23 Comm responds breakdown well 972 3.69 1.18 1227 3.67 1.23 0.01 -0.13 0.15 Confidence committee will solve reported issue 1009 3.10 1.05 1227 3.12 1.03 0.02 -0.13 0.17 Satisfaction with water access (1-5) 1845 2.76 1.49 1438 3.38 1.43 0.56 0.27 0.85 Time spent collecting water (IHS) 1845 4.17 2.07 1438 4.04 2.05 0.03 -0.20 0.27 HHs pay for water (Dummy) 1845 10% 1438 23% 0.09 0.00 0.19 Water use expenditure (weekly) 137 153 217 303 254 351 114 -16 243 Hygiene and behaviour index 1845 0.00 1.00 1438 0.30 1.07 0.23 0.12 0.34 Handwashing score (0-10) 1845 2.79 2.22 1438 3.41 2.16 0.43 0.19 0.67 Open defecation (%) 1843 32% 1437 27% -0.03 -0.08 0.01 Self-reported handwashing with soap/ash (%) 1843 67% 1437 82% 0.09 0.06 0.13 Frequency of latrine cleaning over past 2 weeks 761 4.89 4.26 648 5.00 4.06 0.25 -0.32 0.82 Water in pot is clean and covered (%) 1669 57% 1273 57% 0.02 -0.03 0.07 Water treated for consumption, any method (%) 1669 3% 1273 7% 0.03 0.01 0.05 Handwashing action 616 15% 489 17% 0.00 -0.04 0.03 Handwashing with soap/ash 616 2% 489 4% 0.01 -0.01 0.03 Water point has water 380 97% 266 85% -0.11 -0.18 -0.05 WP FIB MPN/100mL 366 79.31 37.92 221 72.41 42.56 -9.46 -18.85 -0.06 WP FIB log10 MPN/100mL 366 1.71 0.60 221 1.58 0.70 -0.17 -0.32 -0.02 HH FIB MPN/100mL 1098 82.27 35.05 851 79.44 37.71 -3.87 -9.59 1.86 HH FIB log10 MPN/100mL 1098 1.76 0.53 851 1.72 0.58 -0.06 -0.15 0.04 Weight for age 1223 -1.17 1.16 919 -1.14 1.16 0.06 -0.04 0.15 Weight for length 1224 0.14 1.07 918 0.22 1.03 0.10 -0.02 0.22 Life satisfaction & self-esteem index 1843 0.00 1.00 1435 0.03 0.98 0.01 -0.12 0.14 Life satisfaction (WVS) 1843 4.87 3.20 1435 5.04 2.89 0.20 -0.10 0.49 Feel I am person of worth (Rosenberg) 1843 2.25 0.90 1435 2.19 0.90 -0.03 -0.13 0.07 Feel that I have good qualities (Rosenberg) 1843 2.42 0.74 1435 2.36 0.75 -0.02 -0.10 0.05 Inclined to feel I am a failure (Rosenberg) 1843 1.43 1.11 1435 1.49 1.05 -0.04 -0.16 0.08 Able to do things as well as other people (Rosenberg) 1843 2.44 0.75 1435 2.40 0.76 0.01 -0.06 0.08 Feel have not much to be proud of (Rosenberg) 1843 0.98 0.98 1435 0.99 0.92 -0.05 -0.14 0.04 Take a positive attitude towards self (Rosenberg) 1843 2.35 0.78 1435 2.37 0.74 0.08 -0.01 0.17 I am satisfied with myself (Rosenberg) 1843 2.03 1.02 1435 2.12 0.96 0.16 0.07 0.26 22 Wish could have more respect for myself (Rosenberg) 1843 0.42 0.60 1435 0.40 0.57 -0.07 -0.13 0.00 Certainly feel useless at times (Rosenberg) 1843 1.33 1.10 1435 1.40 1.07 -0.03 -0.16 0.10 At times think I am no good at all (Rosenberg) 1843 1.32 1.09 1435 1.43 1.07 0.00 -0.12 0.12 Psychological well-being index 1843 0.00 1.00 1435 0.05 0.99 0.06 -0.06 0.17 Felt cheerful last 2 weeks (WHO) 1843 2.59 1.72 1435 2.64 1.64 0.05 -0.12 0.22 Felt calm & relaxed last 2 weeks (WHO) 1843 2.62 1.68 1435 2.68 1.59 0.06 -0.11 0.22 Felt active & vigorous last 2 weeks (WHO) 1843 2.57 1.64 1435 2.61 1.54 0.04 -0.13 0.21 Woke up fresh & rested last 2 weeks (WHO) 1843 2.58 1.62 1435 2.56 1.53 0.02 -0.14 0.18 Daily life filled with things that interest last 2 weeks (WHO) 1843 2.25 1.76 1435 2.30 1.70 0.12 -0.06 0.29 Felt unable to control important things last month (Cohen) 1843 2.95 1.26 1435 2.92 1.12 -0.06 -0.17 0.05 Felt confident about ability to handle personal problems last month (Cohen) 1843 3.09 1.27 1435 3.05 1.17 0.02 -0.08 0.13 Felt confident things were going your way last month (Cohen) 1843 2.81 1.33 1435 2.89 1.25 0.10 -0.02 0.23 Felt difficulties were piling up could not overcome them last month (Cohen) 1843 2.39 1.28 1435 2.56 1.20 0.10 -0.03 0.23 School attendance (days past week) 299 3.21 2.72 226 2.74 2.72 -0.38 -0.82 0.07 ITT = intention-to-treat effect estimate; HH = household; FIB = fecal indicator bacteria; MPN = most probable number; IHS = inverse hyperbolic spline; WHO = World Health Organization; WVS = World Values Survey. Effects are estimated with models that include controls for randomization blocks based on province and number of villages per cluster. There were 121 clusters in total. The WASH governance perceptions index and the hygiene and behavior index were calculated by rescaling each variable in the index (e.g., satisfaction with water access) so that higher values imply better outcomes, then standardizing relative to the control group, following Kling et al. Effects are in standard deviation units. 23 S4 Table. Intervention effects on all primary outcomes, by province (pre-specified) Control Intervention CI 95% Outcomes Province n Prevalence/Mean SD n Prevalence/Mean SD ITT Lower Bound Upper Bound Diarrhea prevalence Kongo Central 146 17% 163 10% -0.07 -0.15 0.00 Kasai 599 40% 679 40% 0.00 -0.09 0.09 Kasai Central 781 49% 196 59% 0.10 0.00 0.19 South Kivu 784 41% 724 37% -0.05 -0.10 0.01 Length-for-age Z-score Kongo Central 96 -1.59 1.32 93 -1.75 1.67 -0.18 -0.47 0.11 Kasai 319 -2.05 1.60 328 -1.86 1.60 0.20 -0.05 0.44 Kasai Central 409 -2.12 1.71 103 -2.58 1.49 -0.42 -0.80 -0.04 South Kivu 399 -2.49 1.48 395 -2.50 1.50 0.02 -0.16 0.19 WASH institutions index Kongo Central 18 0.27 0.66 20 0.59 0.57 0.31 -0.06 0.68 Kasai 43 -0.48 1.10 48 0.21 0.75 0.70 0.21 1.19 Kasai Central 65 -0.18 0.85 16 0.28 0.61 0.35 0.03 0.68 South Kivu 59 0.46 0.96 60 0.67 0.78 0.23 -0.25 0.71 ITT = intention-to-treat effect estimate. Effects are estimated with models that include controls for randomization blocks based on province and number of villages per cluster. There were 121 clusters in total. The WASH institutions index was calculated by rescaling each variable in the index (e.g., presence of WASH committee) so that higher values imply better outcomes, then standardizing relative to the control group, following Kling et al. Effects are in standard deviation units. 24 S5 Table. Intervention effects on all primary outcomes, province-by-intervention interaction models (1) (2) (3) (4) (5) (6) WASH WASH Diarrhea - Diarrhea - Height for Height for Intitutions Intitutions Simple Interaction Age - Age - Index - Index - Simple Interaction Simple Interaction Intervention 0.404*** 0.308* -0.010 -0.070* -0.014 -0.172 (0.125) (0.167) (0.022) (0.036) (0.070) (0.134) Kasai Province # cluster_strata=2 -0.644** -0.024 0.267*** 0.003 -0.085 0.090 (0.251) (0.256) (0.050) (0.044) (0.200) (0.149) Kasai Province # cluster_strata=3 -0.432* 0.196 0.232*** -0.031 0.222 0.396*** (0.221) (0.315) (0.078) (0.077) (0.172) (0.091) Kasai Central Province # cluster_strata=2 -0.263 0.185 0.372*** -0.024 -0.347 -0.005 (0.201) (0.199) (0.059) (0.055) (0.261) (0.222) Kasai Central Province # cluster_strata=3 -0.866** -0.430 0.335*** -0.036 -0.143 0.106 (0.410) (0.411) (0.063) (0.059) (0.168) (0.140) Sud-Kivu Province # cluster_strata=2 -0.064 0.042 0.273*** 0.035 -0.729*** 0.096 (0.231) (0.237) (0.041) (0.032) (0.161) (0.141) Sud-Kivu Province # cluster_strata=3 0.670** 0.763*** 0.268*** 0.030 -0.240 0.586*** (0.262) (0.223) (0.042) (0.028) (0.164) (0.142) Kasai Province -0.835*** 0.226*** -0.359* (0.314) (0.054) (0.212) Kasai Central Province -0.488* 0.339*** -0.323 (0.272) (0.050) (0.223) Sud-Kivu Province -0.065 0.224*** -0.912*** (0.293) (0.050) (0.227) Intervention # Kasai Province 0.391 0.069 0.365** (0.293) (0.058) (0.182) Intervention # Kasai Central Province 0.044 0.168*** -0.242 (0.230) (0.059) (0.226) 25 Intervention # Sud-Kivu Province -0.076 0.022 0.184 (0.286) (0.044) (0.158) Child's sex 0.010 0.010 0.134** 0.135** (0.014) (0.014) (0.062) (0.062) Child's age (years) -0.043*** -0.043*** -0.233*** -0.234*** (0.004) (0.004) (0.033) (0.033) Constant 0.272 0.324 0.235*** 0.268*** -1.323*** -1.248*** (0.168) (0.214) (0.039) (0.044) (0.182) (0.204) Observations 329 329 4072 4072 2142 2142 Wald_F 0.790 3.169 3.022 Wald_p 0.502 0.027 0.032 Coefficients and standard errors from linear models of the primary outcomes on intervention group, fixed effects for randomization strata, fixed effects for province, and intervention-by-province interaction terms. Standard errors are clustered by cluster (group of villages). There were 121 clusters in total. The WASH institutions index was calculated by rescaling each variable in the index (e.g., presence of WASH committee) so that higher values imply better outcomes, then standardizing relative to the control group, following Kling et al. Effects are in standard deviation units. The Wald F is a test statistic for the null hypothesis that all of the interaction term coefficients are zero. * p<.1, ** p<.05, *** p<.01" 26 S6 Table. Intervention effects on diarrhea and length-for-age z score, separately by sex (pre-specified) Control Intervention CI 95% Prevalence Prevalence / / Lower Upper Outcomes Province n Mean SD n Mean SD ITT Bound Bound Diarrhea Female 1151 43% 880 38% -0.01 -0.06 0.04 Male 1159 41% 882 38% 0.00 -0.06 0.05 Length for Female age 605 -2.08 1.59 475 -2.19 1.64 -0.07 -0.24 0.10 Male 618 -2.28 1.61 444 -2.22 1.53 0.06 -0.14 0.26 ITT = intention-to-treat effect estimate. Effects are estimated with models that include controls for randomization blocks based on province and number of villages per cluster. There were 121 clusters in total. The WASH institutions index was calculated by rescaling each variable in the index (e.g., presence of WASH committee) so that higher values imply better outcomes, then standardizing relative to the control group, following Kling et al. Effects are in standard deviation units. 27 S7 Table. Intervention effects on diarrhea and length-for-age z score, sex-by-intervention interaction models (1) (2) (3) (4) Diarrhea - Diarrhea - Height for Height for Simple Interaction Age - Age - Simple Interaction Treatment -0.010 -0.004 -0.014 0.071 (0.022) (0.027) (0.070) (0.098) Child's sex 0.010 0.014 0.134** 0.206** (0.014) (0.018) (0.062) (0.084) Child's age (years) -0.043*** -0.043*** -0.233*** -0.233*** (0.004) (0.004) (0.033) (0.033) Kasai Province # cluster_strata=2 0.267*** 0.267*** -0.085 -0.089 (0.050) (0.050) (0.200) (0.199) Kasai Province # cluster_strata=3 0.232*** 0.232*** 0.222 0.213 (0.078) (0.078) (0.172) (0.174) Kasai Central Province # cluster_strata=2 0.372*** 0.372*** -0.347 -0.351 (0.059) (0.058) (0.261) (0.260) Kasai Central Province # cluster_strata=3 0.335*** 0.335*** -0.143 -0.150 (0.063) (0.063) (0.168) (0.167) Sud-Kivu Province # cluster_strata=2 0.273*** 0.273*** -0.729*** -0.737*** (0.041) (0.041) (0.161) (0.159) Sud-Kivu Province # cluster_strata=3 0.268*** 0.268*** -0.240 -0.242 (0.042) (0.042) (0.164) (0.163) Treatment # Female -0.011 -0.168 (0.028) (0.122) Constant 0.235*** 0.233*** -1.323*** -1.354*** (0.039) (0.040) (0.182) (0.185) Observations 4072 4072 2142 2142 Wald_F 0.159 1.883 Wald_p 0.691 0.173 Coefficients and standard errors from linear models of the primary outcomes on intervention group, fixed effects for randomization strata, fixed effects for child sex, and intervention-by-sex interaction terms. Standard errors are clustered by cluster (group of villages). There were 121 clusters in total. The WASH institutions index was calculated by rescaling each variable in the index (e.g., presence of WASH committee) so that higher values imply better outcomes, then standardizing relative to the control group, following Kling et al. Effects are in standard deviation units. The Wald F is a test statistic for the null hypothesis that all of the interaction term coefficients are zero. * p<.1, ** p<.05, *** p<.01" 28 S8 Table. Variable definitions Descriptions of primary and secondary outcomes, including index subcomponents Category Outcomes Definition Primary Diarrhea Prevalence of diarrhea in last 7 days of Outcome children under 5 (caregiver reported): 1. Yes to ‘diarrhea’ , or 2. Yes to ‘three or more stools” AND ‘watery or soft stool’, or 3. Yes to blood in the stool Primary Length for age Length for age z score for children under Outcome 5. The sample does not include observations with implausible z-scores. Primary WASH institutions index The index is built from: (1) Presence of Outcome water committee; (2) Frequency of meeting; (3) Average amount spent per month for water activities excluding maintenance (inverse hyperbolic sine); (4) Tracks health conditions; (5) Tracks hygiene and sanitation Subcomponent Presence of water committee 1 if the village has a water committee 0 Otherwise Subcomponent Frequency of meeting Frequency of committee meeting. The variable ranges from 1 to 6: 1 Weekly 2 Fortnightly 3 Monthly 4 Every 3 months 5 Every 6 or more months 6 No regular schedule, based on needs Subcomponent Average amount spent per month for Average amount spent per month for water activities excluding maintenance. water activities excluding maintenance. An inverse hyperbolic sine transformation has been applied to the variable. Subcomponent Tracks health conditions 1 if village keeps track of community health conditions 0 Otherwise Subcomponent Tracks hygiene and sanitation 1 if village keeps track of WASH practices 0 Otherwise 29 Secondary Water governance perception index Water governance perception index is Outcome constructed based on the following variables: fairness of selection of water governance entity; perception of fair treatment; confidence in managing money of committee; confidence in committee response to breakdowns; confidence in committee management; Satisfaction with water access Subcomponent Committee selected fairly The extent to which the process of choosing committee members was fair The answer choices are: Not fair at all; Somewhat fair; Fair; Fully fair Subcomponent Committee treats community fairly The extent to which the water committee treats you fairly The answer choices are: Very unfair Somewhat unfair Neither fair nor unfair Somewhat fair Very fair Subcomponent Committee manages money well How well the water committee manages money The answer choices are the following: Very badly Badly Neither bad nor good Good Very good Subcomponent Comm responds breakdown well How well the water committee responds to breakdowns The answer choices are the following: Very badly Badly Neither bad nor good Good Very good Subcomponent Confidence committee will solve Level of confidence that water committee reported issue will solve issue respondent brings up The answer choices are the following: Not confident at all Not very confident 30 Somewhat confident Very confident Subcomponent Satisfaction with water access (1-5) Level of satisfaction of respondent with your access to water The answer choices are the following: Not satisfied at all Not satisfied Indifferent Satisfied Very satisfied Secondary Improved water source 1 if Household’s Primary source of Outcome drinking water is improved source (JMP definition) and 0 otherwise. ● Improved main drinking source includes Piped into dwelling, Piped into plot, Piped/public tap, Tube well or borehole, Protected dug well, Protected spring, Rainwater, Tanker truck. ● Unimproved main drinking source includes Unprotected dug well, Unprotected spring, Surface water, No other main source Secondary Time spent collecting water (IHS) Each household was asked to list Outcome household members who participated on previous day, and the number of trips undertaken by the household. For each trip, the collector was identified and the time spent for the round trip was collected. For each household, we sum time spent for each trip in minutes to get the total time spent collecting water per household. To estimate treatment effects, an inverse hyperbolic sine transformation was applied due to the large number of zeroes. Secondary HHs pay for water (Dummy) 1 if household pay for water Outcome 0 Otherwise Secondary Water use expenditure (weekly) Estimated household’s weekly Outcome expenditure for water use (in CDF, Congolese Franc) 31 Secondary Improved sanitation Household uses an improved latrine (JMP Outcome definition) ● Improved sanitation includes options such as Flush / Pour Flush: to Piped Sewer System, Flush / Pour Flush: to Septic Tank, Flush / Pour Flush: to Pit Latrine, Ventilated Improved Pit Latrine (VIP), Pit Latrine with Slab, Composting Latrine. ● Unimproved sanitation includes Flush / Pour Flush: to Elsewhere, Flush / Pour Flush: to Don’t Know Where, Pit Latrine without Slab / Open Pit, Bucket Latrine, Hanging Toilet, No Facilities (Bush, Open Field, River), Other (specify) Secondary Hygiene and behaviour index Self-reports of: Knowledge: Caregiver Outcome knows how and when to wash hands; what causes diarrhea. Sanitation practices: cleanliness of household area and latrine (presence of flies and fecal matter); open defecation; observed indicators of toilet use –worn pathway, presence of water; improvements to latrine; disposes of child feces safely (JMP definition). Water storage practices: has a clean pot for water that is covered. Subcomponent Handwashing score (0-10) Self-report of: Knowledge: Caregiver knows how and when to wash hands. The variables considered are the following: count 1 if respondent mentioned unprompted that they wash hands with soap/ash in critical juncture X (= 0 if they say no or they say it prompted). The counts are then added up to create a score. The considered junctures are the following: After toilet; after washing baby’s bottom/changing; after eating; before preparing food; before eating; before feeding/breastfeeding baby; before or after handling children; after taking care of pets or farm animals; after coughing/sneezing; after coming back from the fields. 32 Subcomponent Open defecation (%) 1 if no defecation facility used and 0 otherwise Subcomponent Self-reported handwashing with 1 if respondent washed their hands with soap/ash (%) soap/ash at least once since previous day 0 Otherwise Subcomponent Frequency of latrine cleaning over past Number of times the latrine has been 2 weeks cleaned in the past 2 weeks Subcomponent Water pot is clean and covered (%) 1 if the pot has clean water and is covered 0 Otherwise. This indicator is measured via actual observations Subcomponent Water treated for consumption, any 1 if drinking water stored in household is method (%) treated with any product/method for safe consumption 0 otherwise Secondary Life satisfaction & self-esteem index Summary index of 11 questions Outcome The 11 questions include the (1) life satisfaction question as defined by the World Values Survey and (2) the 10 questions as defined in Rosenberg's Self- Esteem Scale. Subcomponent Life satisfaction (WVS) All things considered, on a scale of 1 to 10, how satisfied are you with your life as a whole? 1 means completely dissatisfied 10 means completely satisfied Subcomponent Feel I am person of worth (Rosenberg) I feel I am a person of worth, at least on an equal plane with others. Tell me to what extent you: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about you. Subcomponent Feel that I have good qualities I feel that I have a number of good (Rosenberg) qualities. Tell me to what extent you: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about you. Subcomponent Inclined to feel I am a failure All in all, I am inclined to feel that I am a (Rosenberg) failure. Tell me to what extent you: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about you. 33 Subcomponent Able to do things as well as oth people I am able to do things as well as most (Rosenberg) other people. Tell me to what extent you: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about you. Subcomponent Feel have not much to be proud of I feel I do not have much to be proud of. (Rosenberg) Tell me to what extent you: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about you. Subcomponent Take a positive attitude towards self I take a positive attitude toward myself. (Rosenberg) Tell me to what extent you: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about you. Subcomponent I am satisfied with myself (Rosenberg) On the whole, I am satisfied with myself. Tell me to what extent you: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about you. Subcomponent Wish could have more respect for I wish I could have more respect for myself (Rosenberg) myself. Tell me to what extent you: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about you. Subcomponent Certainly feel useless at times I certainly feel useless at times. (Rosenberg) Tell me to what extent you: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about you. Subcomponent At times think I am no good at all At times I think I am no good at all. (Rosenberg) To what extent do respondents: Strongly agree, Agree, Disagree, or Strongly disagree with this statement about them Secondary Psychological well-being index Summary index of 9 questions on well- Outcome being in last 2 weeks and stress in last 4 weeks. The questions used here belong to two different sets of questions. The first set is the 5 WHO questions on well-being, and the second set is the Cohen stress scale 4 questions. 34 Each of the five WHO statements on well- being refers to how respondents have been feeling over the last two weeks. The answer choices are the following: At no time; Some of the time; Less than half of the time; More than half of the time; Most of the time; All of the time. Questions in the Cohen stress scale, ask the respondent about their feelings and thoughts during THE LAST MONTH. The answer choices are the following: Never; Almost never; Sometimes; Fairly often; Very often. Subcomponent Felt cheerful last 2 weeks (WHO) Over the last two weeks, I have felt cheerful and in good spirits. The answer choices are the following: At no time; Some of the time; Less than half of the time; More than half of the time; Most of the time; All of the time. Subcomponent Felt calm & relaxed last 2 weeks Over the last two weeks, I have felt calm (WHO) and relaxed. The answer choices are the following: At no time; Some of the time; Less than half of the time; More than half of the time; Most of the time; All of the time. Subcomponent Felt active & vigorous last 2 weeks Over the last two weeks, I have felt active (WHO) and vigorous. The answer choices are the following: At no time; Some of the time; Less than half of the time; More than half of the time; Most of the time; All of the time. Subcomponent Woke up fresh & rested last 2 weeks Over the last two weeks, I woke up (WHO) feeling fresh and rested. The answer choices are the following: At no time; Some of the time; Less than half of the time; More than half of the time; Most of the time; All of the time. Subcomponent Daily life filled with things that interest Over the last two weeks, my daily life has last 2 weeks (WHO) been filled with things that interest me. The answer choices are the following: At no time; Some of the time; Less than half 35 of the time; More than half of the time; Most of the time; All of the time. Subcomponent Felt unable to control important things In the last month, how often have you felt last month (Cohen) that you were unable to control the important things in your life? The answer choices are the following: Never; Almost never; Sometimes; Fairly often; Very often Subcomponent Felt confident about ability to handle In the last month, how often have you felt personal problems last month (Cohen) confident about your ability to handle your personal problems? The answer choices are the following: Never; Almost never; Sometimes; Fairly often; Very often Subcomponent Felt confident things were going your In the last month, how often have you felt way last month (Cohen) confident that things were going your way? The answer choices are the following: Never; Almost never; Sometimes; Fairly often; Very often Subcomponent Felt difficulties were piling up could not In the last month, how often have you felt overcome them last month (Cohen) difficulties were piling up so high that you could not overcome them? The answer choices are the following: Never; Almost never; Sometimes; Fairly often; Very often Secondary Handwashing action Share of adult household members who Outcome were observed washing their hands at any juncture, measured via structured observations. The considered junctures are the following: Before obtaining water from a wide- mouthed storage container; Before cutting or preparing food; Before serving food; Before eating; Before feeding child under 5; Before breastfeeding child; After defecation; After toileting; After cleaning child post- toileting 36 Secondary Handwashing with soap/ash Share of adult household members who Outcome were observed washing their hands with soap/ash at any juncture, measured via structured observations. The considered junctures are the following: Before obtaining water from a wide- mouthed storage container; Before cutting or preparing food; Before serving food; Before eating; Before feeding child under 5; Before breastfeeding child; After defecation; After toileting; After cleaning child post- toileting Secondary School attendance Number of days child aged 6 to 18 years Outcome old attended school in past week, based on responses to “How many days has this child attended school in the past week?” Children who were not enrolled in school were coded as zero. Secondary Water point has water 1 if water point provides water Outcome 0 Otherwise (Locked water points during survey are not considered) Secondary WP Coliforms MPN/100mL Water quality at point of collection (village Outcome water source) is Most Probable Number (MPN) in 100 mL as defined for Aquagenx CBT EC+TC MPN water quality testing kits and follow WHO standards. For nondetects, we substitute half the lower detection limit. Secondary HH Coliforms MPN/100mL Water quality at point of use (Drinking Outcome water stored in the household) is Most Probable Number (MPN) in 100 mL as defined for Aquagenx CBT EC+TC MPN water quality testing kits and follows WHO standards. For nondetects, we substitute half the lower detection limit. Secondary Weight for age Weight for age z score for children under Outcome 5. 37 The sample does not include observations with implausible z-scores. Secondary Weight for length Weight for length z score for children Outcome under 5. The sample does not include observations with implausible z-scores. 38 S1 Text. CONSORT 2010 checklist of information to include when reporting a randomized trial* Reported in Item Section/ Section/Topic No Checklist item Paragraph Title and abstract 1a Identification as a randomized trial in the title Title 1b Structured summary of trial design, methods, Abstract results, and conclusions (for specific guidance see CONSORT for abstracts) Introduction Background and 2a Scientific background and explanation of Introduction objectives rationale 2b Specific objectives or hypotheses Last paragraph, Introduction Methods Trial design 3a Description of trial design (such as parallel, Study design factorial) including allocation ratio 3b Important changes to methods after trial NA commencement (such as eligibility criteria), with reasons Participants 4a Eligibility criteria for participants Study design, 9th paragraph 4b Settings and locations where the data were Study design, collected 5th paragraph Interventions 5 The interventions for each group with sufficient Procedures details to allow replication, including how and when they were actually administered Outcomes 6a Completely defined pre-specified primary and Outcomes secondary outcome measures, including how and when they were assessed 6b Any changes to trial outcomes after the trial NA commenced, with reasons Sample size 7a How sample size was determined Statistical analysis 7b When applicable, explanation of any interim NA analyses and stopping guidelines Randomization: Sequence 8a Method used to generate the random allocation Randomization generation sequence and masking 39 8b Type of randomization; details of any restriction Randomization (such as blocking and block size) and masking Allocation 9 Mechanism used to implement the random Randomization concealment allocation sequence (such as sequentially and masking mechanism numbered containers), describing any steps taken to conceal the sequence until interventions were assigned Implementation 10 Who generated the random allocation sequence, Randomization who enrolled participants, and who assigned and masking participants to interventions Blinding 11a If done, who was blinded after assignment to NA interventions (for example, participants, care providers, those assessing outcomes) and how 11b If relevant, description of the similarity of NA interventions Statistical 12a Statistical methods used to compare groups for Statistical methods primary and secondary outcomes analysis 12b Methods for additional analyses, such as Statistical subgroup analyses and adjusted analyses analysis Results Participant flow (a 13a For each group, the numbers of participants who Study design diagram is were randomly assigned, received intended strongly treatment, and were analyzed for the primary recommended) outcome 13b For each group, losses and exclusions after Study design randomization, together with reasons Recruitment 14a Dates defining the periods of recruitment and Results follow-up 14b Why the trial ended or was stopped NA Baseline data 15 A table showing baseline demographic and Table 2 clinical characteristics for each group Numbers 16 For each group, number of participants Tables 3-4 analysed (denominator) included in each analysis and whether the analysis was by original assigned groups Outcomes and 17a For each primary and secondary outcome, Results, estimation results for each group, and the estimated effect Tables 3-4 size and its precision (such as 95% confidence interval) 17b For binary outcomes, presentation of both Tables 3-4 absolute and relative effect sizes is recommended 40 Ancillary analyses 18 Results of any other analyses performed, Results; Supp including subgroup analyses and adjusted Info analyses, distinguishing pre-specified from exploratory Harms 19 All important harms or unintended effects in NA each group (for specific guidance see CONSORT for harms) Discussion Limitations 20 Trial limitations, addressing sources of potential Discussion, 2nd bias, imprecision, and, if relevant, multiplicity of to last analyses paragraph Generalizability 21 Generalizability (external validity, applicability) of Discussion the trial findings Interpretation 22 Interpretation consistent with results, balancing Discussion benefits and harms, and considering other relevant evidence Other information Registration 23 Registration number and name of trial registry Study design Protocol 24 Where the full trial protocol can be accessed, if Study design available Funding 25 Sources of funding and other support (such as Abstract supply of drugs), role of funders *We strongly recommend reading this statement in conjunction with the CONSORT 2010 Explanation and Elaboration for important clarifications on all the items. If relevant, we also recommend reading CONSORT extensions for cluster randomized trials, non-inferiority and equivalence trials, non-pharmacological treatments, herbal interventions, and pragmatic trials. Additional extensions are forthcoming: for those and for up-to-date references relevant to this checklist, see www.consort-statement.org.