Policy Research Working Paper 10457 Struggling with the Rain Weather Variability and Food Insecurity Forecasting in Mauritania Paul Blanchard Oscar Ishizawa Thibaut Humbert Rafael Van Der Borght Urban, Disaster Risk Management, Resilience and Land Global Practice May 2023 Policy Research Working Paper 10457 Abstract Weather-related shocks and climate variability contrib- risk models that are adapted for estimation of the impact ute to hampering progress toward poverty reduction in of rainy season quality on food insecurity: natural haz- Sub-Saharan Africa. Droughts have a direct impact on ards, households’ vulnerability and exposure. The paper weather-dependent livelihood means and the potential to applies this framework in the context of rural Maurita- affect key dimensions of households’ welfare, including nia and optimizes the model calibration with a machine food consumption. Yet, the ability to forecast food insecu- learning procedure. The model can produce fairly accurate rity for intervention planning remains limited and current lean season food insecurity predictions very early on in approaches mainly rely on qualitative methods. This paper the agricultural season (October-November), that is six to incorporates microeconomic estimates of the effect of the eight months ahead of the lean season. Comparisons of rainy season quality on food consumption into a catastro- model predictions with survey-based estimates yield a mean phe risk modeling approach to develop a novel framework absolute error of 1.2 percentage points at the national level for early forecasting of food insecurity at sub-national levels. and a high degree of correlation at the regional level (0.84). The model relies on three usual components of catastrophe This paper is a product of the Urban, Disaster Risk Management, Resilience and Land Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at oishizawa@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Struggling with the Rain: Weather Variability and Food Insecurity Forecasting in Mauritania∗ Paul Blanchard Thibaut Humbert The World Bank The World Bank Oscar Ishizawa Rafael Van der Borght The World Bank The World Bank Keywords: food security, drought, early warning system, adaptive social protection, climate vulnerability, probabilistic risk modeling JEL classifications: Q54, O55, I31 ∗ We are grateful to the Government of Mauritania and the World Food Programme in Mauritania for sharing the Food Security Monitoring Survey data. We thank Javier Baez, Nicola Fontana, Ruth Hill, Edmundo Murrugarra, Juan Carlos Parra, Emmanuel Skoufias, Eric Strobl, as well as participants to Trinity College Dublin PhD seminar for useful comments. We also thank Dieynaba Diallo, Matthieu Lefebvre, Franck Muller, Mira Saidi and Samantha Sarria for their comments and invaluable support. This work has been financed by the Sahel Adaptive Social Protection Program (SASPP) which is funded by a multi-donor trust fund (MDTF) with contributions from German Federal Ministry for Economic Cooperation and Development (BMZ); Agence Fran¸ caise de D´eveloppement (AFD); Denmark Royal Ministry of Foreign Affairs; and the United Kingdom Foreign, Commonwealth, and Development Office (FCDO) and the Sub-Saharan Africa Disaster Risk Analytics Program which is funded through the Global Facility for Disaster Reduction and Recovery (GFDRR). This paper is a product of the Urban, Disaster Risk Management, Resilience Land GP. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The author(s) may be contacted at pblanchard@worldbank.org or paul.baptiste.blanchard@gmail.com. 1 Introduction Despite a slow decline in the poverty rate over the last decades, the number of poor continues to rise in Sub-Saharan Africa. Although the poverty rate has declined from 56% in 1990 to 40% in 2018, this progress has not been sufficient to keep up with population growth: it was estimated that 433 million Africans lived in extreme poverty in 2018, against 284 million in 1990.1 Considering a higher poverty line (i.e. US$ 5.50 a day) reveals an even more fragile situation, with a poverty rate remaining almost constant and a doubling of the number of poor during the same period. These trends indicate that a large share of the population remains highly vulnerable to poverty and that the modest progress observed in terms of extreme poverty reduction could easily be reversed in the face of adverse shocks. Looking forward, climate change is expected to further exacerbate weather-related shocks and climate variability, which could worsen the current situation, especially in rural areas that concentrate 82% of the people living in extreme poverty in Sub-Saharan Africa and where livelihood means are heavily reliant on weather conditions (Beegle and Christiaensen, 2019). There is already ample evidence on the impact of weather shocks on households’ con- sumption in Africa, not only in terms of short-term fluctuations but also for long-term consumption growth due to the lack of insurance and the adoption of adverse coping strategies (Baez et al., 2020; Dercon, 2004). Some studies have specifically focused on food consumption and found that rainfall variability had a significant impact (Demeke et al., 2011; Huho and Mugalavai, 2010; Lewis, 2017), identifying climate as a clear threat to the achievement of the second Sustainable Development Goal (SDG) of zero hunger. Most of those studies employ panel methods to quantify the impacts of specific events on poverty outcomes. However, to the best of our knowledge, no study has further exploited these results with the objective of effectively predicting the welfare impacts of weather shocks at sub-national levels. This is mainly due to data limitations for investigating a highly complex and heterogeneous relationship between environmental conditions and welfare. Moreover, it is now well-established that both idiosyncratic and covariate shocks, including weather shocks, induce some degree of inter-temporal variability in households’ welfare and measuring poverty at one point in time can be misleading; the non-poor today may be the poor of tomorrow (Hill and Porter, 2017; Skoufias et al., 2021). As a result, another strand of literature has moved the focus away from event-specific impacts to grasp a broader understanding of welfare volatility by developing methods for quantify- ing the risk of being poor, usually referred to as “vulnerability to poverty”2 (see Gallardo 1 Source: World Bank, World Development Indicators and World Bank (2020). 2 In this paper, we revise the concept of vulnerability and provide a definition that is consistent with the notion of vulnerability used in catastrophe risk models. We define vulnerability at the household-level as the relationship between rainy season conditions and food consumption in the following lean season. 3 (2018) for a review). Attempts to measure vulnerability to poverty in African contexts have highlighted the existence of sizeable welfare fluctuations and fractions of households exposed to poverty risk typically larger than realized poverty rates as measured in tra- ditional cross-sectional poverty surveys (Dercon and Krishnan, 2000; Christiaensen and Subbarao, 2005; Demissie and Kasie, 2017; Hill and Porter, 2017; Skoufias et al., 2021). In this paper, we develop a new framework inspired from catastrophe risk modeling meth- ods (henceforth referred to as cat risk models or cat models) with the broad objective of improving our understanding of the nature, magnitude and distribution of impacts of weather shocks on household welfare. The primary aim of the proposed methodology is to produce reliable food insecurity forecasts that can support the design and targeting of annual interventions. We focus more specifically on food consumption for at least two reasons. First, ensuring food security in Sub-Saharan Africa remains a prime challenge and food consumption is still an essential welfare dimension considered for the design of anti-poverty programs in the region. Second, poverty surveys traditionally exploited to examine the impact of climate shocks on household welfare often have a comprehen- sive consumption module but are only carried out every 5 to 10 years. On the other hand, food security surveys primarily focus on food consumption and coping strategies but are generally conducted at least once a year. They therefore provide consumption realizations from a larger set of rainy season conditions scenarios. From an econometric perspective, this simply means that we can rely on relatively more identifying variation to estimate the relation between the rainy season quality (RSQ) and food consump- tion. We focus first on the estimation of a reduced form that quantifies the impact of the RSQ on food consumption. We investigate the heterogeneity of the RSQ impacts considering both geographic and household characteristics and our findings do reveal marked differences across locations (e.g. regions, livelihood zones...) and according to key household features, such as income sources or livestock ownership. We determine an optimized heterogeneity structure via a machine learning procedure that allows to form distinct groups of households for which food consumption in the lean season reacts to rainy season conditions in comparable ways: those are called “household typologies”. An RSQ-food consumption relationship – called a “vulnerability functions” – is calibrated for each household typology. For any given rainy season conditions scenario, RSQ-induced changes in food consumption relative to a baseline are inferred by evaluating vulnerability functions at RSQ values corresponding to that particular scenario. An exposure compo- nent encapsulating the spatial distribution of households and baseline food consumption levels for each household typology then allows to produce food insecurity predictions at sub-national levels. Finally, we incorporate a probabilistic dimension to the hazard com- ponent (i.e. the rainy season quality) to build a food insecurity risk profile that allows to appreciate the volatility of food consumption caused by variability in the RSQ. 4 The paper first contributes to the climate economy literature by providing new evidence of the distributional impacts of weather conditions on welfare in Africa. However, the most important contribution is methodological with the introduction of a novel frame- work that incorporates microeconomic estimates into a cat model structure in order to produce food insecurity forecasts that can be used to support the design and targeting of annual interventions and other social protection programs. The methodology exploits the lagged effect of rainy season conditions on food consumption observed during the hunger season to produce sub-national estimates of food insecurity up to 8 months ahead of the hunger season. This represents a clear improvement from food insecurity prediction techniques currently used in Mauritania and other countries in the region – mainly based on qualitative approaches – which can feed into early warning systems with promising prospects to support intervention planning and inform the design and targeting mech- anisms of adaptive social protection programs. Our model not only provides spatially disaggregated predictions of food insecurity for specific scenarios, but also allows to pro- duce a probabilistic assessment of the welfare risk induced by weather variability. We take our model to the data in the context of rural Mauritania. Mauritania is a West African country located in the Sahelian and Saharan zones that is greatly exposed to climate hazards. In particular, the high inter-annual rainfall variability leads to the frequent occurrence of drought conditions that have devastating effects on households’ well-being. In 2011, marked rainfall deficits led to crop failures, water shortages, lack of pasture and livestock losses, which in turn triggered a major food crisis during the 2012 lean season with an estimated one million people (approximately 27% of total population) in a situation of food insecurity.3 Mauritania has only one rainy season running from June to October and the main harvest takes place between September and November. Following the main harvest, food stocks gradually decline until the lean season (May- August) that is typically characterized by lower food reserves, low demand for agricultural labor and rising food insecurity. The magnitude of these seasonal patterns is largely determined by the quality of the rainy season and drought conditions therefore exacerbate lean season food insecurity peaks. The lagged effect of the rainy season quality on food consumption is a feature of interest because it implies some degree of predictability of the forthcoming impact on households as soon as the quality of the RSQ is observable, which can be exploited to support early action. When tested against historical data, the prediction model produces national food in- security estimates with a mean absolute error of 1.2 p.p., with historical rates ranging between 29% and 42% over the period 2011-2015. At the regional-level, model predictions yield a mean absolute error of 4.9 p.p. and a correlation coefficient of 0.84. More im- portantly, the methodology allows to uncover the fact that inter-year variations in rural 3 Source: authors calculations based on the 2012 Food Security Monitoring Survey. 5 food consumption are large and that they are majorly driven by climate variability. The remainder of the paper is organized as follows. Section 2 provides definitions of our outcome of interest and the rainy season quality. Section 3 explains our methodology with an overview of the food insecurity prediction model based on a cat model framework, a description of the approach used to construct RSQ indices, and the empirical strategy for the estimation of vulnerability functions. Results are presented in section 4 and section 5 concludes. 2 Defining and measuring food security and rainy season quality 2.1 Food security Food security is defined as a situation in which “all people, at all times, have physical and economic access to sufficient, safe and nutritious food that meets their dietary needs and food preferences for an active and healthy life.”4 The concept of food security at the household-level rests on four pillars: (i) the (physical) availability of food that depends on local food production, food stocks and net trade, (ii) the economic and physical access to food, (iii) the utilization of food that reflects the nutritional status of individual through feeding practices, food preparation, diet diversity and intra-household distribution of food and (iv) stability that requires that the first three components be stable over time. In this paper, we measure food security at the household-level with the Food Consump- tion Score (FCS). It is calculated as a weighted sum of reported consumption frequencies of various food groups over a period of seven days. The 8 food groups considered are as- signed weights according to their nutritional importance. The highest weights are placed on food groups providing proteins such as meat (4), dairy (4) and pulses (3); the complete list of food groups and their associated weights is provided in Table A.1 in the appendix. Consumption frequencies for each food group can vary from 0 to 7 days and FCS values therefore range from 0 to 112. Standard threshold values of 28.5 and 42 are used to define situations of moderate and severe food insecurity respectively, although these food insecurity lines may be adjusted to local contexts. Higher FCS values correspond to more diversified and protein-rich diets while low values may reflect unbalanced diets with low protein intakes. The FCS essentially has the ability to capture differences in food secu- rity status across households and disparities in food insecurity rates across space. It also allows for consistent and meaningful comparisons over time, in particular because it does not require food prices in its calculation. However, the FCS does not capture the quality 4 Source: World Food Programme/FAO. 6 of food or quantities consumed per day (or meal), nor does it account for intra-household distribution of food. With these limitations in mind, we adopt the FCS as a measure of food consumption and later show that it proves itself to be a sufficient metric to unveil the effect of the quality of the rainy season on food consumption. In Mauritania, Food Security Monitoring Surveys (FSMS) are conducted twice a year and provide first-hand information on the food situation for intervention planning purposes, including FCS-based food insecurity rates at sub-national levels. We focus mainly on the first round that is carried out during lean seasons when food consumption is usually at the lowest in the annual cycle.5 In our study case, we use a pooled sample of 10,969 household observations from five FSMS rounds collected during lean seasons between 2011 and 2015.6 Each household observation has consumption frequencies in the food groups required for the calculation of the FCS as well as key household characteristics. Over the five lean seasons, we observe large variations in the moderate food insecurity rates in rural areas, from 29% in 2013 to 42% in 2012 (see Figure 1). Note that the worst two years (2012 ans 2015) followed the poorest rainy seasons (2011 and 2014) observed in the period in terms of aggregate rainfall.7 Moreover, our data clearly reveal seasonal patterns between lean and post-harvest seasons with an average food insecurity rate of 34% for the former against 22% for the latter. 5 Another round is conducted during the post-harvest period (December-January). 6 There was no survey conducted in 2016 and, from 2017 onwards, data were collected late in the rainy season which poses comparability issues with the interaction of the preceding and ongoing rainy season quality in food consumption outcomes. 7 We calculate a rural population-weighted average of department-level 5-month precipitation z-scores and find a value of -0.7 and -0.3 in 2011 and 2014, against 2.1, 1.7, 0.2 in 2010, 2012 and 2013 respectively. 7 Figure 1: Rural food insecurity rates in lean and post-harvest seasons, 2011-2015. Note : Source: Food Security Monitoring Surveys. The blue and brown dotted lines represent the average food insecurity rate in post-harvest and lean seasons respectively. 2.2 The rainy season quality Our study focuses on the southern half of the country where the climate qualifies as semi- arid with an average of 171 mm of precipitations received per year.8 . It is worth noting that the area considered accounts for over 97% of the total rural population in Mauritania. There is only one rainy season in Mauritania that runs from June to October with most rainfall typically occurring in August-September and almost no precipitation in the rest of the year (see box plot in Figure 2a). We observe only marginal variations in the onset of the rainy season across departments. However, there are marked spatial disparities in average annual rainfall received with values ranging from 54 mm in the department of Tichitt to 416 mm in Selibaby, with significant implications for agro-pastoral activities that can be carried out in each location. The map provided in Figure 2b illustrates the variation in annual precipitations at the communal-level for the entire country, with values ranging from 16 mm to 496 mm. 8 Our studied area is described in the map of Figure A.1 in appendix A We exclude the northern part of the country that is mostly desert and where rainfall variability is therefore irrelevant. 8 Figure 2: Average monthly precipitations in the studied area. (a) Average monthly precipitations (b) Annual precipitations by commune Note : Averages are calculated for the period 1981-2021 based on CHIRPS v2.0 precipitation data. Agricultural production fundamentally depends on resources in water, light energy and nutrients available for crop growth – although requirement levels vary across both crop types and growth stages. In particular, soil water plays a central role in the plant de- velopment cycle and is largely driven by weather conditions parameters that include precipitations, temperature and wind. To some extent, the same applies to livestock production where weather conditions are a key determinant for the production of fodder and the availability of water and pasture at watering points and grazing areas. In this paper, we broadly define the Rainy Season Quality (RSQ) at local levels for any given year as the extent to which observed weather conditions depart from long-term histori- cal conditions, indicating the degree of suitability for agricultural and pastoral activities with respect to water resources. Due to data limitations, we do not consider plant water requirements in absolute terms to measure the RSQ as this would require household-level information on agricultural practices that FSMS do not provide. Instead, we assume that households adapt farming and breeding practices to local climate conditions and experience year-to-year variability in agricultural and livestock productivity conditional on observed deviations of weather conditions from long-term mean. We propose two distinct categories of RSQ measures. The first one considers precipita- tions as the main weather conditions parameter for soil water availability and therefore calculates the RSQ as a rainfall anomaly. Although water requirements are location- specific and depend on local temperatures, land-cover prevalence or irrigation and agri- cultural practices, precipitations provide a consistent proxy for water resources available 9 for agro-pastoral activities. The second type of RSQ measure abstracts from weather conditions and rather focuses on its direct observable outcomes in the form of vegetation state and dry matter productivity, which can both be measured via remote sensing. The former is a measure of vegetation greenness and a viable alternative to precipitation in- dices for capturing agricultural production dynamics, while the latter is directly related to vegetation nutrient content required for livestock and is often used as a local measure of animal carrying capacity. In this case, the RSQ is also calculated in relative terms as a deviation of vegetation greenness or dry matter productivity from local long-term mean. This multi-index approach has been extensively used in the literature, where drought as a natural hazard is found to be best characterized by multiple climatological and hydro- logical parameters (Beguer´ ıa et al., 2014; Mishra and Singh, 2010). We provide more details on the data and methods effectively used to produce the RSQ indices that we pair with household data in section 3.2. 3 Methodology 3.1 Catastrophe risk models: From physical assets to human welfare Catastrophe risk models are traditionally used to estimate probabilistic loss distribu- tions for physical assets exposed to a given natural hazard. They essentially combine three components. The hazard component is comprised of a set of (a large number of) synthetic scenarios of natural events that together provide a probabilistic representation of the possible events for the area considered. For instance, the hazard component of a windstorm risk model is generally made up of synthetic windstorm events associated with an annual rate of occurrence and a spatial footprint in the form of a gridded repre- sentation of the maximum sustained wind speed. The exposure component describes the spatial distribution of the asset stock considered. In our example, this could be a map giving the replacement value of residential buildings for each grid-cell. Then, the vulner- ability module maps hazard intensity onto expected damages. Building characteristics imply disparities in the response to physical constraints so that building typologies are formed based on, for instance, building materials and building codes, and vulnerability functions are calibrated for each individual building group. For a given hazard scenario, the loss value at each grid-cell is calculated by evaluating vulnerability functions at the corresponding hazard intensity value. The inferred expected damages are summed across building typologies to obtain a total loss value, and the procedure is repeated for all cells and all synthetic events in the stochastic set. Risk metrics such as losses for different return periods can be calculated from the set of values obtained, at both national and 10 sub-national scales. We adapt this analytical framework to the modeling of the impact of the quality of the rainy season on households’ food consumption. We first focus on the development of a food insecurity forecasting model in which the methodology described above is applied to historical hazard scenarios. Each hazard scenario is represented by a spatial footprint of the RSQ in the form of RSQ index values at the department-level. A subtle though fun- damental difference with other types of hazards (e.g. windstorms, earthquakes, floods) is worth noting here, which has important implications in the model. Hazard intensity measures are typically lower-bounded with a minimum value defining the boundary be- tween a situation of an event occurring and that of a “no-event” scenario. For instance, earthquakes are measured based on the Peak Ground Acceleration (PGA) and a strictly positive PGA effectively reflects the occurrence of an earthquake event. This means that cat risk models implicitly calculate losses with respect to a “no-event” scenario in which losses are null. By contrast, the RSQ is measured with variables that encompass a range of hazard conditions going from extremely bad (i.e. severe drought) to exceptionally good rainy seasons, and for which a “no-event” scenario cannot be defined in an equivalent way. Instead, for a given RSQ scenario, we calculate changes in the level of food consumption relative to a baseline scenario that we conveniently define as a scenario in which the RSQ index of interest is equal to 0, henceforth called “normal conditions”.9 The level of food consumption in normal conditions is called the “baseline food consumption”. Then, following the cat model definition above, the vulnerability module is comprised of vulnerability functions representing the relationship between hazard intensity (i.e. an RSQ index value) and a change in the level of household food consumption relative to the baseline food consumption. Just as houses made of wood exhibit higher windstorm damage ratios compared to houses made of bricks, some households will suffer higher food consumption losses than others following a bad rainy season – and conversely for a good rainy season. Data limitations obviously prevent the calibration of vulnerability functions for individual households and, following our analogy with cat risk models, we form typologies of households that exhibit comparable food consumption responses to the quality of rainy seasons. The optimal design of household typologies based on both location and household-level factors is a central outcome of this paper and an issue we explore in depth in section 4.10 In our case, the elements at risk are not physical assets 9 As we later explain in section 3.2, when RSQ indices are SPI-like anomalies of an underlying weather/environmental variable, a value of 0 corresponds to a situation where the underlying variable is equal to its historical median. For instance, an SPI of 0 corresponds to precipitations being equal to their historical median. When anomalies are calculated as z-scores, a value of 0 reflects a scenario in which the underlying variable is equal to its historical mean. Because historical distributions of pre- cipitations/vegetation/dry matter productivity are usually concentrated around their mean/median, we naturally label as “normal” conditions corresponding to the mean or median value. 10 As highlighted in section 1, the concept of household vulnerability is usually understood as the “vul- nerability to poverty”, which is the current probability for a household of being poor in the future. Note 11 but correspond to households’ food consumption levels – although this could of course be generalized to other welfare dimensions – and the exposure component is comprised of two sub-components. The first one describes the spatial distribution of households across modeling units11 and can be understood as the level of exposure at the extensive margin. Because the exposure to the risk of food insecurity also depends on households’ distance to the food insecurity line, the second sub-component provides the probability distribution of the baseline food consumption for each modeling unit and reflects the level of exposure at the intensive margin. For a given hazard scenario, our model combines the hazard, exposure, and vulnera- bility components as follows. For each modeling unit (i.e. department-typology pair), the vulnerability function is evaluated at the RSQ index value to estimate the change in food consumption relative to the baseline, which is then applied to the baseline food consumption distribution in order to infer the food consumption distribution for the haz- ard scenario considered. A positive (respectively negative) change in food consumption induces a translation to the right (respectively left) of the distribution of the baseline food consumption. Food insecurity being defined as food consumption below the food insecurity line, the distribution obtained allows to estimate the fraction of food insecure households, which is finally applied to the modeling unit household count to estimate the absolute number of food insecure households. A graphic illustration of the estimation procedure is provided in Figure 3 where we consider a scenario in which the RSQ index is equal to -2. The evaluation of the vulnerability function at -2 yields a change in the FCS, ∆F CS , of -8 (Figure 3a). This food consumption loss implies a leftward shift of the baseline food consumption distribution (blue line in Figure 3b) and the intersection of the shifted distribution (blue dotted line in Figure 3b) with the food insecurity line (red line) allows to read the estimated food insecurity rate (49%). This procedure is repeated for every modeling unit and values are aggregated across typologies to get food insecurity estimates at the department-level. that in this context and consistent with the cat model definition, we specifically define the vulnerability of a household’s food consumption to the quality of the rainy season as the relationship between an RSQ index and the subsequent change in the level of food consumption in hunger season relative to the baseline food consumption, i.e. the vulnerability function. In the language of cat risk modeling that we adopt in this paper, a household’s vulnerability to poverty is simply the risk of poverty caused by one or several types of hazard. 11 Modeling units are the largest set of households within which hazard conditions and vulnerability parameters are considered homogeneous. In our food insecurity model, we define hazard conditions at the department-level and household typologies are based on the region of residence and household-level characteristics so that modeling units correspond to department-typology pairs. 12 Figure 3: Graphic illustration of the food insecurity estimation procedure. (a) Vulnerability function (b) Food insecurity estimation. In section 4.3, we simply extend the hazard component to a full stochastic set of RSQ scenarios to build a comprehensive food insecurity risk model. 3.2 Hazard component: Construction of RSQ indices The RSQ has been broadly defined as the degree of suitability for agricultural and live- stock activities offered by observed weather conditions during any given rainy season. In this section, we explain more specifically how we build relevant quantitative measures that effectively reflect the RSQ experienced by households, bearing in mind the final objective of pairing those with FSMS household data to estimate the RSQ impact on food consumption. This immediately raises the question of the adequate spatial extent to which conditions should be considered to characterize the quality of a rainy season at the household-level. In a rural context dominated by agricultural activities, local con- ditions in the near vicinity of the household location are most likely relevant assuming households possess (or rent) and cultivate plots close to their home – e.g. buffered ar- eas around household locations, varying from a few hundred meters to a few kilometers. Measures at wider scales can also be informative to the extent they capture conditions at the level of local markets, which in turn are a plausible determinant of the availability of and access to food. Compared to highly local measures, they can also provide an indication of the demand for agricultural labour within reasonable travel distances from home. Finally, rural activities such as nomadic pastoralism rely on conditions at larger scales since herders are susceptible to travelling long distances away from home to access grazing areas and dedicated water points. In this paper, we do not consider household- 13 specific local measures due to missing information on household location12 and we rather focus on RSQ measures based on administrative boundaries. To allow for comparability across areas with different rainfall regimes, we build RSQ indices as normalized deviations of underlying weather or environmental variables such as precipitations, vegetation greenness and dry matter productivity levels with respect to their historical distributions, which are referred to as “anomalies”. For an underlying variable x, we denote the value of x in year t for a spatial extent a and a time window ∆ as x(a, ∆, t). For instance, x can be precipitations, a a department and ∆ the July- August two-month time window so that x(a, ∆, t) represents cumulated rains over the July-August period in year t for department a. We employ two standard methods for calculating the anomaly associated with x(a, ∆, t). The first one is a simple normalization that measures a departure from the mean in standard deviations and is called a z-score, z that we denote rsqa,t (x, ∆). Using our notations, this can be written as: z x(a, ∆, t) − µx (a, ∆) rsqa,t (x, ∆) = (1) σx (a, ∆) Where µx (a, ∆) and σx (a, ∆) are the long-term mean and standard deviation of x(a, ∆) respectively. Another way of computing anomalies is based on the methodology devel- oped by McKee et al. (1993) for the calculation of the Standardized Precipitation Index (SPI). They use a normalization procedure that implies the transformation of historical distributions into normal distributions so that the resulting series of SPI have a mean of 0 and a standard deviation of 1.13 We apply this method to precipitation values to build SPI series but also to our vegetation and dry matter productivity variables to cal- s culate what we call “spi-like” anomalies that we denote rsqa,t (x, ∆). For both types of anomalies, positive values reflect wetter than usual conditions while negative values indi- cate drier than usual conditions. However, z-scores are linear in the underlying variable x(a, ∆, t) by construction and are therefore easier to interpret than spi-like anomalies,14 and for that reason, we mostly prefer z-scores in our study of the impacts of the RSQ on food consumption in section 4. In broad terms, z-scores measure hazard intensity as 12 FSMS are geo-referenced only from 2013 onwards. 13 More formally, a distribution FX is fitted to a long-term time series (x(a, ∆, t))t∈ 1...T and trans- formed to a normal distribution FN via a method of percentile matching, so that the corresponding RSQ s index value rsqa,t (x, ∆) for any given value x(a, ∆, t) is given by: s −1 rsqa,t (x, ∆) = FN (FX (x(a, ∆, t)) (2) 14 For instance, a one unit increase in a precipitation z-score is interpreted as a one standard deviation increase in precipitations in absolute terms and irrespective of the starting point. On the other hand, a one unit increase in SPI is interpreted in probabilistic terms with respect to the normal distribution values. Precipitation values yielding SPI values of 0, 1 and 2 corresponds to probabilities of 0.5, 0.84 and 0.98 of observing rainfall amounts less than or equal to those values respectively, so that increases in the SPI from 0 to 1 and from 1 to 2 do not result in equivalent changes in probability terms. 14 the distance to the mean in absolute terms, whereas spi-like anomalies directly relate the intensity of a hazard to its probability of occurrence. Since there is little consensus around an optimal underlying variable for describing the RSQ, we test a range of usual inputs including precipitations, the Normalized Differ- ence Vegetation Index (NDVI), the Enhanced Vegetation Index (EVI) and Dry Matter Productivity (DMP). We acknowledge that other variables typically used for monitoring droughts and agricultural production could be employed and leave it to future work to test other indices such as the Standardized Precipitation Evapotranspiration Index (SPEI), the Palmer Index, other vegetation indices, temperature-based indices or more elaborate direct measures of agricultural production and availability of pasture. That said, our re- sults suggest that RSQ measurement choices matter less than the choice of heterogeneity structure (i.e. vulnerability factors) in the optimization of the model explaining power of the performance. One final consideration for the calculation of RSQ indices is on the temporal dimension with the choice of time window ∆. First of all, in any given year t, we account for differences in the onset of the rainy season that may exist across administrative units. For any geographic unit a, we calculate monthly historical averages for our underlying variable x and considering any time window width of m months, ∆ is defined such that it covers the m consecutive months where x is historically highest. Second of all, shorter time scales imply that conditions at the peak of the rainy season are the prime determinant of future food consumption whereas larger time scales consider that aggregate measures over the entire rainy season are more relevant to appraise agricultural campaigns. We do not favour one assumption over the other and simply test different time scales ranging from one to five months.15 Gridded precipitations are retrieved from the CHIRPS-2.0 product of the Climate Hazard Group. Monthly precipitations at a resolution of 0.05◦ are available for the period 1981 to present. We produce monthly grids of NDVI and EVI at a 500m resolution from NASA’s MODIS Terra satellite products that cover the period 2000 to present. Similarly, we create monthly 1km grids of DMP using data from the Global Land Service of Copernicus (PROBA-V satellite mission) for the period 1999 to 2019. We aggregate monthly gridded values at administrative levels – from regions (level 1) to communes (level 3) – and we calculate RSQ indices at various temporal scales. We provide a correlogram of department-level RSQ indices in Figure 4 to appreciate differences between index types16 . While we observe high pairwise correlations between 15 The maximum time scale of five months corresponds to the typical duration of the rainy season in Mauritania. 16 We also provide a similar correlogram in Figure B.1 considering a shorter time scale for our four indices (1 month) in appendix B and make comparable observations. 15 vegetation- and DMP-based indices, the relation between these and the precipitation index appears to be much weaker. This highlights the possibility that different RSQ indices might complement each other with some being more appropriate than others in specific contexts. In the same vein, we check the correlation between indices based on the same underlying variable but with time scales varying from 1 to 5 months. We find nearly perfect correlations between indices involving time windows greater than 2 months but somehow weaker correlations between 1-month indices and other time windows, and to some extent between 2-month indices and other time scales (see Figure B.2 to B.4 in appendix B). Conditions at the peak of the rainy season are thus not necessarily representative of the entire rainy season quality and we test several time scales to uncover which is more relevant to explain food consumption in lean season. Finally, our study period (2011-2015) is comprised of a succession of poor to good rainy seasons (see maps in Figure B.5 in appendix B) and we can rely on a reasonable amount of variation in the RSQ to estimate its impact on food consumption in our sample. For instance, department-level 4-month SPI values range from -1.52 to 3.21. Figure 4: Correlogram of department-level RSQ indices for the period 2000-2019. 16 3.3 Vulnerability component As aforementioned, we adopt in this paper a definition of vulnerability at the household- level that is consistent with the concept of vulnerability used in cat risk modeling. More specifically, we define the vulnerability of a household’s food consumption with respect to the Rainy Season Quality (RSQ) as the relationship between a measure of the RSQ and a measure of food consumption in the following lean season, that we call a vulnera- bility function.17 In order to elucidate the notion of vulnerability used in our model, we provide in Figure 5 a simple structural model of the channels through which rainy season conditions may affect a household’s food consumption. Although minimal, this model provides a conceptual basis for the existence of explicit vulnerability factors that mediate the relationship between RSQ and food consumption. The model assumes that rainy season conditions have a direct impact on agro-pastoral production, which in turn affects both food prices and income. Prices and income together determine real income, which affects access to food. On the other hand, the level of agro-pastoral production has an impact on the availability of food – either on markets or in households’ stocks intended for self-consumption. The impacts on access to food and availability of food govern the final effect on food consumption. We then identify four groups of vulnerability factors that correspond to household and market characteristics affecting the shape of the main four structural relations of the model: - Agricultural vulnerability : The agricultural vulnerability comprises factors that determine the shape of the re- lationship between rainy season conditions and agro-pastoral production, such as the types of crops cultivated, seeds used (e.g. ordinary versus drought-resistant) or the level of access to irrigation infrastructure. - Income vulnerability : The income vulnerability reflects the degree to which changes in the level of agro- pastoral production translates into income gains or losses. For instance, households generating larger fractions of their income from agro-pastoral activities will of course have a total income that is all else equal more sensitive to losses in that sector. Agri- cultural insurance can reduce the volatility of agricultural income so that the share of non-insured agricultural production can also be seen as a vulnerability factor. In general, the availability of alternative revenue sources allowing to compensate for agri- cultural losses - e.g. aid or (temporary) migration - decreases income vulnerability. - Price stability : 17 Note that, more generally, this definition can be extended to the vulnerability of any dimension of welfare (e.g. total consumption, nutrition outcome...) to any given type of hazard (e.g. floods, windstorms, locust invasion...). 17 Several factors may influence the sensitivity of food prices to agro-pastoral production. Markets usually supplied with local production will be more sensitive to agricultural shocks as consumers will potentially face higher prices of imported food products. On the other hand, the implementation of national stocks or subsidy programs allow to increase price stability in a context of highly volatile production levels in the primary sector (Dorosh, 2009; World Bank, 2012). - Consumption smoothing capacity : Finally, conditional on levels of food availability and access at a certain point in time, households still have the ability to adjust intertemporal consumption choices to curb the impact on current food consumption. In short, access to credit and savings markets allow households to smooth consumption over time and reduce welfare losses by re- distributing food consumption losses over multiple time periods.18 Figure 5: From rainy season conditions to household food consumption: a schematic view. The proposed structural model provides a conceptual basis to elucidate the notion of vulnerability in our approach. In practice, we simply estimate a reduced form and inves- tigate the heterogeneity in the RSQ-food consumption relation with respect to household or market characteristics that are plausible vulnerability factors. Our empirical strategy therefore exploits the exogenous spatial and temporal variation in the RSQ over five rainy seasons (2010-2014) to estimate its effect on food consumption at 18 Of course, this implicitly assumes that household preferences can be represented by a concave utility function. It is also consistent with preferences that include a minimum subsistence consumption level, which would be, for instance, the food insecurity line. 18 the household-level. We use repeated cross-sectional data compiled from our five FSMS in lean season that provide household-level information on food consumption along with key socio-economic characteristics. We thus propose the following general reduced form that relates the FCS to an RSQ index: yi,d,r,t = β0 + β1 rsqd,t + β2 rsqd,t Vi,d,r,t + β3 Xi,d,r,t + γr + δt + i,d,r,t (3) yi,d,r,t is the FCS in lean season of year t for household i, living in department d and region r. rsqd,t is an RSQ index measuring the quality of the previous rainy season in department d.19 Xi,d,r,t is a vector of controls and Vi,d,r,t is a broadly defined vector of vulnerability factors such that the marginal effect of the RSQ on individual i’s food consumption is given by β1 + β2 .Vi,d,r,t . Of course, the proposed model assumes a linear relationship between food consumption and the RSQ and the vulnerability function for household i is therefore given by vi (rsq ) = (β1 + β2 .Vi,d,r,t ) × rsq , although our results also include estimations with more complex polynomial functional forms. Regional fixed-effects γr absorb time-invariant unobserved spatial characteristics at the regional level that are not captured in Xi,d,r,t . Those include, for instance, access to food markets, to urban markets for trade or non-agricultural work, climatic and geographic fac- tors, institutions and cultural characteristics. Year fixed-effects capture yearly variations at the country level such as those due to exogenous price shocks on food commodities or other goods typically consumed by households and affecting their real disposable in- come. In fact, exogenous food price shocks can significantly compound the food security situation in Mauritania20 where more than half of cereal needs are typically covered with imports.21 For practical reasons, we exclusively use categorical variables for vulnerability factors where each value defines a household typology. For instance, assuming there is only one vulnerability factor Vi,d,r,t that takes on 5 values k ∈ 1, 5 , we have 5 household typologies (Hk )k∈ 1,5 defined by Hk = {i ∈ H|Vi,d,r,t = k }, where H is the full set of households. 19 The choice of spatial resolution for the hazard component is another dimension that we only briefly explore in this paper. We ultimately use department-level RSQ indices because they consistently yielded better results compared to indices at regional and communal levels. We could also have tested household- level indices considering buffers of different sizes (e.g. 5km, 10km, 20km...) around household locations, although only three (2013-2015) out of our five FSMS rounds include households’ GPS coordinates which would lead to a substantial reduction in the sample size. 20 For instance, in 2008, Mauritania was hard hit by a global food price shock that deteriorated a structurally fragile food situation. 21 Cereal imports (rice, wheat, millet, sorghum and corn) represent a yearly average of 62.5% of cereal consumption in the country, although we do not have information on how much this fraction differ between rural and urban markets. Source: authors’ calculation based on data from the Office National de la Statistique de la R´epublique Islamique de Mauritanie (Statistical Office of Mauritania). 19 We can then rewrite equation 3 as: 5 yi,d,r,t = β0 + β1,k .1Hk (i).rsqd,t + β3 Xi,d,r,t + γr + δt + i,d,r,t (4) k=1 Where β1,k is the marginal effect of the RSQ on food consumption for households be- longing to typology Hk . The estimation of coefficients (β1,k )k∈ 1,5 in equation 4 allows to infer a vulnerability function for each household typology k ∈ 1, 5 : vk (rsq ) = β1,k × rsq (5) Several issues challenge the identification of the coefficients of interest in practice. First of all, the objective is to capture the net effect of the RSQ on food consumption, so the control variables Xi,d,r,t cannot be outcomes of the RSQ to avoid an over-controlling problem (Dell et al., 2014). Our primary vector of controls therefore includes: the sex, age, education level and marital status of the household head, the household size, the de- pendency ratio, average yearly precipitations in the department of residence and reported idiosyncratic shocks (death of a household member, health issue). We also consider a sec- ond set of controls (job loss, main source of income, livestock ownership, aid received) that may imply over-controlling issues, and we check for the robustness of our results to their inclusion. Second of all, the choice of vulnerability factors is constrained for several reasons. Complex interactions implying higher numbers of typologies are associated with an increased risk of over-fitting and lower statistical power within each typology, a serious concern given that prediction is our primary goal. In addition to estimating vulnerability functions, a key objective of the calibration exercise is therefore to determine an optimal set of vulnerability factors under constraints imposed by the data in terms of sample size and reported household characteristics. We address this by implementing a machine learning procedure that cross-validates the form of our final specification. Also, and more importantly, more complex heterogeneity structures limit the variation in the RSQ val- ues available for estimating our coefficients of interest, by decreasing the average sample size across typologies. Ideally, we require a uniform distribution of RSQ values ranging from negative to positive extreme values in order to get precise estimates of parameters defining the RSQ-FCS relation. Our study period is a succession of good, bad and av- erage years which result in a variation in RSQ indices that is realistically in line with our objective to calibrate vulnerability functions (see section 3.2). Finally, vulnerability factors cannot be outcomes of the RSQ as this would induce some bias caused by the possibility of households moving across typologies. Since there is no consensus around how to best measure the RSQ, we test the range of precipitation-, vegetation- and dry matter-based indices that we introduced in section 20 3.2. We first follow a conventional approach by conducting regression analyses that allow us to highlight key stylized facts on the effect of the RSQ on food consumption. Then, we use those results to guide the implementation of the machine learning method in which we determine an optimal choice of both vulnerability factors and RSQ measures in the context of rural Mauritania. 4 Results 4.1 Impact of the rainy season quality on food consumption 4.1.1 Main effect In this section, we present results of pooled OLS estimations for specifications derived from the reduced form given by equation 3. We explore both the impact of RSQ mea- surement choices on our results and, more importantly, the heterogeneity of the effect of the RSQ on food consumption based on spatial fixed-effects and household characteris- tics.22 All standard errors shown in the results are clustered at the department level.23 As mentioned in section 3, we consider two separate groups of control variables. “Group A” is comprised of household characteristics that are presumably unrelated to the RSQ, while “group B” contains those control variables that may plausibly induce over-controlling issues (job loss, main source of income, livestock ownership, aid received). We start by estimating a model without interaction, i.e. setting Vi,d,r,t = 0 in equation 3, to identify the overall marginal effect of the RSQ on food consumption. We show results in Table 1,24 where we use the 5-month precipitation z-score as a measure of the RSQ.25 Including both groups of controls (group A and group B) and controlling for 22 Although we include year fixed effects in our general specification (equation 3) for reasons explained in section 3, we omit those in our results as they absorb most of the effect of the RSQ on food consumption. The main reason lies in the high spatial correlation in RSQ experienced across departments for each individual year, which causes the inclusion of year fixed-effects to induce an over-controlling problem. We provide maps of the 5-month SPI at the department-level for each year of the studied period (2010- 2014) in appendix (Figure B.5) to illustrate the fact that, for any given year, the rainy season quality is fairly homogeneous across departments. We also use our household-level dataset to regress 5-month SPI values against year fixed-effects and find that the latter explains 82% of the variance in the 5-month SPI in our sample – whereas cross-sectional variation represents only 10%. To confirm that year fixed-effects effectively absorb the effect of the RSQ, we estimate a basic model without interactions and including all controls and year fixed-effects (column (1) of Table C.1 in appendix). We find that year fixed-effects are almost perfectly correlated with a (rural) population-weighted average of department-level 5-month SPI (Pearson coefficient equal to 0.96). 23 There are only 34 clusters (i.e. departments) so we use a bootstrapping method to calculate our clustered standard errors. 24 Full results including coefficients associated with all controls are provided in Table C.1 in appendix. 25 The 5-month precipitation z-score yields the highest goodness-of-fit for this specification although results of estimations using other indices are provided in appendix, more on this below. 21 regional fixed-effects, we find that a one unit decrease in the RSQ index –in this case corresponding to a one standard deviation decrease in precipitations– is associated with a 3.2 points loss in the Food Consumption Score (column 4). The estimated coefficient is significant at a 1 percent level and both the statistical significance and magnitude of the coefficient are robust to the exclusion of group B controls (column 3), of regional fixed-effects (column 2) and when excluding all types of controls (column 1). To better understand the magnitude of the estimated impact we carry out two complementary analyses. First, we simply estimate a log-linear version of the model so that we can interpret the estimated coefficient in terms of a relative change in food consumption. We find that a one standard deviation decrease in precipitations is associated with a 6.4 percent loss in the FCS (see Table C.2 in the appendix). Second, we apply the estimation procedure of the prediction model described in section 3 to calculate that a hypothetically uniform one standard deviation decrease from the mean in precipitations would lead to a 7.3 percentage points (p.p.) increase in the national food insecurity rate in rural areas, from 34% to 41.3%.26 In 2020, this would roughly represent an additional 150,000 people in a situation of food insecurity compared to the baseline scenario. Table 1: Pooled OLS regression of food consumption on 5-month z-rain. Dep. variable : Food Consumption Score (1) (2) (3) (4) 5-month z-rain 3.183∗∗∗ 3.386∗∗∗ 2.894∗∗∗ 3.194∗∗∗ (0.442) (0.440) (0.445) (0.497) Constant 53.478∗∗∗ 42.223∗∗∗ 38.098∗∗∗ 35.085∗∗∗ (1.029) (2.594) (4.392) (4.383) Controls-A No Yes Yes Yes Controls-B No Yes No Yes region FE No No Yes Yes Observations 10,966 10,924 10,925 10,924 Adjusted R2 0.035 0.127 0.125 0.170 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 26 To arrive at this result, we first subtract the estimated total effect of the RSQ (i.e. the product of the RSQ index value with the estimated marginal effect 3.194) to our historical FCS values to obtain a nationally representative synthetic sample of baseline food consumption values. Those represent the estimated food consumption level for an RSQ scenario in which precipitation levels are equal to their historical mean. Then, we infer the FCS values for a scenario where precipitations are one standard deviation below their historical mean by subtracting 3.194 to baseline FCS values. Accounting for sampling weights, we find that 41.3% of households have an FCS below the food insecurity line (42) in this scenario against 34% in the baseline RSQ scenario corresponding to average precipitations. 22 In order to evaluate the impact of RSQ measurement choices on our results, we replicate the results of Table 1 considering different time windows (1 to 5 months) for the precipi- tation z-score as well as other types of RSQ indices (NDVI, EVI, DMP anomalies). We provide results in Table C.3 to C.10 in the appendix. We compare the performance of RSQ indices for the main specification including all controls (column 4 in Table 1) and based on the goodness-of-fit as measured by the adjusted R2 . This is in no way a measure of the overall accuracy of the food insecurity prediction model but rather an indication of the goodness-of-fit for the relationship between the RSQ and food consumption and a simple mean to support specification choices.27 Our main conclusion is that the re- sults hold across the range of RSQ indices tested and that RSQ measurement choices seem to have only a limited impact on the results, at least when estimating a national- level marginal effect of the RSQ. Considering the optimal time window for each index type, we find that the adjusted R-squared varies only between 16.6% and 17%. Some marginal differences are still worth noting. For any given index type, the goodness-of-fit consistently increases with the time window used, indicating that anomalies of aggregate rainfall/greenness/dry matter productivity for the entire rainy season carry more rele- vant information than those relative to peak months only. For instance, the adjusted R-squared increases from 15.8% to 17% between specifications using the 1-month and 5-month precipitation z-scores respectively (Table C.3 in the appendix). Also, we do not observe any marked differences between spi-like and z-score anomalies, although the for- mer perform marginally better for vegetation indices.28 However, we do not rule out the possibility of variation in the optimal RSQ index in space and across household groups in our final calibration. Our cross-validation exercise will therefore allow for RSQ indices to vary across household typologies. 4.1.2 Heterogeneity across space and household groups Next, we turn to the heterogeneity of the marginal effect of the RSQ on food consumption. First, we investigate how the effect varies across space by taking regional fixed-effects as a vulnerability factor.29 We find substantial variation across regions with an estimated marginal effect ranging from -0.448 in Gorgol (not significant at a 10% level) to 5.9 in the Hodh El Gharbi as well as a marginal increase in the explaining power, highlighting the importance of accounting for spatial disparities in the RSQ effect for sub-national 27 The performance of the prediction model also depends on the exposure component, in particular our ability to calibrate baseline food consumption distributions. 28 Because they are also easier to interpret and rely on weaker distributional assumptions, we generally prefer z-score over spi-like RSQ indices. 29 We replicate the results with other spatial fixed-effects including livelihood zones and regions de- fined based on rainfall regime. We provide the corresponding results in appendix (Table C.11). We keep regional fixed-effects interaction in our core results because they yield the highest improvement in explaining power. 23 predictions. Interpreting those differences based on regional characteristics remains dif- ficult at this point, although allowing for more complex functional forms provides some insights into this issue – more on this below. Second, we try to explain differences in the marginal effect of the RSQ on food consumption based on relevant household character- istics. We estimate equation 4 considering a range of household-level interactions: age, sex, education level and marital status of household head, dependency ratio, household size, livestock ownership30 and primary income source.31 Only livestock ownership and primary income source yield statistically significant coefficients and we report the cor- responding results in Table 3.32 The magnitude of the marginal effect of the RSQ on the FCS shows a clear increasing trend with livestock ownership, from 1.9 for households without any form of livestock (base category) to 4.1 for those in the top quartile. This tends to suggest that livestock ownership increases the sensitivity of livelihood means to climate variability. Further analyses will be needed to clearly identify the channel through which this results in a more elastic food consumption.33 Column 2 in Table 3 shows the same results replacing livestock ownership by the reported primary income source. Consistent with our previous finding, households relying mainly on livestock ac- tivities showcase the highest marginal effect of the RSQ on food consumption. Somehow surprisingly, those living from agriculture have the lowest marginal effect (1.8). In fact, only 12% of households report agriculture as their primary source of income whereas 64% declare possessing farmland so the sample of households generating agricultural income is most likely biased towards a subset of agricultural households farming larger plots,34 perhaps with higher rates of technology adoption and irrigation access, making them rel- atively more climate-resilient. Overall, our findings point to the existence of important disparities in the marginal effect of the RSQ on food consumption across regions and 30 Livestock ownership is measured in Tropical Livestock Units (TLU), which allows to aggregate livestock numbers to a common unit by assigning different weights to livestock types: 0.7 for camels, 0.5 for cattle and 0.1 for sheep and goats. We create a categorical variable that takes on 5 values: 1 if the household has no livestock, and 2, 3, 4, 5 if the TLU value is in the first, second, third and top quartile of the rural TLU distribution respectively. 31 Income source categories are not consistent across our five surveys but we harmonize them to a common set of 6 categories: agriculture, livestock, fishing, small business/informal, formal sector and remittances. Households relying primarily on fishing are excluded as they represent only 1% of the original sample. 32 Results for other household interactions available upon request to the authors. 33 However, one can hypothesize a direct effect on the self-consumption of meat due to livestock losses and distress sales, as well as an income effect also caused by livestock losses but also by a potential drop in livestock selling prices. On the other hand, a closer look at livestock ownership as a control variable (Table C.1 in appendix C.1) reveals that it is, all else equal, a significant determinant of the FCS, with households in the top quartile having on average an additional 13.3 points compared to those with no livestock. The overall effect on food insecurity is therefore ambiguous since households owning more livestock are more sensitive to climate variability but are also farther away from the food insecurity line. Note that our prediction model resolves this ambiguity by explicitly quantifying the distance away from the food insecurity line through the baseline food consumption distributions for each typology. 34 We find disproportionately more households owning more than 0.5 ha of farmland in the subset of households reporting agricultural as a primary income source (41%), compared to the full sample (28%). 24 among household groups, supporting the notion that well-designed household typologies are critical to the calibration of a sub-national level prediction model. Table 2: Regression of food consumption on the RSQ with regional interaction. 5-month z-rain × region:Assaba 4.232∗∗∗ (1.116) 5-month z-rain × region:Brakna 2.418∗ (1.235) 5-month z-rain × region:Gorgol −0.448 (1.742) 5-month z-rain × region:Guidimakha 4.549∗∗∗ (0.606) 5-month z-rain × region:Hodh Ech Chargi 3.217∗∗∗ (0.505) 5-month z-rain × region:Hodh El Gharbi 5.875∗∗∗ (1.787) 5-month z-rain × region:Tagant 3.239∗∗∗ (0.723) 5-month z-rain × region:Trarza 4.718∗∗∗ (1.026) Constant 37.679∗∗∗ (4.855) Controls-A Yes Controls-B Yes region FE Yes Observations 10,924 Adjusted R2 0.180 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 25 Table 3: Regression of food consumption on the RSQ with household-level interactions. (1) (2) 5-month z-rain 1.862∗∗ 1.777∗∗ (0.946) (0.805) 5-month z-rain × livestock:1st quartile 0.905 (0.715) 5-month z-rain × livestock:2nd quartile 1.118∗ (0.630) 5-month z-rain × livestock:3rd quartile 1.988∗∗∗ (0.754) 5-month z-rain × livestock:Top quartile 2.278∗ (1.256) 5-month z-rain × inc. source:Livestock 2.619∗∗∗ (0.950) 5-month z-rain × inc. source:Small business 1.605∗∗ (0.727) 5-month z-rain × inc. source:Formal 1.013∗ (0.549) 5-month z-rain × inc. source:Remittances 1.171 (0.938) Constant 36.121∗∗∗ 36.907∗∗∗ (2.192) (2.513) Controls-A Yes Yes Controls-B Yes Yes region FE Yes Yes Observations 10,924 10,924 Adjusted R2 0.172 0.172 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 4.1.3 Quantile regressions As illustrated in Figure 3, our food insecurity estimation procedure assumes a uniform effect of the RSQ on food consumption along the baseline FCS distribution to infer the FCS distribution under a given RSQ scenario. Standard OLS methods used above provide estimates of the RSQ effect at the mean of the outcome distribution, which we have so far implicitly considered to coincide with the uniform effect applied in the estimation 26 procedure. To the extent that the conditional mean of the distribution may be far away from the food insecurity line and that marginal effects may effectively differ across the distribution, this can be problematic for our food insecurity estimation method. We test the plausibility of our assumption by investigating the heterogeneity in the RSQ effects along the food consumption distribution and we estimate quantile regression models to uncover differences in RSQ effects across conditional quantiles. We first consider our initial specification without interaction and we estimate the marginal effect of the RSQ on the FCS for quantiles ranging from 0.05 to 0.95. Results presented in Figure 6a reveal an inverted U-shaped pattern with a marginal RSQ effect peaking at above 4 around the 6th decile and falling to around 2 at the 5th percentile. We are particularly interested in better understanding how the marginal effect varies within the domain of the distribution that corresponds broadly to the subset of households around the food insecurity line and thus susceptible of coming in and out of food insecurity due to climate variability. Based on historical national food insecurity rates, we evaluate this is roughly the 25th -45th percentile interval.35 Our estimated marginal effects vary only slightly on this interval – between approximately 3 and 3.5 – and is broadly in line with our OLS estimate, which does not invalidate the assumption held in the food insecurity estimation method. We also estimate quantile regressions on logged food consumption which permits a utility interpretation of the results. Interestingly, we find the highest marginal effects (around 9%) at the bottom of the distribution, which then decreases at around 7% between the 3rd and 6th deciles before markedly declining to approximately 2.5% at the top of the distribution. This is broadly consistent with the notion that, in relative terms, the poorest are more vulnerable to climate variability. We replicate the analysis on the specification including a regional fixed-effects interaction and leave results in appendix C.4 for the sake of conciseness. It is worth noting that, for some regions, we find that the OLS estimate differs quite significantly from marginal effects estimated in the percentile interval of interest. In our final prediction model calibration, we partially address this issue by allowing for distinct vulnerability functions across household groups within regions, defined based on characteristics that also capture differences in food consumption ceteris paribus. Also, we estimate our final model with quantile regressions at different percentile values and choose the value that minimizes prediction errors at the regional level. 35 We estimate moderate food insecurity rates at the national-level from FSMS for the period 2011- 2015 and we find a minimum of 28.8% in 2013 and a maximum of 41.9% in 2012 as a result of the 2011 drought. The 25th -45th percentile interval considered is thus a conservative interval. 27 Figure 6: Quantile regression of food consumption on 5-month z-rain. (a) Estimated Marginal effect of RSQ on FCS (b) Estimated Marginal effect of RSQ on level Logged FCS 4.1.4 Non-linear functional forms All results presented so far assume a linear relationship between the RSQ index and the food consumption.36 Although it is a natural starting point in the analysis, we also investigate the existence of non-linearities by estimating third-order polynomial models. As previously, we start with our basic specification without interaction and using the 5-month z-rain as the RSQ measure.37 All three coefficients of interest are statistically significant (see Table C.12 in appendix C.5) and we show the inferred vulnerability func- tion in Figure 7 below. The relationship is mostly positive and linear on the [−0.5, 2.5] interval but we observe a marked decrease in the marginal effect that becomes negative after 2.5. One possible explanation relies on the potential adverse effects of excess rainfall that may result in crop losses due to the occurrence of flooding events and the spread of crop pests and diseases, but also causing mobility frictions and reducing market intercon- nectedness. We once again estimate a model including a regional fixed-effects interaction and find substantial spatial heterogeneity in RSQ effects, which we illustrate in Figure 838 that provides vulnerability functions for four regions. Those functions remain strictly increasing for the Assaba and Guidimakha regions and broadly consistent with our ex- 36 More specifically, between the RSQ index and the conditional mean or conditional quantiles of the outcome of interest. 37 The specification is simply: 2 3 yi,d,r,t = β0 + β11 rsqd,t + β12 rsqd,t + β13 rsqd,t + β3 Xi,d,r,t + γr + δt + i,d,r,t And the coefficients β11 , β12 and β13 define the vulnerability function v (rsq ) = β11 rsq + β12 rsq 2 + β13 rsq 3 . 38 Estimated polynomial vulnerability functions for all 8 regions are provided in appendix C.5. 28 pectation on the RSQ-food consumption relationship. On the other hand, we find again a decreasing pattern for high RSQ values in the vulnerability function for the Tagant region – and, to some extent in the Trarza region but also in Gorgol and Brakna (see Figure C.3 in appendix C.5). Consistent with our prior interpretation, we find that a significant flooding event that affected exactly these four regions in 2010 is reported in the EMDAT database39 while all RSQ values above 2.5 are concurrently observed for the same year. Finally, the Trarza vulnerability function exhibits a surprising decreasing trend within the negative RSQ index range, indicating a significant improvement in food consumption with the worsening of the RSQ. One plausible reason may rely on gains from trade al- lowed by a surge in staple prices due to rainfall deficits that benefits the Trarza region, which produces irrigated rice that is most likely more climate-resilient compared to other forms of agriculture in the rest of the country. Attrition phenomena caused by selective migration patterns of the poorest households could also be considered a sensible assump- tion. We leave it to future research to investigate those facts more in depth and simply use them to highlight the complexity and heterogeneity of the RSQ-food consumption relationship. Figure 7: Country-level polynomial vulnerability function. Note : Red dashed lines represent 95% confidence intervals and the vulnerability function is plot- ted on the RSQ interval actually observed in the sample. The background histogram represents the distribution of RSQ values in the sample used for the estimation. 39 EM-DAT, CRED / UCLouvain, Brussels, Belgium – www.emdat.be 29 Figure 8: Polynomial vulnerability functions for four regions. (a) Assaba (b) Guidimakha (c) Tagant (d) Trarza Note : Red dashed lines represent 95% confidence intervals and vulnerability functions are plot- ted on the RSQ interval actually observed in the sample. The background histograms represent the distributions of RSQ values in the subset of observations for each of the corresponding regions. 4.2 Forecasting model predictions The analysis conducted in section 4.1 unveils highly complex and heterogeneous relation- ships between the quality of rainy seasons and food consumption. There are marked dis- parities across regions and household groups as well as strong evidence of non-linearities that add further complexity to the RSQ-food consumption nexus. Moreover, although we found that the 5-month precipitation z-score best explains food consumption in our 30 basic specification (Table 1) and thus used it throughout our analysis, we do not rule out the possibility that optimal time windows and index types may also differ across space and household groups. Heterogeneity in all those dimensions implies spatial disparities in the RSQ-food consumption relationship and failing to account for these differences may result in large errors when making predictions at sub-national scales. On the other hand, defining an optimal choice to this multi-dimensional problem remains a challenging task. In light of our results in section 4.1, we allow for vulnerability functions to differ across geographic zones, and across household groups within these geographic units. Assuming we have at least three viable geographic breakdown (regions, livelihood zones, rainfall zones) and two household characteristics (primary income source, livestock ownership) to define household groups within geographic units, this already represents 328 possible combinations, each of which determines a set of household typologies. For instance, a regional breakdown (8 regions) with household groups defined based on livestock owner- ship (5 categories) in all regions would result in 40 household typologies. Conditional on a typology structure, we further assume that RSQ measures can vary across typologies which brings an additional dimension to the optimization problem. We allow for 1- to 5-month precipitation anomalies and 1- to 4-month NDVI, EVI and DMP anomalies, which represent a total of 17 possible RSQ indices for each typology. We present the results of a food insecurity prediction model where the vulnerability com- ponent is optimized with respect to both the heterogeneity structure and RSQ indices at the typology-level based on a machine learning procedure (cross-validation), details of which are provided in appendix D. The cross-validation algorithm uses the Mean Squared Error (MSE) between modeled household food consumption and ground-truth FSMS values to evaluate the performance of the models tested. Our optimal structure has a geographic breakdown based on regions and uses primary income source and live- stock ownership as household-level factors to define typologies within regions.40 We follow the estimation procedure described in section 3.1 and produce food insecurity rate pre- dictions at the national and regional levels for the period 2011-2015 that we compare with estimates from the FSMS. We benchmark the results against a simple model with a unique household typology41 and baseline food consumption distributions calibrated at the regional-level, in order to appreciate the gains obtained from a more complex vulner- ability component. Figure 9a shows the results at the national-level where predictions of food insecurity rates closely follow FSMS estimates for the period 2011-2015 – we find a mean absolute error of 1.2 p.p.42 The benchmark model that does not account for het- 40 The primary income source is found as the optimal factor in the Brakna, Guidimakha, Hodh Ech Chargi and Gorgol regions while livestock ownership category is preferred in the other four regions (Assaba, Hodh El Gharbi, Trarza, Tagant). 41 The corresponding vulnerability function is calibrated using the 5-month precipitation z-score. 42 These results are based on a food insecurity line of 41 because the mode of the FCS in our historical sample coincides with the usual food insecurity line – nearly 6% of FCS values are equal to 42 – which 31 erogeneity in the vulnerability component yields a slightly higher deviation from FSMS estimates with a mean absolute error of 2.3 p.p., although it still captures food insecu- rity trends over the period (Figure 9b). However, regional-level comparisons in Figure 9c and 9d clearly reveal the improvement brought about by our optimized vulnerabil- ity structure against the benchmark model. The correlation coefficient between regional predictions and FSMS estimates increases from 0.75 to 0.84 and the mean absolute error drops by nearly 2p.p. Our main conclusions from these results are twofold. First, our ability to reproduce the important variations in national food insecurity observed – even with a rough national-level estimate of the RSQ-food consumption relation – indicates that climate variability is a key driver of food consumption in rural Mauritania and that the proposed cat model-based methodology effectively allows to model this relationship. Second, the regional-level comparison highlights the critical importance of a fine-tuned understanding of the heterogeneity within the vulnerability component for producing exploitable sub-national predictions. Figure 9: Prediction versus FSMS estimates, 2011-2015. (a) Final model - national (b) Benchmark model - national (c) Final model - regional (d) Benchmark model - regional makes historical food insecurity rates highly sensitive to the usual food insecurity threshold. As a robustness check, we reproduced the results with other food insecurity lines and obtained comparable performance levels (results available upon request to the authors). 32 4.3 Extension to a food insecurity risk model Regression analyses in section 4.1 allow to advance our understanding of the heteroge- neous impact of the RSQ on food consumption and we have largely exploited those results to calibrate the vulnerability component of our prediction model in section 4.2. Predic- tions for the period 2011-2015 were produced to show that the model is very much in line with historical values. In this section, we move from historical RSQ scenarios to a probabilistic hazard component in order to extend the prediction model to a food insecu- rity risk model. The objective is to illustrate how the proposed modeling framework can also be used to estimate a probability distribution of future food insecurity, which allows to accurately depict the volatility of national food insecurity caused by the variability in the quality of rainy seasons. To that end, we produce a catalogue of 10,000 synthetic RSQ scenarios based on a statistical analysis of historical monthly precipitations in Mauritania.43 The set of syn- thetic events is a probabilistic representation of all possible RSQ scenarios in the country. We calibrate a simplified food insecurity model using the 5-month SPI only in the haz- ard component in order to simplify the hazard simulation procedure44 and we compute national-level food insecurity estimates for each individual synthetic scenario. We can then infer a probability distribution of future food insecurity from the resulting sample of 10,000 estimates. We show the result in Figure 10. This is an example of how our methodology can be applied to welfare risk assessment, which is critical to the design of adaptive social protection programs. Improved risk information can support govern- ments’ preparedness for food welfare shocks through adequately scaled response systems and coherent risk financing strategies. In line with previous studies on the risk of poverty such as that of Hill and Porter (2017) and Skoufias et al. (2021), we also adopt a different approach where we directly use our nationally representative sample of households and simulate food consumption responses under the 10,000 synthetic scenarios to infer households’ probability distribution of food consumption. This in turn allows to estimate the share of households exposed to the risk of food insecurity being defined as, for instance, those with a probability of being food insecure greater than 0.5. With this definition, we find that 31% of households are at risk of food insecurity in rural Mauritania. 43 We fit a multivariate normal distribution to historical 5-month SPI values in the 34 departments of the study area over the period 1981-2022, and we generate 10,000 scenarios based on the estimated mean and covariance matrix. 44 Comparison of this model’s estimates against historical data for the period 2011-2015 show a per- formance that is comparable to that obtained with the optimal model of section 4.2. 33 Figure 10: Probability distribution of the number of food insecure households at the national level. 5 Conclusion We develop a methodology that incorporates microeconomic estimates of the impact of hazard conditions on welfare into a cat model framework for producing sub-national pre- dictions and quantifying welfare risk. We focus on the impact of the rainy season quality on food consumption, and we apply our framework in the context of rural Mauritania. We pair household observations from five FSMS rounds with RSQ indices to estimate the impact of weather anomalies on food consumption. We find that a one standard deviation decrease in the 5-month precipitation z-score is associated with a 6.4% loss in the food consumption score on average although the effect varies quite significantly with regions and key household characteristics such as livestock ownership and primary income source. We also find significant differences in the marginal effect of the RSQ along the outcome distribution with a one standard deviation decrease in rainfall anomalies being associated with a 2.5% loss in food consumption at the top of the distribution versus 9% at the bottom of the distribution. Our results show the existence of non-linearities, some of which are most likely explained by adverse effects of excess rainfall. Based on those findings, we calibrate the vulnerability component of our prediction model where 34 the choice of heterogeneity structure and RSQ measurement is optimized via a cross- validation method. The final model predicts national (resp. regional) food insecurity rates with a 1.2 p.p. (resp. 4.9 p.p) mean absolute error over the period 2011-2015 and forecasts can be produced approximately 8 months ahead of the lean season. Historical rates exhibit large inter-year variations – between 29% and 42% – and the ability of the model to reproduce them indicates that climate variability is a key driver of food consumption in rural Mauritania and that the proposed framework effectively allows to model this relationship. We extend the prediction model to a food insecurity risk model that illustrates how our methodology also allows to produce probabilistic assessments to quantify the degree of welfare volatility caused by weather variability. Following previous efforts to quantify the vulnerability to poverty, we also apply the model to the estimation of the risk to food insecurity at the household-level and we find that 31% of households in rural Mauritania have at least a 50% chance of being food insecure in any given year. The methodology contributes to advancing our understanding of the impact of hazard conditions on welfare in developing contexts and, more importantly, our ability to model it. As climate change is expected to further exacerbate weather shocks and climate vari- ability, a better knowledge of the weather-welfare relationship is instrumental to inform the design of a wide array of public policies ranging from adaptive social protection policies, early recovery programs, risk-financing strategies or even public investments to increase resilience to climate shocks. Given the high level of within-country heterogeneity found in our study case, the vul- nerability relations derived in this paper must be considered as entirely specific to the Mauritanian case and should not be directly applied to other contexts. The accuracy of our prediction model must also be considered in the particular case of Mauritania where weather variability has a direct impact on livelihood means and is the main driver of food insecurity. We can only encourage future works to apply our approach to other contexts – especially in the Sahel – although bearing in mind the potential coexistence of other hazard types such as conflicts, locust outbreaks or exogenous price shocks. That being said, the proposed framework is flexible enough to be adapted to other hazard types and welfare dimensions where data allow it. 35 References Baez, J. E., G. Caruso, and C. Niu (2020): “Extreme Weather and Poverty Risk: Evidence from Multiple Shocks in Mozambique,” Economics of Disasters and Climate Change, 4, 103–127. Beegle, K. and L. Christiaensen (2019): “ Accelerating Poverty Reduction in Africa,” Washington, DC: World Bank. © World Bank. https://openknowledge. worldbank.org/handle/10986/32354, license: CC BY 3.0 IGO. Beguer´ıa, S., S. M. Vicente-Serrano, F. Reig, and B. Latorre (2014): “Stan- dardized precipitation evapotranspiration index (SPEI) revisited: Parameter fitting, evapotranspiration models, tools, datasets and drought monitoring,” International Journal of Climatology, 34, 3001–3023. Christiaensen, L. J. and K. Subbarao (2005): “Towards an understanding of house- hold vulnerability in rural Kenya,” Journal of African Economies, 14, 520–558. Dell, M., B. F. Jones, and B. A. Olken (2014): “What do we learn from the weather? The new climate-economy literature,” Journal of Economic Literature, 52, 740–798. Demeke, A. B., A. Keil, and M. Zeller (2011): “Using panel data to estimate the effect of rainfall shocks on smallholders food security and vulnerability in rural Ethiopia,” Climatic Change, 108, 185–206. Demissie, B. S. and T. A. Kasie (2017): “Rural Households’ Vulnerability to Poverty in Ethiopia,” Journal of Poverty, 21, 528–542. Dercon, S. (2004): “Growth and shocks: Evidence from rural Ethiopia,” Journal of Development Economics, 74, 309–329. Dercon, S. and P. Krishnan (2000): “Vulnerability, seasonality and poverty in Ethiopia,” Journal of Development Studies, 36, 25–53. Dorosh, P. (2009): “Price stabilization, international trade and national cereal stocks: world price shocks and policy response in South Asia,” The Science, Sociology and Economics of Food Production and Access to Food, 1, 137–149. Gallardo, M. (2018): “Identifying Vulnerability To Poverty: a Critical Survey,” Jour- nal of Economic Surveys, 32, 1074–1105. Hill, R. V. and C. Porter (2017): “Vulnerability to Drought and Food Price Shocks: Evidence from Ethiopia,” World Development, 96, 65–77. 36 Huho, J. M. and E. M. Mugalavai (2010): “The Effects of Droughts on Food Security in Kenya,” The International Journal of Climate Change: Impacts and Responses, 2, 61–72. Lewis, K. (2017): “Understanding climate as a driver of food insecurity in Ethiopia,” Climatic Change, 144, 317–328. McKee, T. B., J. D. Nolan, and J. Kleist (1993): “The relationship of drought frequency and duration with time scales,” Eight conference on applied climatology, American Meteorological Society, Jan 17-23, 1993, Anaheim CA, pp.179-186. Mishra, A. K. and V. P. Singh (2010): “A review of drought concepts,” Journal of Hydrology, 391, 202–216. Skoufias, E., K. Vinha, and M. B. Beyene (2021): “Quantifying Vulnerability to Poverty in the Drought-Prone Lowlands of Ethiopia,” Policy Research Working Paper;No. 9534. World Bank, Washington, DC, https://openknowledge.worldbank. org/handle/10986/35107, license: CC BY 3.0 IGO. World Bank (2012): “ Using Public Food Grain Stocks to Enhance Food Security,” Washington, DC: World Bank. © World Bank. https://openknowledge.worldbank. org/handle/10986/11878, license: CC BY 3.0 IGO. ——— (2020): “Poverty and Shared Prosperity 2020: Reversals of Fortune,” Washing- ton, DC: World Bank. doi: 10.1596/978-1-4648-1602-4., license: Creative Commons Attribution CC BY 3.0 IGO. 37 Appendix A Data description Table A.1: Food groups and weights used in the Food Consumption Score. Food group Weight Staples (rice, wheat, sorghum, maize), tubers, roots 2.00 Pulses (beans, peas, nuts) 3.00 Vegetables 1.00 Fruits 1.00 Meat, fish, egg 4.00 Dairy 4.00 Sugar 0.50 Oil 0.50 Figure A.1: Geographic extent of the studied area. 38 B Rainy season quality descriptives Figure B.1: Correlogram of department-level RSQ indices (short time scale) for the period 2000-2019. 39 Figure B.2: Correlogram of department-level SPI indices for the period 2000-2019, 1 to 5 months. 40 Figure B.3: Correlogram of department-level z-NDVI indices for the period 2000-2019, 1 to 4 months. 41 Figure B.4: Correlogram of department-level z-DMP indices for the period 2000-2019, 1 to 4 months. 42 Figure B.5: 5-month SPI by department, 2010-2014. C Additional regression results C.1 Main specification without interaction 43 Table C.1: Pooled OLS regression of food consumption on 5-month z-rain, including all controls. Dep. variable : Food Consumption Score (1) (2) (3) (4) (5) 5-month z-rain 2.849∗∗∗ 3.194∗∗∗ −0.084 0.167 (0.413) (0.513) (1.020) (1.035) sex −1.263∗∗ −1.184∗∗ −1.059∗ −1.066∗∗ (0.605) (0.547) (0.597) (0.521) age 0.080∗∗∗ 0.044∗∗∗ 0.073∗∗∗ 0.039∗∗ (0.016) (0.015) (0.017) (0.016) size 0.373∗∗∗ 0.159∗∗ 0.407∗∗∗ 0.202∗∗∗ (0.075) (0.072) (0.072) (0.069) annual rain 0.005 0.006 0.004 0.005 (0.013) (0.014) (0.014) (0.014) divorced −1.559 0.391 −1.173 0.627 (1.027) (1.003) (1.014) (1.064) widowed −2.769∗∗∗ −1.590∗ −2.736∗∗∗ −1.612∗ (0.844) (0.877) (0.832) (0.838) single 0.633 0.313 −0.160 −0.304 (2.449) (2.226) (2.346) (2.103) edu:literate 5.145∗∗ 3.738∗ 5.261∗∗ 3.866∗ (2.295) (2.007) (2.377) (2.044) edu:prim./coranic 7.328∗∗∗ 5.916∗∗∗ 7.201∗∗∗ 5.855∗∗∗ (0.918) (0.876) (0.879) (0.837) edu:sec./higher 12.359∗∗∗ 10.306∗∗∗ 12.318∗∗∗ 10.316∗∗∗ (1.365) (1.338) (1.377) (1.327) dep. ratio:1st quart. 5.356∗∗∗ 4.223∗∗∗ 4.875∗∗∗ 4.007∗∗ (1.623) (1.507) (1.619) (1.567) dep. ratio:2nd quart. 4.060∗∗∗ 3.494∗∗∗ 3.560∗∗ 3.245∗∗ (1.370) (1.347) (1.395) (1.397) dep. ratio:3rd quart. 2.796∗ 2.504∗ 2.223 2.181 (1.448) (1.391) (1.484) (1.451) dep. ratio:top quart. 1.347 1.233 0.469 0.786 (1.659) (1.515) (1.742) (1.604) livestock:1st quart. 1.601∗ 1.446 (0.915) (0.985) livestock:2nd quart. 4.166∗∗∗ 3.915∗∗∗ (0.893) (0.971) livestock:3rd quart. 7.782∗∗∗ 7.639∗∗∗ (0.895) (0.974) livestock:top quart. 13.463∗∗∗ 13.282∗∗∗ (1.083) (1.150) income:livestock −1.208 −1.250 (1.363) (1.416) income:small business −0.437 −0.110 (0.873) (0.913) income:formal 0.973 1.219 (1.289) (1.346) income:remittances −0.083 0.273 (2.133) (2.203) unemployment −2.634∗∗ −2.888∗∗∗ (1.075) (1.014) health shock −0.053 −0.092 −0.449 −0.499 (0.879) (0.762) (0.842) (0.744) death −0.556 0.491 −0.814 0.286 (0.725) (0.698) (0.730) (0.695) aid-free food −1.799∗ −1.696∗ 44 (1.023) (0.953) aid-food bank 1.449 1.249 (1.103) (1.119) aid-Emel 3.264∗∗∗ 2.725∗∗∗ (0.773) (0.817) region:Brakna 8.526∗∗∗ 9.537∗∗∗ 9.053∗∗∗ 10.128∗∗∗ (2.735) (2.890) (2.972) (3.224) region:Gorgol 6.119∗ 6.884∗∗ 7.254∗∗ 8.212∗∗ (3.191) (3.173) (3.488) (3.565) region:Guidimakha −2.702 −2.570 −2.273 −2.050 (3.366) (3.635) (3.713) (3.857) region:Hodh Ech Chargi −5.592∗ −5.413∗ −6.521∗∗ −6.317∗ (3.129) (3.014) (3.256) (3.351) region:Hodh El Gharbi 4.483 3.964 3.567 3.090 (3.066) (2.943) (3.301) (3.383) region:Tagant 0.099 1.062 0.070 1.157 (3.792) (4.226) (4.311) (4.895) region:Trarza 7.170∗∗ 8.834∗∗∗ 7.429∗∗ 9.098∗∗∗ (2.872) (3.000) (3.138) (3.401) year:2012 −7.268∗∗ −8.362∗∗ (3.700) (3.500) year:2013 4.418∗∗ 2.381 (1.859) (1.782) year:2014 −3.541 −4.569∗ (2.668) (2.766) year:2015 −6.316∗∗ −7.796∗∗∗ (2.740) (2.807) Constant 38.337∗∗∗ 35.085∗∗∗ 43.876∗∗∗ 41.565∗∗∗ (4.433) (4.404) (5.302) (5.576) Controls-A Yes Yes Yes Yes Controls-B No Yes No Yes region FE Yes Yes Yes Yes Year FE No No Yes Yes Observations 10,925 10,924 10,925 10,924 Adjusted R2 0.120 0.170 0.131 0.178 Note: ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01 45 Table C.2: Pooled OLS regression of logged food consumption on 5-month z-rain. Dep. variable : log Food Consumption Score (1) (2) (3) (4) 5-month z-rain 0.063∗∗∗ 0.066∗∗∗ 0.057∗∗∗ 0.064∗∗∗ (0.008) (0.008) (0.007) (0.009) Constant 3.890∗∗∗ 3.682∗∗∗ 3.652∗∗∗ 3.565∗∗∗ (0.021) (0.055) (0.087) (0.087) Controls-A No Yes Yes Yes Controls-B No Yes No Yes region FE No No Yes Yes Observations 10,966 10,924 10,925 10,924 Adjusted R2 0.035 0.119 0.112 0.159 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 C.2 Main specification testing all RSQ indices. Table C.3: Pooled OLS main specification results using 1- to 5-month z-rain. Dep. variable : Food Consumption Score 1-month 2-month 3-month 4-month 5-month (1) (2) (3) (4) (5) z-rain 2.622∗∗∗ 3.049∗∗∗ 3.172∗∗∗ 3.183∗∗∗ 3.194∗∗∗ (0.417) (0.604) (0.675) (0.578) (0.549) Constant 35.929∗∗∗ 34.689∗∗∗ 34.799∗∗∗ 34.986∗∗∗ 35.085∗∗∗ (2.527) (2.427) (2.882) (2.640) (2.649) Controls-A Yes Yes Yes Yes Yes Controls-B Yes Yes Yes Yes Yes region FE Yes Yes Yes Yes Yes Observations 10,924 10,924 10,924 10,924 10,924 Adjusted R2 0.158 0.165 0.167 0.169 0.170 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 46 Table C.4: Pooled OLS main specification results using 1- to 5-month SPI. Dep. variable : Food Consumption Score 1-month 2-month 3-month 4-month 5-month (1) (2) (3) (4) (5) SPI 2.995∗∗∗ 3.567∗∗∗ 3.499∗∗∗ 3.459∗∗∗ 3.474∗∗∗ (0.523) (0.697) (0.748) (0.644) (0.599) Constant 35.973∗∗∗ 35.000∗∗∗ 35.194∗∗∗ 35.414∗∗∗ 35.483∗∗∗ (2.605) (2.418) (2.904) (2.741) (2.683) Controls-A Yes Yes Yes Yes Yes Controls-B Yes Yes Yes Yes Yes region FE Yes Yes Yes Yes Yes Observations 10,924 10,924 10,924 10,924 10,924 Adjusted R2 0.158 0.166 0.167 0.168 0.170 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 Table C.5: Pooled OLS main specification results using 1- to 4-month z-NDVI. Dep. variable : Food Consumption Score 1-month 2-month 3-month 4-month (1) (2) (3) (4) z-NDVI 3.055∗∗∗ 3.421∗∗∗ 3.624∗∗∗ 3.681∗∗∗ (0.492) (0.533) (0.492) (0.484) Constant 34.964∗∗∗ 35.672∗∗∗ 35.184∗∗∗ 35.335∗∗∗ (4.287) (4.385) (4.074) (4.103) Controls-A Yes Yes Yes Yes Controls-B Yes Yes Yes Yes region FE Yes Yes Yes Yes Observations 10,924 10,924 10,924 10,924 Adjusted R2 0.155 0.161 0.165 0.166 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 47 Table C.6: Pooled OLS main specification results using 1- to 4-month spi-like NDVI anomalies. Dep. variable : Food Consumption Score 1-month 2-month 3-month 4-month (1) (2) (3) (4) NDVI anom. 2.938∗∗∗ 2.938∗∗∗ 3.747∗∗∗ 3.717∗∗∗ (0.446) (0.446) (0.464) (0.488) Constant 34.592∗∗∗ 34.592∗∗∗ 35.026∗∗∗ 35.265∗∗∗ (4.286) (4.286) (4.058) (4.131) Controls-A Yes Yes Yes Yes Controls-B Yes Yes Yes Yes region FE Yes Yes Yes Yes Observations 10,924 10,924 10,924 10,924 Adjusted R2 0.154 0.154 0.164 0.166 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 Table C.7: Pooled OLS main specification results using 1- to 4-month z-EVI. Dep. variable : Food Consumption Score 1-month 2-month 3-month 4-month (1) (2) (3) (4) z-EVI 3.086∗∗∗ 3.282∗∗∗ 3.539∗∗∗ 3.727∗∗∗ (0.561) (0.665) (0.492) (0.494) Constant 34.729∗∗∗ 35.572∗∗∗ 35.178∗∗∗ 35.219∗∗∗ (4.375) (4.338) (4.100) (4.167) Controls-A Yes Yes Yes Yes Controls-B Yes Yes Yes Yes region FE Yes Yes Yes Yes Observations 10,924 10,924 10,924 10,924 Adjusted R2 0.156 0.159 0.164 0.166 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 48 Table C.8: Pooled OLS main specification results using 1- to 4-month spi-like EVI anoma- lies. Dep. variable : Food Consumption Score 1-month 2-month 3-month 4-month (1) (2) (3) (4) EVI anom. 3.010∗∗∗ 3.010∗∗∗ 3.582∗∗∗ 3.893∗∗∗ (0.523) (0.523) (0.516) (0.638) Constant 34.322∗∗∗ 34.322∗∗∗ 35.204∗∗∗ 35.192∗∗∗ (4.488) (4.488) (4.084) (4.310) Controls-A Yes Yes Yes Yes Controls-B Yes Yes Yes Yes region FE Yes Yes Yes Yes Observations 10,924 10,924 10,924 10,924 Adjusted R2 0.155 0.155 0.162 0.165 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 Table C.9: Pooled OLS main specification results using 1- to 4-month z-DMP. Dep. variable : Food Consumption Score 1-month 2-month 3-month 4-month (1) (2) (3) (4) z-DMP 3.191∗∗∗ 3.454∗∗∗ 3.491∗∗∗ 3.541∗∗∗ (0.384) (0.440) (0.393) (0.392) Constant 33.752∗∗∗ 34.893∗∗∗ 33.886∗∗∗ 33.930∗∗∗ (3.822) (4.459) (4.331) (4.018) Controls-A Yes Yes Yes Yes Controls-B Yes Yes Yes Yes region FE Yes Yes Yes Yes Observations 10,924 10,924 10,924 10,924 Adjusted R2 0.162 0.165 0.167 0.167 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 49 Table C.10: Pooled OLS main specification results using 1- to 4-month spi-like DMP anomalies. Dep. variable : Food Consumption Score 1-month 2-month 3-month 4-month (1) (2) (3) (4) DMP anom. 3.378∗∗∗ 3.378∗∗∗ 3.674∗∗∗ 3.654∗∗∗ (0.465) (0.465) (0.399) (0.417) Constant 33.664∗∗∗ 33.664∗∗∗ 33.787∗∗∗ 34.199∗∗∗ (4.052) (4.052) (4.395) (4.214) Controls-A Yes Yes Yes Yes Controls-B Yes Yes Yes Yes region FE Yes Yes Yes Yes Observations 10,924 10,924 10,924 10,924 Adjusted R2 0.161 0.161 0.168 0.167 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 50 C.3 Additional regression models with spatial and household- level interactions. Table C.11: Pooled OLS with livelihood zones and rainfall zones interaction. Dep. variable : Food Consumption Score 5-month z-rain × LZ:Agropastoral 2.793∗∗∗ (0.647) 5-month z-rain × LZ:Rainfed ag. 4.478∗∗∗ (1.151) 5-month z-rain × LZ:Nomadic pastoralism 3.763 (4.136) 5-month z-rain × LZ:Pastoralism - oases 2.522∗∗∗ (0.284) 5-month z-rain × LZ:Pastoralism and trade 3.593∗∗∗ (1.102) 5-month z-rain × LZ:Senegal river 3.170∗∗∗ (1.154) 5-month z-rain × Rain zone:High 4.211∗∗ (1.645) 5-month z-rain × Rain zone:Low 3.186∗∗∗ (0.898) 5-month z-rain × Rain zone:Medium 2.701∗∗∗ (0.552) Constant 35.354∗∗∗ 36.135∗∗∗ (4.468) (4.566) Controls-A Yes Yes Controls-B Yes Yes region FE Yes Yes Observations 10,924 10,924 Adjusted R2 0.171 0.171 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 51 C.4 Quantile regressions Figure C.1: Quantile regression of logged food consumption on 5-month z-rain. 52 Figure C.2: Quantile regression results by region. (a) Assaba (b) Brakna (c) Gorgol (d) Guidimakha (e) Hodh Ech Chargi (f) Hodh El Gharbi 53 (g) Tagant (h) Trarza C.5 Polynomial models Table C.12: Third-order polynomial model estimation. Dep. variable : Food Consumption Score 5-month z-rain 3.194∗∗∗ 3.668∗∗∗ 3.984∗∗∗ (0.549) (1.085) (0.648) (5-month z-rain)2 −0.269 1.642∗∗ (0.469) (0.814) (5-month z-rain)3 −0.648∗∗∗ (0.191) ∗∗∗ ∗∗∗ Constant 35.085 35.800 34.624∗∗∗ (2.649) (2.354) (2.968) Controls-A Yes Yes Yes Controls-B Yes Yes Yes region FE Yes Yes Yes Observations 10,924 10,924 10,924 Adjusted R2 0.170 0.170 0.174 ∗ ∗∗ ∗∗∗ Note: p<0.1; p<0.05; p<0.01 54 Figure C.3: Polynomial vulnerability functions at the regional-level. (a) Assaba (b) Brakna (c) Gorgol (d) Guidimakha (e) Hodh Ech Chargi (f) Hodh El Gharbi 55 (g) Tagant (h) Trarza D Cross-validation procedure We implement a cross-validation procedure with the objective of optimizing the calibra- tion of the vulnerability component with respect to choices of both vulnerability factors and RSQ indices. We have seen in section 4.1 that several variables could reasonably qualify as vulnerability factors, including regional fixed-effects, livestock ownership and primary income source. We found substantial heterogeneity across regions although other geographic breakdowns could be envisaged, such as livelihood zones, rainfall zones or ge- ographic zones corresponding to the main urban markets, as well as other household-level factors (e.g. dependency ratio, age of household head, household size, sex of household head...). More importantly, we can also combine both types of vulnerability factors in order to allow for more complex structures where RSQ effects also differ within geo- graphic zones across household groups defined by a household-level factor (e.g., livestock ownership, primary income source, dependency ratio...). Moreover, for any choice of vulnerability structure, we allow for RSQ indices to differ across household typologies to account for the possibility that some index types and time scales are better suited to describe the quality of the rainy season for specific household groups. With up to 34 possible RSQ indices and a myriad of potential combinations for defining household typologies, the number of possible models is such that only an automated approach may allow to determine an optimum while minimizing the risk of overfitting. We implement a k-fold cross-validation procedure where we randomly partition our sam- ple of 10,969 household observations into k subsets. One sub-sample is defined as the test dataset while the other k − 1 are used as training data. For a given choice of vulnerability factors and RSQ indices, we estimate the corresponding regression model with OLS on the training dataset and we use the estimated parameters to make FCS predictions on 56 the test dataset. A score is then calculated to evaluate the performance of the model on the test dataset; we use the Mean Squared Error (MSE) between FCS predictions and true values, but we also check the robustness of our results to different metrics such as the Mean Absolute Error (MAE). The process is repeated k times so that each split is used as a test dataset exactly once and the k scores obtained are averaged to get the final skill of the model. We finally select the model exhibiting the highest skill and final parameters are estimated on the full dataset. We take a widely used value of 10 for parameter k as it has been shown empirically to be associated with an acceptable compromise between high bias and high variance in score estimations. The set of regression models tested follow the specification given by equation 4 with all controls and regional fixed effects (as in column (4) of Table 1), and we allow for both a linear and a third-order polynomial functional form for the RSQ variable. We test vulner- ability structures based on geographic breakdowns, household-level factors and combina- tions of both where we allow for different vulnerability functions within geographic units for household groups defined by a household-level factor. We test four geographic break- downs: regions (first administrative level), rainfall zones, livelihood zones and groups of departments assigned to major urban markets. We provide the corresponding maps in Figure D.1 below. We allow for six household-level factors: livestock ownership, pri- mary income source, dependency ratio, age of household head, sex of household head and household size.45 Overall, this represents 6,706,343 vulnerability structures, each associated with a unique set of household typologies. In addition, for a given vulnera- bility structure, we allow for 34 different RSQ indices, either with a linear or third-order polynomial function form, which results in 68 possible models. For a single vulnerability structure with 30 household typologies, this results in over 2×101 8 possible models. We do not estimate all these models in practice, and we make some assumptions to allow for tractability. For any given choice of vulnerability structure, we do not evaluate all possible combina- tions of RSQ indices for the resulting set of household typologies but we rather consider that optimal choices of RSQ indices across typologies are independent, i.e., the opti- mal choice of RSQ index for one typology is independent from choices made for other typologies. Conditional on a vulnerability structure, we thus carry out a separate cross- validation for each household typology to determine the optimal RSQ index. For any given typology, we select one fold in our k-fold sample split from which we take the sub- set of observations belonging to the household typology considered, and we define this 45 For livestock ownership (in TLU), dependency ratio, age of household head and household size, categorical variables are computed based on sample quartiles. Age of household head and household size have four categories as a result. There is an additional category for livestock ownership corresponding to households with no livestock (TLU equal to 0), as well as for the dependency ratio for households with no active member. 57 subset as the test set. To evaluate the performance of an RSQ index rsqtest , either with a linear or third-order polynomial functional form, we estimate a model on the remaining k − 1 sub-samples where the RSQ index is rsqtest for observations in the household typol- ogy and the 5-month precipitation z-score for the rest of the training data set that acts as a control, and we calculate a score (MSE or MAE) based on a comparison between true values and predictions on the test set. We iterate k times and evaluate the skill of the model, and we repeat this process for all RSQ indices and functional forms. We assign to the household typology the RSQ index yielding the highest skill and we repeat this operation for all possible household typologies. Given the geographic breakdowns and household-level factors allowed, there is a grand total of 650 typologies. Similarly, for any given choice of geographic breakdown, we do not evaluate all possible combinations of household-level factor for the set of geographic units, but we assume that optimal choices of household-level factor across geographic units are independent. As for the selection of optimal RSQ indices, we thus carry out a 10-fold cross-validation to select an optimal household-level factor for each individual geographic unit. Note that RSQ indices used in the specifications are the optimal indices selected in the previous step. At that point, we have determined optimal household-level factors and RSQ indices for each typology, conditional on a choice of geographic breakdown. In the last step, we therefore perform a final cross-validation to select the optimal geographic breakdown among the regional divide, livelihood zones, rainfall zones, urban markets and a model without geographic breakdown in the vulnerability structure. The optimal vulnerability structure yielded by the cross-validation procedure adopts a regional geographic breakdown with either livestock ownership (for Assaba, Hodh El Gharbi, Trarza and Tagant regions) or primary income source (for Brakna, Guidimakha, Hodh Ech Chargi and Gorgol) as household-level factor for each region, which results in 40 household typologies. The optimal RSQ index is a precipitation-based index for most typologies (23) and only a minority relies on DMP indices (4). 58 Figure D.1: Geographic breakdown tested in the cross-validation procedure. (a) Regions (b) Urban markets (c) Rainfall zones (d) Livelihood zones 59 E Exposure component calibration As explained in section 3.1, the exposure component is comprised of two sub-components. The first one gives the spatial distribution of households across modeling units and can be understood as the level of exposure at the extensive margin. Recall that modeling units are defined as the largest sets of households within which hazard conditions and vulnerability parameters are considered homogeneous. Given the latter, their definition is therefore contingent on the structure of the vulnerability component. On the other hand, the second sub-component provides fitted probability distribution of the baseline food consumption for each modeling unit and therefore describes the level of exposure at the intensive margin. In our study case, 40 household typologies are defined based on the region of residence and a household-level factor – either livestock ownership or primary income source – that varies across regions. Hazard conditions are defined at the department-level so that modeling units are all typology-department pairs. Of course, each of the 34 departments is strictly included in a region so the typologies represented in a department only depend on the household-level factor. Both livestock ownership and primary income source factors take on 5 values so there are exactly 5 typologies within each department, which results in a total of 170 modeling units. To calibrate the first sub-component of the exposure, we first get the total number of rural households by department from census data.46 Then , we further break it down into household counts by typology based on statistics computed on our pooled FSMS sample, to which we also add the 2017 round. Individual FSMS are only representative at the regional level and we thus rely on six survey rounds to approach a department-level representativity. In our pooled sample, 31 departments were visited in at least 5 survey rounds, 2 departments were visited twice and only one (Bassikounou) was visited in 2 years. We calculate the sample size needed for proportion estimation at the department- level47 and we find that our pooled sample allows to reach the required size for 22 out of 34 departments – only 3 departments have less than half the required sample size (Boumdeid, Bassikounou, Tichitt). Next, we construct a synthetic sample of baseline food consumption values to calibrate the second component of the exposure. Of course, we do not observe food consumption outcomes in normal conditions in practice and we thus infer baseline food consumption values by subtracting estimated RSQ effects to observed FCS values in our sample of household observations. We then fit Gaussian kernel densities to obtain probability dis- 46 en´ Recensement g´ epublique Islamique de Mauri- eral de la population et de l’habitat (RGPH) 2013, R´ tanie, Office National de la Statistique (ONS). 47 We use the R function sample.size.prop from the samplingbook package considering a precision of 0.05 and a 90% confidence level. 60 tributions of baseline FCS for each modeling unit. Due to sample size limitations, we estimate probability distributions at the typology-level although distinct distributions across modeling units within typologies would most likely improve the model accuracy. 61