WPS8403


Policy Research Working Paper               8403




             To Impute or Not to Impute?
    A Review of Alternative Poverty Estimation Methods
     in the Context of Unavailable Consumption Data

                         Hai-Anh H. Dang




Development Economics
Development Data Group
April 2018
Policy Research Working Paper 8403


  Abstract
 There is an increasingly stronger demand for more frequent                         national level, to estimates at a more disaggregated level,
 and accurate poverty estimates, despite the oftentimes                             as well as estimates of poverty dynamics. The paper pro-
 unavailable household consumption data. This paper offers                          vides a concise and accessible synthesis, which serves as an
 a review of alternative imputation methods that have been                          introduction to the literature. The focus is on intuition and
 employed to provide poverty estimates in such contexts.                            practical insights that highlight the nuanced differences
 These range from estimates on a nonmonetary basis, esti-                           between the existing methods rather than technical aspects.
 mates for specific project targeting or tracking trends at the




  This paper is a product of the Development Data Group, Development Economics. It is part of a larger effort by the
  World Bank to provide open access to its research and make a contribution to development policy discussions around
  the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The author may be
  contacted at hdang@worldbank.org.




         The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
         issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
         names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
         of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
         its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                       Produced by the Research Support Team
                 To Impute or Not to Impute? A Review of Alternative Poverty
             Estimation Methods in the Context of Unavailable Consumption Data



                                             Hai-Anh H. Dang*




Key words: poverty, imputation, consumption, wealth index, synthetic panels, household survey

JEL: C15, I32, O15




*
  Dang (hdang@worldbank.org) is an economist in the Survey Unit, Development Data Group, World Bank, a non-
resident research scholar with School of Public and Environmental Affairs, Indiana University, and a non-resident
senior research fellow with Vietnamâ€™s Academy of Social Sciences. We would like to thank Gero Carletto, Dean
Jolliffe, and Peter Lanjouw for helpful discussions on related work. We are grateful to the UK Department of
International Development for funding assistance through its Strategic Research Program (SRP) and Knowledge for
Change (KCP) program.
Introduction

    Fighting poverty is a challenging and complex undertaking faced by policy makers in

developing and richer countries alike. This undertaking can include different policy options

ranging from crafting short-term safety-net programs that prevent vulnerable households from

sliding into poverty during a time of crisis, to designing long-term plans that invest in education

and skills formation to promote economic growth. But regardless of the specific poverty reduction

strategies, a prerequisite for the success of the whole process is a clear understanding of poverty

trends and dynamics (either at any single snapshot in time or over different time periods). Indeed,

inaccurate measurement naturally results in an inefficientâ€”and even a wastefulâ€”use of resources

if, say, short-term employment programs are employed to address a chronic poverty situation

caused by inadequate infrastructure.

    Perhaps the greatest challenge with poverty measurement is the fact that household

consumption (or income) dataâ€”the underlying data source that provides poverty estimatesâ€”do

not often meet the necessary requirements.1 For example, such data may simply be unavailable, or

may not be comparable from one survey round to the next. As another example, household

consumption data are seldom collected on the same households (or individuals) over time, thus

making it difficultâ€”if not possibleâ€”to track the dynamics of these householdsâ€™ movements into

or out of poverty in different periods.

    We offer in this paper a review of alternative poverty estimation methods in contexts where

consumption data are unavailable. The economic literature on poverty imputation has grown

rapidly in the past 20 years, and various methods have been developed. In fact, methods that have


1
 We use the terms â€œincomeâ€ and â€œconsumptionâ€ interchangeably in this review. The latter is often considered to offer
better measures of household welfare, especially in developing countries (see, e.g., Deaton (1997)). We focus on
monetary poverty in this paper; see Alkire et al. (2015) for a comprehensive discussion of the alternative approach of
multi-dimensional poverty.

                                                           2
been proposed and discussed using different frameworks and terminology may turn out to be more

similar than one might think.2 A typical development practitioner who does not keep regular track

of the latest advances in the field may find it time-consuming, and perhaps even quite challenging,

to stay abreast of this literature.

    At the same time, to our knowledge, currently very few studies provide a systematic

introduction to this literature. We thus aim to fill in this gap by providing a succinct but user-

friendly synthesis of existing poverty imputation methods. In particular, we aim to achieve the

following objectives in this review

    i) Offer a classification of the various poverty imputation situations

    ii) Lay out clearly the contexts where imputation is most relevant

    iii) Provide an accessible description of imputation techniques, with explicit flags for common

         (or potential) mistakes

    iv) Point out â€œgrayâ€ areas with current imputation methods that need more research.

    Our goal is to offer a systematic discussion of imputation methods in a consistent format, which

starts first with each methodâ€™s motivation, a brief description of the method, some recent

application examples, and the remaining challenges. While this format may appear rigid, it offers

a (somewhat) self-contained treatment of different methods. It also helps facilitate comparison and

cross-reference between the various methods and highlight the nuanced differences among them.

To help readersâ€”particularly for those who are new to the topicâ€”obtain a quick overview of the

literature and the â€œfeelâ€ behind each method, we will focus on offering the intuition rather than the




2
  This is not to mention the (beneficial) interactions between the economic literature and the well-established, and
more general statistical literature on missing data imputation. We return to more discussion in later sections. Given
our focus on intuition and practical insights in this paper, we only provide a superficial description of the imputation
methods, and we leave more technical details to footnotes where it is useful to do so. A more technical review with
software instructions is offered in Dang, Jolliffe, and Carletto (2017).

                                                            3
technical details. Readers who are more familiar with these methods may also find some useful

practical insights and suggestions for further research. We will discuss several existing (well-cited)

studies as examples, with a focus on developing countries.

   This paper consists of eight sections. We provide a simple, but new classification of poverty

imputation methods in the next section, which also offers a roadmap to the other sections in the

paper. This roadmap can function both as a graphical illustration that links the different methods,

and as a â€œuserâ€™s guideâ€ that can help readers better identify the issues of their interest. It is

subsequently followed by more detailed discussion for poverty estimates on a non-monetary basis

(i.e., using wealth indexes) (Section II), poverty estimates for project targeting (Section III), and

monitoring poverty trends at the national level (Section IV). We then discuss poverty estimates

that are disaggregated below the national level (Section V) and estimates of poverty dynamics

(Section VI) before offering some further reflections (Section VII) and the conclusion (Section

VIII).



I. A â€œRoadmapâ€ of Poverty Imputation Methods
I.1. Roadmap
    The need to provide imputed poverty estimates varies from context to context, and nuanced

differences exist between seemingly similar imputation methods. It can thus be useful to be explicit

about the outcomes of interest, as well as the desired types of analysis and their associated

challenges, before starting an imputation project. We offer in Figure 1 a simple list of five common

questions that can be asked about key aspects of the poverty imputation process, and the suggested

strategy to address each question to be discussed in more detail (in the section listed in

parentheses). Put differently, this figure proposes a checklist in the decision process to identify

the relevant imputation method.



                                                   4
   Figure 1 suggests that the first question a researcher can ask is whether poverty estimates are

to be produced on a monetary (or money-metric) basis. If the answer to this question is no, the

relevant imputation method is to construct an asset (wealth) index, and this method is discussed in

more detail in Section II. If the answer is yes, the next useful question is whether the desired

poverty estimate is nationally representative. If the answer to this question is no, proxy means

testing is likely the relevant method (Section III). A yes to this question leads to the third question

of whether the poverty estimate will be used for dynamics analysis, which involves an examination

of household movements into or out of poverty based on synthetic panel data (Section VI).

   A no to this question leads to the fourth question of whether the poverty estimate will be

disaggregated below the national level. If yes, techniques commonly known as â€œpoverty mappingâ€

(or small-area estimation techniques) are most relevant. Another name for these techniques is

survey-to-census imputation, where the imputation model is first built from a survey and

subsequently applied to a census, since the latter offers reliable disaggregate data (Section V). If

no, survey-to-survey imputation methods are likely most appropriate. The final question is whether

the imputation will involve surveys of the same design. Within-survey imputation (Section IV.1)

and across-survey imputation (Section IV.2) should be respectively employed if imputation is done

on surveys of the same design or of a different design.

   A related but different classification is offered in Dang et al. (2017a), where a poverty

imputation situation is classified according to the degree of missing consumption data. We present

a slightly modified version of this classification in Table 1, which offers three categories in a

roughly decreasing order of the severity of missing consumption data: completely missing

(Category A), partially missing (Category B), and available cross-sectional data but missing panel

data (Category C). In other words, while Figure 1 focuses on the functional or practical aspects of



                                                   5
poverty imputation methods (and the associated roadmap pointing to the relevant discussion in this

paper), Table 1 offers another classification that is more technical and data-oriented.

    Table 1 also lists the typical poverty imputation (or corresponding data) situation, examples of

surveys, and some recent studies corresponding to each missing data category. In particular,

Category A can be further broken down into two data situations: one where the available survey

produces no consumption data, and the other where the available survey is designed for project

targeting purposes. Corresponding to these data situations are the Demographic Health Surveys

(DHS) and most small-scale surveys. Category B can also be further disaggregated into three

different but related data situations: consumption data are incomparable across survey rounds,

consumption data are unavailable in the current survey but available in some other related surveys,

and consumption data are unavailable at more disaggregated administrative levels than those

offered in the current survey. Finally, Category C addresses the widespread situation that most

surveys in developing countries do not provide (nationally representative) household panel data.

Table 1 also offers some recent studies that we will discuss in more detail in later sections.



I.2. Typical Estimation Equations
    While we focus in this paper on the intuition behind existing studies, it can be useful to briefly

discuss the commonly used empirical framework for clarity. Let xj be a vector of characteristics

that represent all the factors that determine a householdâ€™s consumption, where j indicates the

survey type. More generally, j can indicate either another round of the same household expenditure

survey, or a different survey (census), for j= 1, 2.3 Subject to data availability, xj can include

household variables such as the household headâ€™s age, sex, education, ethnicity, religion, language



3
  More generally, j can indicate any type of relevant surveys that collect household data sufficiently relevant for
imputation purposes such as labor force surveys or demographic and health surveys..

                                                         6
(i.e., which can represent household tastes), occupation, and household assets or incomes.

Occupation-related characteristics can generally include whether the household head works, the

share of household members that work, the type of work that household members participate in,

as well as context-specific variables such as the share of female household members that

participate in the labor force, or some variables at the region level. Other community or regional

variables can also be added since these can help control for different labor market conditions.

     The following linear model is typically employed in empirical studies to project household

consumption on household and other characteristics (x)

                                         í µí±¦í µí±–í µí±— = í µí»½â€² í µí±¥í µí±–í µí±— + í µí¼‡í µí±–í µí±—                                              (1)

for household i in survey j, for i= 1,â€¦, N. Equation (1) is often extended to a more general model
                                                   â€²
                                         í µí±¦í µí±— = í µí»½í µí±— í µí±¥í µí±— + í µí¼í µí±í µí±— + í µí¼€í µí±—                                          (2)

where the error term í µí¼‡í µí±–í µí±— is now broken down into two components, one (í µí¼í µí±í µí±— ) is a cluster random

effects and the other (í µí¼€í µí±— ) the idiosyncratic error term. Note that we suppress the subscript that

indexes households to make the notation less cluttered.4 For convenience, we also refer to the

survey that we are interested in imputing poverty estimates for as the target survey, and the survey

that we can estimate Equation (1) on as the base survey. The former survey is usually more recent

(or offers more disaggregated information, as in the case of a census) and has no consumption

data, while the latter is usually older and has consumption data.

     We discuss the various poverty imputation methods starting from the next section.


II. Non-Monetary Poverty Estimates
4
  Conditional on household characteristics, the cluster random effects and the error terms are usually assumed
                                                                                                  2                              2
uncorrelated with each other and to follow a normal distribution such that í µí¼í µí±í µí±— |í µí±¥í µí±— ~í µí±(0, í µí¼Ží µí¼í µí±—
                                                                                                      ) and í µí¼€í µí±— |í µí±¥í µí±— ~í µí±(0, í µí¼Ží µí¼€ í µí±—
                                                                                                                                      ).
While the normal distribution assumption results in the standard linear random effects model that is more convenient
for mathematical manipulations and computation, it is not necessary for this type of model. As can be seen later, we
can remove this assumption and use the empirical distribution of the error terms instead, albeit at the cost of somewhat
more computing time.

                                                                        7
Motivation
   Consumption data are not always collected in a household survey for various reasons. The

main reason is that since a typical consumption module (e.g., like that of a Living Standards

Measurement Survey) usually consists of up to hundreds of items of food and non-food

consumption, it takes time and a certain level of technical expertise to design such a module well.

Furthermore, these items need to be updated from time to time to reflect changes in household

consumption patterns (e.g., buying a smart phone is becoming an increasingly common purchase

these days, but might not be so just 10 years ago). Other factors such as seasonal variations in

household consumption (e.g., consumption during holidays can be larger than that during regular

times) could only be appropriately accounted for by fielding the consumption module in different

months throughout the year. Consequently, collecting household consumption data requires

considerable financial resources, time, and logistical capacity, which result in most household

consumption surveys being implemented every few years rather than on an annual basis.5 Given

this data situation, for the years when consumption data are not collected, but other non-

consumption surveys such as labor force surveys (LFS) or small-scale surveys are available, these

surveys may be â€œrepurposedâ€ to generate some substitute variable for consumption data. Indeed,

one well-known example is the DHS that (usually) do not collect consumption data but offer a

wealth index instead.


Method Description
   A well-established method to produce poverty estimates for surveys that have no consumption

data but have data on household assets and the physical characteristics of the house is to construct

a wealth index from these assets. This method typically consists of two steps. The first step is to

identify the list of assets to be used in the construction of the index, and the second step is to apply


5
    We return to more discussion on this point in Section VII.

                                                             8
some aggregation method to convert these assets into a (single-valued) index. Filmer and Pritchett

(2001), who popularized the use of such indexes in the economic literature, employ the following

variables from the India National Sample Survey (NSS): household ownership of consumer

durables (including clock/watch, bicycle, radio, television, sewing machine, refrigerator, and car),

characteristics of the householdâ€™s dwelling (including indicators about toilet facilities, the source

of drinking water, the rooms in the dwelling, the building materials used, and the main source of

lighting and cooking), and household land ownership. They then utilize the principal component

analysis (PCA) technique to create a wealth index, which offers different combinations of these

assets that aim to capture as much variation in the data as possible.6

    Variations exist on both of these steps. Since household questionnaires vary in different

contexts, the list of assets to be employed depends on data availability. For example, Filmer and

Scott (2012) examine all available asset variables in the DHS from four different countries and

find these variables to vary across countries. In particular, their data sets contain between 12

(Uganda) and 29 (Nicaragua) indicators of asset ownership, and between 4 (Ghana) and 12

(Albania) and even as many as 37 (Zambia) indicators of housing characteristics. Besides the PCA

technique, the aggregation method can range from simply counting (or adding up) all the indicators

of asset ownership to using other techniques such as factor analysis (Sahn and Stifel, 2000).7


Examples and Remaining Challenges



6
  See, for example, Jolliffe (2002) for a comprehensive treatment of PCA methods.
7
  Moser and Felton (2009) propose a variant of the PCA technique, where analysis is done on the components of each
of several types of capital including physical capital, productive capital, human capital, and social capital. Another
approach is to produce householdsâ€™ ranks in the population with the number of consumption items they own, if we
make the additional assumption that households place an order of importance on their consumption items when having
to reduce their consumption expenditure (Deutsch, Silber, and Wang, 2017). Notably, a common mistake when
constructing wealth indexes is to convert ordinal asset variables to dummy variables and then aggregate; see Kolenikov
and Angeles (2009) for a careful analysis of this issue. An alternative approach is to collect data on a reduced set of
consumption items that may offer strong correlation with the total consumption aggregate (Morris et al., 2000).

                                                           9
    Sahn and Stifel (2000) offer a well-cited study that constructs wealth indexes for two or more

periods from the DHS for 11 Sub-Saharan African countries between 1986 and 1998. They find

that poverty generally declined, largely due to improvements in rural areas. A recent study by

Harttgen, Klasen, and Vollmer (2013) analyzes 160 DHS surveys from 33 African and 34 non-

African countries and constructs wealth indexes in different ways. This study argues, however,

that employing wealth indexes as a proxy for trends in household consumption is subject to various

types of biases. These include changing preferences for certain assets (e.g., the increasing

ownership of smart phones) and changing relative prices among different assets leading to more

demand for one asset at the expense of others (e.g., the dramatically decreasing price of smart

phones).

    Other additional challenges exist with using wealth indexes to measure poverty. First, as

suggested by the title of this section, wealth indexes offer a non-monetary or relative measure of

poverty. Put differently, a wealth index can only identify a household as poor by its relative

position on the populationâ€™s wealth distribution. This renders it difficult, if not impossible, to

compare the poverty rate among different countries without assuming that all the countries under

comparison have comparable distributions of wealth. Practically speaking, the list of assets must

be either identical or comparable for all these countries, which is certainly harder to satisfy than

comparing a consumption distribution denominated on a monetary basis such as the international

Purchasing Power Parity (PPP) dollars. For example, an air-conditioner may be a valuable asset in

countries in a tropical climate, but it may not even be found in those in a frigid climate, and vice

versa with assets such as a heater. Another inconvenience is that the poverty line (threshold) is

relative and would need to be fixed for the whole distribution.8 Furthermore, wealth indexes tend


8
 A similar concern applies to comparing the wealth index within a country (or countries) over time. In this case, assets
for all countries from two periods must be pooled together to construct the wealth index.

                                                           10
to provide biased estimates of poverty rates (as measured by household consumption) and may

only be able to help track poverty trends over time under special conditions (Dang et al., 2017a).

    We end this section on a cautious note that even if all the challenges discussed above are

overcome, to our knowledge, several other technical challenges remain with the usage of wealth

assets to measure poverty. For example, it remains unclear how many assets are sufficient in

constructing a wealth index. Assets may be more comparable within a country, but it is unclear to

what extent assets are comparable across countries (for cross-country analysis)? It also remains

unclear how to take into account the issue of quality versus quantity of assets (e.g., the ownership

of one brand-new luxurious car can be different from that of three old and cheap cars).9


III. Poverty Estimates for Project Targeting
Motivation
    Identifying poor households that are beneficiaries of social transfer projects is a common task

which is also known as proxy means testing, but completing this task is not simple when there are

no data on household consumption. Put differently, this case shares a similar constraint with that

of the previous section: consumption data are not collected because of various costs or logistical

constraints. One major difference with proxy means testing, however, is targeting a subset of the

poor population that are to benefit from (government or NGO) subsidies rather than measuring

poverty for the whole population. Another important feature is that proxy means testing is (mostly)

implemented when there are two sources of household survey data: one is a nationally

representative survey that collects both household consumption data and the variables (í µí±¥í µí±–í µí±— , that

are used in Equation (1)), and the other is the special and (much) lighter survey conducted to collect

only the variables (í µí±¥í µí±–í µí±— ) for the purpose of proxy means testing.


9
 Ngo (2018) offers a method to construct a utility-based living standards index based on the values of assets that may
address this issue to some extent (assuming that asset quality is correctly measured by asset values). But one practical
challenge with this approach is that it requires the collection of asset price data.

                                                           11
Method Description
   Proxy means testing involves two steps. The first step is to estimate the household consumption

model using the nationally representative household (or larger) survey, as expressed in Equation

                                                                                  Ì‚ obtained from the
(1). Once this is done, the second step is to combine the predicted coefficients í µí»½

first step with the variables í µí±¥í µí±–í µí±— in the smaller survey to generate the predicted consumption (i.e.,

                     Ì‚ â€² í µí±¥í µí±–í µí±— ) for household i in the smaller survey. In other words, proxy means
estimating the term í µí»½

                                            Ì‚ that come from the larger consumption survey and the
testing employs the predicted coefficients í µí»½

variables í µí±¥í µí±–í µí±— in the smaller survey to generate the predicted household consumption, which is

subsequently utilized to generate poverty estimates.10


Examples and Remaining Challenges
   A major advantage of proxy means testing over wealth indexes is that the former offers an

estimate of household consumption, while the latter an estimate of household wealth. As such,

proxy means testing perhaps provides a better estimate of poverty rates (that are based on

household consumption data). Still, proxy means testing tends to offer biased poverty estimates.

Indeed, Brown, Ravallion, and van de Walle (2016) offer a recent assessment of the performance

of various proxy-means testing methods using data for nine African countries. They find that

standard proxy-means testing is useful with targeting but excludes many poor people. Furthermore,

even with some methodological adjustments, there is room for improvement with proxy-means

tests, particularly with identifying the poorest.

     The intuition behind this result is rather straightforward: the household poverty status is based

on the household consumption that consists of two terms on the right hand side of Equation (1),


10
  For recent reviews of proxy mean tests, see Grosh et al. (2008), Coady et al. (2014), and Brown, Ravallion and van
de Walle (2016).

                                                         12
but proxy means testing likely offers biased estimates since it offers the estimate for only one of

                    Ì‚ â€² í µí±¥í µí±–í µí±— ). 11 Perhaps the largest advantages of proxy means testing is that for a smaller
these terms (i.e., í µí»½

population group, it is rather simple to implement (relatively speaking, compared with other

imputation methods). Consequently, it can offer a quick and inexpensive estimate of poverty in

the absence of household consumption data. This, however, may come at the expense of estimate

precision.



IV. Poverty Estimates for Tracking Trends at the National Level
Motivation
    While proxy means testing targets a subset of the poor population that are to benefit from some

social transfer programs, tracking trends in poverty rates at the national level requires poverty

estimates for the whole population. The motivation for poverty imputation in this case is related

to discussions regarding national poverty trends. For example, the Sustainable Development Goals

(SDGs) require frequent monitoring of (national and global) poverty trends that would perhaps

require much more frequent collection of household consumption data than most existing

household consumption surveys allow (see our earlier discussion with Figure 1).12 In particular,

most developing countries, say India, do not collect household consumption data annually. As

such, poverty estimates for these countries would have to be obtained using alternative methods

such as poverty imputation methods.

     In this case, similar to proxy means testing, we also need two sources of household survey

data: one survey that collects both household consumption data and the variables (í µí±¥í µí±–í µí±— , that are used


11
    Proxy means tests offer unbiased estimates of poverty only in the special case that the poverty line is set exactly at
the mean consumption level (see Dang et al. (2017a) for a more detailed technical discussion). See also Diamond et
al. (2016) for a careful comparison of the poverty score cardâ€”a variant of proxy means testingâ€”and other (regression-
based) poverty imputation methods.
12
   The first goal of the SDGs is to eliminate extreme poverty and reduce national poverty levels at least by half by
2030. For details see https://sustainabledevelopment.un.org/sdg1.

                                                            13
in Equation (1)), and the other is another non-consumption survey that offers only the variables

(í µí±¥í µí±–í µí±— ). One major difference is that both surveys should provide nationally representative data.

   We can divide this case into two subcategories depending on whether the non-consumption

survey is of the same design or a different design from the consumption survey. The former

subcategoryâ€”hereafter referred to as within-survey imputationâ€”includes situations where the

household consumption data collected in the two surveys are not comparable over time due to

changes to the consumption items (while the í µí±¥í µí±–í µí±— remain similar). This situation occurs more often

than one might think; for example, national statistical agencies regularly update the list of

consumption items from time to time to reflect changes in household consumption patterns

regarding high-technology goods such as smart phones or smart televisions. The latter

subcategoryâ€”hereafter referred to as across-survey imputationâ€”includes situations where the

consumption survey was implemented a few years back, and a newer round is yet to be fielded. At

the same time, there exists another more recent non-consumption survey such as a labor force

survey (LFS) that can be combined with the consumption survey to provide more recent poverty

estimates. Both of these sub-categories of within-survey imputation and across-survey imputation

are also commonly known as survey-to-survey imputation (which is different from survey-to-

census imputation or the poverty mapping technique discussed in the next section).

   We discuss next the techniques and challenges for both these subcategories.



Method Description
   Similar to proxy means testing, survey-to-survey imputation methods consist of two main

steps. The first step is to estimate the household consumption model using the nationally

representative household (or larger) survey, as expressed in Equation (1). Once this is done, the

                                                      Ì‚ and the distribution of the error term
second step is to combine the predicted coefficients í µí»½

                                                   14
í µí¼‡í µí±–í µí±— obtained from the first step with the variables í µí±¥í µí±–í µí±— in the smaller survey to generate the predicted

consumption. However, different from proxy means testing, the predicted household consumption

generated using survey-to-survey imputation methods is composed of both the two terms on the

                                          Ì‚ â€² í µí±¥í µí±–í µí±— and í µí¼‡Ì‚ í µí±–í µí±— .13
right-hand side of Equation (1), that is í µí»½

     Notably, the estimation framework utilized by most existing economic studies is largely based

on the seminal survey-to-census imputation method offered by Elbers, Lanjouw, and Lanjouw

(2003) (which we return to more discussion in the next section).14 Most recently, building on the

Elbers et al. (2003) method, Dang, Lanjouw, and Serajuddin (2017b) attempt to bring some further

improvements to the survey-to-survey poverty imputation method, which include simpler variance

formulas, more guidance on the selection of control variables for model building, and formulas for

standardization of variables from surveys with different sampling designs. They validate

estimation results against both household consumption data and LFS data from Jordan before

combining these two sources of data to provide more recent poverty estimates.



Examples and Remaining Challenges
   Poverty reduction in India has been subject to intense debates in the past, which was caused

by comparability issues with different rounds of the National Sample Survey (NSS)â€”the countryâ€™s


13
     Furthermore, í µí¼‡Ì‚ í µí±–í µí±— is usually disaggregated into a cluster random effects term (í µí¼        Ì‚í µí±í µí±— ) and an idiosyncratic error term
 Ì‚)
(í µí¼€í µí±— as in Equation      (2). This feature, as well as the addition of í µí¼‡Ì‚ í µí±–í µí±— to the error term,   results in more accurate estimates
of household consumption and poverty estimates than proxy means testing (see Dang et al. (2017a) for more detailed
discussion). Also note that for consistency, the poverty line in the base surveyâ€”rather than that in the target surveyâ€”
should be used together with the predicted consumption to obtain poverty estimates.
14
    Variants on this method exist. For example, Tarozzi (2007) proposes a two-step inverse probability weighting probit
estimator, with the relevant weights derived in the first step from the change in the distribution of household
characteristics across the two surveys. Mathiassen (2009) also employs a probit estimator, but proposes an exact
expression for the standard errors and imposes a stricter parametric functional form on the error term. On the empirical
front, Christiaensen et al. (2012) and Mathiassen (2013) apply the Elbers et al. (2003) framework to provide poverty
estimates based on within-survey imputation for several countries including China, Kenya, the Russian Federation,
Uganda, and Vietnam. Using the same technique, other studies combine the household consumption survey with other
surveys such as the DHS (Stifel and Christiaensen, 2007) or the LFS (Mathiassen (2009) and Douidich et al. (2016)
to provide across-survey imputation.

                                                                    15
mainstream consumption survey dataâ€”over time (Deaton and Kozel 2005). Similar concerns,

albeit to a lesser extent, were raised about the dramatic poverty reduction between 2004 and 2012

as well. One main reason is that the questionnaire design of the consumption module in the 2011/12

(68th) round of the NSS is not comparable to that in the 2009/10 (66th) round (and 2004/05 or 61st

round), which may result in inconsistently constructed and incomparable consumption data. Dang

and Lanjouw (in press) apply imputation methods to provide checks on the poverty trend. They

first build an imputation model using the 2004/05 round as the base survey to obtain poverty

estimates for 2009/10, which are satisfactorily not different from the actual poverty rates. They

subsequently employ the same model using the 2009/10 as the base survey to obtain poverty

estimates for 2011/12. These estimates are close to the actual rates in this year, thus providing

supportive evidence for the swift fall in poverty observed in the data.

       A key assumption for survey-to-survey imputation is that the coefficients í µí»½ estimated from the

previous consumption survey can be combined with the variables in the more recent survey to

obtain poverty estimates.15 While concerns exist that this assumption is likely to be valid only

under normal circumstances, rather than during periods of fast (economic growth and) poverty

reduction, it has been shown to hold during a period of dramatic economic growth in China and

Vietnam where poverty incidence was cut by around half (Christiaensen et al., 2012). Furthermore,

a weaker version of this assumption has been proposed and validated for data from various

countries such as India, Jordan, and Vietnam (Dang et al., 2017a; Dang et al., 2017b; Dang and

Lanjouw, in press). Yet, we would like to note that the validity of this assumption can be context-

specific, and it can be useful to check it using at least two previous rounds of household

consumption surveys wherever such data are available. Common sense also suggests that the



15
     This is also commonly known as the constant parameter assumption.

                                                         16
longer is the time interval between the base survey and the target survey, the more likely that this

assumption can be violated.

     Another concern with survey-to-survey imputation, or more accurately speaking, across-

survey imputation methods, is that the variables used in the imputation in both the base survey and

the target survey should have the same distribution. This seemingly rather innocuous condition

appears often taken for granted in many studies, but if it is not satisfied, it may result in severely

biased estimates (Dang et al., 2017b)16. The intuition is rather straightforward: variables such as

household size or labor force participation may be defined differently in a household consumption

survey and a labor force survey, and the data can be collected accordingly in different ways.17

Consequently, this condition should be carefully checked, and appropriate adjustments (e.g.,

standardizing the variables) should be done before imputation is implemented on surveys of a

different design.



V. Disaggregated Poverty Estimates at the Sub-National Level
Motivation
   In most household consumption surveys, consumption data are rarely available at a

disaggregated level (such as the state or province level) due to the typically limited sample size of

household surveys. However, there exists a strong demand to produce poverty estimates at more

disaggregated levels for various purposes such as social transfer targeting or budget allocation. For

example, statistical agencies such as the U.S. Census Bureau routinely implement this task to




16
   Dang et al. (2017b) also offer a simple method to standardize the variables from the two different surveys. This
study also provides more discussion on another related issue of selecting variables in estimating the consumption
model as in Equation (1).
17
   The inconsistency between different surveys is well documented in studies using data from richer countries. For
example, Abraham et al. (2013) examine the differences between employment data between the U.S. Current
Population Surveys and employer-reported administrative data. See also Angrist and Krueger (1999) for a related
review of comparability and other data issues with a focus on labor force surveys.

                                                         17
identify poor communities.18 This task of identifying poor households is commonly known as

â€œpoverty mappingâ€ in most studies on developing countries, since poverty estimates are usually

graphed on a map at a lower-level administrative level (such as that of a district or a local

community).19 In other words, this case typically involves survey-to-census imputation, since only

censuses can offer more disaggregated data than those available in a household survey. Put

differently, survey-to-census imputation can often be regarded as some type of geographical

imputation, which differs from the (mostly) temporal imputation offered with survey-to-survey

imputation.



Method Description
   Similar to proxy means testing and survey-to-survey imputation methods, survey-to-census

imputation methods also consist of two main steps. The first step is to estimate the household

consumption model in Equation (2) using the more aggregated-level household survey. The second

                                               Ì‚ and the distribution of the two error terms í µí¼
step is to combine the predicted coefficients í µí»½                                             Ì‚í µí±í µí±— and

Ì‚
í µí¼€í µí±— obtained from the first step with the variables í µí±¥í µí±–í µí±— in the census to generate predicted


consumption data at a more disaggregated level.

     Elbers et al. (2003) is perhaps the first study that introduces a formal framework for survey-

to-census imputation in economics. Building on this framework, Tarozzi and Deaton (2007)

propose another condition where the conditional distribution of household consumption given í µí±¥í µí±—

is the same for both the survey and the census. This assumption ensures that the estimated

parameters from the smaller areas (as representative in the survey) can be imposed on the data for

the larger areas (as representative in the census). Recent developments have been proposed, mostly

18
  See, e.g., https://www.census.gov/srd/csrm/SmallArea.html.
19
  Another name for this topic in the statistical literature is â€œsmall-area estimationâ€ (see, e.g., Rao and Molina (2015)
for a recent textbook treatment).

                                                           18
in the statistics literature, to offer extensions or alternative estimation techniques to the Elbers et

al. (2003) method.20


Examples and Remaining Challenges
   Elbers et al. (2007) use â€œpoverty mapsâ€ from three countries for an ex ante evaluation of the

distributional incidence of geographic targeting of public resources. They simulate the impact on

poverty of transferring an exogenously given budget to geographically defined sub-groups of the

population according to their relative poverty status. They find large gains from targeting smaller

administrative units, such as districts or villages. They also suggest that poverty map-based

geographic targeting can be combined with within-community targeting mechanisms for better

estimation results. Lanjouw, Marra, and Nguyen (2017) employ small area estimation techniques

to estimate the poverty indexes of Vietnam's provinces and districts in 2009. They find poverty

rates to become more spatially concentrated over time, which is consistent with agglomeration-

related growth processes. They offer simulation results suggesting that in both 1999 and 2009

geographic targeting for poverty alleviation improves upon a uniform lump-sum transfer,

particularly for the more spatially disaggregated target populations.

     We note that survey-to-census imputation shares a similar issue as with across-survey

imputation methods: the variables used in the imputation in both the (base) survey and the (target)

census should have the same distribution. Perhaps we cannot overemphasize the importance of

this condition, given both the theoretical results and empirical evidence (offered by Tarozzi and

Deaton (2007) and Dang et al. (2017b)). However, to our knowledge, few studies offer explicit




20
  See, e.g., Bilton et al. (2017) for a proposal to use a classification trees technique for poverty mapping, and Das and
Chambers (2017) for alternative standard error formulae with the Elbers et al. (2003) method. See also Pratesi (2016)
for a recent collection of studies discussing various technical aspects of poverty mapping. Another study by Steele et
al. (2017) applies machine learning techniques and big data (i.e., cell phone and satellite data) to evaluate poverty
mapping.

                                                            19
checks of this condition before implementing the imputation. Even fewer studies, if any, attempt

to standardize the variables in both the survey and the census.

     Another interesting and useful area that needs more research is how to produce and interpret

the evolution of poverty maps over time. Multiple challenges exist with generating such dynamic

poverty maps. One is that we would need survey and census data at two points in time, and both

these data sources should be comparable both at each point in time (i.e., the issue discussed

immediately above) and over time (i.e., the issue of the constant parameter assumption as discussed

with survey-to-survey imputation methods). While some alternative techniques have been

proposed,21 it seems that this topic still needs more development.


VI. Dynamic Poverty Estimates
Motivation
    Different poverty situations are best addressed with different policy responses. In particular,

transitory and chronic poverty typically require different policy instruments and no single policy

may successfully address both. For example, while transitory poverty can be alleviated with safety

net programs, chronic poverty would need to be tackled with structural and longer-term

interventions such as investment in human capital and building infrastructure.22

     However, poverty estimates based on cross-sectional data provide only static snapshots of

poverty rates, rather than the dynamics of poverty transitions over time. Absent a clear

understanding of poverty dynamics, a seemingly unchanged poverty rate of, say, 15 percent in two

periods could conceal dynamic processes ranging from zero mobility (i.e., where all households

see no change in their welfare) to perfect mobility (i.e., where all poor households in the first


21
   For example, Nguyen (2011) offers an innovative study that uses panel data from household surveys to estimate the
relation between expenditure in the second period and household characteristics in the first period. The estimated
parameters are then applied to a census in the second period to predict expenditure and poverty measures in a future
third period. This approach may address, partially but not completely, the issues raised above.
22
   See, e.g., Barret (2005) and Ravallion (2016) for more discussion on different policy approaches to reduce poverty.

                                                          20
period escaped poverty and were all replaced by households that had previously been non-poor in

the first period) and any scenario between these two extremes. Dynamics analysis is crucial for the

design of effective and efficient poverty reduction policies, but such analysis requires panel survey

data that are usually unavailable (particularly for developing countries).



Method Description
   In the absence of actual panel data, Dang et al. (2014) and Dang and Lanjouw (2013) recently

propose methods to construct synthetic panels from repeated cross sections, which have provided

encouraging results in various settings.23 Their techniques share certain similarities with survey-

to-survey imputation methods and include the following steps. First, estimate the household

consumption model in Equation (1) using the available cross sections to obtain the predicted

              Ì‚ . Second, estimate the correlation coefficient í µí¼Œ of the error terms í µí¼‡í µí±–í µí±— , using cohort-
coefficients í µí»½

                                                                          Ì‚ and í µí¼Œ
aggregated household consumption between the two surveys. Third, combine í µí»½      Ì‚ to obtain

estimates of poverty mobility using bivariate probability formulae.

     A key difference, however, with survey-to-survey imputation methods is that the xj in Equation

(1) should consist of time-invariant characteristics alone. These include such variables as ethnicity,

religion, language, place of birth, and parental education, which provide the connectors between

different rounds of cross sections with the appropriate age adjustments. For example, the cohorts



23
  For example, recent applications and further validations of these synthetic panels methods include Ferreira et al.
(2013), Cruces et al. (2015), and Vakis et al. (2015) for Latin American countries, Bourguignon, Moreno, and Dang
(2018) and Foster and Rothbaum (2015) for Mexico, Balcazar et al. (2018) for Colombia, Martinez et al. (2013) for
the Philippines, Garbero (2014) for Vietnam, Cancho et al. (2015) for countries in Europe and Central Asia, Dang and
Ianchovichina (forthcoming) for countries in the Middle East and North Africa region, Dang and Dabalen (in press)
and Dang, Lanjouw and Swinkels (2017) for Sub-Saharan African countries, and Dang and Lanjouw (2017 and
forthcoming) for India, Vietnam, and the United States. Researchers at international organizations including the
UNDP and the Asian Development Bank have also applied these methods for analysis of welfare mobility (UNDP,
2016; Jha et al., 2018); see also OECD (2015) for an application by the OECD to study labor transitions in richer
countries. See also Gibson (2001) for a related study on how panel data on a subset of individuals can be used to infer
chronic poverty for a larger sample.

                                                           21
age 25 to 55 in the first survey round are likely the same cohorts age 30 to 60 in the second survey

round five years later.24 The effects of the time-varying variables are thus subsumed in the

correlation coefficient í µí¼Œ of the error terms. This feature stands in contrast with the survey-to-

survey imputation methods discussed in earlier sections that aim to capture as many relevant (time-

invariant and time-varying) variables on the right-hand side in their estimation models.25



Examples and Remaining Challenges
   Despite a growing collection of nationally representative panel surveys for African countries,

data coverage is unfortunately available for only seven countries, and the time periods spanned by

these panel surveys are mostly limited to short periods of three years or less. To overcome this

data shortage, Dang and Dabalen (in press) construct synthetic panel data for more than 20

countries accounting for two-thirds of the population in Sub-Saharan Africa; these synthetic panels

span an average of six years for each country. Their analysis suggests that all these countries as a

whole have had pro-poor growth, with one-third of the poor population escaping poverty, which

is larger than the proportion of the population that fell into poverty in the same period. Chronic

poverty, however, remains high and a considerable proportion of the population is vulnerable to

falling into poverty.

      Despite their increasing popularity, synthetic panels are not the perfect substitute for actual

panel data. In particular, not much is known whether, and how useful synthetic panels can be

utilized in applications involving studying a causal relationship or regression analysis. The analysis

offered to date in terms of profiling the poverty trajectories for population groups with these

24
   The age range can be adjusted similarly if there is a different time interval between the two survey rounds. Other
time-varying household characteristics can also be used if retrospective questions about the round-1 values of such
characteristics are asked in the second-round survey.
25
   This difference is further accentuated with some missing data imputation methods in the statistical literature where
sample design variables such as sampling weights, strata and cluster identifiers are also included in the estimation
model (see, e.g., Rubin (1987)).

                                                           22
synthetic panels is mostly descriptive, with little explicit attention to underlying causal

mechanisms. More research is thus needed on the application of these synthetic panels in such

contexts. Put differently, it is useful to know to which extent synthetic panels can substitute for

actual panel survey data.



VII. Further Reflections on Related Issues

   We are more often than not faced with contexts where we either have no consumption data, or

the available consumption data have quality. Indeed, Serajuddin et al. (2015) find that over the

period 2002- 2011, almost one-fifth (i.e., 28) of the 155 countries have only one poverty data point

in the WDI database, and as many as 29 countries do not have any poverty data point in the same

period. Another recent survey by Beegle et al. (2016) indicates that slightly more than half (i.e.,

27) of the 48 countries in Sub-Saharan Africa had two or more comparable household surveys for

the period between 1990 and 2012. Clearly, poverty imputation is useful, and may likely be the

only choice in these cases. But what about other contexts where we have a choice over poverty

imputation and implementing a full-scale household consumption survey?

   Dang et al. (2017a) suggest that even in these contexts, there are still a couple of advantages

to poverty imputation methods, particularly in the following scenarios

   i)      In the immediate term (when micro survey data are not fully available for all countries)

   ii)     Survey costs and/or survey implementation pose a challenge

   iii)    Back-casting consumption from a more recent survey for better comparison with older

           surveys.

It can be useful to offer some additional commentary on these cases. Cases (i) and (ii) are closely

related and are perhaps the main driving factor behind poverty imputation, since very few, if any,

developing countries can afford the financial and logistical challenges of fielding a household

                                                 23
consumption survey every year. Consequently, most developing countries are likely to implement

the household consumption survey every few years. In such contexts, poverty imputation methods

can offer a (far) less costly option to provide estimates for the intervening years between the

surveys.26 Case (iii), although less common, certainly represents the scenario where poverty

imputation is the only route to provide poverty estimates for surveys fielded in the past.

     Another advantage with poverty imputation methods that has received little attention is the fact

that such methods, particularly survey-to-survey imputation, can help us bypass the oftentimes

thorny issues of obtaining the appropriate (intertemporal and intraregional) price deflators. This

issue worsens for cross-country comparison, since in that case we have to employ conversion

factors to convert the different currencies to the same base.

     Notably, a promising direction for further development with poverty imputation methods is

that they need not be restricted to the topic of poverty alone but can be utilized in other fields as

well. For example, Fujii (2010) and Sohnesen et al. (2017) employ the Elbers et al. (2003) method

to provide a map of child malnutrition respectively in Cambodia and Ethiopia; Gibson (2018) uses

the same method to study the effect of deforestation on subsequent inequality in the rural Solomon

Islands. As another example, a recent study by Dang and Ianchovichina (forthcoming) constructs

synthetic panels from the cross sections in the Gallup Poll for 16 countries in the Middle East to

offer analysis of dynamics of subjective well-being during the Arab Spring period. Figure 3, taken

from this study, plots the percentage of the poor or vulnerable in the first year who move up one

or two welfare categories in the second year for major population groups classified by gender,




26
   Recent estimates suggest that the average cost of implementing a household consumption survey (in 2014 or later)
ranges from approximately $US 800,000 to US$ 5 million depending on the context and sample sizes (Kilic et al.,
2017). At the same time, implementing poverty imputation may require only a fraction of this cost since its major
expense is to cover analytical time. Indeed, selective pairing of international experts with national statistical agenciesâ€™
staff can form small teams that provide both cost-effective analysis and local capacity building.

                                                             24
education levels, work status, migration status, and residence areas. This figure suggests that

upward mobility is weaker than downward mobility both for Arab Spring and other Arab countries,

and migrants are more likely to be less upwardly mobile (and more downwardly mobile) in Arab

Spring countries, but the opposite holds for non-Arab Spring countries. As such, in a similar spirit,

other potentially useful applications of poverty imputation methods may include emerging policy-

relevant topics such as vulnerability, multidimensional poverty, or gender equality.

       Yet, we end this section on a cautious note that poverty imputation methods, like most other

statistical models, rely heavily on the modeling techniques and their accompanying assumptions.

If the model assumptions are satisfied by the data, poverty imputation can yield encouraging and

low-cost results. However, the opposite holds where the model assumptions are invalid. It would

thus be useful to offer careful checks on the relevant modeling assumptions as well as the variable

selection process before providing imputation-based poverty estimation.



VIII. Conclusion
   The growing demand for more frequent and accurate poverty estimates is not satisfied by the

current availability of household consumption data, at least in the short run. Imputation methods

offer a promising solution against this background and have been widely in use.27 These methods

have also received increasingly more attention. For example, a recent and high-profile report on

monitoring global poverty (Atkinson, 2017) explicitly calls for further exploration of imputation

techniques for poverty measurement purposes in data-scarce contexts. Yet, there is currently a

severe dearth of research that can bridge the gaps between the typical development practitioners

and the latest advances in the field. We aim to help fill in this gap with this review.




27
     See, e.g., Jolliffe et al. (2015) for a recent review.

                                                              25
References
Abraham, Katharine G., John Haltiwanger, Kristin Sandusky, and James R. Spletzer. (2013)
   â€œExploring Differences in Employment between Household and Establishment Dataâ€. Journal
   of Labor Economics, 31, S129-S172.

Alkire, Sabina, James Foster, Suman Seth, Jose Manuel Roche, and Maria Emma Santos. (2015).
   Multidimensional Poverty Measurement and Analysis. USA: Oxford University Press.

Angrist, J. D. and Krueger, A. B. (1999) â€œEmpirical Strategies in Labor Economics.â€ In
   Ashenfelter, Orley and David E. Card. (Eds.). Handbook of Labor Economics, Vol. 3c.
   Amsterdam: North-Holland.

Atkinson, Anthony B. (2017). Monitoring Global Poverty: Report of the Commission on Global
   Poverty. Washington, DC: The World Bank.

Balcazar, Carlos Felipe, Hai-Anh Dang, Eduardo Malasquez, Sergio Olivieri and Julieth Pico.
   (2018). â€œWelfare Dynamics in Colombia: Results from Synthetic Panelsâ€. Working paper.

Barrett, Christopher B. (2005). "Rural Poverty Dynamics: Development Policy Implications."
   Agricultural Economics, 32: 45-60.

Beegle, Kathleen, Luc Christiaensen, Andrew Dabalen, and Isis Gaddis. (2016). Poverty in a
   Rising Africa. Washington, DC: The World Bank.

Bilton, Penny, Geoff Jones, Siva Ganesh, and Steve Haslett. (2017). "Classification Trees for
    Poverty Mapping." Computational Statistics & Data Analysis, 115: 53-66.

Bourguignon, Francois, Hector Moreno, and Hai-Anh Dang. (2018). â€œOn the Construction of
   Synthetic Panelsâ€. Working paper. Paris School of Economics.

Brown, Caitlin, Martin Ravallion, and Dominique van de Walle. (2016). â€œA Poor Means Test?
   Econometric Targeting in Africaâ€. World Bank Policy Research Working Paper No. 7915.
   Washington DC: The World Bank.

Cancho, Author CÃ©sar, MarÃ­a E. DÃ¡valos, Giorgia Demarchi, Moritz Meyer, and Carolina SÃ¡nchez
   PÃ¡ramo. (2015). â€œEconomic Mobility in Europe and Central Asia: Exploring Patterns and
   Uncovering Puzzlesâ€. World Bank Policy Research Paper No. 7173.

Christiaensen, Luc, Peter Lanjouw, Jill Luoto, and David Stifel. (2012). "Small Area Estimation-
   based Prediction Models to Track Poverty: Validation and Applications.â€ Journal of Economic
   Inequality, 10(2):267-297.

Coady, David, Margaret Grosh, and John Hoddinott. (2014). â€œTargeting Outcomes Reduxâ€. World
   Bank Research Observer, 19:61â€“85.




                                               26
Cruces, Guillermo, Peter Lanjouw, Leonardo Lucchetti, Elizaveta Perova, Renos Vakis, and
   Mariana Viollaz. (2015). â€œEstimating Poverty Transitions Repeated Cross-Sections: A Three-
   country Validation Exerciseâ€. Journal of Economic Inequality, 13:161â€“179.

Cuesta Jose and Gabriel Lara Ibarra. (2018). â€œComparing Cross-Survey Micro Imputation and
   Macro Projection Techniques: Poverty in Post Revolution Tunisiaâ€. Journal of Income
   Distribution.

Dang, Hai-Anh and Andrew L. Dabalen. (in press). â€œIs Poverty in Africa Mostly Chronic or
   Transient? Evidence from Synthetic Panel Dataâ€. Journal of Development Studies.

Dang, Hai-Anh and Elena Ianchovichina. (forthcoming). â€œWelfare Dynamics with Synthetic
   Panels: The Case of the Arab World in Transitionâ€. Review of Income and Wealth.

Dang, Hai-Anh and Peter Lanjouw. (2013). â€œMeasuring Poverty Dynamics with Synthetic Panels
   Based on Cross-Sectionsâ€. World Bank Policy Research Working Paper No. 6504, World
   Bank, Washington, DC.

---. (2016). â€œToward a New Definition of Shared Prosperity: A Dynamic Perspective from Three
     Countriesâ€. In Kaushik Basu and Joseph Stiglitz. (Eds.). Inequality and Growth: Patterns and
     Policy. Palgrave MacMillan Press.

---. (2017). â€œWelfare Dynamics Measurement: Two Definitions of a Vulnerability Line and Their
     Empirical Applicationâ€. Review of Income and Wealth, 63(4): 633-660.

---. (in press). â€œPoverty and Vulnerability Dynamics for India during 2004-2012: Insights from
     Longitudinal Analysis Using Synthetic Panel Dataâ€. Economic Development and Cultural
     Change.

Dang, Hai-Anh, Dean Jolliffe, and Calogero Carletto. (2017a). "Data Gaps, Data Incomparability,
   and Data Imputation: A Review of Poverty Measurement Methods for Data-Scarce
   Environments". World Bank Policy Research Paper # 8282. World Bank: Washington, DC.

Dang, Hai-Anh, Peter Lanjouw, Umar Serajuddin. (2017b). â€œUpdating Poverty Estimates at
   Frequent Intervals in the Absence of Consumption Data: Methods and Illustration with
   Reference to a Middle-Income Country.â€ Oxford Economic Papers, 69(4): 939-962.

Dang, Hai-Anh, Peter Lanjouw, Jill Luoto, and David McKenzie. (2014). â€œUsing Repeated Cross-
   Sections to Explore Movements in and out of Povertyâ€. Journal of Development Economics,
   107: 112-128.
Das, Sumonkanti, and Ray Chambers, R., (2017). â€œRobust meanâ€squared error estimation for
   poverty estimates based on the method of Elbers, Lanjouw and Lanjouwâ€. Journal of the Royal
   Statistical Society: Series A (Statistics in Society), 180(4): 1137-1161.

Deaton, Angus. (1997). The Analysis of Household Surveys: A Microeconometric Approach to
   Development Policy. MD: The Johns Hopkins University Press.

                                                27
Deaton, Angus and Valerie Kozel. (2005). The Great Indian Poverty Debate. New Delhi:
   Macmillan.

Deutsch, Joseph, Jacques Silber, and Guanghua Wan. (2017). â€œCurbing Oneâ€™s Consumption and
   the Impoverishment Process: The Case of Western Asiaâ€. Research on Economic Inequality,
   25: 1-24.

Diamond, Alexis, Michael Gill, Miguel Rebolledo Dellepiane, Emmanuel Skoufias, Katja Vinha,
   and Yiqing Xu. (2016). â€œEstimating Poverty Rates in Target Populations: An Assessment of
   the Simple Poverty Scorecard and Alternative Approachesâ€. Policy Research Working Paper
   No. 7793. World Bank, Washington, DC.

Douidich, Mohamed, Abdeljaouad Ezzrari, Roy van der Weide, and Paolo Verme. (2016).
   â€œEstimating Quarterly Poverty Rates Using Labor Force Surveys: A Primer.â€ World Bank
   Economic Review, 30(3): 475-500.

Elbers, Chris, Jean O. Lanjouw, and Peter Lanjouw. (2003). â€œMicro-Level Estimation of Poverty
   and Inequality.â€ Econometrica, 71(1): 355-364.

Elbers, Chris, Tomoki Fujii, Peter Lanjouw, Berk Ã–zler, and Wesley Yin. (2007). "Poverty
   Alleviation through Geographic Targeting: How Much Does Disaggregation Help?" Journal
   of Development Economics, 83(1): 198-213.

Ferreira, Francisco H. G., Julian Messina, Jamele Rigolini, Luis-Felipe LÃ³pez-Calva, Luis Felipe
   LÃ³pez-Calva, and Renos Vakis. (2012). Economic Mobility and the Rise of the Latin American
   Middle Class. Washington DC: World Bank.

Filmer, Deon and Lant Pritchett. (2001). â€œEstimating Wealth Effects without Expenditure Dataâ€”
    or Tears: An Application to Educational Enrollments in States of Indiaâ€. Demography, 38(1):
    115â€“132.

Filmer, Deon and Kinnon Scott. (2012). â€œAssessing Asset Indices.â€ Demography, 49 (1): 359â€“92.


Fujii, Tomoki. (2010). â€œMicro-Level Estimation of Child Undernutrition Indicators in Cambodiaâ€.
    World Bank Economic Review, 24(3): 520â€“553.

Garbero, Alessandra. (2014). â€œEstimating Poverty Dynamics Using Synthetic Panels for IFAD-
   supported Projects: A Case Study from Vietnamâ€. Journal of Development Effectiveness, 6(4):
   490-510.

Gibson, John. (2001). â€œMeasuring Chronic Poverty without a Panelâ€, Journal of Development
   Economics 65(2): 243-66.




                                               28
---. (2018). â€œForest Loss and Economic Inequality in the Solomon Islands: Using Small-Area
     Estimation to Link Environmental Change to Welfare Outcomesâ€. Ecological Economics, 148:
     66â€“76.

Grosh, M., C. Del Ninno, E. Tesliuc, and A. Ouerghi. (2008). For Protection and Promotion: The
   Design and Implementation of Effective Safety Nets. Washington, DC: World Bank.

Harttgen, Kenneth, Stephan Klasen, and Sebastian Vollmer. (2013). â€œAn African Growth Miracle?
   Or: What Do Asset Indices Tell Us about Trends in Economic Performance?â€ Review of
   Income and Wealth, 59(S1): S37â€“S61.

Jha, S., A. Martinez, P. Quising, Z. Ardaniel, and L. Wang. (2018). â€œNatural Disasters, Public
    Spending, and Creative Destruction: A Case Study of the Philippinesâ€. ADBI Working Paper
    817. Tokyo: Asian Development Bank Institute.

Jolliffe, Dean, Peter Lanjouw, Shaohua Chen, Aart Kraay, Christian Meyer, Mario Negre, Espen
    Prydz, Renos Vakis, and Kyla Wethli. (2015). A Measured Approach to Ending Poverty and
    Boosting Shared Prosperity: Concepts, Data, and the Twin Goals. Washington DC: The World
    Bank.

Kilic, Talip, Umar Serajuddin, Hiroki Uematsu, and Nobuo Yoshida. (2017). "Costing Household
    Surveys for Monitoring Progress toward Ending Extreme Poverty and Boosting Shared
    Prosperity." World Bank Policy Research Paper no. 7951, World Bank, Washington, DC.


Kolenikov, S. and Angeles, G. (2009). â€œSocioeconomic status measurement with discrete proxy
   variables: Is principal component analysis a reliable answer?â€ Review of Income and Wealth,
   55(1): 128-165.
Lanjouw, Peter, Marleen Marra, and Cuong Nguyen. (2017). "Vietnamâ€™s Evolving Poverty Index
   Map: Patterns and Implications for Policy." Social Indicators Research, 133(1): 93-118.

Martinez, Arturo Jr., Mark Western, Michele Haynes, Wojtek Tomaszewski. (2013). â€œMeasuring
  Income Mobility Using Pseudo-Panel Dataâ€. Philippine Statistician, 62(2): 71-99.

Mathiassen, Astrid. (2009). â€œA Model Based Approach for Predicting Annual Poverty Rates
   without Expenditure Dataâ€. Journal of Economic Inequality, 7:117â€“135.

---. (2013). â€œTesting Prediction Performance of Poverty Models: Empirical Evidence from
    Ugandaâ€. Review of Income and Wealth 59, no. 1:91â€“112.

Morris, Saul S., Calogero Carletto, John Hoddinott, and Luc JM Christiaensen. (2000). "Validity
  of Rapid Estimates of Household Wealth and Income for Health Surveys in Rural Africa."
  Journal of Epidemiology & Community Health, 54(5): 381-387.




                                               29
Moser, Caroline and Andrew Felton. (2009). â€œThe Construction of an Asset Index: Measuring
  Asset Accumulation in Ecuadorâ€. In Addison, T., Hulme, D. and Kanbur, R. (Eds.) Poverty
  Dynamics: Interdisciplinary Perspectives. Oxford: Oxford University Press.

Ngo, Diana K. L. (2018). â€œA Theory-based Living Standards Index for Measuring Poverty in
   Developing Countriesâ€. Journal of Development Economics, 130: 190-202.

Nguyen, Cuong V. (2011). â€œPoverty Projection Using a Small Area Estimation Method: Evidence
   from Vietnamâ€. Journal of Comparative Economics, 39:368â€“382.

OECD. (2015). OECD Employment Outlook 2015. OECD Publishing, Paris.
  http://dx.doi.org/10.1787/empl_outlook-2015-en

Pratesi, Monica. (Eds.) (2016). Analysis of Poverty Data by Small Area Estimation. John Wiley &
    Sons.

Rao, J. N. K. and Isabel Molina. (2015). Small Area Estimation, 2nd edition, New York: Wiley.

Ravallion, Martin. (2016). The Economics of Poverty: History, Measurement, and Policy. New
   York: Oxford University Press.

Rubin, Donald B. (1987). Multiple Imputation for Nonresponse in Surveys. New York: Wiley.

Sahn, David E. and David C. Stifel. (2000). â€œPoverty Comparison over Time and across Countries
   in Africaâ€. World Development, 28(12): 2123-2155.

Serajuddin, Umar, Hiroki Uematsu, Christina Wieser, Nobuo Yoshida, and Andrew Dabalen.
   (2015). "Data deprivation: another deprivation to end." World Bank Policy Research Paper no.
   7252, World Bank, Washington, DC.

Sohnesen, Thomas Pave, Alemayehu Azeze Ambel, Peter Fisker, Colin Andrews, and Qaiser
   Khan. (2017). "Small Area Estimation of Child Undernutrition in Ethiopian Woredas." PloS
   one 12(4): e0175445.

Steele, Jessica E., PÃ¥l Roe SundsÃ¸y, Carla Pezzulo, Victor A. Alegana, Tomas J. Bird, Joshua
    Blumenstock, Johannes Bjelland, Kenth EngÃ¸-Monsen, Yves-Alexandre de Montjoye, Asif M.
    Iqbal, Khandakar N. Hadiuzzaman, Xin Lu, Erik Wetter, Andrew J. Tatem, and Linus
    Bengtsson. (2017). â€œMapping Poverty Using Mobile Phone and Satellite Dataâ€. Journal of the
    Royal Society Interface. DOI: 10.1098/rsif.2016.0690.

Stifel, D. and Christiaensen, L. (2007) â€œTracking Poverty over Time in the Absence of Comparable
    Consumption Dataâ€. World Bank Economic Review, 21, 317-341.

Tarozzi, Alessandro. (2007). â€œCalculating Comparable Statistics from Incomparable Surveys,
   With an Application to Poverty in Indiaâ€. Journal of Business and Economic Statistics 25, no.
   3:314-336.


                                               30
Tarozzi, Alessandro and Angus Deaton. (2009). â€œUsing Census and Survey Data to Estimate
   Poverty and Inequality for Small Areasâ€. Review of Economics and Statistics, 91(4): 773-792.

United Nations Development Programme (UNDP). (2016). Multidimensional Progress: Well-
   being beyond Income. New York: United Nations Development Programme.

Vakis, Renos, James Rigolini, and Leonardo Lucchetti. (2015). Left Behind: Chronic Poverty in
   Latin America and the Caribbean. Washington, DC: World Bank.




                                               31
Table 1: Categories of Missing Household Consumption Data and Recent Sample Studies
          Extent of Missing
 Type                                           Typical Situation                                    Example                                 Recent Sample Studies
          Consumption Data
                                                                                                                                    Sahn and Stifel (2000); Filmer and
                                                                                   Demographic and Health Surveys and
                                    i) Non-consumption surveys                                                                      Pritchett (2001); Filmer and Scott
                                                                                   most small-scale surveys
         Completely missing                                                                                                         (2012)
   A
         (e.g., wealth index)                                                                                                       Grosh et al. (2008); Coady et al.
                                    ii) Proxy means test/ project targeting           Most small-scale surveys                      (2014); Brown, Ravallion, and van
                                                                                                                                    de Walle (2016)

                                    i) Consumption data not comparable             Some rounds of India's National Sample           Tarozzi (2007); Christiaensen et al.
                                    across survey rounds                           Surveys                                          (2012); Mathiassen (2013)

                                                                                   The annual LFS does not have
         Partially missing          ii) Consumption data unavailable in                                                             Mathiassen (2009); Douidich et al.
                                                                                   consumption data, but the household
   B     (e.g., imputed             current survey but available in another                                                         (2016); Dang, Lanjouw, and
                                                                                   consumption survey is implemented
         consumption)               related survey                                                                                  Serajuddin (2017)
                                                                                   every few years
                                                                                   Population census data are representative
                                    iii) Consumption data unavailable at                                                            Elbers, Lanjouw, and Lanjouw
                                                                                   at lower administrative level than a
                                    more disaggregated administrative levels                                                        (2003); Elbers et al. (2007);
                                                                                   household consumption survey, but does
                                    than those in current survey                                                                    Tarozzi and Deaton (2007)
                                                                                   not collect consumption data.
         Available cross
         sections, but missing                                                                                                      Dang et al. (2014); Dang and
                                    Most surveys in developing countries do
   C     panel data                                                                                                                 Lanjouw (2013); Bourguignon,
                                    not offer panel data
         (e.g., synthetic                                                                                                           Moreno, and Dang (2018)
         panels)
Note: LFS stands for Labor Force Surveys. This table is a modified and expanded version of Table 1 in Dang, Jolliffe, and Carletto (2017).




                                                                                 32
Figure 1: Decision Process to Select Appropriate Poverty Imputation Methods
                            Yes                                     Yes                               Yes
                                                                                                                        Synthetic panels
        Money-metric                          Nationally                         Poverty                                  (Section VI)
         (absolute)                         representative?                      dynamics?
          poverty?




                  No                                   No                                    No



        Asset index                                                             Disaggregation              Yes
        (Section II)                       Proxy means test                                                                  Poverty mapping
                                                                                below national
                                             (Section III)                                                                     (Section V)
                                                                                    level?
                                                                                                                                           Yes


                                                                                             No


                                                                                                            Yes
                                                                                   Imputation                                   Within-survey
                                                                                   using same                                     imputation
                                                                                 survey design?                                 (Section IV.1)




                                                                                             No


                                                                                 Across-survey
                                                                                   imputation
                                                                                 (Section IV.2)



Note: Rhombus represents the desired poverty estimates, and circle represent the suggested poverty imputation method.

                                                                                33
Figure 2: Number of Household Surveys vs. Countriesâ€™ Income Level, 1981- 2014
    30
    25
    20




                                                                                    y= -7.6 + 2.9x
                                                                                        (2.9) (0.5)
    15
    10
     5




             4                     5                  6                         7                     8
                                         log of mean consumption
         Note: Estimated coefficients are shown from an OLS regression of the number of surveys on log of
         mean consumption; standard errors are in parentheses.



Source: Dang et al. (2017a).




                                                            34
                 Figure 3: Subjective Wellbeing Dynamics in Arab Spring Countries and Other Arab
                 Countries Based on Synthetic Panels, 2007-2012

                                       Panel A: Upward mobility                                        Panel B: Downward mobility
                     80




                                                                                                                                                                    80
                     70




                                                                                                                                                                    70
                     60




                                                                                                                                                                    60
Percentage (%)




                                                                                                                                             Percentage (%)
                     50




                                                                                                                                                                    50
                     40




                                                                                                                                                                    40
                     30




                                                                                                                                                                    30
                     20




                                                                                                                                                                    20
                     10




                                                                                                                                                                    10
                                e     e   y   y      y      e    d   d   g t   e    l        n y e     e   y    y     y      e    d   d   g t   e    l        n y
                              al al tar dar tiar ye ye ye kin ran tiv rura ow cit al al tar dar tiar ye ye ye kin ran tiv rura ow cit
                          m        m n            r     o      o   o   r     a           t      m e m en           r     o      o   o   r     a           t
                                 fe m e c o n t e     pl pl pl wo mig n             a l l r ge    f m      c on te     pl pl pl wo mig n             a l l r ge
                                   el
                                     e se           em -em -)em ot               sm        la
                                                                                                    el
                                                                                                      e se           em -em -)em ot               sm        la
                                                 id elf       r    n                                              id elf       r    n
                                              pa s (de                                                          pa s (de
                                                        u n                                                              u n

                                                   Population groups                                                 Population groups

                                               Arab Spring              Others                                    Arab Spring              Others

                    Note: dashed lines represent the regional averages for upward mobility & downward mobility respectively.




                 Source: Dang and Ianchovichina (forthcoming).




                                                                                                           35