METHODOLOGY NOTE FOR REFUGEE PROFILING

                            METHODOLOGY NOTE

                                     26.02.2019

Table of contents
  1. BACKGROUND AND OBJECTIVES                                                     3
  2. SURVEY DESIGN, SAMPLING AND SAMPLE WEIGHTS                                    4
  3. QUESTIONNAIRE DESIGN                                                          7
  4. ENUMERATOR TRAINING AND DATA COLLECTION MONITORING                           10
  5. CHALLENGES IN COMBINING A HOUSEHOLD SURVEY WITH A REGISTRATION DATA UPDATE   12
  REFERENCES                                                                      12
  ANNEX 1: DATA COLLECTION MONITORING FIGURES                                     14
  ANNEX 2: QUESTIONNAIRE CONTENTS                                                 17




                                                  1
Table of tables
Table 1: The VRX and SEP interview types .................................................................................................... 6
Table 2: Robustness check of consumption item removal: poverty headcount rates comparison ............. 9
Table 3: Consumption shares of items in the optional module groups...................................................... 10


Table of figures
Figure 1: Illustration of VRX and long and short SEP coverage in the camp ................................................. 5
Figure 2: Allocation of consumption item questions using the RCM ........................................................... 8
Figure 3: Imputation of total consumption using the RCM .......................................................................... 8
Figure 4: Data collection monitoring framework........................................................................................ 11
Figure 5: Data collection monitoring: Daily individual enumerator print-outs .......................................... 14
Figure 6: Evolution of data collection indicators over the course of fieldwork .......................................... 15
Figure 7:Data collection monitoring: Overall time trends .......................................................................... 15
Figure 8: Data collection monitoring: Checking questionnaire skipping patterns...................................... 16


List of abbreviations

 KCHS                             Kenya Continuous Household Survey

 KIHBS                            Kenya Integrated Household Budget Survey

 proGres                          Profile Global Registration System (UNHCR)

 RCM                              Rapid Consumption Methodology

 SEP                              Socio Economic Profiling

 UNHCR                            United Nations High Commissioner for Refugees

 VRX                              proGres Registration Verification Exercise




                                                                      2
     1. Background and objectives

1.       Refugees in Kenya are not sufficiently covered in national surveys, contributing to a socio-
economic data gap that makes targeting and programming for this particularly vulnerable population
difficult. Kenya hosts about 470,000 refugees, where the majority originate from Somalia (55 percent).1
Other major nationalities are South Sudanese (24 percent), Congolese (9 percent) and Ethiopians (6
percent). Yet socio-economic data on refugees in Kenya as in most of Africa is scarce, which makes the
comparison of poverty and vulnerability between refugees, host communities and nationals difficult. Such
data is, however, urgently needed to inform targeting and programs for refugees, as well as host
communities.2 Forcibly displaced persons face specific vulnerabilities, including loss of assets and
psychological trauma, limited rights, lack of opportunities, a protection risk, and a lack of planning
horizon.3 Host communities have to pursue their own development efforts in an environment that has
been transformed by a large inflow of newcomers, posing challenges while also introducing new
opportunities.4
2.       In Kalobeyei, the need for better baseline welfare information coincided with a planned update
of UNHCR registration records. Kalobeyei was established in 2015 as an extension of the four Kakuma
settlements, with an overall population of 183,000 at the time. The population of 43,000 presently living
in Kalobeyei is largely South Sudanese and the majority has arrived in 2016 and 2017. Due to the
emergency situation at arrival, and the fact that most were recognized on a prima facie basis, there was
a need to complete and update registration records. Such planned data collection constitutes an
opportunity to complement registration records, which typically include 60-80 standard socio-
demographic variables, with greater detail about socio-economic conditions, including
consumption/income information. However, the design of such surveys is fundamentally different from
typical census-like registration data collection.5
3.      The Kalobeyei Socio-Economic Profiling (SEP) was designed to help fill this data gap and provide
a template for future refugee surveys. The SEP includes a host of socio-economic indicators, both on the
household and on the individual household member level, based on the forthcoming World Bank-
supported national Kenya Continuous Household Survey (KCHS), the 2015/16 Kenya Integrated Household
Budget Survey (KIHBS) and key information from the 2016 Kakuma Refugee Vulnerability Study, among
other sources.6 The survey also comprises a consumption module using the new Rapid Consumption
Methodology (RCM) for improved efficiency. The SEP and the lessons learned from its design and
implementation can therefore inform future refugee surveys in Kenya and beyond, including the planned
UNHCR global module for socio-economic analysis.
4.      Collecting household consumption data is methodologically challenging. Living standards are
most widely measured using consumption aggregates, constructed from data collected in household
surveys, such as the SEP.7 However, there is considerable variation in the exact methodology of
consumption surveys, which has been shown to significantly affect the resulting aggregates.8 The SEP
adapts a cost-efficient while reliable approach by going without a consumption diary but using an

1 According to the UNHCR, as of the end of November 2018 (https://www.unhcr.org/ke/figures-at-a-glance).
2 Beegle et al., Poverty in a Rising Africa.
3 World Bank, “Forcibly Displaced: Toward a Development Approach Supporting Refugees, the Internally Displaced, and Their Hosts�?; Etang -Ndip,

Hoogeveen, and Lendorfer, “Socioeconomic Impact of the Crisis in North Mali on Displaced People.�?
4 Verwimp and Jean�?Francois, “Forced Displacement and Refugees in Sub�?Saharan Africa: An Economic Inquiry�?; Kreibaum, “Their Suffering, Our

Burden?�?
5 Estadística, Division, and Programme, Household Surveys in Developing and Transition Countries.
6 See 0 for the full questionnaire
7 Deaton and Zaidi, “Guidelines for Constructing Consumption Aggregates for Welfare Analysis.�?
8
  Beegle et al., “Methods of Household Consumption Measurement through Surveys�?; Kilic and Sohnesen, “Same Question but Different Answer.�?

                                                                      3
extensive list of items, where households are asked to recall their recent consumption over periods
ranging from 7 days for food to 1 year for some durable goods.
5.      The SEP allows to compare the Kalobeyei refugees to other populations in Kenya, and both to
displaced and non-displaced populations in other countries. The importance of comparability between
questionnaires for meaningful comparative analysis, particularly of consumption data, is well
documented.9 By using standardized and widely used questionnaire modules and adapting a standard
household definition, the SEP provides estimates on socio-economic indicators and poverty that are
comparable to those from other surveys, including the national Kenyan 2015/16 KIHBS and surveys on
South Sudanese IDPs and South Sudanese refugees in Ethiopia.10
6.      The Kalobeyei SEP is conducted in parallel to the UNHCR Registration Verification Exercise
(VRX). The latter is designed to update the UNHCR’s Profile Global Registration System v3 (proGres)
database, which covers all refugees registered by UNHCR in the settlement. Verification is conducted
every two years, on average. In Kenya, proGres contains approximately 60-80 variables on individual
characteristics, addresses, documentation, education, employment, language, relatives and specific
needs. As part of the fieldwork, enumerators first completed a VRX questionnaire for each household,
followed by a second questionnaire for the SEP.11 For technical and confidentiality reasons, the VRX and
the SEP interviews are administered on different devices and platforms.

     2. Survey design, sampling and sample weights

7.      The exercise encompasses three different questionnaires: the VRX questionnaire, one for the
basic SEP and one for an extended SEP interview. All households in the camp need to be visited to update
the proGres database and collect basic socio-economic information. The VRX covers the full camp to
exhaustively update the proGres database. However, it is inefficient to administer more detailed
questions or the consumption modules to all households. Rather, households are randomly sampled for
the extended SEP interview. Those not selected are administered the basic SEP questionnaire instead,
which is less extensive and does not contain the consumption modules (Figure 1). This results in three
different questionnaires, where all refugees are subject to the VRX and the basic SEP questionnaire, while
a representative sample of households is administered the extended SEP (Table 1).




9 Beegle et al., Poverty in a Rising Africa; Beegle et al., “Methods of Household Consumption Measurement through Surveys�?; Backiny-Yetna,
Steele, and Djima, “The Impact of Household Food Consumption Data Collection Methods on Poverty and Inequality Measures in Niger.�?
10 For example, the South Sudan Crisis Recovery Survey 2017 and the Ethiopia Skills Profile Survey 2017
11 During the verification exercise, households were asked to confirm the accuracy of their registration records. Any new household members

(new births or arrivals) are referred to formal registration centers. Similarly, absent members are noted and, if not eventually located, their status
changed to inactive.

                                                                          4
                           Figure 1: Illustration of VRX and long and short SEP coverage in the camp




                                                                                                        VRX + extended SEP

                                                                                                        VRX + basic SEP


                                                     Source: Authors' illustration.
8.       Households are identified based on the groupings established through the UNHCR registration
process, which may vary from the functional households. In the proGres database individuals are
generally organized into nuclear families. These are defined, upon first registration, as a group which “lives
together and identifies as a family and for whom a relationship of either social, emotional or economic
dependency is assumed�?.12 By contrast, the unit of observation for the SEP is households, to ensure
comparability with the national KIHBS survey and most other consumption surveys. According to the
Kenya National Bureau of Statistics, households are groups of people who are living together, have a
common household head and share “a common source of food and/or income as a single unit in the sense
that they have common housekeeping arrangements [..]�?.13 Since proGres families and households are not
necessarily the same, the VRX and the SEP surveys do not use the same unit of observation (Table 1). For
instance, someone may at the time of registration have identified a group of people as her family, yet they
do not or no longer live together. She would thus be in the same proGres family but not the same
household as them. Or, a person may live and eat with a group of people, but not identify them as her
family. They will then be in the same household but not in the same family. The correct identification of
the household and all its members must therefore be captured before the start of a SEP interview to
ensure the comparability of the data.




12   As defined by the UNHCR, see for example UNHCR, “Implementing Registration within an Identity Management Framework.�?
13
     KNBS, “Basic Report 2015/16 KIHBS.�?

                                                                     5
                                                       Table 1: The VRX and SEP interview types
     Survey              Coverage                Interviews                      Unit of            Administering                     Modules
                                                                               observation             time

     VRX                     All                      8,000               proGres families              ~15 min              proGres v3 questions

                                                                                                                               Detailed socio-
     Extended Representative
                                                      1,500                    Households              ~100 min              economic questions
     SEP         sample
                                                                                                                            +Consumption module

     Basic              Remaining                                                                                               Essential socio-
                                                      6,500                    Households               ~25 min
     SEP                households                                                                                            economic questions
                                                                  Source: Authors' illustration.
9.      A sample size of 1,500 for the extended SEP questionnaire allows statistically detecting
differences in proportion, i.e. the poverty rate, between two (balanced) groups. The survey is designed
to identify small but meaningful differences in proportion between two groups in the sample. A sensible
threshold is a 15 percent difference in the poverty rate, detectable between any two halves of the sample.
For the Kalobeyei SEP, to obtain these results at a confidence level of 95 percent and a power of 80 percent
while allowing for about 5 percent invalid interviews, a targeted sample size of 1,500 households is
needed.14
10.       The basic SEP produces a list of all refugee households in the camp to serve as the sample frame
for the extended SEP, making a separate listing exercise unnecessary. Drawing the sample requires a list
of all households in the camp to serve as the sample frame. If it does not exist beforehand, usually a
separate listing exercise has to be conducted before data collection where all households are visited and
recorded. Since all refugee dwellings in the camp are visited for the VRX and basic SEP to interview all
families and households, however, such advance listing becomes unnecessary. A complete list of refugee
households can be produced during data collection and the sampling can be done on-the-fly during the
visit, using the survey software on the mobile devices. The parallel design thus improves the efficiency of
sampling as compared to stand-alone household surveys. However, it also requires thorough monitoring
of whether records that appear in the VRX data also come up in the SEP data and vice versa. In addition,
it is essential that a record be made of refused or otherwise unsuccessful interviews, so that the sample
frame and non-response rate are accurate.
11.      Households can be sampled on the spot and with a fixed probability. Without certainty on the
number of households in the camp, the probability of selection that is needed to implement the random
draw in the survey software needs to be determined from an estimate. A straight-forward approach is to
use the families registered in proGres before the exercise and divide the 1,500-sample size by this total to
obtain the selection probability. Note that this assumes that households and families are on average made
up of a similar number of people. Households can then be randomly selected for the extended SEP before
the start of the interview using the tablet software.



14   Detecting the difference is most difficult when the proportion of one of the groups is p= 0.5. The formula for the sample size n of one of the
                                                2
                                (�����?1−β +�����?1−α/2 ) (p������ (1−p������ )+p������ (1−p������))
two balanced groups is ������������ =                   (p������ −p������ )2
                                                                           . Given the z-scores of �����?1−β=0.84 and �����?1−α/2=1.96 for a power of 80 percent and a
95 percent confidence interval, and the proportions �����?������ =0.5 and �����?������ =0.575 or �����?������ =0.425, this yields a minimum total sample size of �����?������������������������������ ≈ 1380.
Allowing for around 5 percent non-response, this leads to a planned sample of N=1453 ≈ 1500.

                                                                                    6
12.      Implicit stratification balances the sample in case systematic differences in household
characteristics are expected between different parts of the camp. There may be important systematic
differences between the populations of different parts of the camp, say in the date of arrival, which makes
it desirable to ensure that each neighborhood be represented proportionally in the sample. A straight-
forward way to ensure such a balanced representation in the sample is to implicitly stratify for
neighborhoods. Households then need to be linked to the families and their existing proGres records
before the sampling, and are stratified based on the addresses in the data base.15
13.      The single-stage sample design implies uniform sample weights, both for the basic and
extended SEP. Sample weights are essentially the inverse of the probability of an observation of being
included in the data. For the basic SEP, all households in the camp are selected and the selection
probability is 1. For the extended SEP, denote the selection probability by �����?0 . In a next step, the weights
need to be adjusted for unit non-response. The final weights for analyzing the basic SEP data are then
            1
�����������? = 1 ∗ ������, where ������ is the estimated propensity of response, so the overall response rate for all SEP
interviews. For the extended SEP, the implementation of the sampling also has to be accounted for. If after
data collection the final ratio �����?1 of extended interviews to the overall number of households differs slightly
from �����?0 , the sample weights need to be scaled to sum up to the overall population 16 The extended SEP
sample weight for a given household is therefore calculated as
                                                                       1 1 �����?0
                                                             ������������ =      ∗ ∗ ,
                                                                      �����?0 ������ �����?1
                       �����?
where the factor �����?������ corrects for variations in the surveyed proportion of households.
                         1

14.      The SEP data can be linked to the proGres database to cross-check between the data and
explore the correlation of VRX variables with SEP indicators. The SEP survey can record the proGres ID
for the data to be linked to the proGres database and enable cross-checks and comparisons between the
datasets.17 This allows verifying the accuracy and plausibility of the data in the analysis. In addition, the
correlation between variables in the proGres database and the more detailed SEP indicators can be
explored. This helps to better understand the implications of the proGres variables, which are available
for a large number of refugee populations worldwide.
15.     The importance of the proGres registration to the refugees reduces non-response in the SEP.
Both the basic and the extended SEP have a household non-response rate of about 2 percent, largely due
to households without adults being ineligible for the SEP questionnaires. The low non-response rate can
be explained by the combination of the SEP questionnaires with the VRX, given the importance of the
proGres registration for the refugees to receive support. However, this might have also impacted truthful
reporting to attempt maximizing support.

     3. Questionnaire design

16.     The SEP questionnaire is designed to produce data comparable with the national household
survey. The questionnaire modules for the SEP are largely taken verbatim from the KIHBS 2015/16 survey.
As also the household definitions are aligned, indicators on demographics, education, labor, household

15In practice, implicit stratification entails making a list of families ordered by their neighborhoods and randomizing the order within
neighborhoods. If then e.g. one fifth of the households needs to be sampled, one can just select every fifth household in the list.
16 In the Kalobeyei SEP, the probability of selection was �����?0 = 0.19, while the actual proportion of extended SEP interviews to the sample frame
was �����?1 = 0.184. Note that it is important that this difference does not result from significantly lower response rates to the long interviews, which
would have to be accounted for separately.
17
   For technical and confidentiality reasons, the SEP and VRX surveys may have to be conducted with different devices and on different platforms.

                                                                         7
characteristics and consumption are directly comparable to those in the national household data, allowing
for more comparative analysis and better contextualization. However, the analysis must consider that the
comparability is limited by the gap of four years between the surveys, during which national indicators
can have changed considerably.
17.     The Rapid Consumption Methodology (RCM) improves the efficiency of collecting consumption
data. Measuring consumption levels increases questionnaire administering times considerably. The RCM
reduces the number of questions in the consumption module, while still providing reliable poverty
estimates.18 The method consists of five steps: First, core consumption items are selected based on their
importance for welfare and consumption. Second, the remaining consumption items are partitioned into
three different optional consumption modules. Third, these optional modules are randomly assigned to
the households, which are then only administered the core module and their respective optional module
(Figure 2). Fourth, after data collection, a model imputes the consumption of items contained in the
optional modules for all households based on the households’ characteristics and their found association
with consumption levels (Figure 3). Finally, the resulting consumption aggregate is used to estimate
poverty.
       Figure 2: Allocation of consumption item questions                     Figure 3: Imputation of total consumption using the
                          using the RCM                                                               RCM




                                                                                                       Skip
                        3                                                             Skip                              3



                        2
                                                                                                       2




                                                                                                                       Skip
                        1                           1, 2 or 3
                                                                                                       Skip



                                                                                       1

                       Core                           Core

                                                                                     Core            Core             Core            Core
                 All questions                  Questions asked

                Core        Module 1      Module 2        Module 3                   HH 1            HH 2            HH 3         Imputation

                     Source: Authors' illustration.                                          Source: Authors' illustration.
18.      To further minimize administration times and reduce enumerator and respondent fatigue, the
list of consumption items used in the survey is optimized based on national consumption patterns.
Findings from earlier consumption surveys can inform the selection of consumption goods into the
questionnaire. Items that were found in the 2015/16 KIHBS to have a national democratic consumption
share of less than 0.1 percent and were consumed by less than 50 households are excluded from the RCM
consumption module.19 In addition, items that were listed as “other�? (e.g. “other bread�?) in the KIHBS
survey, which did not employ the RCM, are ignored. Including them in the RCM would affect
measurement, as respondents would report on consumption of items outside their optional group as an
“other�? item, leading to double counting after imputation Based on this procedure, the extended SEP

18   Pape and Mistiaen, “Household Expenditure and Poverty Measures in 60 Minutes.�?
19
     The national democratic consumption share of an item is the share of total household budget spent on that item, averaged over all households.

                                                                         8
questionnaire includes 58 fewer food and 63 fewer non-food items than the KIHBS questionnaire (Table
3). A robustness test estimates the expected impact of this optimization by re-calculating the consumption
aggregates from the 2015/16 KIHBS consumption data based on the reduced list of items. The result is an
increase of the national poverty headcount rate by only 0.05 percentage points, and a change in rural and
urban poverty of 0.1 and -0.3 percentage points respectively (Table 2). These impacts are deemed
acceptable for the SEP given measurement and sampling errors are generally considerably higher than
that.
               Table 2: Robustness check of consumption item removal: poverty headcount rates comparison

                                                                       Low share items
                                               KIHBS 2015/16
                                                                          removed

                                National            36.1%                     36.2%
                                 Rural              40.1%                     40.3%
                                 Urban              29.4%                     29.1%
                                 Peri-
                                                    27.5%                     28.3%
                                 Urban
                                             Source: Authors’ calculations.
19.      Allocation of items into the RCM modules is also informed by national consumption shares. The
consumption items of the SEP questionnaire are allocated into one core module and three optional
modules, which allows sufficient reduction of items for individual households while still producing reliable
poverty estimates. 20 The allocation is informed by consumption shares retrieved from the KIHBS 2015/16.
The items in the optional modules are distributed such that similar items within categories are included
in different modules, to ensure orthogonality between groups. At the same time, items that are more
commonly consumed are spread across optional modules, for each module to represent similarly
meaningful consumption shares (Table 3).




20
     Pape and Mistiaen.

                                                           9
                            Table 3: Consumption shares of items in the optional module groups
          Module Groups:                        Core       Module 1 Module 2 Module 3 Total                             KIHBS
                                                                                                                        2015/16

          Food:

          National democratic share 90.8% 2.8%                              2.8%            2.6%            99.0%          100%

          Number of items                       78         26               27              29              160              218

          Non-food:

          National democratic share 86.9% 3.5%                              3.5%            3.5%            97.4%          100%

          Number of items                       87         40               40              41              208              271
                                                     Source: Authors’ calculations.

     4. Enumerator training and data collection monitoring

20.      Enumerators are trained on the questionnaire, both in the classroom and in the field, including
a final exam, to ensure high-quality data. Enumerators are trained for at least one but ideally two week(s)
before fieldwork, for them to understand the questionnaire and how to correctly administer it. This
includes intensive training on different responsibilities, e.g. how to correctly identify a household before
the interview and understanding the difficulties of collecting consumption data. After being instructed in
the classroom, enumerators practice different sections of the questionnaire amongst themselves. At the
end of this training period, the enumerators’ understanding is tested in an exam. The exam administers
the questionnaire to the trainer while enumerators are capturing the answers so that they can be graded
on their accuracy. This is followed by a few days of field training in neighborhoods similar to the ones
sampled for the survey, to gain experience in administering the questionnaire in a realistic setting,
culminating in a comprehensive debriefing session to share experiences and challenges.
21.       Trained enumerators serve as mentors to less experienced ones, instructing them during
training and fieldwork to improve their performance. For enumeratorswho are not able to attend the
full training period, performance can be improvedby using better trained enumerators as mentors in the
training and during fieldwork. The less experienced enumerators each accompany a mentor for 2-5 days
of data collection. As an additional quality control, data collected by less experienced enumerators can be
thoroughly compared to data from more experienced enumerators, to identify potential systematic
differences. In particular, t-tests for differences in key indicators performed regularly during data
collection help identifying such biases early.21 In the Kalobeyei SEP no such differences were found,
indicating that less trained enumerators learned sufficiently well from their mentors.
22.      Monitoring enumerator performance during the data collection enables supervising staff to
address issues quickly. Given the instant availability of data in the cloud, dashboards are produced
automatically to support staff on the ground, underpinned by a framework of automated data exports
and scripts for statistical software. (Figure 4). Equipped with knowledge and materials on daily data
collection trends, enumerators and supervisors can track mistakes and improve their performance while

21The indicators used for these t-tests during SEP fieldwork were the proportion of “don’t know�? and “refused to respond�? answers, the median
number of consumption items entered, the median interview duration in minutes and the median number of people entered per household.

                                                                    10
data collection is still going on. The feedback includes summary print-outs both at the enumerator and
the team level, as well as trends over time (Figure 5, Figure 6 and Figure 7 in 0). The featured indicators
inform on individual enumerator performance, such as the proportion of “don’t know�? and “refused to
respond�? answers, the median number of food and non-food consumption items recorded, the median
interview duration in minutes and the median household size. Furthermore, indicators on technical
problems such as missing GPS coordinates or incorrect date/time settings help addressing such challenges
quickly. The supervising staff in the field discusses the dashboards with individual enumerators.
                               Figure 4: Data collection monitoring framework

 Data collection on mobile device

                                              Interview
                                              submission




           Data                                                     Data server      Daily data
           corrections                                                               download
           and
           feedback               Feedback dashboard           STATA code




                                                                                  Export
                                                                                  cleaned data


                                        Source: Authors' illustration.
23.       Data collection monitoring also helps to check the functioning of skipping patterns in the
questionnaire and track unsuccessful interviews. To ensure that every group of respondents is being
asked all relevant questions while skipping what is not applicable, the percentage of missing answers for
each question is monitored for different household and individual groups, based on characteristics like
the gender of the household head (Figure 8 in 0). The daily data monitoring also includes an up-to-date
list of all unsuccessful interviews, which allows keeping track of why and how many interviews fail, and
whether follow-up interviews are actually done if a new appointment is made after a failed interview
attempt.


                                                     11
    5. Challenges in combining a household survey with a registration data update

24.      Conducting in parallel the proGres verification exercise has significant advantages, yet risks
creating adverse incentives and complicates the identification of households during fieldwork.
Conducting the SEP survey in parallel with the proGres verification exercise significantly reduces the
overall organizational costs, ensures data comparability between the two surveys, and is likely to decrease
non-response by the incentives to being present on the day of the survey. However, a proGres verification
exercise requires respondents to identify their families during the interviews, where aid and shelter
allocation may depend on how many people are registered for a family. This creates incentives for
respondents to overstate the number of people living with them, registering individuals in shelters and
with families where they expect the biggest benefits. For the subsequent SEP survey, enumerators then
face difficulties identifying the groups that constitute the functional households. Together with the
different definition of proGres families and households, this complicates the identification of households
during fieldwork.
25.     Matching the records between the two instruments during fieldwork is challenging. The proGres
database must be queried during an SEP interview to identify the corresponding family records. In a
second step, the form must allow for individuals from other proGres records to be added or replaced, and
for additional proGres families to be joined, depending on the constitution of functional households. This
complicated procedure makes mismatches between the two instruments likely. The parallel
implementation exacerbates the problem as the updated proGres data is not yet available in the SEP
questionnaire at the time of the interviews. Instead, the SEP questionnaire relies on the on the previous
version of the database, making more and repetitive adjustments necessary during fieldwork.
26.     A phased implementation of the two surveys is less efficient but can mitigate some of the
practical challenges. While the losses in efficiency are potentially large, allowing a time gap between the
two surveys has some advantages. For refugees who no longer live with the family that they registered
with upon arrival in the camp, it is often unclear whether they should be with this original family or their
new household on the day of the parallel interviews. Similarly, a phased implementation mitigates the
incentives to intentionally misreport in the SEP, since it is more credible that responses in the SEP have
no direct impact on the households’ aid benefits.

    References

Backiny-Yetna, Prospere, Diane Steele, and Ismael Yacoubou Dijma Djima. “The Impact of Household Food
Consumption Data Collection Methods on Poverty and Inequality Measures in Niger.�? World Bank Policy
Research Working Paper, 2014. https://elibrary.worldbank.org/doi/abs/10.1596/1813-9450-7090.
Beegle, Kathleen, Luc Christiaensen, Andrew Dabalen, and Isis Gaddis. Poverty in a Rising Africa.
Washington, DC: World Bank Publications, 2016.
Beegle, Kathleen, Joachim De Weerdt, Jed Friedman, and John Gibson. “Methods of Household
Consumption Measurement through Surveys: Experimental Results from Tanzania.�? Journal of
Development Economics 98, no. 1 (2012): 3–18.
Deaton, A, and S Zaidi. “Guidelines for Constructing Consumption Aggregates for Welfare Analysis.�? World
Bank Living Standards Measurement Study. Washington, D.C.: World Bank, 2002.
Estadística, Naciones Unidas División de, United Nations Statistical Division, and National Household
Survey Capability Programme. Household Surveys in Developing and Transition Countries. Vol. 96. United
Nations Publications, 2005.

                                                    12
Etang-Ndip, Alvin, Johannes Hoogeveen, and Julia Lendorfer. “Socioeconomic Impact of the Crisis in North
Mali on Displaced People.�? World Bank Policy Research Working Paper, 2015, 32.
Kilic, Talip, and Thomas Sohnesen. “Same Question but Different Answer: Experimental Evidence on
Questionnaire Design’s Impact on Poverty Measured by Proxies.�? Review of Income and Wealth 65, no. 1
(2019): 144–65.
KNBS. “Basic Report 2015/16 Kenya Integrated Household Budget Survey.�? Nairobi, Kenya: KNBS, March
2018. http://statistics.knbs.or.ke/nada/index.php/catalog/88/related_materials.
Kreibaum, Merle. “Their Suffering, Our Burden? How Congolese Refugees Affect the Ugandan
Population.�? World Development 78 (2016): 262–87.
Pape, Utz Johann, and Johan A. Mistiaen. “Household Expenditure and Poverty Measures in 60 Minutes:
A New Approach with Results from Mogadishu,�? 2018.
UNHCR. “Implementing Registration within an Identity              Management      Framework,�?     2018.
https://www.unhcr.org/registration-guidance/chapter5/.
Verwimp, Philip, and Maystadt Jean�?Francois. “Forced Displacement and Refugees in Sub�?Saharan Africa:
An Economic Inquiry.�? Background Paper for “Poverty in a Rising Africa.�? World Bank, 2015.
World Bank. “Forcibly Displaced: Toward a Development Approach Supporting Refugees, the Internally
Displaced, and Their Hosts.�? Washington, DC: World Bank, 2017.




                                                  13
Annex 1: Data collection monitoring figures

            Figure 5: Data collection monitoring: Daily individual enumerator print-outs




                                   Source: Authors' illustration.


                                                14
                      Figure 6: Evolution of data collection indicators over the course of fieldwork

            Median interview duration                                  Median no. of items consumed
                during fieldwork                                             during fieldwork
          140                                                     25
          120
                                                                  20
          100
Minutes




          80                                                      15
          60                                                      10
          40
                                                                  5
          20
           0                                                      0




                                                                       11/22/2018




                                                                       12/13/2018
                11/22/2018
                11/24/2018
                11/27/2018
                11/29/2018
                 12/1/2018
                 12/4/2018
                 12/6/2018
                 12/8/2018
                12/11/2018
                12/13/2018
                12/15/2018
                12/18/2018
                12/20/2018
                12/22/2018
                  1/8/2019
                 1/10/2019




                                                                       11/24/2018
                                                                       11/27/2018
                                                                       11/29/2018
                                                                        12/1/2018
                                                                        12/4/2018
                                                                        12/6/2018
                                                                        12/8/2018
                                                                       12/11/2018

                                                                       12/15/2018
                                                                       12/18/2018
                                                                       12/20/2018
                                                                       12/22/2018
                                                                         1/8/2019
                                                                        1/10/2019
                Median long interview duration in minutes                       Median food items consumed
                Median short interview duration in minutes                      Median nonfood items consumed

                                 Source: Authors' calculations based on Kalobeyei SEP.
                                Figure 7:Data collection monitoring: Overall time trends




                                               Source: Authors' illustration.




                                                             15
Figure 8: Data collection monitoring: Checking questionnaire skipping patterns




                        Source: Authors' illustration.




                                     16
Annex 2: Questionnaire contents




                                  17