Comparing a tablet-based rapid consumption with a paper-based full consumption survey




COMPARING A TABLET-BASED RAPID CONSUMPTION
WITH A PAPER-BASED FULL CONSUMPTION SURVEY




               9/20/2019




                               i
                                      Comparing a tablet-based rapid consumption with a paper-based full consumption survey



Acknowledgement
The report was led by Utz Pape (Senior Economist, EA1PV) and written together with Gonzalo Nunez
(Consultant, EA1PV) with substantial contributions from Nduati Kariuki (ET Consultant, EA1PV). The team
is grateful for inputs and comments from the peer Alvin Etang Ndip (Senior Economist, EA1PV) and Arden
Finn (Young Professional, EA1PV) as well as guidance received from Pierella Paci (Practice Manager,
EA1PV).
The World Bank collaborated closely with the Kenya National Bureau of Statistics (KNBS) on the 2015/16
Kenya Integrated Household Budget Survey, as well as on the tablet-based survey pilot. The team would
like to thank Mr. Zachary Mwangi, Ms. Mary Wanyonyi, Mr. Paul Samoei and Mr. Samuel Kipruto for their
tireless work and dedication toward the measurement of wellbeing in Kenya.




                                                      ii
                                                           Comparing a tablet-based rapid consumption with a paper-based full consumption survey



Table of Contents
A.        INTRODUCTION ......................................................................................................................... 1
B.        DATA COLLECTION METHODS .................................................................................................... 4
     - DIFFERENCES BETWEEN PAPI (PAPER) AND CAPI (TABLET) ................................................................................. 4
     - EVIDENCE FOR THE USE OF CAPI .................................................................................................................... 6
C.        RESULTS FROM THE TABLET-BASED SURVEY ............................................................................... 8
     - USE OF SOFT CONSTRAINTS IN THE FOOD CONSUMPTION MODULE........................................................................ 8
     - IMPLEMENTATION OF THE RAPID CONSUMPTION METHODOLOGY ....................................................................... 11
D.        RESULTS FROM THE PILOT COMPARING CAPI VS. PAPI...............................................................14
     - POPULATION AND HOUSEHOLD CHARACTERISTICS ............................................................................................ 14
     - CONSUMPTION AGGREGATES AND POVERTY ESTIMATES.................................................................................... 17
E.        CONCLUSIONS AND RECOMMENDATIONS .................................................................................21
ANNEX.............................................................................................................................................22
     1.     ADDITIONAL FIGURES AND TABLES ........................................................................................................... 22
     2.     SAMPLING DESIGN AND WEIGHTS ............................................................................................................ 23
     3.     ALLOCATION OF HOUSEHOLDS TO PSUS ................................................................................................... 24
     4.     QUALITY CONTROL OF CAPI SUBMISSIONS ............................................................................................... 25
     5.     IMPUTED RENT VALUES FOR URBAN HOUSEHOLDS ...................................................................................... 26
     6.     POVERTY LINE AND CONSUMPTION AGGREGATES ....................................................................................... 26
REFERENCES ....................................................................................................................................29




                                                                            iii
                                                         Comparing a tablet-based rapid consumption with a paper-based full consumption survey



List of Figures
Figure 1: Characteristics of the paper and tablet-based survey. .................................................................. 2
Figure 2: Structure of the questionnaire for both surveys. .......................................................................... 2
Figure 3: Distribution of unit prices for cocoa & cocoa products. .............................................................. 10
Figure 4: Distribution of quantity consumed for eggs. ............................................................................... 11
Figure 5: CAPI total food and nonfood expenditure collected and imputed.............................................. 13
Figure 6: Total food and nonfood expenditure for rural areas. .................................................................. 19
Figure 7: Total food and nonfood expenditure for urban areas excluding Nairobi.................................... 19
Figure 8: Total food and nonfood expenditure for urban areas. ................................................................ 23
Figure 9: Allocation of households and EAs. ............................................................................................... 25


List of Tables
Table 1: Main characteristics of data collection methods. ........................................................................... 5
Table 2: Example of upper bound soft constraints for cocoa & cocoa products.......................................... 8
Table 3: Conversion factors to kilogram for non-standard unit bowl. ....................................................... 10
Table 4: Allocation of consumption items into modules. ........................................................................... 13
Table 5: Dwelling characteristics. ............................................................................................................... 14
Table 6: Water, sanitation, cooking and lighting source. ........................................................................... 15
Table 7: Population by age, gender and education. ................................................................................... 16
Table 8: Characteristics of the household head. ........................................................................................ 17
Table 9: Poverty estimates.......................................................................................................................... 18
Table 10: Gini inequality index.................................................................................................................... 20
Table 11: Average response rate by province. ........................................................................................... 23
Table 12: Characteristics of the sample. ..................................................................................................... 23
Table 13: Corrections made to produce comparable consumption aggregates. ....................................... 27


List of Boxes
Box 1: Soft constraints in the tablet-based survey ....................................................................................... 9
Box 2: Rapid consumption methodology. ................................................................................................... 12
Box 3: Measures of poverty and inequality ................................................................................................ 22


Abbreviations
 CAPI                           Computer Assisted Personal Interview
 EA                             Enumeration area
 KIHBS                          Kenya Integrated Household Budget Survey
 KNBS                           Kenya National Bureau of Statistics
 KCHS                           Kenyan Continuous Household Survey
 KSH                            Kenyan Shillings
 KSPforRR                       Kenya Statistics Program-for-Results
 NASSEP                         National sample survey and Evaluation Programme
 OLS                            Ordinary Least squares
 PAPI                           Pen-and-Paper Personal Interview
 RCM                            Rapid Consumption Methodology




                                                                          iv
                                        Comparing a tablet-based rapid consumption with a paper-based full consumption survey



Executive Summary
The Kenyan Continuous Household Survey (KCHS), funded by the World Bank, will make available timely
high-quality data. A ten-year gap between the Kenya Integrated Household Budget Survey (KIHBS)
2005/06 and 2015/16 prevented Kenya from understanding the profile of the poor and trends in poverty
reduction. In addition, a processing and analysis time of two years for the 2015/16 data limited
stakeholders from having timely information. The Kenya National Bureau of Statistics (KNBS), with support
of the World Bank through the Kenya Statistics Program-for-Results (KSPforR), implements the KCHS to
collect quarterly and annual data on labor and poverty indicators. The survey has the objective to provide
frequent, timely and high-quality evidence to support stakeholders in decision-making.
Implementing the KCHS provides an opportunity for building a reliable survey infrastructure with the
best choice of consumption module and data collection method. Designing a large-scale and continuous
survey such as the KCHS involves defining the best strategy to support frequent collection and a timely
dissemination. The feasibility of including a full consumption module against the alternative of a rapid
consumption methodology (RCM) needs to be carefully assessed. The latter can reduce the administering
time of the questionnaire, while generating accurate consumption and poverty estimates to those
obtained with a full consumption module. Moreover, the data collection method can support the
monitoring of fieldwork and facilitate data availability. Collecting data with computer-assisted personal
interviewing (CAPI or tablet survey) can serve as the basis in building a modern data monitoring system,
allowing near real-time monitoring of interviews and shortening the time between data collection and its
availability.
In a preparatory step, the KNBS conducted a tablet-based pilot alongside the implementation of the
2015/16 paper-based KIHBS. KIHBS 2015/16 collected data with pen-and-paper interviewing (referred as
PAPI or paper survey), while at the same time a pilot was implemented using CAPI. Both surveys are based
on the same sampling frame and design. Data was collected in the same randomly drawn geographic
locations or enumeration areas (EAs), with 10 households interviewed for KIHBS and 6 for the tablet-based
pilot. The questionnaires of both surveys differ in some important aspects. The paper survey utilized a full
consumption module as well as detailed information on many livelihood characteristics, while the tablet-
based survey used the rapid consumption methodology combined with a focus on essential questions
throughout all other questionnaire modules.
CAPI has the potential to improve data quality, to eliminate the need for data entry, to support
near real-time monitoring of data collection and to reduce the time between fieldwork and
analysis.
PAPI is the traditional method in which enumerators fill out paper questionnaires, while CAPI refers to
interviews being conducted with the assistance of a computer or tablet. In PAPI, the enumerator records
respondents’ answers on a printed paper questionnaire with the help of a pen. Thus, the data needs to
be digitalized either manually or automatically (scanned). In CAPI, the enumerator uses an electronic
version of the questionnaire to record responses directly on the device, eliminating the need for data
entry while also allowing for near real-time monitoring of fieldwork as completed interviews can be
available immediately, as long as the devices are connected to a cloud server and submissions transmitted
on a regular basis.
Data quality can be improved with CAPI by preventing enumerators from skipping questions and
introducing dynamic checks and constraints. Skip patterns prevent enumerators from asking a question
that should be skipped, or conversely skipping a question that should be asked. This decreases the length
of the interview, avoids confusing the respondent and saves time when processing the data. It can also
improve data quality by reducing item non-response. Dynamic validations can also be included in the


                                                        v
                                        Comparing a tablet-based rapid consumption with a paper-based full consumption survey



tablet-based questionnaire to flag suspiciously high or low data entries requiring confirmation from the
enumerators before proceeding. Dynamic checks and constraints allow identifying and confirming data
entries while the enumerator is still with the respondent, where the best contextual knowledge is
available.
The tablet survey produces point estimates that are statistically indifferent to those from the
paper survey for household and population characteristics.
Dwelling characteristics, water sources, sanitation, lighting and cooking fuel are similar between the
paper-based survey and tablet-based survey. The material of the floor, roof and walls, as well as
sanitation and lighting are not significantly different between the paper and tablet survey at the national
level, but also for rural and urban areas. Similarly, water sources and cooking fuel of Kenyan households
are also not significantly different between the CAPI and PAPI data at the national level. For urban and
rural areas, there are some differences between paper and tablet, mainly in categories that refer to other
water sources or cooking fuels. However, it seems unlikely that the differences are associated with the
data collection method employed since some CAPI estimates are larger compared to PAPI estimates, while
others smaller, suggesting the lack of a systematic bias.
Age, gender, literacy and educational characteristics of the household head are also comparable in the
paper and tablet survey. There are only small differences between surveys for some 5-year age groups,
such as population aged 20-24 and 65-69, representing relatively small shares of the overall population.
Kenyans that have ever attended school, adult literacy rate and school enrolment for the population aged
6-13 is not statistically different between the CAPI and PAPI surveys. Also, the age, gender and education
of the household head are equivalent between the tablet-based survey and KIHBS 2015/16.
The food module of the tablet-based pilot included soft constrains based on outdated
assumptions affecting quantities and prices collected.
Incorrect soft constraints encouraged misreporting of quantities and prices in the CAPI pilot. Soft
constraints or consistency checks alert enumerators of unusual responses or values to certain questions.
They are set when coding the questionnaire and applied automatically during the interview, usually in the
form of a warning or error message, so that they can be clarified when the enumerator is still with the
respondent. The tablet questionnaire included soft constraints in the consumption module, yet they were
not set to the right levels according to the latest information available. Incorrect upper bound constraints
for prices induced lower unit values in the CAPI survey relative to PAPI. Also, non-standard conversion
factors to kilograms or liters induced higher quantities consumed in the tablet survey. Finally, incorrect
non-standard conversion factors combined with relatively high lower bound soft constraints for quantities
consumed induced higher quantities for items reported in ‘handful’ and ‘piece’ units.
Various adjustments are made to obtain comparable prices and quantities between the paper and the
tablet survey. Comparing the paper and tablet data in terms of consumption and poverty required
adjusting both datasets. The tablet food consumption aggregates are derived using the median PAPI prices
for each item and EA. Then, the incorrect conversion factors to standard units for food items in CAPI are
replaced by the median implicit conversion factor for each item and non-standard unit from PAPI. Finally,
other minor adjustments are introduced to produce equivalent consumption aggregates between both
surveys.
The rapid consumption methodology with CAPI produces similar consumption and poverty results
to a full consumption module in PAPI.
The consumption details of all items were asked to households in the paper but not in the tablet-based
survey, in line with the rapid consumption methodology. PAPI used a full consumption module asking


                                                       vi
                                         Comparing a tablet-based rapid consumption with a paper-based full consumption survey



households about their consumption details of 193 food and 126 nonfood items. In the RCM, households
were only asked about consumption details of 125 food items and between 79 and 83 nonfood items,
depending on the optional module allocated to the household. After data collection, multiple imputation
techniques are used to estimate the consumption aggregate of CAPI households.
Poverty point estimates are statistically the same between CAPI and PAPI based on core food
consumption, which allows isolating the effect of data collection method. Poverty estimates from a core
food consumption aggregate allows isolating the effect of data collection method, as there are no
differences in the consumption module of both surveys since the consumption details of 91 core food
items were collected from all households directly in CAPI and PAPI. The difference in poverty incidence,
depth and severity between CAPI and PAPI are neither statistically significant at the national level nor for
urban and rural areas.
CAPI with a rapid consumption methodology produces poverty point estimates that are statistically
equivalent from those obtained from PAPI and a full consumption module. Two equivalent consumption
aggregates, employing the RCM for CAPI, are created and compared; i) total food consumption; and ii)
total food and nonfood expenditure. The point estimates of poverty incidence, depth and severity are
statistically equivalent between KIHBS 2015/16 and the tablet-based pilot at the national level. In urban
areas, poverty incidence from both consumption aggregates is similar between surveys, and poverty
depth and severity from total food consumption. Poverty depth and severity are slightly different when
considering total expenditure. In rural areas, poverty incidence from total food consumption is also
slightly different between surveys, with all other poverty measures being equivalent between CAPI and
PAPI. The relatively small differences in some poverty measures for urban and rural areas –all at the 10
percent level of significance– do not seem to be associated with the choice of survey method and
consumption module, since there is no evidence of a systematic bias when comparing the paper and tablet
estimates for the two consumption aggregates that employed the RCM for CAPI.
In addition, the distribution of total food and nonfood expenditure is similar, regardless of whether it
is generated from a full consumption module or the rapid consumption methodology. The RCM
combines asking households about their consumption of a subset of items and estimating the
consumption aggregates with multiple imputation techniques. Nonetheless, the distribution of total food
and nonfood expenditure for rural areas is equivalent between PAPI and CAPI, and the Gini inequality
index not statistically different for any of the consumption aggregates considered. In urban areas, the
distribution of total food and nonfood consumption is similar between PAPI and CAPI for expenditure
values below KSh 9,500. Lower response rates in wealthier areas like Nairobi for the paper survey (77
percent for PAPI vs. 99 percent for CAPI) can potentially distort consumption at the top of the distribution.
After excluding the capital city, the distribution of total expenditure for urban areas is statistically
equivalent between the paper and tablet survey, and the Gini index similar using a core food consumption
aggregate, and only statistically different at the 10 percent level of significance for total food consumption
and total food and nonfood expenditure. Furthermore, the Gini inequality index increases for both CAPI
and PAPI after excluding Nairobi, while the difference between surveys decreases for all the consumption
aggregates. This ultimately suggests that differences in the urban Gini index between CAPI and PAPI are
more likely to be explained by different response rates in other urban areas beyond the capital city, and
less likely to be associated with the data collection method and the consumption module considered
The KCHS would benefit from using the rapid consumption methodology and CAPI technology, but
other elements also need to be considered in this decision.
Using the rapid consumption methodology and CAPI would facilitate providing frequent, timely and
high-quality data for the KCHS. Only around 6 percent of the items not asked to households in CAPI with



                                                        vii
                                        Comparing a tablet-based rapid consumption with a paper-based full consumption survey



the RCM were actually consumed. These items represent 21 percent of total food and nonfood
consumption, suggesting a full consumption module might not be necessary to obtain precise
consumption estimates. Including the RCM in the KCHS can considerably reduce the administering time
of the questionnaire, while producing statistically insignificantly different household, population,
consumption and poverty estimates to those from a paper survey with a full consumption module. In
addition, collecting data with CAPI has the potential to improve data quality, eliminates the need for data
entry, supports near real-time monitoring of data collection and reduces the time between fieldwork and
analysis.
Implementing the KCHS with the RCM and CAPI implies considering other aspects to leverage their
benefits. CAPI increases the time needed to design the questionnaire and thus must be combined with
constant efforts in coding and testing it before data collection. The tablet or device, together with the
software and hardware will need to be procured and tested in advance. In addition, the training of
enumerators will need to emphasis how to use the tablets and to correctly record answers in them. A
successful monitoring system needs to ensure appropriate data management and transfer from the
devices to the cloud server and define how to provide timely feedback to teams in the field. Additionally,
training and capacity building will be needed to derive total consumption aggregates with multiple
imputation techniques as part of the rapid consumption methodology, besides a providing documentation
and support to data users.




                                                       viii
                                                      Comparing a tablet-based rapid consumption with a paper-based full consumption survey




           A.         INTRODUCTION
1.      After 10 years the Kenya National Bureau of Statistics (KNBS) conducted a nationwide
household survey between September 2015 and August 2016 to assess the progress made in living
standards. The Kenya Integrated Budget Household Survey (KIHBS) 2015/16 collected representative data
on various socio-demographic and welfare indicators in each of the 47 counties. The aim was to provide
an update on poverty, and to review the progress achieved over the last decade, comparing against data
from the previous household survey from 2005/06.1
2.      The Kenyan Continuous Household Survey (KCHS), funded by the World Bank, has the objective
to close crucial data gaps and make available timely high-quality data. A ten-year gap between KIHBS
2005/06 and 2015/16 prevented Kenya from understanding the profile of the poor and trends in poverty
reduction. In addition, a processing and analysis time of two years for the 2015/16 data limited
stakeholders from having timely information. The KNBS, with support of the World Bank through the
Kenya Statistics Program-for-Results (KSPforR), implements the Kenyan Continuous Household Survey
(KCHS) to collect quarterly and annual data on labor and poverty indicators.
3.      Improving data availability and quality with the KCHS will contribute to increasing the evidence
base and facilitate decision-making in Kenya. The KCHS will generate quarterly data nationally
representative, and annual data representative at the county level. Frequent and high-quality data will
allow producing more precise and credible analysis, which will support national and local stakeholders in
making informed decision in terms of monitoring, planning and resource allocation in Kenya.
4.       The KCHS would benefit from setting up a modern data monitoring system and the best choice
of consumption module and data collection method. The design and monitoring of the survey need
considering various trade-offs associated with the choice of consumption modules and data collection
methods like the preparation and administration time and quality of the monitoring system. Using a tablet
or computer-assisted personal interviewing (CAPI) method can create additional challenges, including
configuring and trouble-shooting tablets, data transmission and storage, we well as others.2 Yet, it offers
new opportunities to increase the quality of data collection, real-time monitoring of incoming data with
effective feedback loops as well as near real-time analysis of the data to facilitate a timely release of
results. In addition, the design of the survey needs to assess carefully the feasibility of including a full
consumption module against the alternative of a rapid consumption methodology.3
5.      In a preparatory step, the KNBS conducted a tablet-based pilot alongside the implementation
of the 2015/16 paper-based KIHBS. The Kenya Integrated Household Budget Survey (KIHBS) in 2015/2016
collected data with pen-and-paper interviewing (referred to as PAPI or paper survey). At the same time,
a pilot was implemented using computer-assisted personal interviewing (referred to as CAPI or tablet
survey). The sampling design of both surveys involved three phases: first 2,400 clusters from the National
sample survey and Evaluation Programme (NASSEP) V sampling frame were draw. Next, 16 households
were selected in each cluster or enumeration area (EA). Finally, 10 households out of these 16 were
selected for the KIHBS survey while the remaining 6 for the tablet-based pilot (Figure 1). The
questionnaires administered differ in some important aspects. The paper survey utilized a full
consumption module as well as detailed information on many livelihood characteristics, while the tablet-



1 The analysis has been published in (Kenya National Bureau of Statistics 2018) and (World Bank 2019).
2 Demombynes, Gubbins, and Romeo 2013.
3
  U. Pape and Mistiaen 2015b and U. J. Pape and Mistiaen 2018a.



                                                                       1
                                               Comparing a tablet-based rapid consumption with a paper-based full consumption survey



based survey used the rapid consumption methodology (RCM) combined with a focus on essential
questions throughout all other questionnaire modules (Figure 2).

                          Figure 1: Characteristics of the paper and tablet-based survey.

                                                                                                                     No. of
                                      Survey                                              Sampling                 households
                                                             Objective
                                      method                                                frame                    per EA
                          Pen-and-                 Assess the                 5th National sample         10 households
         PAPI             paper
                          interviewing
                                                   progress made
                                                   in living
                                                                              survey & Evaluation
                                                                              Programme
                                                                                                          per EA
                          (PAPI)                   standards                  (NASSEP V)



                                                                                                                     No. of
                                      Survey                                              Sampling                 households
                                                             Objective
                                      method                                                frame                    per EA
                          Compute-                                            5th National sample
         CAPI             assisted
                                                   Compare survey
                                                   methods &                  survey & Evaluation
                                                                                                          6 households
                                                                                                          per EA
                          personal                 inform future              Programme
                          interviewing             surveys                    (NASSEP V)
                          (CAPI)


                                               Source: Authors’ elaboration.


                              Figure 2: Structure of the questionnaire for both surveys.

                            Household                                                                Consumption
                                                                          Details of the                                    Assets
                             member                   Rent                                             modules
                                                                            dwelling
                              roster

                                           Rent values             Material, water,          Food and                Detailed
                    Age, gender,                                   sanitation, lighting
                    education and          for urban                                         non-food                module on
                                           households              and cooking               consumption             assets
 PAPI               labor
                                                                   Energy use,
                                                                                             details
                    Health, fertility &                            agriculture holding
                    deaths, child                                  & output, livestock,
                    health, ITC                                    enterprises, other
                    service &                                      income, transfers,
                    domestic tourism                               shocks, food
                                                                   security, justice,
                                                                   credit & ITC


                            Household                                                                                     Consumption
                                                                          Details of the               Assets
                             member                   Rent                                                                  modules
                                                                            dwelling
                              roster

 CAPI               Age, gender,           Rent values             Material, water,          Ownership               Food and
                    education and          for urban               sanitation,               Yes/No                  non-food
                    labor                  households              lighting and                                      consumption
                                                                   cooking                                           details



                                               Source: Authors’ elaboration.

6.      This technical note compares indicators estimated based on the paper- and tablet-based
surveys, focusing on consumption and poverty providing an assessment of the RCM vis-à-vis the
traditional full consumption module. The setup of the tablet-based pilot allows for direct comparison of
the data collection methods (CAPI vs. PAPI) on estimated indicators. In the context of consumption and
poverty estimates, the paper-based traditional full consumption methodology is compared with the rapid


                                                               2
                                       Comparing a tablet-based rapid consumption with a paper-based full consumption survey



consumption results, even though the different forms of administration (CAPI vs. PAPI) can influence the
comparison. The results of the analysis can help informing the design of the KCHS, for example whether
consumption can be accurately measured by the rapid consumption methodology, considerably reducing
the administering time of the questionnaire. The note documents the design recommendations and their
justification, which foster transparency and provide learning resources, with potential impacts beyond
Kenya.




                                                       3
                                                     Comparing a tablet-based rapid consumption with a paper-based full consumption survey



             B.        DATA COLLECTION METHODS

                       - Differences between PAPI (paper) and CAPI (tablet)

7.      PAPI is the traditional method in which enumerators fill out paper questionnaires, while CAPI
refers to interviews being conducted with the assistance of a computer or tablet. In the pen-and-paper
interviewing (PAPI) data collection method, the interviewer or enumerator proceeds question by question
asking and recording the respondents’ answers on a printed paper template or questionnaire. As a result,
the data needs to be transferred to digital format through manual data entry or automatically (scanned)
after the interview has concluded. In CAPI, the enumerator uses a tablet, smartphone or computer
preloaded with an electronic version of the questionnaire to record responses directly on the device
(Table 1).
8.       Using CAPI eliminates the need for data entry, allowing for data to be available in real time.
One of the main features of CAPI is that interviews are conducted with computers or tablets and the
answers are immediately entered into the device, which eliminates the need for data entry of each
questionnaire. CAPI completed interviews can be available immediately, as long as the devices are
connected to a cloud server and submissions transmitted on a regular basis, which ultimately facilitates
real-time processing and regular monitoring of fieldwork by verifying data collection progress and
identifying challenges in the field, such as low-performing enumerators.
9.      Data quality can be improved with CAPI by preventing enumerators from skipping questions
and introducing dynamic checks and constraints. Introducing skip patterns and consistency checks in the
questionnaire can improve the quality of the data collected. Skip patterns or automated routing is one of
the most relevant characteristics of CAPI in order to reduce the number of incorrect entries by preventing
enumerators from asking a question that should be skipped, or conversely skipping a question that should
be asked to the respondent. 4 They can decrease the length of the interview, avoid confusing the
respondent and save time when processing the data. It can also improve data quality by reducing item
non-response. CAPI allows complex skip patterns that would be complicated to implement manually for
enumerators. Moreover, dynamic checks and constraints can also be included in the tablet-based
questionnaire to flag suspiciously high or low data entries requiring confirmation from the enumerators
before proceeding. Dynamic checks and constraints allow identifying and confirming data entries while
the enumerator is still with the respondent, where the best contextual knowledge is available.
10.      Other features of CAPI include recording of date and time, GPS positions and multimedia
content. Conducting interviews with CAPI allows exploiting some of the functionalities available in the
devices used to collect the data. Data and time stamps can be recorded automatically at the beginning
and end of each module, which facilitates monitoring of fieldwork. In addition, recording GPS positions
permits tracking the locations of teams in the field to determine whether interviews are conducted at the
correct locations. Audio, photos and video can also be recorded using the devices employed in the survey.
Other features include complex functions such as randomly allocating a module to a household or
selecting a household for interview among a list of potential respondents.




4
    Banks and Laurie 2000; Caeyers, Chalmers, and De Weerdt 2012.



                                                                     4
                                             Comparing a tablet-based rapid consumption with a paper-based full consumption survey



                             Table 1: Main characteristics of data collection methods.

     Characteristic                PAPI or paper-based survey                           CAPI or tablet-based survey

                                                                              Each enumerator requires a device with the
                          Requires several computers with a data-entry        survey’s software. Also, a cloud server is
Hardware and software
                          software.                                           needed for centralized management of
                                                                              submissions.

                          In addition to usual topics covered (such as        In addition to usual topics covered (such as
                          sampling, questionnaire content and survey          sampling, questionnaire content and survey
Training of enumerators
                          objectives among others), the emphasis is on        objectives among others), it concentrates on
                          complex or numerous skip patterns.                  how to use the device and record the answers.

                          The questionnaire and a data entry template
Design of the                                                                 The questionnaire needs to be coded and
                          must be designed and printed before
questionnaire                                                                 tested before loading it into the devices.
                          fieldwork.

                                                                              Single option, multiple choice and written
                          Single option, multiple choice and written          responses. In addition, GPS locations, time
Type of questions
                          responses.                                          stamps, multimedia content and other
                                                                              complex operations.

                          A secure system is designed to transport paper      Interviews are submitted daily to the cloud
Transportation and
                          interviews from the field to the offices, and to    server via 3G or Wifi connection, where the
storage
                          stored them securely.                               information is stored securely.

                                                                              Real-time and remote monitoring of fieldwork
                          Survey managers need to review completed
Quality controls                                                              is possible, since submitted interviews are
                          interviews to identify errors.
                                                                              available immediately.

                                                                              Responses are immediately entered in a digital
                          Responses must be entered I the pre-defined
Data entry                                                                    format as they are recorded with the tablet or
                          template by data-entry operators.
                                                                              computer.

                          The data is ready for processing only after the
                          interviews have been entered, which typically       The data processing or cleaning can begin
Data processing
                          involves two data-entry exercises and               immediately after fieldwork has concluded.
                          corrections for discrepancies between both.


Management of                                                                 Respondents can type their own answers
                          A secure protocol must be put in place to
sensitive and personal                                                        directly on the device, and completed
                          handle personal data.
data                                                                          interviews are encrypted.


                                             Source: Authors’ elaboration.

11.    Besides improving data quality, migrating from PAPI to CAPI can bring about additional benefits
in terms of cost savings. Using a CAPI data collection method increases the time needed to set up the
survey and the fixed up-front costs from procuring the hardware and hiring staff to code the
questionnaire. However, both time and cost are compensated in CAPI since data entry is no longer




                                                             5
                                                    Comparing a tablet-based rapid consumption with a paper-based full consumption survey



required, and the time needed to conduct the interview and clean the data is reduced.5 The costs in CAPI
can be around 75 percent of those from PAPI.6 In addition, since the largest cost in CAPI correspond to
fixed costs, the use of this method is more suited for large and ongoing surveys like the KCHS.

                     - Evidence for the use of CAPI

12.      CAPI was tested in Europe in the mid-80s and used for the first time to collect data as part of
the Netherlands Labor Force Survey in 1987. While the first Computer-Assisted Telephone Interviews
were implemented in the USA in 1971 by marketing companies, the use of CAPI technology to conduct
face-to-face interviews took longer. CAPI was tested in Europe by Statistics Sweden in 1982 and by
Statistics Netherlands in 1984.7 A few years later, the first nationwide CAPI survey was implemented to
collect all the data from the Netherlands Labor Force Survey in 1987. 8 Since then, the use of CAPI
techniques to conduct surveys has proliferated, pushed by the availability of cheaper computers or tablets
and higher network connectivity.9 More importantly, CAPI has proven to work in most contexts, even with
low infrastructure and institutional capacity.10
13.     CAPI tends to improve data quality compared to collecting data with a paper-based method. A
randomized controlled trial with 1,840 households in Zanzibar comparing data obtained from CAPI and
PAPI surveys found that the PAPI data contained many errors, which are practically eliminated with the
use of CAPI. 11 Moreover, these errors in PAPI are not randomly distributed across the sample but
correlated with certain household characteristics, thus dropping or excluding them from the analysis
would introduce a bias. Other common benefits from using CAPI compared to PAPI are associated with
increased speed and efficiency, and savings in time and cost from implementing the survey.12
14.     Automated routing or skip patterns are crucial features for reducing errors in a tablet-based
survey. Skip patterns are an essential element of CAPI software. They are configured to ask or skip certain
questions conditioned on the respondents answer to a previous question. With this functionality, both
the interviewed household and enumerator are guided through the tablet questionnaire. Most errors
found in PAPI due to skip patterns can be avoided in CAPI.13 Also, CAPI leads to less missing data, relative
to PAPI, since enumerators cannot skip some questions, ultimately improving the quality of the
information collected.14
15.      Similarly, soft constraints or consistency checks help reducing the number of incorrect data
entries, especially for consumption data. The number of interviews with impossible or missing values can
be reduced to nearly zero when using CAPI, as a direct result from introducing consistency checks.15 The
impact on data quality is especially relevant for the consumption modules which are the basis for
estimating poverty rates. In the case of sales and profits data from microenterprises, consistency checks
also help decreasing number of implausible values recorded.16



5 Caeyers, Chalmers, and De Weerdt 2012; Glewwe and Hoang Dang 2008; King et al. 2013; Zhang et al. 2012.
6 Leisher 2014.
7 Danielsson and Maarstad 1982; Bemelmans-Spork and Sikkel 1985.
8 E. de Leeuw and Nicholls 1996.
9 Schräpler, Schupp, and Wagner 2010.
10 Prydz 2013.
11 Caeyers, Chalmers, and De Weerdt 2012.
12
   E. D. de Leeuw 2008; Banks and Laurie 2000; Rosero-Bixby et al. 2005.
13 Caeyers, Chalmers, and De Weerdt 2012.
14 Sebestik et al. 1988; Olsen, n.d.; E. D. de Leeuw 2008.
15 Caeyers, Chalmers, and De Weerdt 2012.
16
   Fafchamps et al. 2014.



                                                                    6
                                                       Comparing a tablet-based rapid consumption with a paper-based full consumption survey



16.      Moreover, respondent’s perceptions tend to improve or remain unchanged with the use of CAPI
technology relative to PAPI. Both interviewers and respondents have been found to have positive
opinions about conducting the survey with CAPI technology in the early days of CAPI. 17 Besides, this
positive attitude was also perceived by population over 70 years which are, in general, less exposed to
technology.18 If the perception from migrating to CAPI is not positive, usually there are no differences in
respondents perception between CAPI and PAPI.19 Furthermore, when migrating from a PAPI to CAPI
method, the response rates are unlikely to change as seen in the British Household Panel Survey in 1998.20
Similarly, the migration to CAPI in the German Socio-Economic Panel (SOEP) of 1998 had no effect on
respondents acceptance of CAPI or response rates.21




17
   Couper and Burt 1994; Nicholls and De Leeuw 1996.
18 Taylor 1998.
19 Caeyers, Chalmers, and De Weerdt 2012.
20 Banks and Laurie 2000.
21
   Schräpler, Schupp, and Wagner 2010.



                                                                       7
                                          Comparing a tablet-based rapid consumption with a paper-based full consumption survey



        C.       RESULTS FROM THE TABLET -BASED SURVEY

                 - Use of soft constraints in the food consumption module

17.      Incorrect soft constraints can encourage misreporting of quantities and prices. One of the
advantages from CAPI technology is the use of soft constraints in the questionnaire. They allow to confirm
data entries in the field, which is where the best contextual knowledge is available. The aim is to minimize
the number of incorrect data entries. Nevertheless, soft constraints need to be implemented carefully to
avoid flagging entries as unusual when in fact they are within a normal range, especially when there are
other surveys that can help informing at what level to set these thresholds.
18.     The tablet questionnaire included soft constraints in the food consumption module based on
outdated assumptions affecting quantities and prices collected. For each item considered in the tablet-
based survey, the questionnaire included soft constraints for the lower and upper bound of unit prices,
the lower and upper bound of quantities consumed, as well as implicit conversion factors from non-
standard units of quantities reported to kilograms or liters. If the respondent’s answer was above or below
these thresholds, enumerators were prompted with a message saying the quantity/price was too low/high
and they had to confirm or correct the value reported (Box 1). In the CAPI survey, the incorrect use of soft
constraints based on outdated assumptions induced higher/lower values reported compared to prices
and quantities from the paper survey.

                  Table 2: Example of upper bound soft constraints for cocoa & cocoa products.

                                                                 Minimum value             Maximum value
                       Item                   Unit                                          (KSh per Kg)
                                                                  (KSh per Kg)

             Cocoa & cocoa products            Cup                       50                       400

             Cocoa & cocoa products           Grams                      50                       400

             Cocoa & cocoa products         Kilograms                    50                       400


               Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.

19.       Incorrect upper bound constraints for prices induced lower unit values of food items in the CAPI
survey relative to PAPI. The upper bound of the soft constraint for unit prices was set at a relatively low
level, compared to the PAPI data. During fieldwork, enumerators were constantly reporting unit values
beyond this threshold, and thus they were repeatedly asked to confirm the entries. The constant flagging
of unusual high values –even though incorrectly– affected the data collection process and ultimately led
to lower unit prices. For example, for cocoa and cocoa product, the upper bound of the soft constraint
was set at KSh 400 per kilogram, regardless of the unit in which households reported the consumption of
this item (Table 2). As a result, unit prices collected with CAPI range between KSh 56 and 333 KSh, while
between KSh 750-857 in the PAPI dataset (Figure 3). A soft constraint of KSh 400 per kilogram is
unrealistically low, when considering the PAPI values, and is likely to have induced lower unit values in
CAPI. Around 70 percent of the core food items with the largest price differences between CAPI and PAPI
have median prices in the paper survey that would place them outside of the CAPI soft constraints. Thus,
this is a generalized issue across items that systematically affected unit prices of food items in the tablet-
based survey.




                                                          8
                                         Comparing a tablet-based rapid consumption with a paper-based full consumption survey



                               Box 1: Soft constraints in the tablet-based survey

 Soft constraints or consistency checks alert enumerators of unusual values recorded. Including
 consistency checks in the tablet questionnaire prevents collecting incorrect information. Enumerators
 are alerted of unusual responses to certain questions. These checks are set when coding the
 questionnaire and applied automatically during the interview so that they can be corrected when the
 enumerator is still with the respondent, where the best contextual knowledge is available.
 In such cases, enumerators receive a warning or error message. A warning message flags an unlikely
 response, outlier or a contradictory answer, and asks the enumerator to confirm the response. For
 consumption values, they are a common practice to identify quantities and prices not lying within
 reasonable boundaries. An example of this is presented in the figure below from the tablet interface
 flagging an unusually high price for rice. An error message is a stronger type of constraint preventing
 enumerators from moving on to the next question until the issue is corrected.




20.      Non-standard conversion factors to kilograms or liters induced higher quantities consumed in
the tablet survey relative to the paper survey. In the tablet-based survey, households were able to report
the consumption of food items in standard units (kilograms or liter), but also in non-standard units (e.g. a
bowl, bunch, cup, etc.). The CAPI questionnaire included automatic conversion factors for these non-
standard units into kilograms or liters, yet they were set at higher level compared to the conversion factors
implicit in the PAPI data. For the non-standard unit ‘bowl’, the conversion factor to kilograms in the tablet
questionnaire was the same for all items and corresponded to 0.7 kilograms, which is not in line with the
results from the paper survey. The implicit conversion factor from the paper data oscillates around 0.21
and 0.25 (Table 3) depending on the item considered. The conversion factors included in CAPI are around
3 times larger than the observed values from PAPI, inducing artificially higher quantities consumed in the
former compared to the latter. In 80 percent of the core food items, the median quantity purchased in
CAPI was equal or higher than the respective value from PAPI. While the original data collected was not


                                                         9
                                                Comparing a tablet-based rapid consumption with a paper-based full consumption survey



affected, quantities estimated in non-standard units were systematically affected by the conversion
factors considered.

                                Figure 3: Distribution of unit prices for cocoa & cocoa products.




                             .04
                             .03
                          Density
                           .02
                             .01
                             0




                                    0           200              400             600                800
                                                Median unit price across EAs (Kshs per Kg)

                                                              PAPI               CAPI



                Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.

                             Table 3: Conversion factors to kilogram for non-standard unit bowl.

                                                                 CAPI: Conversion             PAPI: Median quantity
                         Item                   Unit                                          consumed in kilograms
                                                                factor to kilograms

          Aromatic unbroken rice                Bowl                    0.70                              0.22

          Broken white rice                     Bowl                    0.70                              0.25


          Brown rice                            Bowl                    0.70                              0.21


          Millet grain                          Bowl                    0.70                              0.25


                Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.

21.      Moreover, incorrect non-standard conversion factors combined with relatively high lower
bound soft constraints for quantities consumed induced higher quantities for items reported in
‘handful’ and ‘piece’ units. The soft constraints of quantities consumed included in the CAPI questionnaire
are set in standard units (kilogram or liters). Hence, quantities reported in non-standard units were
automatically converted –with CAPI technology– to standard units using the incorrect conversion factors,
and then soft constraints validated. The incorrect non-standard conversion factors together with a
relatively high lower bound soft constraint for quantities consumed implied that quantities reported by
households were usually flagged as being too low for items specified in ‘handful’ and ‘piece’ units,
ultimately affecting the data collected. For example, the consumption of eggs in the tablet-based survey
was exclusively reported in ‘piece’ units, which has a mean quantity consumed of 0.6 kilograms, compared
to 0.14 kilograms from the paper data (Figure 4). This is a separate issue from the previous one described



                                                               10
                                                     Comparing a tablet-based rapid consumption with a paper-based full consumption survey



since the data collected was directly affected and adjusting the conversion factor is unlikely to produce
equivalent figures to those from PAPI. However, the issue concentrates on values reported in ‘handful’
and ‘piece’ units.

                                           Figure 4: Distribution of quantity consumed for eggs.




                                8
                                6
                             Density
                               42
                                0




                                       0               .5                 1               1.5                   2
                                                            Quantity consumed (Kg per AE)

                                                                    PAPI              CAPI



                    Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.


                     - Implementation of the rapid consumption methodology

22.     The consumption details of all items were asked to households in the paper but not in the
tablet-based survey, in line with the rapid consumption methodology. PAPI used a full consumption
module asking households about their consumption details of 193 food and 126 nonfood items. In the
rapid consumption methodology, households were only asked about their consumption of 125 food items,
91 core and 34 additional items; and between 79 and 83 nonfood items, 58 core and between 21 and 25
additional items depending on the optional module systematically allocated to the household (Table 4
and Box 2).22 After data collection, multiple imputation techniques are used to estimate the consumption
aggregates of CAPI households.
23.      Only around 6 percent of the items excluded in the RCM, representing 21 percent of total food
and nonfood consumption, were actually consumed by households. The tablet survey asked households
about their consumption of 149 core items, as well as additional 34 food items and 21 to 25 nonfood
items, depending on the optional module (Table 4). Only around 6 percent of the items not asked to
households in CAPI with the RCM were actually consumed. These items represent up to 21 percent of
total food and nonfood consumption. Items asked to households represent around 80 percent of the total
food and nonfood expenditure. Furthermore, out of the 68 food items and 43 to 47 nonfood not asked,
CAPI households only consumed on average around 3.2 to 3.4 and 2.8 to 3.0 respectively. As a result, the


22The RCM allocates food and nonfood items into a core module and nonoverlapping optional modules according to their consumption shares.
The tablet-based pilot employed a conservative approach by allocating a large number of items to the core module which was asked to all
households. However, the RCM can be implemented assigning less items into the core module and further reducing the duration of interviews,
while still deriving accurate poverty estimates (U. J. Pape and Mistiaen 2018).



                                                                    11
                                                       Comparing a tablet-based rapid consumption with a paper-based full consumption survey



distribution of total expenditure is similar for collected and imputed items, indicating the rapid
consumption methodology produces precise consumption aggregates, while contributing to reduced
fatigue of respondents and the time spent during fieldwork (Figure 5).

                                                Box 2: Rapid consumption methodology.


     Poverty is an indicator of paramount importance for gauging socio-economic wellbeing of a
     population. Consumption-based poverty measures in which poverty is defined as those whose
     consumption level falls below the poverty line—the threshold consumption-level for sustaining a
     minimum level of welfare for healthy living—are widely used in the developing world and play a critical
     role in policy decisions.
     Measurement of consumption, however, has traditionally been very time consuming, which is a
     serious issue particularly in the context of a continuous survey. A household consumption
     questionnaire contains a series of questions about the monetary value of each consumption item,
     whether it is purchased, self-produced, or bartered. With around 300 to 400 items, including both food
     and nonfood items, the time for administering the questionnaire often exceeds two hours. In addition
     to the high administration cost due to the long interview time, measurement errors may become
     significant towards the end of the questionnaire as respondents get tired.
     To overcome this challenge, a new methodology can be used in which we combine an innovative
     questionnaire design with standard imputation techniques. The new methodology allows us to
     substantially shorten the consumption questionnaire and reduce the administering time by imputing
     missing consumption values for items that are not explicitly asked. The proposed methodology allows
     to derive accurate poverty estimates in less than 60 minutes of administering time per household.23
     The rapid consumption survey methodology involves five main steps. First, core items are selected
     based on their importance for consumption. Second, the remaining items are partitioned into optional
     modules. Third, optional modules are randomly assigned to groups of households. Fourth, after data
     collection, consumption of optional modules is imputed for all households. Fifth, the resulting
     consumption aggregate is used to estimate poverty indicators.

24.      The median duration of interview in CAPI with the RCM was 162 minutes, with food and
nonfood modules representing nearly two thirds of the total administering time. The rapid consumption
methodology facilitates saving time during data collection by not asking all household the consumption
details of every item. The median duration of interview in the tablet survey was 162 minutes, with a
median of 59 and 37 minutes for the food and nonfood module respectively. Responding the food and
nonfood modules represented an average of 38 and 24 percent respectively from the total duration of
the interview. Reducing time during fieldwork can be achieved by employing the RCM which concentrates
on asking certain items to each household since i) consumption modules tend to be time-consuming; and
ii) the consumption aggregates can be precisely estimated using multiple imputation techniques after
fieldwork.




23
     U. Pape and Mistiaen 2015a; U. J. Pape and Mistiaen 2018b.



                                                                      12
                                                          Comparing a tablet-based rapid consumption with a paper-based full consumption survey



                                          Table 4: Allocation of consumption items into modules.

                                                   Food                                                   Nonfood

 Module                                       Share of total      Mean no. of                          Share of total      Mean no. of
                 Total no. of                                                        Total no. of
                                               expenditure          items                               expenditure          items
                    items                                                               items
                                                   (%)            consumed                                  (%)            consumed

Core                           91                   57                17.3                58                 14                11.7

Module 1                       34                    6                 1.7                22                  3                1.3

Module 2                       34                    7                 1.7                21                  5                1.5

Module 3                       34                    7                 1.5                25                  3                1.5


                                        Source: Authors’ calculation based on the tablet-based pilot.


                             Figure 5: CAPI total food and nonfood expenditure collected and imputed.

                             100

                              90

                              80

                              70
           % of population




                              60

                              50                                                         Core

                              40                                                         Core + assigned module

                              30                                                         Imputed

                              20

                              10

                               0
                                    0     2,000   4,000   6,000   8,000 10,000 12,000 14,000 16,000 18,000 20,000
                                              Monthly per adult equivalent food and nonfood expenditure (deflated)


                                        Source: Authors’ calculation based on the tablet-based pilot.




                                                                         13
                                                Comparing a tablet-based rapid consumption with a paper-based full consumption survey



         D.          RESULTS FROM THE COMPARISON OF CAPI VS. PAPI
25.     The data from both surveys was processed in almost identical ways. Both surveys are based on
the same sampling frame and design. Data was collected in the same randomly drawn geographic
locations or enumeration areas. The data collected with CAPI was validated and cleaned in a similar way
as the KIHBS 2015/16 data. Moreover, the sampling weights for households included in the CAPI survey
are estimated from the weights of households in the same enumeration area (EA) in the PAPI survey
(Annex).

                     - Population and household characteristics

26.      Easy to verify dwelling characteristics show no statistical difference between the paper-based
survey and tablet-based survey. The details of dwelling characteristics were asked in a similar sequence
for both the paper and tablet survey and yield no significant differences. The material of the floor, roof
and walls are not significantly different between the paper and tablet survey at the national level, but also
for rural and urban areas (Table 5).

                                           Table 5: Dwelling characteristics.

                                       National                              Urban                               Rural
      Characteristic
                               CAPI      PAPI         Diff.         CAPI      PAPI        Diff.       CAPI       PAPI        Diff.

Roof: Corrugated iron
                               80.8      81.7          No           75.3      77.0         No         83.9        84.3        No
sheets (%)
Roof: Concrete (%)             7.5       6.7           No           19.8      17.8         No         0.5         0.3         No

Roof: Other (%)                11.7      11.6          No           4.9        5.2         No         15.6        15.3        No

Floor: Cement (%)              48.3      47.3          No           73.0      72.6         No         34.2        32.8        No

Floor: Carpet or ceramic
                               7.5       7.1           No           17.0      15.9         No         2.0         2.0         No
(%)

Floor: Earth, sand & other
                               44.2      45.6          No           10.0      11.5         No         63.8        65.2        No
(%)

Walls: Wood planks or
                               8.4       8.6           No           2.2        2.1         No         12.0        12.4        No
shingles (%)

Walls: Corrugated iron
                               7.3       8.3           No           12.7      15.8         No         4.3         4.0         No
sheets (%)

Walls: Mud & tone or
                               32.5      32.9          No           8.8        8.4         No         46.2        46.9        No
bamboo with wood (%)

Walls: Stone w/lime,
                               51.7      50.2          No           76.2      73.7         No         37.5        36.7        No
cement, bricks & other (%)

                  Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.
                                   Significance level: 1% (***), 5% (**), and 10% (*).

27.     Water sources, sanitation, lighting and cooking fuel of households are also not significantly
different between the paper and tablet survey. Similar to the previous result, there are no significant



                                                               14
                                                 Comparing a tablet-based rapid consumption with a paper-based full consumption survey



differences at the national level on point estimates for water sources, type of sanitation, lighting or
cooking source between households included in the paper and tablet-based survey (Table 6). For urban
and rural areas, there are some differences between paper and tablet, mainly in categories that refer to
other water sources or cooking fuels. However, it seems unlikely that the differences are associated with
the data collection method employed since some CAPI estimates are larger compared to PAPI estimates,
while others smaller, suggesting the lack of a systematic bias.

                                 Table 6: Water, sanitation, cooking and lighting source.

                                         National                             Urban                               Rural
       Characteristic
                                 CAPI     PAPI          Diff.        CAPI      PAPI        Diff.       CAPI        PAPI        Diff.

Water: Piped into dwelling,
                                 30.8      30.3          No          54.6       52.4        No         17.0        17.6        No
plot or year (%)
Water: Public tap or stand
                                 13.4      13.9          No          24.0       24.1        No          7.3         8.1        No
pipe (%)
Water: Tubewell or
                                 7.4       6.6           No          6.1        3.9         No          8.2         8.1        No
borehole with pump (%)
Water: Protected well or
                                 15.0      15.6          No          3.5        4.3         No         21.6        22.1        No
spring (%)
Water: Unprotected well or
                                 13.8      15.1          No          9.0        12.2       -3.3**      16.6        16.7        No
spring & other (%)
Water: nature/rain (%)           19.6      18.5          No          2.9        3.1         No         29.2        27.3        No

Sanitation: VIP or pit latrine
                                 47.1      46.0          No          44.3       43.5        No         48.7        47.5        No
with slab (%)
Sanitation: pit latrine
                                 24.1      25.1          No          6.7        6.8         No         34.1        35.6        No
without slab or open pit (%)
Sanitation: flush & other
                                 28.8      28.9          No          48.9       49.7        No         17.3        16.9        No
(%)
Lighting: is electricity (%)     41.5      41.4          No          80.0       80.0        No         19.4        19.3        No

Lighting: is paraffin or
                                 35.6      35.0          No          14.1       13.3        No         48.0        47.5        No
pressure lamp (%)
Lighting: is other (%)           22.9      23.6          No          5.9        6.7         No         32.6        33.2        No

Cooking: charcoal (%)            15.5      14.6          No          26.8       23.3       3.5**        8.9         9.6        No

Cooking: kerosene (%)            13.7      14.0          No          32.7       32.8        No          2.8         3.1        No

Cooking: LPG & other (%)         15.8      16.9          No          32.6       37.2        No          6.2         5.2        1.1*

Cooking: firewood (%)            55.0      54.6          No          7.9        6.8         No         82.1        82.1        No

                   Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.
                                    Significance level: 1% (***), 5% (**), and 10% (*).

28.      Age, gender, literacy and educational characteristics from the tablet survey are not statistically
different to those from the paper survey. The household roster is asked early in both paper and tablet
interviews following the same set of questions. When comparing the tablet pilot against KIHBS 2015/16,
there are only small differences for certain 5-year age groups, representing relatively small shares of the
population. Moreover, Kenyans that have ever attended school, adult literacy rate and school enrolment
for the population aged 6-13 is similar for both paper and tablet surveys (Table 7).


                                                                15
                                                 Comparing a tablet-based rapid consumption with a paper-based full consumption survey



29.     In line with previous findings, the profile of the household head is the same regardless of the
data collection method considered. The age, gender and education of the household head is also
equivalent between the tablet-based survey and KIHBS 2015/16 for urban areas and at the national level
(Table 8). For rural areas, the share of women as the head of the household is slightly larger in the paper-
based survey.

                                     Table 7: Population by age, gender and education.

                                        National                             Urban                                Rural
      Characteristic
                             CAPI         PAPI        Diff.       CAPI        PAPI        Diff.       CAPI        PAPI        Diff.

Share of women (%)            50.8        50.6         No         51.4        49.6        1.8**        50.5        51.1        No

0-4 Years (%)                 13.5        13.4         No         13.9        13.0         No          13.4        13.5        No

5-9 Years (%)                 14.1        14.3         No         11.8        11.6         No          15.1        15.5        No

10-14 Years (%)               13.7        13.3        0.5*        10.3         9.3         No          15.2        14.8        No

15-19 Years (%)               11.0        11.1         No            9.3       8.6         No          11.7        12.1        No

20-24 Years (%)               8.4          9.0       -0.6**       11.1        12.8        -1.7**       7.2         7.4         No

25-29 Years (%)               8.4          8.1         No         12.8        12.5         No          6.6         6.3         No

30-34 Years (%)               6.7          6.8         No            9.1      10.0         No          5.7         5.5         No

35-39 Years (%)               5.7          5.5         No            7.2       6.9         No          5.1         4.9         No

40-44 Years (%)               4.4          4.5         No            4.9       4.8         No          4.2         4.3         No

45-49 Years (%)               3.3          3.2         No            3.4       3.3         No          3.2         3.2         No

50-54 Years (%)               2.8          2.7         No            2.3       2.7         No          3.0         2.8         0.2*

55-59 Years (%)               2.2          2.4         No            1.4       1.7         No          2.6         2.7         No

60-64 Years (%)               2.2          2.1         No            1.3       1.2         No          2.5         2.5         No

65-69 Years (%)               1.1          0.9       0.2***          0.6       0.3       0.3***        1.3         1.1         0.2*

70-74 Years (%)               0.9          0.9         No            0.3       0.3         No          1.2         1.2         No

75-79 Years (%)               0.7          0.6         No            0.3       0.3         No          0.8         0.8         No

80-84 Years (%)               0.5          0.5         No            0.1       0.2         No          0.6         0.6         No

85+Years (%)                  0.5          0.6         No            0.2       0.3         No          0.7         0.7         No

Ever attended school (%)      89.3        89.5         No         94.7        94.9         No          87.0        87.4        No

Adult literacy (% of 25+
                              66.1        66.9         No         81.1        82.5         No          62.1        62.7        No
years)
Enrolment rate (% of 6-
                              94.9        94.4         No         96.5        96.2         No          94.4        93.9        No
13 years)

                  Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.
                                   Significance level: 1% (***), 5% (**), and 10% (*).

30.     Additionally, imputed monthly rent expenditure for urban households is also not statistically
different between the paper and tablet survey. Information on rent expenditure was collected in both
surveys. However, the data is not available for those households that own their dwelling. Thus, rent
expenditure was imputed by estimating a stepwise log-linear Ordinary Least squares (OLS) regression of
rents reported on housing characteristics (Annex). The average imputed monthly rent expenditure for



                                                                16
                                                     Comparing a tablet-based rapid consumption with a paper-based full consumption survey



urban households from KIHBS 2015/16 is KSh 3,741, compared to KSh 4,026 generated from the CAPI data
and regression model. Yet, the difference between both is not statistically significant.

                                          Table 8: Characteristics of the household head.

                                           National                              Urban                                Rural
         Characteristic
                                  CAPI       PAPI         Diff.       CAPI        PAPI        Diff.       CAPI        PAPI        Diff.

Share of women (%)                32.2        32.4         No         28.8        26.8         No          34.2        35.7       -1.4*

Average age (%)                   43.47      43.43         No         37.40       37.27        No         46.95       46.97        No

Education: None or other
                                  13.9        13.2         No            5.1       4.6         No          18.7        18.2        No
(%)
Education: Pre-primary
                                  46.0        46.1         No         32.8        34.1         No          53.2        53.0        No
or primary (%)
Education: Secondary or
                                  40.1        40.6         No         62.0        61.3         No          28.0        28.8        No
tertiary (%)

                    Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.
                                     Significance level: 1% (***), 5% (**), and 10% (*).


                       - Consumption aggregates and poverty estimates

31.     Various adjustments are made to obtain comparable consumption aggregates between the
paper and the tablet survey. To compare the paper and tablet surveys in terms of consumption and
poverty incidence, equivalent consumption aggregates are created with some adjustments to both
datasets. The comparable consumption aggregates are obtained first by considering the same food and
nonfood items included in both surveys. Given the issue with prices collected in CAPI, the tablet aggregate
was derived using the median PAPI prices for each item and EA. Moreover, the conversion factors of food
items to standard units in CAPI are replaced by the median implicit conversion factor for each item and
non-standard unit from PAPI. Finally, other minor adjustments are introduced to produce equivalent
consumption aggregates between both surveys (Table 13 in the Annex).
32.     A core food consumption aggregate from information collected for all households in both
surveys allows a direct comparison of the data collection method. The first consumption aggregate
considers a subset of 91 food consumption items, which are classified under the rapid consumption
methodology as core items given their relatively high consumption share (Box 2). The consumption details
of these 91 items were collected from all households directly in both the paper and tablet survey. This
allows a direct comparison of PAPI against CAPI, as there are no differences in the consumption module
of both surveys for these items. To obtain the poverty status of households the core food consumption
aggregate was then compared against a national food poverty line which was rescaled using the PAPI
share of core food items from the total comparable food consumption aggregate.24
33.    Two additional consumption aggregates, total food consumption and total food and nonfood
expenditure, employed the rapid consumption methodology for CAPI. The paper survey included a full
consumption module, while CAPI households were only asked about their consumption of certain food
and nonfood items. Thus, multiple imputation techniques are employed to estimate total food
consumption and total expenditure on food and nonfood items from the tablet-based survey (Box 2 and

24
     Olson Lanjouw and Lanjouw 2001.



                                                                    17
                                                       Comparing a tablet-based rapid consumption with a paper-based full consumption survey



Annex).25 Total food consumption was compared against the national food poverty line to estimate the
share of population that is unable to meet the minimum basic food consumption needs.26 The list of
nonfood items from the tablet-based survey excluded some items that are considered in KIHBS 2015/16
and in the computation of the national poverty line. Therefore, a comparable total expenditure aggregate
–from food and nonfood items– for CAPI and PAPI were compared against a rescaled national poverty
line, using the PAPI share of items in the comparable aggregate from the total KIHBS 2015/16 aggregate
as scaling factor.

                                                       Table 9: Poverty estimates.

                                           National                                Urban                                Rural

                                  CAPI        PAPI          Diff.       CAPI        PAPI        Diff.       CAPI        PAPI         Diff.


                                                           Core food consumption
                                  33.0         33.2                      26.9        26.6                   35.7         35.8
Poverty incidence (%)            (0.746)     (0.648)
                                                             No
                                                                       (1.505)     (1.290)
                                                                                                 No
                                                                                                           (0.834)      (0.723)
                                                                                                                                      No

                                  10.1         9.9                       8.4         8.1                    10.9         10.7
Poverty depth (%)                (0.328)     (0.266)
                                                             No
                                                                       (0.694)     (0.441)
                                                                                                 No
                                                                                                           (0.359)      (0.321)
                                                                                                                                      No

                                   4.7         4.4                       4.1         3.6                     5.0          4.8
Poverty severity (%)             (0.226)     (0.166)
                                                             No
                                                                       (0.532)     (0.248)
                                                                                                 No
                                                                                                           (0.228)      (0.208)
                                                                                                                                      No


                                                          Total food consumption
                                  32.4        33.8                       27.4        27.4                   34.5         36.3
Poverty incidence (%)            (0.736)     (0.640)
                                                             No
                                                                       (1.568)     (1.264)
                                                                                                 No
                                                                                                           (0.805)      (0.721)
                                                                                                                                     -1.8*

                                  10.2         9.8                       9.2         8.3                    10.6         10.4
Poverty depth (%)                (0.326)     (0.248)
                                                             No
                                                                       (0.713)     (0.432)
                                                                                                 No
                                                                                                           (0.355)      (0.296)
                                                                                                                                      No

                                   4.4         4.3                       4.2         3.7                     4.5          4.5
Poverty severity (%)             (0.210)     (0.145)
                                                             No
                                                                       (0.522)     (0.234)
                                                                                                 No
                                                                                                           (0.201)      (0.179)
                                                                                                                                      No


                                                 Total food and nonfood consumption
                                  36.2        37.1                       31.7        31.1                   38.1         39.5
Poverty incidence (%)            (0.736)     (0.667)
                                                             No
                                                                       (1.592)     (1.472)
                                                                                                 No
                                                                                                           (0.801)      (0.719)
                                                                                                                                      No

                                  11.8        11.1                      11.5         9.8                    12.0         11.7
Poverty depth (%)                (0.344)     (0.269)
                                                             No
                                                                       (0.759)     (0.552)
                                                                                                1.7*
                                                                                                           (0.372)      (0.304)
                                                                                                                                      No

                                   5.3         4.9                       5.6         4.5                     5.2          5.1
Poverty severity (%)             (0.237)     (0.163)
                                                             No
                                                                       (0.571)     (0.310)
                                                                                                1.1*
                                                                                                           (0.235)      (0.190)
                                                                                                                                      No


                   Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.
                    Standard errors in parentheses. Significance level: 1% (***), 5% (**), and 10% (*).

34.     Poverty estimates are similar between CAPI and PAPI based on core food consumption, which
allows isolating the effect of data collection method. At the national level, poverty incidence from a core
food consumption aggregate is 33.0 percent for the tablet-based pilot and 33.2 percent for the tablet
survey. The difference between estimates from both surveys are neither statistically significant at the
national level nor for urban and rural areas (Table 9). Poverty estimates from core food consumption
allows isolating the effect of data collection method, as there are no differences in the consumption


25The nonfood component excludes rent, energy and educational expenses.
26The food poverty line was defined considering basic food items which attain the 2,250 Kcal minimum nutritional requirements (Kenya National
Bureau of Statistics 2018).



                                                                      18
                                                               Comparing a tablet-based rapid consumption with a paper-based full consumption survey



 module of CAPI and PAPI for core food items. Collecting household-level consumption data with CAPI
 results in equivalent poverty estimates to those obtained from a paper survey.

      Figure 6: Total food and nonfood expenditure for rural                         Figure 7: Total food and nonfood expenditure for urban
                              areas.                                                                  areas excluding Nairobi.

                  100                                                                                 100
                  90                                                                                  90
                  80                                                                                  80
                  70                                                                                  70
% of population




                                                                                    % of population
                  60                                                                                  60
                                                     CAPI         PAPI                                                                   CAPI          PAPI
                  50                                                                                  50
                  40                                                                                  40
                  30                                                                                  30
                  20                                                                                  20
                  10                                                                                  10
                   0                                                                                   0
                        0    2,000   4,000     6,000     8,000 10,000 12,000                                0     4,000      8,000      12,000      16,000    20,000
                            Monthly per adult equivalent total expenditure                                      Monthly per adult equivalent total expenditure
                                              (deflated)                                                                          (deflated)


                                Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.

 35.      The rapid consumption methodology with CAPI produces poverty point estimates that are
 statistically indifferent to a full consumption module in PAPI. The point estimates of poverty incidence,
 depth and severity derived from i) total food consumption and ii) total food and nonfood expenditure,
 including the rapid consumption methodology for CAPI, are statistically equivalent between KIHBS
 2015/16 and the tablet-based pilot at the national level (Table 9 and Box 3 in the Annex). 27 Poverty
 incidence in urban areas is also similar between CAPI and PAPI from both consumption aggregates, and
 poverty depth and severity from total food consumption. Poverty depth and severity are slightly different
 between surveys when considering total food and nonfood expenditure in urban areas. In rural areas,
 poverty incidence from total food consumption is also slightly different between surveys, with all other
 poverty measures being equivalent between CAPI and PAPI. The relatively small differences in some
 poverty measures for urban and rural areas –all at the 10 percent level of significance– do not seem to be
 associated with the choice of survey method and consumption module, since there is no evidence of a
 systematic bias when comparing the paper and tablet estimates for the two consumption aggregates that
 employed the RCM for CAPI. Overall, the poverty estimates from CAPI with a rapid consumption
 methodology are mostly equivalent to those derived from the KIHBS 2015/16 paper survey.
 36.     Besides, the distribution of total food and nonfood expenditure for rural areas is similar
 between PAPI with a full consumption module and CAPI with a rapid consumption methodology.
 Despite the differences in number of items asked to respondents in the full consumption module and the
 RCM, the distribution of total food and nonfood expenditure for rural areas is equivalent between PAPI
 and CAPI (Figure 6). A two-sample Kolmogorov–Smirnov test indicates there are no differences in the




 27Food poverty is slightly higher from KIHBS 2015/16 (32 percent) because the comparable consumption aggregate excluded a few items and the
 same national food poverty line was considered. Similarly, poverty from a total food and nonfood consumption aggregate is different to the
 absolute poverty rate from KIHBS 2015/16 (36.1 percent) because the consumption aggregates only considered items that were included in both
 surveys and excluded rent, energy and educational expenses.



                                                                               19
                                                     Comparing a tablet-based rapid consumption with a paper-based full consumption survey



distribution of total food and nonfood expenditure between CAPI and PAPI for rural areas. 28 CAPI
technology combined with a rapid consumption methodology produces consistent total food and nonfood
expenditure results as those obtained from a full consumption module in KIHBS 2015/16. In line with this,
the Gini inequality index is not statistically different between CAPI and PAPI for any of the three
consumption aggregates considered (Table 10).

                                                    Table 10: Gini inequality index.

                    National level excluding        Urban areas excluding
                                                                                          Urban areas                      Rural areas
 Consumption               Nairobi                         Nairobi
  aggregate
                    CAPI       PAPI       Diff.     CAPI      PAPI         Diff.   CAPI       PAPI       Diff.     CAPI      PAPI        Diff.

 Core food
                     32.3      32.2       No        31.5       31.8         No     29.9       31.2       -1.2*      31.4      31.4       No
 consumption
 Total food
                     33.3      34.2      -0.9**     32.9       36.0        -3.0*   31.6       35.2     -3.6***      32.0      32.2       No
 consumption
 Total food
 and nonfood         34.1      35.2      -1.1**     33.1       35.5        -2.4*   32.2       35.1     -2.8***      32.3      32.8       No
 consumption

                   Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.
                                    Significance level: 1% (***), 5% (**), and 10% (*).

37.      In urban areas, some differences at the top of the consumption between surveys do not seem
to be associated with the choice of data collection method and consumption module. The distribution
of total food and nonfood consumption for urban areas is similar between PAPI and CAPI for expenditure
values below KSh 9,500 (Figure 8 in the Annex). Higher nonresponse rates for wealthier areas like Nairobi
can potentially distort consumption at the top of the distribution. KIHBS 2015/16 had smaller response
rate in every province compared to the tablet-based pilot, with the largest difference in Nairobi (77
percent for PAPI vs. 99 percent for CAPI; Table 11 in the Annex). After excluding the capital city, the
distribution of total expenditure for urban areas is similar between the paper and the tablet survey (Figure
7). The two-sample Kolmogorov–Smirnov test for urban areas without Nairobi indicates there are no
differences in the distribution of total food and nonfood expenditure between CAPI and PAPI.29 The Gini
inequality index is statistically equivalent from a core food consumption aggregate for urban areas
without Nairobi, and only slightly different –at the 10 percent level of significance– for total food
consumption and total food and nonfood expenditure (Table 10). Furthermore, the Gini inequality index
increases for both CAPI and PAPI after excluding Nairobi, while the difference between surveys decreases
for all the consumption aggregates. This ultimately suggests that differences in the urban Gini index
between CAPI and PAPI are more likely to be explained by different response rates in other urban areas
beyond the capital city, and less likely to be associated with the data collection method and the
consumption module considered (Table 10).




28 The two-sample Kolmogorov–Smirnov test verifies the equality of distributions considering the null hypotheses that the distribution of total
food and nonfood expenditure for PAPI i) contains smaller values than for CAPI; and ii) contains larger values than for CAPI. The associated p-
values for rural areas are 0.990 and 0.779 respectively.
29
   For urban areas excluding Nairobi, the p-values for the null hypotheses of the two-sample Kolmogorov–Smirnov test are 0.961 and 0.779.



                                                                      20
                                         Comparing a tablet-based rapid consumption with a paper-based full consumption survey



              E.         CONCLUSIONS AND RECOMMENDATIONS
38.      The tablet-based pilot produced point estimates that are statistically indifferent to those from
the paper survey in terms of household and population characteristics. Easy to verify dwelling
characteristics as well as water sources, sanitation, lighting and cooking fuel of households show no
statistical differences at the national level between the paper-based survey and tablet-based survey.
Gender, literacy and educational and other characteristic of the household head are also similar in the
paper and tablet-based survey.
39.     The rapid consumption methodology with CAPI produces similar consumption and poverty
results to a full consumption module in PAPI. The point estimates of poverty incidence, depth and
severity derived from i) total food consumption and ii) total food and nonfood expenditure, including the
rapid consumption methodology for CAPI, are statistically equivalent between KIHBS 2015/16 and the
tablet-based pilot at the national level. In addition, the distribution of total food and nonfood expenditure
are similar between CAPI and PAPI for rural and urban areas excluding Nairobi, regardless of whether they
are generated from a full consumption module or the rapid consumption methodology.
40.      Using the rapid consumption methodology and CAPI would facilitate providing frequent, timely
and high-quality data for the KCHS. Including the RCM in the KCHS can considerably reduce the
administering time of the questionnaire, while producing household, population, consumption and
poverty estimates that are statistically not significantly different from those using a paper survey with a
full consumption module. In addition, collecting data with CAPI has the potential to improve data quality,
eliminates the need for data entry, supports near real-time monitoring of data collection and reduces the
time between fieldwork and analysis. CAPI can help closing data gaps in terms of data collection and
dissemination of poverty data. 30 Even in a context of conflict and violence, technological innovations
based on CAPI have proven to be successful in establishing a survey infrastructure to obtain valid and
reliable information.31
41.     Implementing the KCHS with the RCM and CAPI implies considering other aspects to leverage
their benefits. CAPI increases the time needed to design the questionnaire and thus must be combined
with constant efforts in coding and testing it before data collection. The tablet or device, together with
the software and hardware would need to be procured and tested in advance. In addition, the training of
enumerators will need to emphasis how to use the tablets and to correctly record answers. A successful
monitoring system would need to ensure appropriate data management and transfer from the devices to
the cloud server and define how to provide timely feedback to teams in the field. Additionally, training
and capacity building will be needed to derive total consumption aggregates with multiple imputation
techniques as part of the rapid consumption methodology, besides a providing documentation and
support to data users.




30   Serajuddin et al. 2015.
31
     U. J. Pape and Parisotto 2019.



                                                        21
                                      Comparing a tablet-based rapid consumption with a paper-based full consumption survey



      ANNEX


      1.       Additional figures and tables



                                Box 3: Measures of poverty and inequality

The poverty incidence is the most common poverty measure. The poverty incidence or headcount
ratio refers to the share of population that is poor or that have a total consumption lower than the
poverty line. Its derived from the total consumption of the household in food, non-food and durable
goods, the number of members that comprise the household and a specific consumption threshold or
poverty line. This measure describes the extent of poverty in a country or region.

The poverty gap index measures how far poor households are from overcoming poverty, while the
poverty severity index measures the level of inequality among the poor. The poverty gap index is
the difference between current consumption and the poverty line as a proportion of the poverty line
for the poor population. It can be interpreted as the minimum amount of resources that would have
to be transferred to the poor, under a perfect targeting scheme, to eradicate poverty. The poverty
severity index is estimated as the square of the poverty gap. It attributes a larger weight to the
poorest among the poor, thus reflecting inequality conditions for the poor.

While poverty measures absolute deprivation with respect to a given threshold, inequality is a
relative measure indicating how little some parts of a population have relative the entire population.
In the context of monetary poverty, equality can be defined as an equal distribution of resources across
the population. This means that each share of the population owns the same share of
consumption/income. The Lorenz Curve compares the cumulative share of the population with their
cumulative share of consumption/income. A perfectly equal distribution is indicated by a diagonal. The
other extreme is complete inequality where one individual owns all the consumption/income. These
two (theoretical) extremes define the boundaries for observed inequality.
The Gini index is the most commonly used measure for inequality. A Gini index of 0 indicates perfect
equality while 100 corresponds complete inequality. The Gini index is associated to the Lorenz curve as
it measures the area between the distribution of consumption or income represented by the Lorenz
Curve and the diagonal line that implies perfect equality.




                                                     22
                                                            Comparing a tablet-based rapid consumption with a paper-based full consumption survey



                        Figure 8: Total food and nonfood expenditure for urban areas.

                                     100
                                           90
                                           80
                                           70


                         % of population
                                           60
                                                                                    CAPI           PAPI
                                           50
                                           40
                                           30
                                           20
                                           10
                                           0
                                                0        4,000       8,000        12,000       16,000       20,000
                                                       Monthly per adult equivalent total expenditure (deflated)


              Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.



                                                    Table 11: Average response rate by province.
                                                                                  Response rate (%)
                                           Province
                                                                    KIHBS 2015/16              Tablet-based pilot

                     Central                                              92.4                         98.7
                     Coast                                                91.3                         99.8
                     Eastern                                              92.2                        100.0
                     Nairobi                                              76.9                         99.5
                     North Eastern                                        87.7                         99.9
                     Nyanza                                               93.1                         99.1
                     Rift Valley                                          91.0                         96.7
                     Western                                              93.7                        100.0
               Source: Authors’ calculation based on KIHBS 2015/16 and the tablet-based pilot.



        2.      Sampling design and weights

The tablet pilot surveyed six households per EA, while KIHBS 2015/16 ten (Table 12).

                                                      Table 12: Characteristics of the sample.

                                                                          CAPI                             PAPI
                      Characteristic
                Number of EAs                                             2,251                            2,387

                Number of households                                     12,851                           21,773

              Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.




                                                                            23
                                          Comparing a tablet-based rapid consumption with a paper-based full consumption survey



To compare indicators from the paper and tablet survey, the sampling weights for CAPI are derived from
PAPI at the EA level. That is, the sum of sampling weights at the EA level in PAPI was divided among the
households included in CAPI. Sampling weights are then scaled to the totals from PAPI at strata level. The
resultant weights for CAPI are the same as PAPI at urban/rural and county levels. The equation below
provides the formal description:
                                                                            �����?
                                                 �����?           �����?
                                                                        ������ℎ������������
                                              ������������������   =   ������������������   ×      �����?
                                                                        ������ℎ������������


      where
 c
wij = CAPI cluster weight for cluster i in stratum j;
  p
wij = PAPI cluster weight for cluster i in stratum j;
  p
nhij = Number of PAPI households interviewed in cluster i in stratum j;
nhc
  ij = Number of CAPI households interviewed in cluster i in stratum j.

In the scenario where no CAPI households are interviewed in a cluster the weight of other clusters in the
same stratum must be adjusted;
                                                                                 p
                                                                        ∑wj
                                             c2             c1
                                            wij        =   wij      ×
                                                                        ∑������jc


such that
  c1
wij  = CAPI cluster weight for cluster i in stratum j after adjusting for differences in number of households
between CAPI and PAPI clusters only;
  c2
wij  = CAPI cluster weight for cluster i in stratum j after adjusting for differences in number of households
and CAPI cluster in stratum j missing all households;
      �����?
∑wj = Sum of PAPI cluster weights in stratum j;
      �����?
∑wj = Sum of CAPI cluster weights in stratum j.




           3.    Allocation of households to PSUs

During data collection some enumerators chose incorrectly the EA correspondent to the interview
conducted. As a result, many EAs had more than six households (Figure 9).




                                                             24
                                         Comparing a tablet-based rapid consumption with a paper-based full consumption survey



                                  Figure 9: Allocation of households and EAs.




              Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot.

The KNBS was able to validate the correct EA number for some submissions based on their GPS positions.
However, various interviews were not validated. This was only possible for 9,146 households out of
13,795. In order to solve the issue, the midpoint (median latitude/longitude) of each EA from GPS
positions of households interviewed in PAPI was obtained. Then, the household was allocated to the EA
using the distance to the closest EA from the distance to all the EAs. As a robust check the same process
was applied to household with a correct EA ID from fieldwork records, which resulted in an accuracy of 94
percent of the cases from the approach.
As a result, 9,146 households were identified to their EA using the information from KNBS, 334 did not
have GPS coordinates and were dropped; 4,315 were allocated to an EA using the GPS coordinates to the
closes midpoint from PAPI. Finally, an additional correction was introduced excluding households that
were 1 km away from the midpoint of the EA, but only among EAs identified through GPS positions from
PAPI which had more than 6 interviews in the EA. With this correction, 168 additional households were
excluded.



        4.      Quality control of CAPI submissions

The submissions from the tablet-based were validated before proceeding to the cleaning and analysis
stage of the survey. The following two validation rules were considered to exclude submissions:
    -   Duration: Submissions with a duration of less than 30 minutes, those with missing duration in the
        food or nonfood modules, and those interviews whose duration in the food module was less than
        10 minutes and 5 minutes in the nonfood module. 76 interviews were deemed invalid for this
        reason.
    -   Location: Submissions whose GPS coordinates do not fall within the EA boundaries from the PAPI
        records. 346 interviews were deemed invalid for this reason.




                                                        25
                                                 Comparing a tablet-based rapid consumption with a paper-based full consumption survey



        -    Completeness: 18 interviews with no information on food consumption and 2 submissions with
             zero food consumption recorded were also excluded.



             5.         Imputed rent values for urban households

Rent was imputed by estimating a stepwise log-linear Ordinary Least squares (OLS) regression of reported
rents on housing characteristic variables.32
The same structural model used in PAPI was employed to obtain rent expenditure for urban households
in CAPI. The variables considered in the model correspond to the following:
        -    Location
        -    Number of rooms
        -    Construction materials
        -    Type of water supply and sanitation
        -    Type of energy source for cooking
        -    Household head employment and
        -    Educational characteristics



             6.         Poverty line and consumption aggregates

The food consumption aggregate was obtained from 4 sub-groups: purchased items, stocked, own-
production and gifts. Data on non-food consumption by households was also collected. The set of nonfood
items included in both surveys was considered, excluding rent, energy and educational expenses.
To obtain comparable consumption aggregates between the paper and tablet surveys, various
adjustments are made. Table 13 presents the source of discrepancy and the correction introduced to have
comparable consumption measures between surveys. For the CAPI consumption aggregates, multiple
imputations are used in the following way:
        -    Model selection.
                   ▪    First, the best model for the multiple imputation process is selected based on Furnival-
                        Wilson leaps-and-bounds algorithm to explore all possible combinations of independent
                        variables.
                   ▪    The dependent variable corresponds to the natural logarithm of collected consumption;
                        the core module as well as the optional module allocated to the household.
                   ▪    The independent variables include strata, optional module, urban/rural location of the
                        household, dwelling characteristics (material of floor, roof and walls), source of water,
                        sanitation, cooking fuel and lighting, number of habitable rooms, demographic
                        characteristics (household size, proportion of children and elderly), ownership of assets


32
     Kenya National Bureau of Statistics 2018.



                                                                26
                                               Comparing a tablet-based rapid consumption with a paper-based full consumption survey



                  (radio, tv, cellphone, bicycle and mosquito net), as well as characteristics of the household
                  head (gender, education and employment status).
             ▪    Once the best specification is identified, the model is complemented with two dummy
                  variables indicating the quartile associated to each household from the consumption of
                  core food and core non-food expenditure.
   -    Multiple imputation.
             ▪    A two-step multiple imputation process is then implemented for each optional module
                  using the log-space for consumption and an indicator of whether consumption collected
                  on the respective module is zero or not.
             ▪    In the first step, the model estimates whether the module has non-zero consumption
                  using a logit regression, while in the second step the consumption of the module is
                  estimated with an OLS regression using the best specification previously identified and
                  drawing 100 point estimates.
             ▪    Finally, consumption aggregates are obtained from the 100 point estimates of each
                  household.

                    Table 13: Corrections made to produce comparable consumption aggregates.

                            Issue                                                            Correction

                                                                    Use median prices for urban and rural areas from the paper
                                                                    survey
Upper bound constraints for unit prices induced lower values

                                                                    Use the median deflator by EA from the paper survey
Implicit non-standard conversion factors induced higher             Convert to standard units (Kg/Lt) using the median values by
quantities consumed                                                 item & non-standard unit from the paper survey
                                                                    Replace the consumption value of 3 items that were only
                                                                    reported in ‘handful’ or ‘piece’ using the median values by
Non-standard conversion factors combined with lower                 EA from the paper survey
bound constraints for quantities reported in ‘handful’ and
‘piece’ units induced higher values                                 For the rest of the items, entries reported in ‘handful’ or
                                                                    ‘piece’ were replaced by the median value by item from the
                                                                    tablet survey
                                                                    Outliers in the tablet survey with values beyond the
Upper bound constraints for quantities did not prevent
                                                                    maximum value in the paper survey were replaced by the
outliers
                                                                    median value by item from the tablet survey
                                                                    1 non-core item included in the tablet but not the paper
                                                                    survey was excluded from the tablet consumption aggregate.
Food items were different in the paper vs. tablet survey            25 non-core items included in the paper but not the tablet
                                                                    survey were excluded from the KHIBS consumption
                                                                    aggregate
For item 'food eaten outside the household' only the price          Replace consumption values of this item in the tablet survey
was considered to derive the consumption value in the paper         by the median value by EA from the paper survey
survey (i.e. not the quantity)
The consumption value of item 'cost of milling' was set to          For consistency, the consumption value of item 'cost of
zero in the paper survey                                            milling' was also set to zero in the tablet survey




                                                               27
                                                 Comparing a tablet-based rapid consumption with a paper-based full consumption survey



The consumption aggregates are converted to adult equivalent measure using the equivalence scales
developed by Anzagi and Bernard. These adult equivalence scales prescribe that age groups 0-4 years are
weighted as 0.24 of an adult, children aged 5-14 years be weighted as 0.65 and all people aged 15 years
and older be assigned a value of unity.
The poverty lines used corresponds to those from KIHBS 2015/16 and are based on cost-of-basic needs.
The rural and urban food poverty lines were set by costing two separate bundles of basic food items which
attain the 2,250 Kcal minimum nutritional requirements in a way which is consistent with food tastes in
rural and urban areas.33As for PAPI, the overall poverty lines in monthly adult equivalent terms considered
are KSh 3,252 for rural areas and KSh 5,995 for urban areas. The food poverty lines in monthly adult
equivalent terms are computed as KSh 1,952 and KSh 2,551 for rural and urban areas, respectively. The
food poverty line was used directly to obtain the incidence of poverty from a total food consumption
aggregate. However, the national poverty line was rescaled using the PAPI share of food and nonfood
items considered in the total expenditure aggregate for CAPI and PAPI from the total KIHBS 2015/16
consumption aggregate.




33
     Kenya National Bureau of Statistics 2018.



                                                                28
                                       Comparing a tablet-based rapid consumption with a paper-based full consumption survey



REFERENCES
Banks, Randy, and Heather Laurie. 2000. “From Papi to Capi: The Case of the British Household Panel
Survey.�?        Social     Science       Computer       Review        18         (4):     397–406.
https://doi.org/10.1177/089443930001800403.
Bemelmans-Spork, M.E., and D. Sikkel. 1985. “Data Collection with Hand-Held Computers.�? Netherlands
Central Bureau of Statistics.
Caeyers, Bet, Neil Chalmers, and Joachim De Weerdt. 2012. “Improving Consumption Measurement and
Other Survey Data through CAPI: Evidence from a Randomized Experiment.�? Journal of Development
Economics 98 (1): 19–33.
Couper, Mick P., and Geraldine Burt. 1994. “Interviewer Attitudes Toward Computer-Assisted Personal
Interviewing      (CAPI.�?    Social     Science     Computer       Review    12     (1):    38–54.
https://doi.org/10.1177/089443939401200103.
Danielsson, L., and P. Maarstad. 1982. “Statistical Data Collection with Hand-Held Computers: A
Consumer Price Index.�?
Demombynes, Gabriel, Paul Gubbins, and Alessandro Romeo. 2013. “Challenges and Opportunities of
Mobile Phone-Based Data Collection: Evidence from South Sudan.�?
Fafchamps, Marcel, David McKenzie, Simon Quinn, and Christopher Woodruff. 2014. “Microenterprise
Growth and the Flypaper Effect: Evidence from a Randomized Experiment in Ghana.�? Journal of
Development Economics 106: 211–26. https://doi.org/10.1016/j.jdeveco.2013.09.010.
Glewwe, Paul, and Hai-Anh Hoang Dang. 2008. “Impact of Decentralized Data Entry on the Quality of
Household Survey Data in Developing Countries : Evidence from a Randomized Experiment in Vietnam.�?
The World Bank Economic Review 22 (1 (January 2008)): 165–85.
Kenya National Bureau of Statistics. 2018. “Basic Report on Well-Being in Kenya.�?
King, Jonathan D., Joy Buolamwini, Elizabeth A. Cromwell, Andrew Panfel, Tesfaye Teferi, Mulat Zerihun,
Berhanu Melak, et al. 2013. “A Novel Electronic Data Collection System for Large-Scale Surveys of
Neglected Tropical Diseases.�? PLOS ONE 8 (9): e74570. https://doi.org/10.1371/journal.pone.0074570.
Leeuw, E. D. de. 2008. “The Effect of Computer-Assisted Interviewing on Data Quality: A Review of the
Evidence.�? Preprint. 2008. http://dspace.library.uu.nl/handle/1874/44502.
Leeuw, Edith de, and William Nicholls. 1996. “Technological Innovations in Data Collection: Acceptance,
Data Quality and Costs.�? Sociological Research Online 1 (4): 1–15. https://doi.org/10.5153/sro.50.
Leisher, Craig. 2014. “A Comparison of Tablet-Based and Paper-Based Survey Data Collection in
Conservation Projects.�? Social Sciences 3 (2): 264–71. https://doi.org/10.3390/socsci3020264.
Nicholls, W.L., and E. De Leeuw. 1996. “Factors in Acceptance of Computer-Assisted Interviewing
Methods: A Conceptual and Historic Review.�? Proceedings of the Section on Survey Research Methods,
American Statistical Association.
Olsen, R. J. n.d. “The Effects of Computer Assisted Interviewing on Data Quality.�? University of Essex.
Olson Lanjouw, Jean, and Peter Lanjouw. 2001. “How to Compare Apples and Oranges: Poverty
Measurement Based on Different Definitions of Consumption.�? Review of Income and Wealth 47 (1): 25–
42.




                                                      29
                                      Comparing a tablet-based rapid consumption with a paper-based full consumption survey



Pape, U, and J Mistiaen. 2015. “Measuring Household Consumption and Poverty in 60 Minutes: The
Mogadishu. Washington DC: World Bank.�? Proceedings of ABCA Conference 2015. Washington DC:
World Bank.
Pape, Utz Johann, and Johan A. Mistiaen. 2018. “Household Expenditure and Poverty Measures in 60
Minutes: A New Approach with Results from Mogadishu.�?
Pape, Utz Johann, and Luca Parisotto. 2019. “Estimating Poverty in a Fragile Context -- The High
Frequency Survey in South Sudan.�? World Bank Policy Research Working Paper no 8722 (January): 1–42.
Pape, Utz, and Johan Mistiaen. 2015. “Measuring Household Consumption and Poverty in 60 Minutes:
The Mogadishu High Frequency Survey.�?
Prydz, Espen. 2013. “‘Knowing in Time’: How Technology Innovations in Statistical Data Collection Can
Make a Difference in Development.�? OECD.
Rosero-Bixby, L., J. Hidalgo-Céspedes, D. Antich-Montero, and M. A. Seligson. 2005. “Improving the
Quality and Lowering Costs of Household Survey Data Using Personal Digital Assistants (PDAs). An
Application for Costa Rica.�? In meeting of the Population Association of America Philadelphia.
Schräpler, Jörg-Peter, Jürgen Schupp, and Gert G. Wagner. 2010. “Changing from PAPI to CAPI:
Introducing CAPI in a Longitudinal Study.�? Journal of Official Statistics 26 (2): 239–69.
Sebestik, J., H. Zelon, D. DeWitt, J. M. O’Reilly, and K. McGowan. 1988. “Initial Experiences with CAPI.�?
Proceedings of the Bureau of the Census Fourth Annual Research Conference.
Serajuddin, Umar, Hiroki Uematsu, Christina Wieser, Nobuo Yoshida, and Andrew L. Dabalen. 2015. “Data
Deprivation :    Another     Deprivation      to     End.�?    WPS7252.      The     World       Bank.
http://documents.worldbank.org/curated/en/700611468172787967/Data-deprivation-another-
deprivation-to-end.
Taylor, Sue. 1998. “Setting up Computer-Assisted Personal Interviewing in the Australian Longitudinal
Study of Ageing.�? Statistical Science 13 (1): 14–18.
World Bank. 2019. “Kenya Gender and Poverty Assessment 2015/16; Reflecting on a Decade of Progress
and the Road Ahead.�? Washington, DC: World Bank.
Zhang, Shuyi, Qiong Wu, Michelle HMMT van Velthoven, Li Chen, Josip Car, Igor Rudan, Yanfeng Zhang,
Ye Li, and Robert W Scherpbier. 2012. “Smartphone Versus Pen-and-Paper Data Collection of Infant
Feeding Practices in Rural China.�? Journal of Medical Internet Research 14 (5).
https://doi.org/10.2196/jmir.2183.




                                                     30