Comparing a tablet-based rapid consumption with a paper-based full consumption survey COMPARING A TABLET-BASED RAPID CONSUMPTION WITH A PAPER-BASED FULL CONSUMPTION SURVEY 9/20/2019 i Comparing a tablet-based rapid consumption with a paper-based full consumption survey Acknowledgement The report was led by Utz Pape (Senior Economist, EA1PV) and written together with Gonzalo Nunez (Consultant, EA1PV) with substantial contributions from Nduati Kariuki (ET Consultant, EA1PV). The team is grateful for inputs and comments from the peer Alvin Etang Ndip (Senior Economist, EA1PV) and Arden Finn (Young Professional, EA1PV) as well as guidance received from Pierella Paci (Practice Manager, EA1PV). The World Bank collaborated closely with the Kenya National Bureau of Statistics (KNBS) on the 2015/16 Kenya Integrated Household Budget Survey, as well as on the tablet-based survey pilot. The team would like to thank Mr. Zachary Mwangi, Ms. Mary Wanyonyi, Mr. Paul Samoei and Mr. Samuel Kipruto for their tireless work and dedication toward the measurement of wellbeing in Kenya. ii Comparing a tablet-based rapid consumption with a paper-based full consumption survey Table of Contents A. INTRODUCTION ......................................................................................................................... 1 B. DATA COLLECTION METHODS .................................................................................................... 4 - DIFFERENCES BETWEEN PAPI (PAPER) AND CAPI (TABLET) ................................................................................. 4 - EVIDENCE FOR THE USE OF CAPI .................................................................................................................... 6 C. RESULTS FROM THE TABLET-BASED SURVEY ............................................................................... 8 - USE OF SOFT CONSTRAINTS IN THE FOOD CONSUMPTION MODULE........................................................................ 8 - IMPLEMENTATION OF THE RAPID CONSUMPTION METHODOLOGY ....................................................................... 11 D. RESULTS FROM THE PILOT COMPARING CAPI VS. PAPI...............................................................14 - POPULATION AND HOUSEHOLD CHARACTERISTICS ............................................................................................ 14 - CONSUMPTION AGGREGATES AND POVERTY ESTIMATES.................................................................................... 17 E. CONCLUSIONS AND RECOMMENDATIONS .................................................................................21 ANNEX.............................................................................................................................................22 1. ADDITIONAL FIGURES AND TABLES ........................................................................................................... 22 2. SAMPLING DESIGN AND WEIGHTS ............................................................................................................ 23 3. ALLOCATION OF HOUSEHOLDS TO PSUS ................................................................................................... 24 4. QUALITY CONTROL OF CAPI SUBMISSIONS ............................................................................................... 25 5. IMPUTED RENT VALUES FOR URBAN HOUSEHOLDS ...................................................................................... 26 6. POVERTY LINE AND CONSUMPTION AGGREGATES ....................................................................................... 26 REFERENCES ....................................................................................................................................29 iii Comparing a tablet-based rapid consumption with a paper-based full consumption survey List of Figures Figure 1: Characteristics of the paper and tablet-based survey. .................................................................. 2 Figure 2: Structure of the questionnaire for both surveys. .......................................................................... 2 Figure 3: Distribution of unit prices for cocoa & cocoa products. .............................................................. 10 Figure 4: Distribution of quantity consumed for eggs. ............................................................................... 11 Figure 5: CAPI total food and nonfood expenditure collected and imputed.............................................. 13 Figure 6: Total food and nonfood expenditure for rural areas. .................................................................. 19 Figure 7: Total food and nonfood expenditure for urban areas excluding Nairobi.................................... 19 Figure 8: Total food and nonfood expenditure for urban areas. ................................................................ 23 Figure 9: Allocation of households and EAs. ............................................................................................... 25 List of Tables Table 1: Main characteristics of data collection methods. ........................................................................... 5 Table 2: Example of upper bound soft constraints for cocoa & cocoa products.......................................... 8 Table 3: Conversion factors to kilogram for non-standard unit bowl. ....................................................... 10 Table 4: Allocation of consumption items into modules. ........................................................................... 13 Table 5: Dwelling characteristics. ............................................................................................................... 14 Table 6: Water, sanitation, cooking and lighting source. ........................................................................... 15 Table 7: Population by age, gender and education. ................................................................................... 16 Table 8: Characteristics of the household head. ........................................................................................ 17 Table 9: Poverty estimates.......................................................................................................................... 18 Table 10: Gini inequality index.................................................................................................................... 20 Table 11: Average response rate by province. ........................................................................................... 23 Table 12: Characteristics of the sample. ..................................................................................................... 23 Table 13: Corrections made to produce comparable consumption aggregates. ....................................... 27 List of Boxes Box 1: Soft constraints in the tablet-based survey ....................................................................................... 9 Box 2: Rapid consumption methodology. ................................................................................................... 12 Box 3: Measures of poverty and inequality ................................................................................................ 22 Abbreviations CAPI Computer Assisted Personal Interview EA Enumeration area KIHBS Kenya Integrated Household Budget Survey KNBS Kenya National Bureau of Statistics KCHS Kenyan Continuous Household Survey KSH Kenyan Shillings KSPforRR Kenya Statistics Program-for-Results NASSEP National sample survey and Evaluation Programme OLS Ordinary Least squares PAPI Pen-and-Paper Personal Interview RCM Rapid Consumption Methodology iv Comparing a tablet-based rapid consumption with a paper-based full consumption survey Executive Summary The Kenyan Continuous Household Survey (KCHS), funded by the World Bank, will make available timely high-quality data. A ten-year gap between the Kenya Integrated Household Budget Survey (KIHBS) 2005/06 and 2015/16 prevented Kenya from understanding the profile of the poor and trends in poverty reduction. In addition, a processing and analysis time of two years for the 2015/16 data limited stakeholders from having timely information. The Kenya National Bureau of Statistics (KNBS), with support of the World Bank through the Kenya Statistics Program-for-Results (KSPforR), implements the KCHS to collect quarterly and annual data on labor and poverty indicators. The survey has the objective to provide frequent, timely and high-quality evidence to support stakeholders in decision-making. Implementing the KCHS provides an opportunity for building a reliable survey infrastructure with the best choice of consumption module and data collection method. Designing a large-scale and continuous survey such as the KCHS involves defining the best strategy to support frequent collection and a timely dissemination. The feasibility of including a full consumption module against the alternative of a rapid consumption methodology (RCM) needs to be carefully assessed. The latter can reduce the administering time of the questionnaire, while generating accurate consumption and poverty estimates to those obtained with a full consumption module. Moreover, the data collection method can support the monitoring of fieldwork and facilitate data availability. Collecting data with computer-assisted personal interviewing (CAPI or tablet survey) can serve as the basis in building a modern data monitoring system, allowing near real-time monitoring of interviews and shortening the time between data collection and its availability. In a preparatory step, the KNBS conducted a tablet-based pilot alongside the implementation of the 2015/16 paper-based KIHBS. KIHBS 2015/16 collected data with pen-and-paper interviewing (referred as PAPI or paper survey), while at the same time a pilot was implemented using CAPI. Both surveys are based on the same sampling frame and design. Data was collected in the same randomly drawn geographic locations or enumeration areas (EAs), with 10 households interviewed for KIHBS and 6 for the tablet-based pilot. The questionnaires of both surveys differ in some important aspects. The paper survey utilized a full consumption module as well as detailed information on many livelihood characteristics, while the tablet- based survey used the rapid consumption methodology combined with a focus on essential questions throughout all other questionnaire modules. CAPI has the potential to improve data quality, to eliminate the need for data entry, to support near real-time monitoring of data collection and to reduce the time between fieldwork and analysis. PAPI is the traditional method in which enumerators fill out paper questionnaires, while CAPI refers to interviews being conducted with the assistance of a computer or tablet. In PAPI, the enumerator records respondents’ answers on a printed paper questionnaire with the help of a pen. Thus, the data needs to be digitalized either manually or automatically (scanned). In CAPI, the enumerator uses an electronic version of the questionnaire to record responses directly on the device, eliminating the need for data entry while also allowing for near real-time monitoring of fieldwork as completed interviews can be available immediately, as long as the devices are connected to a cloud server and submissions transmitted on a regular basis. Data quality can be improved with CAPI by preventing enumerators from skipping questions and introducing dynamic checks and constraints. Skip patterns prevent enumerators from asking a question that should be skipped, or conversely skipping a question that should be asked. This decreases the length of the interview, avoids confusing the respondent and saves time when processing the data. It can also improve data quality by reducing item non-response. Dynamic validations can also be included in the v Comparing a tablet-based rapid consumption with a paper-based full consumption survey tablet-based questionnaire to flag suspiciously high or low data entries requiring confirmation from the enumerators before proceeding. Dynamic checks and constraints allow identifying and confirming data entries while the enumerator is still with the respondent, where the best contextual knowledge is available. The tablet survey produces point estimates that are statistically indifferent to those from the paper survey for household and population characteristics. Dwelling characteristics, water sources, sanitation, lighting and cooking fuel are similar between the paper-based survey and tablet-based survey. The material of the floor, roof and walls, as well as sanitation and lighting are not significantly different between the paper and tablet survey at the national level, but also for rural and urban areas. Similarly, water sources and cooking fuel of Kenyan households are also not significantly different between the CAPI and PAPI data at the national level. For urban and rural areas, there are some differences between paper and tablet, mainly in categories that refer to other water sources or cooking fuels. However, it seems unlikely that the differences are associated with the data collection method employed since some CAPI estimates are larger compared to PAPI estimates, while others smaller, suggesting the lack of a systematic bias. Age, gender, literacy and educational characteristics of the household head are also comparable in the paper and tablet survey. There are only small differences between surveys for some 5-year age groups, such as population aged 20-24 and 65-69, representing relatively small shares of the overall population. Kenyans that have ever attended school, adult literacy rate and school enrolment for the population aged 6-13 is not statistically different between the CAPI and PAPI surveys. Also, the age, gender and education of the household head are equivalent between the tablet-based survey and KIHBS 2015/16. The food module of the tablet-based pilot included soft constrains based on outdated assumptions affecting quantities and prices collected. Incorrect soft constraints encouraged misreporting of quantities and prices in the CAPI pilot. Soft constraints or consistency checks alert enumerators of unusual responses or values to certain questions. They are set when coding the questionnaire and applied automatically during the interview, usually in the form of a warning or error message, so that they can be clarified when the enumerator is still with the respondent. The tablet questionnaire included soft constraints in the consumption module, yet they were not set to the right levels according to the latest information available. Incorrect upper bound constraints for prices induced lower unit values in the CAPI survey relative to PAPI. Also, non-standard conversion factors to kilograms or liters induced higher quantities consumed in the tablet survey. Finally, incorrect non-standard conversion factors combined with relatively high lower bound soft constraints for quantities consumed induced higher quantities for items reported in ‘handful’ and ‘piece’ units. Various adjustments are made to obtain comparable prices and quantities between the paper and the tablet survey. Comparing the paper and tablet data in terms of consumption and poverty required adjusting both datasets. The tablet food consumption aggregates are derived using the median PAPI prices for each item and EA. Then, the incorrect conversion factors to standard units for food items in CAPI are replaced by the median implicit conversion factor for each item and non-standard unit from PAPI. Finally, other minor adjustments are introduced to produce equivalent consumption aggregates between both surveys. The rapid consumption methodology with CAPI produces similar consumption and poverty results to a full consumption module in PAPI. The consumption details of all items were asked to households in the paper but not in the tablet-based survey, in line with the rapid consumption methodology. PAPI used a full consumption module asking vi Comparing a tablet-based rapid consumption with a paper-based full consumption survey households about their consumption details of 193 food and 126 nonfood items. In the RCM, households were only asked about consumption details of 125 food items and between 79 and 83 nonfood items, depending on the optional module allocated to the household. After data collection, multiple imputation techniques are used to estimate the consumption aggregate of CAPI households. Poverty point estimates are statistically the same between CAPI and PAPI based on core food consumption, which allows isolating the effect of data collection method. Poverty estimates from a core food consumption aggregate allows isolating the effect of data collection method, as there are no differences in the consumption module of both surveys since the consumption details of 91 core food items were collected from all households directly in CAPI and PAPI. The difference in poverty incidence, depth and severity between CAPI and PAPI are neither statistically significant at the national level nor for urban and rural areas. CAPI with a rapid consumption methodology produces poverty point estimates that are statistically equivalent from those obtained from PAPI and a full consumption module. Two equivalent consumption aggregates, employing the RCM for CAPI, are created and compared; i) total food consumption; and ii) total food and nonfood expenditure. The point estimates of poverty incidence, depth and severity are statistically equivalent between KIHBS 2015/16 and the tablet-based pilot at the national level. In urban areas, poverty incidence from both consumption aggregates is similar between surveys, and poverty depth and severity from total food consumption. Poverty depth and severity are slightly different when considering total expenditure. In rural areas, poverty incidence from total food consumption is also slightly different between surveys, with all other poverty measures being equivalent between CAPI and PAPI. The relatively small differences in some poverty measures for urban and rural areas –all at the 10 percent level of significance– do not seem to be associated with the choice of survey method and consumption module, since there is no evidence of a systematic bias when comparing the paper and tablet estimates for the two consumption aggregates that employed the RCM for CAPI. In addition, the distribution of total food and nonfood expenditure is similar, regardless of whether it is generated from a full consumption module or the rapid consumption methodology. The RCM combines asking households about their consumption of a subset of items and estimating the consumption aggregates with multiple imputation techniques. Nonetheless, the distribution of total food and nonfood expenditure for rural areas is equivalent between PAPI and CAPI, and the Gini inequality index not statistically different for any of the consumption aggregates considered. In urban areas, the distribution of total food and nonfood consumption is similar between PAPI and CAPI for expenditure values below KSh 9,500. Lower response rates in wealthier areas like Nairobi for the paper survey (77 percent for PAPI vs. 99 percent for CAPI) can potentially distort consumption at the top of the distribution. After excluding the capital city, the distribution of total expenditure for urban areas is statistically equivalent between the paper and tablet survey, and the Gini index similar using a core food consumption aggregate, and only statistically different at the 10 percent level of significance for total food consumption and total food and nonfood expenditure. Furthermore, the Gini inequality index increases for both CAPI and PAPI after excluding Nairobi, while the difference between surveys decreases for all the consumption aggregates. This ultimately suggests that differences in the urban Gini index between CAPI and PAPI are more likely to be explained by different response rates in other urban areas beyond the capital city, and less likely to be associated with the data collection method and the consumption module considered The KCHS would benefit from using the rapid consumption methodology and CAPI technology, but other elements also need to be considered in this decision. Using the rapid consumption methodology and CAPI would facilitate providing frequent, timely and high-quality data for the KCHS. Only around 6 percent of the items not asked to households in CAPI with vii Comparing a tablet-based rapid consumption with a paper-based full consumption survey the RCM were actually consumed. These items represent 21 percent of total food and nonfood consumption, suggesting a full consumption module might not be necessary to obtain precise consumption estimates. Including the RCM in the KCHS can considerably reduce the administering time of the questionnaire, while producing statistically insignificantly different household, population, consumption and poverty estimates to those from a paper survey with a full consumption module. In addition, collecting data with CAPI has the potential to improve data quality, eliminates the need for data entry, supports near real-time monitoring of data collection and reduces the time between fieldwork and analysis. Implementing the KCHS with the RCM and CAPI implies considering other aspects to leverage their benefits. CAPI increases the time needed to design the questionnaire and thus must be combined with constant efforts in coding and testing it before data collection. The tablet or device, together with the software and hardware will need to be procured and tested in advance. In addition, the training of enumerators will need to emphasis how to use the tablets and to correctly record answers in them. A successful monitoring system needs to ensure appropriate data management and transfer from the devices to the cloud server and define how to provide timely feedback to teams in the field. Additionally, training and capacity building will be needed to derive total consumption aggregates with multiple imputation techniques as part of the rapid consumption methodology, besides a providing documentation and support to data users. viii Comparing a tablet-based rapid consumption with a paper-based full consumption survey A. INTRODUCTION 1. After 10 years the Kenya National Bureau of Statistics (KNBS) conducted a nationwide household survey between September 2015 and August 2016 to assess the progress made in living standards. The Kenya Integrated Budget Household Survey (KIHBS) 2015/16 collected representative data on various socio-demographic and welfare indicators in each of the 47 counties. The aim was to provide an update on poverty, and to review the progress achieved over the last decade, comparing against data from the previous household survey from 2005/06.1 2. The Kenyan Continuous Household Survey (KCHS), funded by the World Bank, has the objective to close crucial data gaps and make available timely high-quality data. A ten-year gap between KIHBS 2005/06 and 2015/16 prevented Kenya from understanding the profile of the poor and trends in poverty reduction. In addition, a processing and analysis time of two years for the 2015/16 data limited stakeholders from having timely information. The KNBS, with support of the World Bank through the Kenya Statistics Program-for-Results (KSPforR), implements the Kenyan Continuous Household Survey (KCHS) to collect quarterly and annual data on labor and poverty indicators. 3. Improving data availability and quality with the KCHS will contribute to increasing the evidence base and facilitate decision-making in Kenya. The KCHS will generate quarterly data nationally representative, and annual data representative at the county level. Frequent and high-quality data will allow producing more precise and credible analysis, which will support national and local stakeholders in making informed decision in terms of monitoring, planning and resource allocation in Kenya. 4. The KCHS would benefit from setting up a modern data monitoring system and the best choice of consumption module and data collection method. The design and monitoring of the survey need considering various trade-offs associated with the choice of consumption modules and data collection methods like the preparation and administration time and quality of the monitoring system. Using a tablet or computer-assisted personal interviewing (CAPI) method can create additional challenges, including configuring and trouble-shooting tablets, data transmission and storage, we well as others.2 Yet, it offers new opportunities to increase the quality of data collection, real-time monitoring of incoming data with effective feedback loops as well as near real-time analysis of the data to facilitate a timely release of results. In addition, the design of the survey needs to assess carefully the feasibility of including a full consumption module against the alternative of a rapid consumption methodology.3 5. In a preparatory step, the KNBS conducted a tablet-based pilot alongside the implementation of the 2015/16 paper-based KIHBS. The Kenya Integrated Household Budget Survey (KIHBS) in 2015/2016 collected data with pen-and-paper interviewing (referred to as PAPI or paper survey). At the same time, a pilot was implemented using computer-assisted personal interviewing (referred to as CAPI or tablet survey). The sampling design of both surveys involved three phases: first 2,400 clusters from the National sample survey and Evaluation Programme (NASSEP) V sampling frame were draw. Next, 16 households were selected in each cluster or enumeration area (EA). Finally, 10 households out of these 16 were selected for the KIHBS survey while the remaining 6 for the tablet-based pilot (Figure 1). The questionnaires administered differ in some important aspects. The paper survey utilized a full consumption module as well as detailed information on many livelihood characteristics, while the tablet- 1 The analysis has been published in (Kenya National Bureau of Statistics 2018) and (World Bank 2019). 2 Demombynes, Gubbins, and Romeo 2013. 3 U. Pape and Mistiaen 2015b and U. J. Pape and Mistiaen 2018a. 1 Comparing a tablet-based rapid consumption with a paper-based full consumption survey based survey used the rapid consumption methodology (RCM) combined with a focus on essential questions throughout all other questionnaire modules (Figure 2). Figure 1: Characteristics of the paper and tablet-based survey. No. of Survey Sampling households Objective method frame per EA Pen-and- Assess the 5th National sample 10 households PAPI paper interviewing progress made in living survey & Evaluation Programme per EA (PAPI) standards (NASSEP V) No. of Survey Sampling households Objective method frame per EA Compute- 5th National sample CAPI assisted Compare survey methods & survey & Evaluation 6 households per EA personal inform future Programme interviewing surveys (NASSEP V) (CAPI) Source: Authors’ elaboration. Figure 2: Structure of the questionnaire for both surveys. Household Consumption Details of the Assets member Rent modules dwelling roster Rent values Material, water, Food and Detailed Age, gender, sanitation, lighting education and for urban non-food module on households and cooking consumption assets PAPI labor Energy use, details Health, fertility & agriculture holding deaths, child & output, livestock, health, ITC enterprises, other service & income, transfers, domestic tourism shocks, food security, justice, credit & ITC Household Consumption Details of the Assets member Rent modules dwelling roster CAPI Age, gender, Rent values Material, water, Ownership Food and education and for urban sanitation, Yes/No non-food labor households lighting and consumption cooking details Source: Authors’ elaboration. 6. This technical note compares indicators estimated based on the paper- and tablet-based surveys, focusing on consumption and poverty providing an assessment of the RCM vis-à-vis the traditional full consumption module. The setup of the tablet-based pilot allows for direct comparison of the data collection methods (CAPI vs. PAPI) on estimated indicators. In the context of consumption and poverty estimates, the paper-based traditional full consumption methodology is compared with the rapid 2 Comparing a tablet-based rapid consumption with a paper-based full consumption survey consumption results, even though the different forms of administration (CAPI vs. PAPI) can influence the comparison. The results of the analysis can help informing the design of the KCHS, for example whether consumption can be accurately measured by the rapid consumption methodology, considerably reducing the administering time of the questionnaire. The note documents the design recommendations and their justification, which foster transparency and provide learning resources, with potential impacts beyond Kenya. 3 Comparing a tablet-based rapid consumption with a paper-based full consumption survey B. DATA COLLECTION METHODS - Differences between PAPI (paper) and CAPI (tablet) 7. PAPI is the traditional method in which enumerators fill out paper questionnaires, while CAPI refers to interviews being conducted with the assistance of a computer or tablet. In the pen-and-paper interviewing (PAPI) data collection method, the interviewer or enumerator proceeds question by question asking and recording the respondents’ answers on a printed paper template or questionnaire. As a result, the data needs to be transferred to digital format through manual data entry or automatically (scanned) after the interview has concluded. In CAPI, the enumerator uses a tablet, smartphone or computer preloaded with an electronic version of the questionnaire to record responses directly on the device (Table 1). 8. Using CAPI eliminates the need for data entry, allowing for data to be available in real time. One of the main features of CAPI is that interviews are conducted with computers or tablets and the answers are immediately entered into the device, which eliminates the need for data entry of each questionnaire. CAPI completed interviews can be available immediately, as long as the devices are connected to a cloud server and submissions transmitted on a regular basis, which ultimately facilitates real-time processing and regular monitoring of fieldwork by verifying data collection progress and identifying challenges in the field, such as low-performing enumerators. 9. Data quality can be improved with CAPI by preventing enumerators from skipping questions and introducing dynamic checks and constraints. Introducing skip patterns and consistency checks in the questionnaire can improve the quality of the data collected. Skip patterns or automated routing is one of the most relevant characteristics of CAPI in order to reduce the number of incorrect entries by preventing enumerators from asking a question that should be skipped, or conversely skipping a question that should be asked to the respondent. 4 They can decrease the length of the interview, avoid confusing the respondent and save time when processing the data. It can also improve data quality by reducing item non-response. CAPI allows complex skip patterns that would be complicated to implement manually for enumerators. Moreover, dynamic checks and constraints can also be included in the tablet-based questionnaire to flag suspiciously high or low data entries requiring confirmation from the enumerators before proceeding. Dynamic checks and constraints allow identifying and confirming data entries while the enumerator is still with the respondent, where the best contextual knowledge is available. 10. Other features of CAPI include recording of date and time, GPS positions and multimedia content. Conducting interviews with CAPI allows exploiting some of the functionalities available in the devices used to collect the data. Data and time stamps can be recorded automatically at the beginning and end of each module, which facilitates monitoring of fieldwork. In addition, recording GPS positions permits tracking the locations of teams in the field to determine whether interviews are conducted at the correct locations. Audio, photos and video can also be recorded using the devices employed in the survey. Other features include complex functions such as randomly allocating a module to a household or selecting a household for interview among a list of potential respondents. 4 Banks and Laurie 2000; Caeyers, Chalmers, and De Weerdt 2012. 4 Comparing a tablet-based rapid consumption with a paper-based full consumption survey Table 1: Main characteristics of data collection methods. Characteristic PAPI or paper-based survey CAPI or tablet-based survey Each enumerator requires a device with the Requires several computers with a data-entry survey’s software. Also, a cloud server is Hardware and software software. needed for centralized management of submissions. In addition to usual topics covered (such as In addition to usual topics covered (such as sampling, questionnaire content and survey sampling, questionnaire content and survey Training of enumerators objectives among others), the emphasis is on objectives among others), it concentrates on complex or numerous skip patterns. how to use the device and record the answers. The questionnaire and a data entry template Design of the The questionnaire needs to be coded and must be designed and printed before questionnaire tested before loading it into the devices. fieldwork. Single option, multiple choice and written Single option, multiple choice and written responses. In addition, GPS locations, time Type of questions responses. stamps, multimedia content and other complex operations. A secure system is designed to transport paper Interviews are submitted daily to the cloud Transportation and interviews from the field to the offices, and to server via 3G or Wifi connection, where the storage stored them securely. information is stored securely. Real-time and remote monitoring of fieldwork Survey managers need to review completed Quality controls is possible, since submitted interviews are interviews to identify errors. available immediately. Responses are immediately entered in a digital Responses must be entered I the pre-defined Data entry format as they are recorded with the tablet or template by data-entry operators. computer. The data is ready for processing only after the interviews have been entered, which typically The data processing or cleaning can begin Data processing involves two data-entry exercises and immediately after fieldwork has concluded. corrections for discrepancies between both. Management of Respondents can type their own answers A secure protocol must be put in place to sensitive and personal directly on the device, and completed handle personal data. data interviews are encrypted. Source: Authors’ elaboration. 11. Besides improving data quality, migrating from PAPI to CAPI can bring about additional benefits in terms of cost savings. Using a CAPI data collection method increases the time needed to set up the survey and the fixed up-front costs from procuring the hardware and hiring staff to code the questionnaire. However, both time and cost are compensated in CAPI since data entry is no longer 5 Comparing a tablet-based rapid consumption with a paper-based full consumption survey required, and the time needed to conduct the interview and clean the data is reduced.5 The costs in CAPI can be around 75 percent of those from PAPI.6 In addition, since the largest cost in CAPI correspond to fixed costs, the use of this method is more suited for large and ongoing surveys like the KCHS. - Evidence for the use of CAPI 12. CAPI was tested in Europe in the mid-80s and used for the first time to collect data as part of the Netherlands Labor Force Survey in 1987. While the first Computer-Assisted Telephone Interviews were implemented in the USA in 1971 by marketing companies, the use of CAPI technology to conduct face-to-face interviews took longer. CAPI was tested in Europe by Statistics Sweden in 1982 and by Statistics Netherlands in 1984.7 A few years later, the first nationwide CAPI survey was implemented to collect all the data from the Netherlands Labor Force Survey in 1987. 8 Since then, the use of CAPI techniques to conduct surveys has proliferated, pushed by the availability of cheaper computers or tablets and higher network connectivity.9 More importantly, CAPI has proven to work in most contexts, even with low infrastructure and institutional capacity.10 13. CAPI tends to improve data quality compared to collecting data with a paper-based method. A randomized controlled trial with 1,840 households in Zanzibar comparing data obtained from CAPI and PAPI surveys found that the PAPI data contained many errors, which are practically eliminated with the use of CAPI. 11 Moreover, these errors in PAPI are not randomly distributed across the sample but correlated with certain household characteristics, thus dropping or excluding them from the analysis would introduce a bias. Other common benefits from using CAPI compared to PAPI are associated with increased speed and efficiency, and savings in time and cost from implementing the survey.12 14. Automated routing or skip patterns are crucial features for reducing errors in a tablet-based survey. Skip patterns are an essential element of CAPI software. They are configured to ask or skip certain questions conditioned on the respondents answer to a previous question. With this functionality, both the interviewed household and enumerator are guided through the tablet questionnaire. Most errors found in PAPI due to skip patterns can be avoided in CAPI.13 Also, CAPI leads to less missing data, relative to PAPI, since enumerators cannot skip some questions, ultimately improving the quality of the information collected.14 15. Similarly, soft constraints or consistency checks help reducing the number of incorrect data entries, especially for consumption data. The number of interviews with impossible or missing values can be reduced to nearly zero when using CAPI, as a direct result from introducing consistency checks.15 The impact on data quality is especially relevant for the consumption modules which are the basis for estimating poverty rates. In the case of sales and profits data from microenterprises, consistency checks also help decreasing number of implausible values recorded.16 5 Caeyers, Chalmers, and De Weerdt 2012; Glewwe and Hoang Dang 2008; King et al. 2013; Zhang et al. 2012. 6 Leisher 2014. 7 Danielsson and Maarstad 1982; Bemelmans-Spork and Sikkel 1985. 8 E. de Leeuw and Nicholls 1996. 9 Schräpler, Schupp, and Wagner 2010. 10 Prydz 2013. 11 Caeyers, Chalmers, and De Weerdt 2012. 12 E. D. de Leeuw 2008; Banks and Laurie 2000; Rosero-Bixby et al. 2005. 13 Caeyers, Chalmers, and De Weerdt 2012. 14 Sebestik et al. 1988; Olsen, n.d.; E. D. de Leeuw 2008. 15 Caeyers, Chalmers, and De Weerdt 2012. 16 Fafchamps et al. 2014. 6 Comparing a tablet-based rapid consumption with a paper-based full consumption survey 16. Moreover, respondent’s perceptions tend to improve or remain unchanged with the use of CAPI technology relative to PAPI. Both interviewers and respondents have been found to have positive opinions about conducting the survey with CAPI technology in the early days of CAPI. 17 Besides, this positive attitude was also perceived by population over 70 years which are, in general, less exposed to technology.18 If the perception from migrating to CAPI is not positive, usually there are no differences in respondents perception between CAPI and PAPI.19 Furthermore, when migrating from a PAPI to CAPI method, the response rates are unlikely to change as seen in the British Household Panel Survey in 1998.20 Similarly, the migration to CAPI in the German Socio-Economic Panel (SOEP) of 1998 had no effect on respondents acceptance of CAPI or response rates.21 17 Couper and Burt 1994; Nicholls and De Leeuw 1996. 18 Taylor 1998. 19 Caeyers, Chalmers, and De Weerdt 2012. 20 Banks and Laurie 2000. 21 Schräpler, Schupp, and Wagner 2010. 7 Comparing a tablet-based rapid consumption with a paper-based full consumption survey C. RESULTS FROM THE TABLET -BASED SURVEY - Use of soft constraints in the food consumption module 17. Incorrect soft constraints can encourage misreporting of quantities and prices. One of the advantages from CAPI technology is the use of soft constraints in the questionnaire. They allow to confirm data entries in the field, which is where the best contextual knowledge is available. The aim is to minimize the number of incorrect data entries. Nevertheless, soft constraints need to be implemented carefully to avoid flagging entries as unusual when in fact they are within a normal range, especially when there are other surveys that can help informing at what level to set these thresholds. 18. The tablet questionnaire included soft constraints in the food consumption module based on outdated assumptions affecting quantities and prices collected. For each item considered in the tablet- based survey, the questionnaire included soft constraints for the lower and upper bound of unit prices, the lower and upper bound of quantities consumed, as well as implicit conversion factors from non- standard units of quantities reported to kilograms or liters. If the respondent’s answer was above or below these thresholds, enumerators were prompted with a message saying the quantity/price was too low/high and they had to confirm or correct the value reported (Box 1). In the CAPI survey, the incorrect use of soft constraints based on outdated assumptions induced higher/lower values reported compared to prices and quantities from the paper survey. Table 2: Example of upper bound soft constraints for cocoa & cocoa products. Minimum value Maximum value Item Unit (KSh per Kg) (KSh per Kg) Cocoa & cocoa products Cup 50 400 Cocoa & cocoa products Grams 50 400 Cocoa & cocoa products Kilograms 50 400 Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. 19. Incorrect upper bound constraints for prices induced lower unit values of food items in the CAPI survey relative to PAPI. The upper bound of the soft constraint for unit prices was set at a relatively low level, compared to the PAPI data. During fieldwork, enumerators were constantly reporting unit values beyond this threshold, and thus they were repeatedly asked to confirm the entries. The constant flagging of unusual high values –even though incorrectly– affected the data collection process and ultimately led to lower unit prices. For example, for cocoa and cocoa product, the upper bound of the soft constraint was set at KSh 400 per kilogram, regardless of the unit in which households reported the consumption of this item (Table 2). As a result, unit prices collected with CAPI range between KSh 56 and 333 KSh, while between KSh 750-857 in the PAPI dataset (Figure 3). A soft constraint of KSh 400 per kilogram is unrealistically low, when considering the PAPI values, and is likely to have induced lower unit values in CAPI. Around 70 percent of the core food items with the largest price differences between CAPI and PAPI have median prices in the paper survey that would place them outside of the CAPI soft constraints. Thus, this is a generalized issue across items that systematically affected unit prices of food items in the tablet- based survey. 8 Comparing a tablet-based rapid consumption with a paper-based full consumption survey Box 1: Soft constraints in the tablet-based survey Soft constraints or consistency checks alert enumerators of unusual values recorded. Including consistency checks in the tablet questionnaire prevents collecting incorrect information. Enumerators are alerted of unusual responses to certain questions. These checks are set when coding the questionnaire and applied automatically during the interview so that they can be corrected when the enumerator is still with the respondent, where the best contextual knowledge is available. In such cases, enumerators receive a warning or error message. A warning message flags an unlikely response, outlier or a contradictory answer, and asks the enumerator to confirm the response. For consumption values, they are a common practice to identify quantities and prices not lying within reasonable boundaries. An example of this is presented in the figure below from the tablet interface flagging an unusually high price for rice. An error message is a stronger type of constraint preventing enumerators from moving on to the next question until the issue is corrected. 20. Non-standard conversion factors to kilograms or liters induced higher quantities consumed in the tablet survey relative to the paper survey. In the tablet-based survey, households were able to report the consumption of food items in standard units (kilograms or liter), but also in non-standard units (e.g. a bowl, bunch, cup, etc.). The CAPI questionnaire included automatic conversion factors for these non- standard units into kilograms or liters, yet they were set at higher level compared to the conversion factors implicit in the PAPI data. For the non-standard unit ‘bowl’, the conversion factor to kilograms in the tablet questionnaire was the same for all items and corresponded to 0.7 kilograms, which is not in line with the results from the paper survey. The implicit conversion factor from the paper data oscillates around 0.21 and 0.25 (Table 3) depending on the item considered. The conversion factors included in CAPI are around 3 times larger than the observed values from PAPI, inducing artificially higher quantities consumed in the former compared to the latter. In 80 percent of the core food items, the median quantity purchased in CAPI was equal or higher than the respective value from PAPI. While the original data collected was not 9 Comparing a tablet-based rapid consumption with a paper-based full consumption survey affected, quantities estimated in non-standard units were systematically affected by the conversion factors considered. Figure 3: Distribution of unit prices for cocoa & cocoa products. .04 .03 Density .02 .01 0 0 200 400 600 800 Median unit price across EAs (Kshs per Kg) PAPI CAPI Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. Table 3: Conversion factors to kilogram for non-standard unit bowl. CAPI: Conversion PAPI: Median quantity Item Unit consumed in kilograms factor to kilograms Aromatic unbroken rice Bowl 0.70 0.22 Broken white rice Bowl 0.70 0.25 Brown rice Bowl 0.70 0.21 Millet grain Bowl 0.70 0.25 Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. 21. Moreover, incorrect non-standard conversion factors combined with relatively high lower bound soft constraints for quantities consumed induced higher quantities for items reported in ‘handful’ and ‘piece’ units. The soft constraints of quantities consumed included in the CAPI questionnaire are set in standard units (kilogram or liters). Hence, quantities reported in non-standard units were automatically converted –with CAPI technology– to standard units using the incorrect conversion factors, and then soft constraints validated. The incorrect non-standard conversion factors together with a relatively high lower bound soft constraint for quantities consumed implied that quantities reported by households were usually flagged as being too low for items specified in ‘handful’ and ‘piece’ units, ultimately affecting the data collected. For example, the consumption of eggs in the tablet-based survey was exclusively reported in ‘piece’ units, which has a mean quantity consumed of 0.6 kilograms, compared to 0.14 kilograms from the paper data (Figure 4). This is a separate issue from the previous one described 10 Comparing a tablet-based rapid consumption with a paper-based full consumption survey since the data collected was directly affected and adjusting the conversion factor is unlikely to produce equivalent figures to those from PAPI. However, the issue concentrates on values reported in ‘handful’ and ‘piece’ units. Figure 4: Distribution of quantity consumed for eggs. 8 6 Density 42 0 0 .5 1 1.5 2 Quantity consumed (Kg per AE) PAPI CAPI Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. - Implementation of the rapid consumption methodology 22. The consumption details of all items were asked to households in the paper but not in the tablet-based survey, in line with the rapid consumption methodology. PAPI used a full consumption module asking households about their consumption details of 193 food and 126 nonfood items. In the rapid consumption methodology, households were only asked about their consumption of 125 food items, 91 core and 34 additional items; and between 79 and 83 nonfood items, 58 core and between 21 and 25 additional items depending on the optional module systematically allocated to the household (Table 4 and Box 2).22 After data collection, multiple imputation techniques are used to estimate the consumption aggregates of CAPI households. 23. Only around 6 percent of the items excluded in the RCM, representing 21 percent of total food and nonfood consumption, were actually consumed by households. The tablet survey asked households about their consumption of 149 core items, as well as additional 34 food items and 21 to 25 nonfood items, depending on the optional module (Table 4). Only around 6 percent of the items not asked to households in CAPI with the RCM were actually consumed. These items represent up to 21 percent of total food and nonfood consumption. Items asked to households represent around 80 percent of the total food and nonfood expenditure. Furthermore, out of the 68 food items and 43 to 47 nonfood not asked, CAPI households only consumed on average around 3.2 to 3.4 and 2.8 to 3.0 respectively. As a result, the 22The RCM allocates food and nonfood items into a core module and nonoverlapping optional modules according to their consumption shares. The tablet-based pilot employed a conservative approach by allocating a large number of items to the core module which was asked to all households. However, the RCM can be implemented assigning less items into the core module and further reducing the duration of interviews, while still deriving accurate poverty estimates (U. J. Pape and Mistiaen 2018). 11 Comparing a tablet-based rapid consumption with a paper-based full consumption survey distribution of total expenditure is similar for collected and imputed items, indicating the rapid consumption methodology produces precise consumption aggregates, while contributing to reduced fatigue of respondents and the time spent during fieldwork (Figure 5). Box 2: Rapid consumption methodology. Poverty is an indicator of paramount importance for gauging socio-economic wellbeing of a population. Consumption-based poverty measures in which poverty is defined as those whose consumption level falls below the poverty line—the threshold consumption-level for sustaining a minimum level of welfare for healthy living—are widely used in the developing world and play a critical role in policy decisions. Measurement of consumption, however, has traditionally been very time consuming, which is a serious issue particularly in the context of a continuous survey. A household consumption questionnaire contains a series of questions about the monetary value of each consumption item, whether it is purchased, self-produced, or bartered. With around 300 to 400 items, including both food and nonfood items, the time for administering the questionnaire often exceeds two hours. In addition to the high administration cost due to the long interview time, measurement errors may become significant towards the end of the questionnaire as respondents get tired. To overcome this challenge, a new methodology can be used in which we combine an innovative questionnaire design with standard imputation techniques. The new methodology allows us to substantially shorten the consumption questionnaire and reduce the administering time by imputing missing consumption values for items that are not explicitly asked. The proposed methodology allows to derive accurate poverty estimates in less than 60 minutes of administering time per household.23 The rapid consumption survey methodology involves five main steps. First, core items are selected based on their importance for consumption. Second, the remaining items are partitioned into optional modules. Third, optional modules are randomly assigned to groups of households. Fourth, after data collection, consumption of optional modules is imputed for all households. Fifth, the resulting consumption aggregate is used to estimate poverty indicators. 24. The median duration of interview in CAPI with the RCM was 162 minutes, with food and nonfood modules representing nearly two thirds of the total administering time. The rapid consumption methodology facilitates saving time during data collection by not asking all household the consumption details of every item. The median duration of interview in the tablet survey was 162 minutes, with a median of 59 and 37 minutes for the food and nonfood module respectively. Responding the food and nonfood modules represented an average of 38 and 24 percent respectively from the total duration of the interview. Reducing time during fieldwork can be achieved by employing the RCM which concentrates on asking certain items to each household since i) consumption modules tend to be time-consuming; and ii) the consumption aggregates can be precisely estimated using multiple imputation techniques after fieldwork. 23 U. Pape and Mistiaen 2015a; U. J. Pape and Mistiaen 2018b. 12 Comparing a tablet-based rapid consumption with a paper-based full consumption survey Table 4: Allocation of consumption items into modules. Food Nonfood Module Share of total Mean no. of Share of total Mean no. of Total no. of Total no. of expenditure items expenditure items items items (%) consumed (%) consumed Core 91 57 17.3 58 14 11.7 Module 1 34 6 1.7 22 3 1.3 Module 2 34 7 1.7 21 5 1.5 Module 3 34 7 1.5 25 3 1.5 Source: Authors’ calculation based on the tablet-based pilot. Figure 5: CAPI total food and nonfood expenditure collected and imputed. 100 90 80 70 % of population 60 50 Core 40 Core + assigned module 30 Imputed 20 10 0 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000 20,000 Monthly per adult equivalent food and nonfood expenditure (deflated) Source: Authors’ calculation based on the tablet-based pilot. 13 Comparing a tablet-based rapid consumption with a paper-based full consumption survey D. RESULTS FROM THE COMPARISON OF CAPI VS. PAPI 25. The data from both surveys was processed in almost identical ways. Both surveys are based on the same sampling frame and design. Data was collected in the same randomly drawn geographic locations or enumeration areas. The data collected with CAPI was validated and cleaned in a similar way as the KIHBS 2015/16 data. Moreover, the sampling weights for households included in the CAPI survey are estimated from the weights of households in the same enumeration area (EA) in the PAPI survey (Annex). - Population and household characteristics 26. Easy to verify dwelling characteristics show no statistical difference between the paper-based survey and tablet-based survey. The details of dwelling characteristics were asked in a similar sequence for both the paper and tablet survey and yield no significant differences. The material of the floor, roof and walls are not significantly different between the paper and tablet survey at the national level, but also for rural and urban areas (Table 5). Table 5: Dwelling characteristics. National Urban Rural Characteristic CAPI PAPI Diff. CAPI PAPI Diff. CAPI PAPI Diff. Roof: Corrugated iron 80.8 81.7 No 75.3 77.0 No 83.9 84.3 No sheets (%) Roof: Concrete (%) 7.5 6.7 No 19.8 17.8 No 0.5 0.3 No Roof: Other (%) 11.7 11.6 No 4.9 5.2 No 15.6 15.3 No Floor: Cement (%) 48.3 47.3 No 73.0 72.6 No 34.2 32.8 No Floor: Carpet or ceramic 7.5 7.1 No 17.0 15.9 No 2.0 2.0 No (%) Floor: Earth, sand & other 44.2 45.6 No 10.0 11.5 No 63.8 65.2 No (%) Walls: Wood planks or 8.4 8.6 No 2.2 2.1 No 12.0 12.4 No shingles (%) Walls: Corrugated iron 7.3 8.3 No 12.7 15.8 No 4.3 4.0 No sheets (%) Walls: Mud & tone or 32.5 32.9 No 8.8 8.4 No 46.2 46.9 No bamboo with wood (%) Walls: Stone w/lime, 51.7 50.2 No 76.2 73.7 No 37.5 36.7 No cement, bricks & other (%) Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. Significance level: 1% (***), 5% (**), and 10% (*). 27. Water sources, sanitation, lighting and cooking fuel of households are also not significantly different between the paper and tablet survey. Similar to the previous result, there are no significant 14 Comparing a tablet-based rapid consumption with a paper-based full consumption survey differences at the national level on point estimates for water sources, type of sanitation, lighting or cooking source between households included in the paper and tablet-based survey (Table 6). For urban and rural areas, there are some differences between paper and tablet, mainly in categories that refer to other water sources or cooking fuels. However, it seems unlikely that the differences are associated with the data collection method employed since some CAPI estimates are larger compared to PAPI estimates, while others smaller, suggesting the lack of a systematic bias. Table 6: Water, sanitation, cooking and lighting source. National Urban Rural Characteristic CAPI PAPI Diff. CAPI PAPI Diff. CAPI PAPI Diff. Water: Piped into dwelling, 30.8 30.3 No 54.6 52.4 No 17.0 17.6 No plot or year (%) Water: Public tap or stand 13.4 13.9 No 24.0 24.1 No 7.3 8.1 No pipe (%) Water: Tubewell or 7.4 6.6 No 6.1 3.9 No 8.2 8.1 No borehole with pump (%) Water: Protected well or 15.0 15.6 No 3.5 4.3 No 21.6 22.1 No spring (%) Water: Unprotected well or 13.8 15.1 No 9.0 12.2 -3.3** 16.6 16.7 No spring & other (%) Water: nature/rain (%) 19.6 18.5 No 2.9 3.1 No 29.2 27.3 No Sanitation: VIP or pit latrine 47.1 46.0 No 44.3 43.5 No 48.7 47.5 No with slab (%) Sanitation: pit latrine 24.1 25.1 No 6.7 6.8 No 34.1 35.6 No without slab or open pit (%) Sanitation: flush & other 28.8 28.9 No 48.9 49.7 No 17.3 16.9 No (%) Lighting: is electricity (%) 41.5 41.4 No 80.0 80.0 No 19.4 19.3 No Lighting: is paraffin or 35.6 35.0 No 14.1 13.3 No 48.0 47.5 No pressure lamp (%) Lighting: is other (%) 22.9 23.6 No 5.9 6.7 No 32.6 33.2 No Cooking: charcoal (%) 15.5 14.6 No 26.8 23.3 3.5** 8.9 9.6 No Cooking: kerosene (%) 13.7 14.0 No 32.7 32.8 No 2.8 3.1 No Cooking: LPG & other (%) 15.8 16.9 No 32.6 37.2 No 6.2 5.2 1.1* Cooking: firewood (%) 55.0 54.6 No 7.9 6.8 No 82.1 82.1 No Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. Significance level: 1% (***), 5% (**), and 10% (*). 28. Age, gender, literacy and educational characteristics from the tablet survey are not statistically different to those from the paper survey. The household roster is asked early in both paper and tablet interviews following the same set of questions. When comparing the tablet pilot against KIHBS 2015/16, there are only small differences for certain 5-year age groups, representing relatively small shares of the population. Moreover, Kenyans that have ever attended school, adult literacy rate and school enrolment for the population aged 6-13 is similar for both paper and tablet surveys (Table 7). 15 Comparing a tablet-based rapid consumption with a paper-based full consumption survey 29. In line with previous findings, the profile of the household head is the same regardless of the data collection method considered. The age, gender and education of the household head is also equivalent between the tablet-based survey and KIHBS 2015/16 for urban areas and at the national level (Table 8). For rural areas, the share of women as the head of the household is slightly larger in the paper- based survey. Table 7: Population by age, gender and education. National Urban Rural Characteristic CAPI PAPI Diff. CAPI PAPI Diff. CAPI PAPI Diff. Share of women (%) 50.8 50.6 No 51.4 49.6 1.8** 50.5 51.1 No 0-4 Years (%) 13.5 13.4 No 13.9 13.0 No 13.4 13.5 No 5-9 Years (%) 14.1 14.3 No 11.8 11.6 No 15.1 15.5 No 10-14 Years (%) 13.7 13.3 0.5* 10.3 9.3 No 15.2 14.8 No 15-19 Years (%) 11.0 11.1 No 9.3 8.6 No 11.7 12.1 No 20-24 Years (%) 8.4 9.0 -0.6** 11.1 12.8 -1.7** 7.2 7.4 No 25-29 Years (%) 8.4 8.1 No 12.8 12.5 No 6.6 6.3 No 30-34 Years (%) 6.7 6.8 No 9.1 10.0 No 5.7 5.5 No 35-39 Years (%) 5.7 5.5 No 7.2 6.9 No 5.1 4.9 No 40-44 Years (%) 4.4 4.5 No 4.9 4.8 No 4.2 4.3 No 45-49 Years (%) 3.3 3.2 No 3.4 3.3 No 3.2 3.2 No 50-54 Years (%) 2.8 2.7 No 2.3 2.7 No 3.0 2.8 0.2* 55-59 Years (%) 2.2 2.4 No 1.4 1.7 No 2.6 2.7 No 60-64 Years (%) 2.2 2.1 No 1.3 1.2 No 2.5 2.5 No 65-69 Years (%) 1.1 0.9 0.2*** 0.6 0.3 0.3*** 1.3 1.1 0.2* 70-74 Years (%) 0.9 0.9 No 0.3 0.3 No 1.2 1.2 No 75-79 Years (%) 0.7 0.6 No 0.3 0.3 No 0.8 0.8 No 80-84 Years (%) 0.5 0.5 No 0.1 0.2 No 0.6 0.6 No 85+Years (%) 0.5 0.6 No 0.2 0.3 No 0.7 0.7 No Ever attended school (%) 89.3 89.5 No 94.7 94.9 No 87.0 87.4 No Adult literacy (% of 25+ 66.1 66.9 No 81.1 82.5 No 62.1 62.7 No years) Enrolment rate (% of 6- 94.9 94.4 No 96.5 96.2 No 94.4 93.9 No 13 years) Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. Significance level: 1% (***), 5% (**), and 10% (*). 30. Additionally, imputed monthly rent expenditure for urban households is also not statistically different between the paper and tablet survey. Information on rent expenditure was collected in both surveys. However, the data is not available for those households that own their dwelling. Thus, rent expenditure was imputed by estimating a stepwise log-linear Ordinary Least squares (OLS) regression of rents reported on housing characteristics (Annex). The average imputed monthly rent expenditure for 16 Comparing a tablet-based rapid consumption with a paper-based full consumption survey urban households from KIHBS 2015/16 is KSh 3,741, compared to KSh 4,026 generated from the CAPI data and regression model. Yet, the difference between both is not statistically significant. Table 8: Characteristics of the household head. National Urban Rural Characteristic CAPI PAPI Diff. CAPI PAPI Diff. CAPI PAPI Diff. Share of women (%) 32.2 32.4 No 28.8 26.8 No 34.2 35.7 -1.4* Average age (%) 43.47 43.43 No 37.40 37.27 No 46.95 46.97 No Education: None or other 13.9 13.2 No 5.1 4.6 No 18.7 18.2 No (%) Education: Pre-primary 46.0 46.1 No 32.8 34.1 No 53.2 53.0 No or primary (%) Education: Secondary or 40.1 40.6 No 62.0 61.3 No 28.0 28.8 No tertiary (%) Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. Significance level: 1% (***), 5% (**), and 10% (*). - Consumption aggregates and poverty estimates 31. Various adjustments are made to obtain comparable consumption aggregates between the paper and the tablet survey. To compare the paper and tablet surveys in terms of consumption and poverty incidence, equivalent consumption aggregates are created with some adjustments to both datasets. The comparable consumption aggregates are obtained first by considering the same food and nonfood items included in both surveys. Given the issue with prices collected in CAPI, the tablet aggregate was derived using the median PAPI prices for each item and EA. Moreover, the conversion factors of food items to standard units in CAPI are replaced by the median implicit conversion factor for each item and non-standard unit from PAPI. Finally, other minor adjustments are introduced to produce equivalent consumption aggregates between both surveys (Table 13 in the Annex). 32. A core food consumption aggregate from information collected for all households in both surveys allows a direct comparison of the data collection method. The first consumption aggregate considers a subset of 91 food consumption items, which are classified under the rapid consumption methodology as core items given their relatively high consumption share (Box 2). The consumption details of these 91 items were collected from all households directly in both the paper and tablet survey. This allows a direct comparison of PAPI against CAPI, as there are no differences in the consumption module of both surveys for these items. To obtain the poverty status of households the core food consumption aggregate was then compared against a national food poverty line which was rescaled using the PAPI share of core food items from the total comparable food consumption aggregate.24 33. Two additional consumption aggregates, total food consumption and total food and nonfood expenditure, employed the rapid consumption methodology for CAPI. The paper survey included a full consumption module, while CAPI households were only asked about their consumption of certain food and nonfood items. Thus, multiple imputation techniques are employed to estimate total food consumption and total expenditure on food and nonfood items from the tablet-based survey (Box 2 and 24 Olson Lanjouw and Lanjouw 2001. 17 Comparing a tablet-based rapid consumption with a paper-based full consumption survey Annex).25 Total food consumption was compared against the national food poverty line to estimate the share of population that is unable to meet the minimum basic food consumption needs.26 The list of nonfood items from the tablet-based survey excluded some items that are considered in KIHBS 2015/16 and in the computation of the national poverty line. Therefore, a comparable total expenditure aggregate –from food and nonfood items– for CAPI and PAPI were compared against a rescaled national poverty line, using the PAPI share of items in the comparable aggregate from the total KIHBS 2015/16 aggregate as scaling factor. Table 9: Poverty estimates. National Urban Rural CAPI PAPI Diff. CAPI PAPI Diff. CAPI PAPI Diff. Core food consumption 33.0 33.2 26.9 26.6 35.7 35.8 Poverty incidence (%) (0.746) (0.648) No (1.505) (1.290) No (0.834) (0.723) No 10.1 9.9 8.4 8.1 10.9 10.7 Poverty depth (%) (0.328) (0.266) No (0.694) (0.441) No (0.359) (0.321) No 4.7 4.4 4.1 3.6 5.0 4.8 Poverty severity (%) (0.226) (0.166) No (0.532) (0.248) No (0.228) (0.208) No Total food consumption 32.4 33.8 27.4 27.4 34.5 36.3 Poverty incidence (%) (0.736) (0.640) No (1.568) (1.264) No (0.805) (0.721) -1.8* 10.2 9.8 9.2 8.3 10.6 10.4 Poverty depth (%) (0.326) (0.248) No (0.713) (0.432) No (0.355) (0.296) No 4.4 4.3 4.2 3.7 4.5 4.5 Poverty severity (%) (0.210) (0.145) No (0.522) (0.234) No (0.201) (0.179) No Total food and nonfood consumption 36.2 37.1 31.7 31.1 38.1 39.5 Poverty incidence (%) (0.736) (0.667) No (1.592) (1.472) No (0.801) (0.719) No 11.8 11.1 11.5 9.8 12.0 11.7 Poverty depth (%) (0.344) (0.269) No (0.759) (0.552) 1.7* (0.372) (0.304) No 5.3 4.9 5.6 4.5 5.2 5.1 Poverty severity (%) (0.237) (0.163) No (0.571) (0.310) 1.1* (0.235) (0.190) No Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. Standard errors in parentheses. Significance level: 1% (***), 5% (**), and 10% (*). 34. Poverty estimates are similar between CAPI and PAPI based on core food consumption, which allows isolating the effect of data collection method. At the national level, poverty incidence from a core food consumption aggregate is 33.0 percent for the tablet-based pilot and 33.2 percent for the tablet survey. The difference between estimates from both surveys are neither statistically significant at the national level nor for urban and rural areas (Table 9). Poverty estimates from core food consumption allows isolating the effect of data collection method, as there are no differences in the consumption 25The nonfood component excludes rent, energy and educational expenses. 26The food poverty line was defined considering basic food items which attain the 2,250 Kcal minimum nutritional requirements (Kenya National Bureau of Statistics 2018). 18 Comparing a tablet-based rapid consumption with a paper-based full consumption survey module of CAPI and PAPI for core food items. Collecting household-level consumption data with CAPI results in equivalent poverty estimates to those obtained from a paper survey. Figure 6: Total food and nonfood expenditure for rural Figure 7: Total food and nonfood expenditure for urban areas. areas excluding Nairobi. 100 100 90 90 80 80 70 70 % of population % of population 60 60 CAPI PAPI CAPI PAPI 50 50 40 40 30 30 20 20 10 10 0 0 0 2,000 4,000 6,000 8,000 10,000 12,000 0 4,000 8,000 12,000 16,000 20,000 Monthly per adult equivalent total expenditure Monthly per adult equivalent total expenditure (deflated) (deflated) Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. 35. The rapid consumption methodology with CAPI produces poverty point estimates that are statistically indifferent to a full consumption module in PAPI. The point estimates of poverty incidence, depth and severity derived from i) total food consumption and ii) total food and nonfood expenditure, including the rapid consumption methodology for CAPI, are statistically equivalent between KIHBS 2015/16 and the tablet-based pilot at the national level (Table 9 and Box 3 in the Annex). 27 Poverty incidence in urban areas is also similar between CAPI and PAPI from both consumption aggregates, and poverty depth and severity from total food consumption. Poverty depth and severity are slightly different between surveys when considering total food and nonfood expenditure in urban areas. In rural areas, poverty incidence from total food consumption is also slightly different between surveys, with all other poverty measures being equivalent between CAPI and PAPI. The relatively small differences in some poverty measures for urban and rural areas –all at the 10 percent level of significance– do not seem to be associated with the choice of survey method and consumption module, since there is no evidence of a systematic bias when comparing the paper and tablet estimates for the two consumption aggregates that employed the RCM for CAPI. Overall, the poverty estimates from CAPI with a rapid consumption methodology are mostly equivalent to those derived from the KIHBS 2015/16 paper survey. 36. Besides, the distribution of total food and nonfood expenditure for rural areas is similar between PAPI with a full consumption module and CAPI with a rapid consumption methodology. Despite the differences in number of items asked to respondents in the full consumption module and the RCM, the distribution of total food and nonfood expenditure for rural areas is equivalent between PAPI and CAPI (Figure 6). A two-sample Kolmogorov–Smirnov test indicates there are no differences in the 27Food poverty is slightly higher from KIHBS 2015/16 (32 percent) because the comparable consumption aggregate excluded a few items and the same national food poverty line was considered. Similarly, poverty from a total food and nonfood consumption aggregate is different to the absolute poverty rate from KIHBS 2015/16 (36.1 percent) because the consumption aggregates only considered items that were included in both surveys and excluded rent, energy and educational expenses. 19 Comparing a tablet-based rapid consumption with a paper-based full consumption survey distribution of total food and nonfood expenditure between CAPI and PAPI for rural areas. 28 CAPI technology combined with a rapid consumption methodology produces consistent total food and nonfood expenditure results as those obtained from a full consumption module in KIHBS 2015/16. In line with this, the Gini inequality index is not statistically different between CAPI and PAPI for any of the three consumption aggregates considered (Table 10). Table 10: Gini inequality index. National level excluding Urban areas excluding Urban areas Rural areas Consumption Nairobi Nairobi aggregate CAPI PAPI Diff. CAPI PAPI Diff. CAPI PAPI Diff. CAPI PAPI Diff. Core food 32.3 32.2 No 31.5 31.8 No 29.9 31.2 -1.2* 31.4 31.4 No consumption Total food 33.3 34.2 -0.9** 32.9 36.0 -3.0* 31.6 35.2 -3.6*** 32.0 32.2 No consumption Total food and nonfood 34.1 35.2 -1.1** 33.1 35.5 -2.4* 32.2 35.1 -2.8*** 32.3 32.8 No consumption Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. Significance level: 1% (***), 5% (**), and 10% (*). 37. In urban areas, some differences at the top of the consumption between surveys do not seem to be associated with the choice of data collection method and consumption module. The distribution of total food and nonfood consumption for urban areas is similar between PAPI and CAPI for expenditure values below KSh 9,500 (Figure 8 in the Annex). Higher nonresponse rates for wealthier areas like Nairobi can potentially distort consumption at the top of the distribution. KIHBS 2015/16 had smaller response rate in every province compared to the tablet-based pilot, with the largest difference in Nairobi (77 percent for PAPI vs. 99 percent for CAPI; Table 11 in the Annex). After excluding the capital city, the distribution of total expenditure for urban areas is similar between the paper and the tablet survey (Figure 7). The two-sample Kolmogorov–Smirnov test for urban areas without Nairobi indicates there are no differences in the distribution of total food and nonfood expenditure between CAPI and PAPI.29 The Gini inequality index is statistically equivalent from a core food consumption aggregate for urban areas without Nairobi, and only slightly different –at the 10 percent level of significance– for total food consumption and total food and nonfood expenditure (Table 10). Furthermore, the Gini inequality index increases for both CAPI and PAPI after excluding Nairobi, while the difference between surveys decreases for all the consumption aggregates. This ultimately suggests that differences in the urban Gini index between CAPI and PAPI are more likely to be explained by different response rates in other urban areas beyond the capital city, and less likely to be associated with the data collection method and the consumption module considered (Table 10). 28 The two-sample Kolmogorov–Smirnov test verifies the equality of distributions considering the null hypotheses that the distribution of total food and nonfood expenditure for PAPI i) contains smaller values than for CAPI; and ii) contains larger values than for CAPI. The associated p- values for rural areas are 0.990 and 0.779 respectively. 29 For urban areas excluding Nairobi, the p-values for the null hypotheses of the two-sample Kolmogorov–Smirnov test are 0.961 and 0.779. 20 Comparing a tablet-based rapid consumption with a paper-based full consumption survey E. CONCLUSIONS AND RECOMMENDATIONS 38. The tablet-based pilot produced point estimates that are statistically indifferent to those from the paper survey in terms of household and population characteristics. Easy to verify dwelling characteristics as well as water sources, sanitation, lighting and cooking fuel of households show no statistical differences at the national level between the paper-based survey and tablet-based survey. Gender, literacy and educational and other characteristic of the household head are also similar in the paper and tablet-based survey. 39. The rapid consumption methodology with CAPI produces similar consumption and poverty results to a full consumption module in PAPI. The point estimates of poverty incidence, depth and severity derived from i) total food consumption and ii) total food and nonfood expenditure, including the rapid consumption methodology for CAPI, are statistically equivalent between KIHBS 2015/16 and the tablet-based pilot at the national level. In addition, the distribution of total food and nonfood expenditure are similar between CAPI and PAPI for rural and urban areas excluding Nairobi, regardless of whether they are generated from a full consumption module or the rapid consumption methodology. 40. Using the rapid consumption methodology and CAPI would facilitate providing frequent, timely and high-quality data for the KCHS. Including the RCM in the KCHS can considerably reduce the administering time of the questionnaire, while producing household, population, consumption and poverty estimates that are statistically not significantly different from those using a paper survey with a full consumption module. In addition, collecting data with CAPI has the potential to improve data quality, eliminates the need for data entry, supports near real-time monitoring of data collection and reduces the time between fieldwork and analysis. CAPI can help closing data gaps in terms of data collection and dissemination of poverty data. 30 Even in a context of conflict and violence, technological innovations based on CAPI have proven to be successful in establishing a survey infrastructure to obtain valid and reliable information.31 41. Implementing the KCHS with the RCM and CAPI implies considering other aspects to leverage their benefits. CAPI increases the time needed to design the questionnaire and thus must be combined with constant efforts in coding and testing it before data collection. The tablet or device, together with the software and hardware would need to be procured and tested in advance. In addition, the training of enumerators will need to emphasis how to use the tablets and to correctly record answers. A successful monitoring system would need to ensure appropriate data management and transfer from the devices to the cloud server and define how to provide timely feedback to teams in the field. Additionally, training and capacity building will be needed to derive total consumption aggregates with multiple imputation techniques as part of the rapid consumption methodology, besides a providing documentation and support to data users. 30 Serajuddin et al. 2015. 31 U. J. Pape and Parisotto 2019. 21 Comparing a tablet-based rapid consumption with a paper-based full consumption survey ANNEX 1. Additional figures and tables Box 3: Measures of poverty and inequality The poverty incidence is the most common poverty measure. The poverty incidence or headcount ratio refers to the share of population that is poor or that have a total consumption lower than the poverty line. Its derived from the total consumption of the household in food, non-food and durable goods, the number of members that comprise the household and a specific consumption threshold or poverty line. This measure describes the extent of poverty in a country or region. The poverty gap index measures how far poor households are from overcoming poverty, while the poverty severity index measures the level of inequality among the poor. The poverty gap index is the difference between current consumption and the poverty line as a proportion of the poverty line for the poor population. It can be interpreted as the minimum amount of resources that would have to be transferred to the poor, under a perfect targeting scheme, to eradicate poverty. The poverty severity index is estimated as the square of the poverty gap. It attributes a larger weight to the poorest among the poor, thus reflecting inequality conditions for the poor. While poverty measures absolute deprivation with respect to a given threshold, inequality is a relative measure indicating how little some parts of a population have relative the entire population. In the context of monetary poverty, equality can be defined as an equal distribution of resources across the population. This means that each share of the population owns the same share of consumption/income. The Lorenz Curve compares the cumulative share of the population with their cumulative share of consumption/income. A perfectly equal distribution is indicated by a diagonal. The other extreme is complete inequality where one individual owns all the consumption/income. These two (theoretical) extremes define the boundaries for observed inequality. The Gini index is the most commonly used measure for inequality. A Gini index of 0 indicates perfect equality while 100 corresponds complete inequality. The Gini index is associated to the Lorenz curve as it measures the area between the distribution of consumption or income represented by the Lorenz Curve and the diagonal line that implies perfect equality. 22 Comparing a tablet-based rapid consumption with a paper-based full consumption survey Figure 8: Total food and nonfood expenditure for urban areas. 100 90 80 70 % of population 60 CAPI PAPI 50 40 30 20 10 0 0 4,000 8,000 12,000 16,000 20,000 Monthly per adult equivalent total expenditure (deflated) Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. Table 11: Average response rate by province. Response rate (%) Province KIHBS 2015/16 Tablet-based pilot Central 92.4 98.7 Coast 91.3 99.8 Eastern 92.2 100.0 Nairobi 76.9 99.5 North Eastern 87.7 99.9 Nyanza 93.1 99.1 Rift Valley 91.0 96.7 Western 93.7 100.0 Source: Authors’ calculation based on KIHBS 2015/16 and the tablet-based pilot. 2. Sampling design and weights The tablet pilot surveyed six households per EA, while KIHBS 2015/16 ten (Table 12). Table 12: Characteristics of the sample. CAPI PAPI Characteristic Number of EAs 2,251 2,387 Number of households 12,851 21,773 Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. 23 Comparing a tablet-based rapid consumption with a paper-based full consumption survey To compare indicators from the paper and tablet survey, the sampling weights for CAPI are derived from PAPI at the EA level. That is, the sum of sampling weights at the EA level in PAPI was divided among the households included in CAPI. Sampling weights are then scaled to the totals from PAPI at strata level. The resultant weights for CAPI are the same as PAPI at urban/rural and county levels. The equation below provides the formal description: �����? �����? �����? ������ℎ������������ ������������������ = ������������������ × �����? ������ℎ������������ where c wij = CAPI cluster weight for cluster i in stratum j; p wij = PAPI cluster weight for cluster i in stratum j; p nhij = Number of PAPI households interviewed in cluster i in stratum j; nhc ij = Number of CAPI households interviewed in cluster i in stratum j. In the scenario where no CAPI households are interviewed in a cluster the weight of other clusters in the same stratum must be adjusted; p ∑wj c2 c1 wij = wij × ∑������jc such that c1 wij = CAPI cluster weight for cluster i in stratum j after adjusting for differences in number of households between CAPI and PAPI clusters only; c2 wij = CAPI cluster weight for cluster i in stratum j after adjusting for differences in number of households and CAPI cluster in stratum j missing all households; �����? ∑wj = Sum of PAPI cluster weights in stratum j; �����? ∑wj = Sum of CAPI cluster weights in stratum j. 3. Allocation of households to PSUs During data collection some enumerators chose incorrectly the EA correspondent to the interview conducted. As a result, many EAs had more than six households (Figure 9). 24 Comparing a tablet-based rapid consumption with a paper-based full consumption survey Figure 9: Allocation of households and EAs. Source: Authors’ calculation based on the KIHBS 2015/16 and the tablet-based pilot. The KNBS was able to validate the correct EA number for some submissions based on their GPS positions. However, various interviews were not validated. This was only possible for 9,146 households out of 13,795. In order to solve the issue, the midpoint (median latitude/longitude) of each EA from GPS positions of households interviewed in PAPI was obtained. Then, the household was allocated to the EA using the distance to the closest EA from the distance to all the EAs. As a robust check the same process was applied to household with a correct EA ID from fieldwork records, which resulted in an accuracy of 94 percent of the cases from the approach. As a result, 9,146 households were identified to their EA using the information from KNBS, 334 did not have GPS coordinates and were dropped; 4,315 were allocated to an EA using the GPS coordinates to the closes midpoint from PAPI. Finally, an additional correction was introduced excluding households that were 1 km away from the midpoint of the EA, but only among EAs identified through GPS positions from PAPI which had more than 6 interviews in the EA. With this correction, 168 additional households were excluded. 4. Quality control of CAPI submissions The submissions from the tablet-based were validated before proceeding to the cleaning and analysis stage of the survey. The following two validation rules were considered to exclude submissions: - Duration: Submissions with a duration of less than 30 minutes, those with missing duration in the food or nonfood modules, and those interviews whose duration in the food module was less than 10 minutes and 5 minutes in the nonfood module. 76 interviews were deemed invalid for this reason. - Location: Submissions whose GPS coordinates do not fall within the EA boundaries from the PAPI records. 346 interviews were deemed invalid for this reason. 25 Comparing a tablet-based rapid consumption with a paper-based full consumption survey - Completeness: 18 interviews with no information on food consumption and 2 submissions with zero food consumption recorded were also excluded. 5. Imputed rent values for urban households Rent was imputed by estimating a stepwise log-linear Ordinary Least squares (OLS) regression of reported rents on housing characteristic variables.32 The same structural model used in PAPI was employed to obtain rent expenditure for urban households in CAPI. The variables considered in the model correspond to the following: - Location - Number of rooms - Construction materials - Type of water supply and sanitation - Type of energy source for cooking - Household head employment and - Educational characteristics 6. Poverty line and consumption aggregates The food consumption aggregate was obtained from 4 sub-groups: purchased items, stocked, own- production and gifts. Data on non-food consumption by households was also collected. The set of nonfood items included in both surveys was considered, excluding rent, energy and educational expenses. To obtain comparable consumption aggregates between the paper and tablet surveys, various adjustments are made. Table 13 presents the source of discrepancy and the correction introduced to have comparable consumption measures between surveys. For the CAPI consumption aggregates, multiple imputations are used in the following way: - Model selection. ▪ First, the best model for the multiple imputation process is selected based on Furnival- Wilson leaps-and-bounds algorithm to explore all possible combinations of independent variables. ▪ The dependent variable corresponds to the natural logarithm of collected consumption; the core module as well as the optional module allocated to the household. ▪ The independent variables include strata, optional module, urban/rural location of the household, dwelling characteristics (material of floor, roof and walls), source of water, sanitation, cooking fuel and lighting, number of habitable rooms, demographic characteristics (household size, proportion of children and elderly), ownership of assets 32 Kenya National Bureau of Statistics 2018. 26 Comparing a tablet-based rapid consumption with a paper-based full consumption survey (radio, tv, cellphone, bicycle and mosquito net), as well as characteristics of the household head (gender, education and employment status). ▪ Once the best specification is identified, the model is complemented with two dummy variables indicating the quartile associated to each household from the consumption of core food and core non-food expenditure. - Multiple imputation. ▪ A two-step multiple imputation process is then implemented for each optional module using the log-space for consumption and an indicator of whether consumption collected on the respective module is zero or not. ▪ In the first step, the model estimates whether the module has non-zero consumption using a logit regression, while in the second step the consumption of the module is estimated with an OLS regression using the best specification previously identified and drawing 100 point estimates. ▪ Finally, consumption aggregates are obtained from the 100 point estimates of each household. Table 13: Corrections made to produce comparable consumption aggregates. Issue Correction Use median prices for urban and rural areas from the paper survey Upper bound constraints for unit prices induced lower values Use the median deflator by EA from the paper survey Implicit non-standard conversion factors induced higher Convert to standard units (Kg/Lt) using the median values by quantities consumed item & non-standard unit from the paper survey Replace the consumption value of 3 items that were only reported in ‘handful’ or ‘piece’ using the median values by Non-standard conversion factors combined with lower EA from the paper survey bound constraints for quantities reported in ‘handful’ and ‘piece’ units induced higher values For the rest of the items, entries reported in ‘handful’ or ‘piece’ were replaced by the median value by item from the tablet survey Outliers in the tablet survey with values beyond the Upper bound constraints for quantities did not prevent maximum value in the paper survey were replaced by the outliers median value by item from the tablet survey 1 non-core item included in the tablet but not the paper survey was excluded from the tablet consumption aggregate. Food items were different in the paper vs. tablet survey 25 non-core items included in the paper but not the tablet survey were excluded from the KHIBS consumption aggregate For item 'food eaten outside the household' only the price Replace consumption values of this item in the tablet survey was considered to derive the consumption value in the paper by the median value by EA from the paper survey survey (i.e. not the quantity) The consumption value of item 'cost of milling' was set to For consistency, the consumption value of item 'cost of zero in the paper survey milling' was also set to zero in the tablet survey 27 Comparing a tablet-based rapid consumption with a paper-based full consumption survey The consumption aggregates are converted to adult equivalent measure using the equivalence scales developed by Anzagi and Bernard. These adult equivalence scales prescribe that age groups 0-4 years are weighted as 0.24 of an adult, children aged 5-14 years be weighted as 0.65 and all people aged 15 years and older be assigned a value of unity. The poverty lines used corresponds to those from KIHBS 2015/16 and are based on cost-of-basic needs. The rural and urban food poverty lines were set by costing two separate bundles of basic food items which attain the 2,250 Kcal minimum nutritional requirements in a way which is consistent with food tastes in rural and urban areas.33As for PAPI, the overall poverty lines in monthly adult equivalent terms considered are KSh 3,252 for rural areas and KSh 5,995 for urban areas. The food poverty lines in monthly adult equivalent terms are computed as KSh 1,952 and KSh 2,551 for rural and urban areas, respectively. The food poverty line was used directly to obtain the incidence of poverty from a total food consumption aggregate. However, the national poverty line was rescaled using the PAPI share of food and nonfood items considered in the total expenditure aggregate for CAPI and PAPI from the total KIHBS 2015/16 consumption aggregate. 33 Kenya National Bureau of Statistics 2018. 28 Comparing a tablet-based rapid consumption with a paper-based full consumption survey REFERENCES Banks, Randy, and Heather Laurie. 2000. “From Papi to Capi: The Case of the British Household Panel Survey.�? Social Science Computer Review 18 (4): 397–406. https://doi.org/10.1177/089443930001800403. Bemelmans-Spork, M.E., and D. Sikkel. 1985. “Data Collection with Hand-Held Computers.�? Netherlands Central Bureau of Statistics. Caeyers, Bet, Neil Chalmers, and Joachim De Weerdt. 2012. “Improving Consumption Measurement and Other Survey Data through CAPI: Evidence from a Randomized Experiment.�? Journal of Development Economics 98 (1): 19–33. Couper, Mick P., and Geraldine Burt. 1994. “Interviewer Attitudes Toward Computer-Assisted Personal Interviewing (CAPI.�? Social Science Computer Review 12 (1): 38–54. https://doi.org/10.1177/089443939401200103. Danielsson, L., and P. Maarstad. 1982. “Statistical Data Collection with Hand-Held Computers: A Consumer Price Index.�? Demombynes, Gabriel, Paul Gubbins, and Alessandro Romeo. 2013. “Challenges and Opportunities of Mobile Phone-Based Data Collection: Evidence from South Sudan.�? Fafchamps, Marcel, David McKenzie, Simon Quinn, and Christopher Woodruff. 2014. “Microenterprise Growth and the Flypaper Effect: Evidence from a Randomized Experiment in Ghana.�? Journal of Development Economics 106: 211–26. https://doi.org/10.1016/j.jdeveco.2013.09.010. Glewwe, Paul, and Hai-Anh Hoang Dang. 2008. “Impact of Decentralized Data Entry on the Quality of Household Survey Data in Developing Countries : Evidence from a Randomized Experiment in Vietnam.�? The World Bank Economic Review 22 (1 (January 2008)): 165–85. Kenya National Bureau of Statistics. 2018. “Basic Report on Well-Being in Kenya.�? King, Jonathan D., Joy Buolamwini, Elizabeth A. Cromwell, Andrew Panfel, Tesfaye Teferi, Mulat Zerihun, Berhanu Melak, et al. 2013. “A Novel Electronic Data Collection System for Large-Scale Surveys of Neglected Tropical Diseases.�? PLOS ONE 8 (9): e74570. https://doi.org/10.1371/journal.pone.0074570. Leeuw, E. D. de. 2008. “The Effect of Computer-Assisted Interviewing on Data Quality: A Review of the Evidence.�? Preprint. 2008. http://dspace.library.uu.nl/handle/1874/44502. Leeuw, Edith de, and William Nicholls. 1996. “Technological Innovations in Data Collection: Acceptance, Data Quality and Costs.�? Sociological Research Online 1 (4): 1–15. https://doi.org/10.5153/sro.50. Leisher, Craig. 2014. “A Comparison of Tablet-Based and Paper-Based Survey Data Collection in Conservation Projects.�? Social Sciences 3 (2): 264–71. https://doi.org/10.3390/socsci3020264. Nicholls, W.L., and E. De Leeuw. 1996. “Factors in Acceptance of Computer-Assisted Interviewing Methods: A Conceptual and Historic Review.�? Proceedings of the Section on Survey Research Methods, American Statistical Association. Olsen, R. J. n.d. “The Effects of Computer Assisted Interviewing on Data Quality.�? University of Essex. Olson Lanjouw, Jean, and Peter Lanjouw. 2001. “How to Compare Apples and Oranges: Poverty Measurement Based on Different Definitions of Consumption.�? Review of Income and Wealth 47 (1): 25– 42. 29 Comparing a tablet-based rapid consumption with a paper-based full consumption survey Pape, U, and J Mistiaen. 2015. “Measuring Household Consumption and Poverty in 60 Minutes: The Mogadishu. Washington DC: World Bank.�? Proceedings of ABCA Conference 2015. Washington DC: World Bank. Pape, Utz Johann, and Johan A. Mistiaen. 2018. “Household Expenditure and Poverty Measures in 60 Minutes: A New Approach with Results from Mogadishu.�? Pape, Utz Johann, and Luca Parisotto. 2019. “Estimating Poverty in a Fragile Context -- The High Frequency Survey in South Sudan.�? World Bank Policy Research Working Paper no 8722 (January): 1–42. Pape, Utz, and Johan Mistiaen. 2015. “Measuring Household Consumption and Poverty in 60 Minutes: The Mogadishu High Frequency Survey.�? Prydz, Espen. 2013. “‘Knowing in Time’: How Technology Innovations in Statistical Data Collection Can Make a Difference in Development.�? OECD. Rosero-Bixby, L., J. Hidalgo-Céspedes, D. Antich-Montero, and M. A. Seligson. 2005. “Improving the Quality and Lowering Costs of Household Survey Data Using Personal Digital Assistants (PDAs). An Application for Costa Rica.�? In meeting of the Population Association of America Philadelphia. Schräpler, Jörg-Peter, Jürgen Schupp, and Gert G. Wagner. 2010. “Changing from PAPI to CAPI: Introducing CAPI in a Longitudinal Study.�? Journal of Official Statistics 26 (2): 239–69. Sebestik, J., H. Zelon, D. DeWitt, J. M. O’Reilly, and K. McGowan. 1988. “Initial Experiences with CAPI.�? Proceedings of the Bureau of the Census Fourth Annual Research Conference. Serajuddin, Umar, Hiroki Uematsu, Christina Wieser, Nobuo Yoshida, and Andrew L. Dabalen. 2015. “Data Deprivation : Another Deprivation to End.�? WPS7252. The World Bank. http://documents.worldbank.org/curated/en/700611468172787967/Data-deprivation-another- deprivation-to-end. Taylor, Sue. 1998. “Setting up Computer-Assisted Personal Interviewing in the Australian Longitudinal Study of Ageing.�? Statistical Science 13 (1): 14–18. World Bank. 2019. “Kenya Gender and Poverty Assessment 2015/16; Reflecting on a Decade of Progress and the Road Ahead.�? Washington, DC: World Bank. Zhang, Shuyi, Qiong Wu, Michelle HMMT van Velthoven, Li Chen, Josip Car, Igor Rudan, Yanfeng Zhang, Ye Li, and Robert W Scherpbier. 2012. “Smartphone Versus Pen-and-Paper Data Collection of Infant Feeding Practices in Rural China.�? Journal of Medical Internet Research 14 (5). https://doi.org/10.2196/jmir.2183. 30