Labour Force survey METHODOLOGY NATIONAL INSTITUTE OF STATISTICS OF RWANDA
2 Labour Force Survey, Methodology LFS, Methodology © NISR, 2024 Copyright © 2024 National Institute of Statistics of Rwanda (NISR). All rights reserved. The Labour Force Survey Methodology is produced by the National Institute of Statistics of Rwanda (NISR). Additional information about the Labour Force Survey Methodology report may be obtained from NISR: P .O Box : 6139 Kigali, Rwanda Tel: +250 788 383103 Hotline: 4321 Email: [email protected] Recommended citation: National Institute of Statistics of Rwanda (NISR), Labour Force Survey, Methodology 3 © NISR LFS, Methodology Contents Foreword
4 1 Sampling Methodology---------------------------------------------------------------------------------- 7 1.2 Technical Definitions........................................................................................................... 7 1.3 LFS Objectives..................................................................................................................... 8 1.4 Current LFS Sample Review and Parameters for the New Design ...................................... 8 1.5 Sampling Frame and Stratification in the New Sample Design........................................... 9 1.6 New Sample Size................................................................................................................. 9 1.7 Sample Allocation............................................................................................................... 11 1.8 Panel Rotation Scheme....................................................................................................... 14 2.9 Sample Selection................................................................................................................. 15 1.10 Weighting............................................................................................................................ 16 1.11 Weighting Nonresponse Adjustment.................................................................................. 17 1.12 Weighting Calibration Adjustment...................................................................................... 18 1.13 Estimation........................................................................................................................... 18 1.14 Sampling Error Estimation.................................................................................................. 19 2 Questionnaire design------------------------------------------------------------------------------------- 21 3 Fieldwork operations------------------------------------------------------------------------------------- 22 3.1 Preparations........................................................................................................................ 22 3.2 Fieldwork Data collection.................................................................................................... 22 3.3 Data quality control............................................................................................................ 23 4 Data security and processing -------------------------------------------------------------------------- 28 5 Data analysis and reporting writing------------------------------------------------------------------ 29 6 Annexes ----------------------------------------------------------------------------------------------------- 31 Annex 1: Rwanda LFS theoretical Sample Rotation Scheme 2-2-2 ................................................ 31 Annex 2: Main concepts and definitions......................................................................................... 32 Annex 3: LFS Methodology Contributors........................................................................................ 36 Annex 4: LFS Questionnaire............................................................................................................ 37
4 © NISR LFS, Methodology Foreword The National Institute of Statistics of Rwanda (NISR) introduced the labour force survey (LFS) program to provide labour market statistics to the Ministry of Public Service and Labour (MIFOTRA), the Ministry of Finance and Economic Planning (MINECOFIN), the Ministry of Education (MINEDUC), the International Labour Organization (ILO), and other key stakeholders. Labour statistics are fundamental to Rwanda’s efforts in achieving decent work for all. These statistics are needed in developing evidence-based policies and assessing progress toward this goal. Additionally, the Government of Rwanda requires updated information to monitor the implementation of programs and policies as stipulated in the second National Strategy for Transformation (NST2), the Sustainable Development Goals (SDGs), and the Vision 2050. To effectively track progress toward these targets, it is essential to produce relevant, reliable, coherent, timely, and accessible labour statistics. The primary objective of the Labour Force Survey (LFS) is to provide reliable and timely data on the structure and dynamics of the labour force, including employment, unemployment, and other labour market indicators. These data support the formulation, implementation, and evaluation of economic and social policies, particularly those related to employment creation, income generation, skills development, and the promotion of decent work. The LFS program began with a pilot survey conducted in February 2016. The first official round was implemented in August 2016 and continued on bi-annual basis until August 2018. Starting from the year 2019, the survey was redesigned to produce quarterly estimates of key labour market aggregates, allowing for more frequent monitoring and analysis. Following the 2022 Rwanda Population and Housing Census, which provided updated population structures and spatial distributions, a new sampling frame was developed. As a result, the LFS methodology was revised to align with this updated frame. This report presents the sampling methodology and other operational procedures applied in the Labour Force Survey from 2024 onward, until further revisions are made. The National Institute of Statistics of Rwanda (NISR) encourages policymakers, program managers, researchers, and other users to refer to this LFS Methodology Report. It offers essential insights into the data production process and enhances understanding of the quarterly and annual LFS data products. MURENZI Ivan Director General of NISR
6 © NISR LFS, Methodology 1.1 Introduction The Rwanda Labour Force Survey (LFS) aims to generate regular and timely data on the most important dimensions of the labour market to labour-related policy. It provides quarterly and yearly estimates on employment, unemployment, underemployment, and multiple demographic characteristics. The National Statistical Institute of Rwanda (NISR) decided to update the LFS sampling design and increase its sample size, starting in Quarter 1 2024, to obtain more precise quarterly national estimates and reliable yearly estimates by district and urban and rural areas. This report describes the new sampling design and the corresponding estimation procedures. It starts by outlining the main features of the current LFS sampling design. It then presents the new sample size calculations, the allocation among strata, and the updated sample rotation scheme. Finally, the report describes the sample weighting process and the estimation procedures. 1.2 Technical Definitions This section lists the definition of the most commonly used terminology to make this report understandable for non-specialist readers. – Unit of analysis: element for which the information is recollected. In the case of the Rwanda LFS, the units of analysis are the private households and the people living in each household. – Population: total units of analysis whose characteristics are to be estimated. For the LFS, the universe will cover the population living in private households in all districts of Rwanda. – Sample: subset of analysis units selected to represent the population. A probability sample is one where each element in the population has a positive and known probability of selection. – Sampling unit: the unit selected at each stage of selection to represent the units of analysis. The LFS has two stages of selection. The primary sampling units (PSUs) are census enumeration areas (or a set of merged small enumeration areas), and the secondary sampling units (SSUs) are the households listed within each PSU selected in the sample. – Sampling frame: extensive and complete list of all the sampling units in the population. In the case of the LFS, the sampling frame of PSUs is based on cartography from the last Census. The second stage sampling frame will be based on listing households in each sample PSU. – Sample estimate: numerical quantity estimated from sample observations of a characteristic to provide inferences about an unknown population parameter. – Sampling error: variability in the value of an estimate based on data from a population sample. – Coefficient of variation: defined as the standard error of an estimate divided by the value of the estimate, usually expressed as a percentage. It is a measure of the relative precision of an estimate. – Stratification: dividing the population into independent groups (strata) defined to provide homogeneity of the sampling units within each stratum. It is used to improve the efficiency of the sample design, i.e., obtain more precise estimates with a given sample size. The sampling units are selected independently within each stratum and each unit is part of exactly one stratum. Sampling Methodology 1
7 Labour Force Survey, Methodology LFS, Methodology © NISR, 2024 – Cluster sampling: clusters are defined as area units such as census enumeration areas (EAs) with well-defined boundaries. The clusters are selected at the first sampling stage to make the sample more cost-effective. – Design effect: the ratio between the variance of an estimate based on a complex sample (such as the one used for the LFS with stratification and multiple stages of selection) and the corresponding variance from a simple random sample of the same size. 1.3 LFS Objectives The objectives of the 2024-2034 LFS are similar to those of 2022 and before, i.e. producing quarterly and yearly labour-market estimates. However, the new LFS sampling design provides estimates with improved precision, especially for districts and urban and rural areas. 1.4 Current LFS Sample Review and Parameters for the New Design The parameters for redesigning the sample were obtained from the current (2022) LFS data from previous rounds. The following paragraphs describe the main features of the current sample and the outcomes of the review and analysis of the available data. The current LFS sample: – Has a two-stage stratified probability design. PSUs are selected in the first stage. All households are then listed in each sample PSU to ensure an updated household sampling frame, and households are selected in the second stage. – Selects 25 households per PSU. – Has a quarterly rotation scheme with three rotating panels. – Delivers quarterly estimates, and fieldwork is carried out entirely in the mid-month of each quarter. – Yields quarterly estimates for the entire country and countrywide urban and rural areas. Yearly estimates are reported for each of the thirty districts. The review of the 2022 data shows that frame errors (inclusion of ineligible units during the listing, such as vacant dwellings) are very few and nonresponse is low at less than 5%. The efficiency of the current sample rotation scheme was assessed in terms of the resulting gains in precision for the change estimates across consecutive quarters and years. As a result, it was decided to shift from the current three rotating panels to four panels and implement a 2-2-2 rotation scheme, in which each sample household is interviewed over two consecutive quarters, rests for the following two quarters, and is interviewed again over the two subsequent quarters. The analysis of 2022 data also concluded that sampling errors were too large for most of the districts and for the countrywide urban domain to derive reliable sampling parameters that could be used for the new design in these domains, such as design effects and intra-class correlations.
8 Labour Force Survey, Methodology LFS, Methodology © NISR, 2024 1.5 Sampling Frame and Stratification in the New Sample Design The new LFS design uses the 2022 Population and Housing Census as the frame for the first sampling stage. The PSUs are the 2022 census enumeration areas (EAs), except for some small EAs that were merged to ensure an adequate number of households per PSU. The list of all PSUs in the country was stratified by district. A sample of PSUs was drawn in each district independently, ensuring enough sample to reach adequate precision for yearly district-level estimates. In addition, PSUs within the six districts with the largest share of urban population outside Kigali were substratified into urban and rural PSUs. For the second sampling stage, households will be listed in each selected PSU, which will serve as an updated frame for household selection. 1.6 New Sample Size According to the 2022 Population and Housing Census, Rwanda has about 13 million people distributed across 30 districts. One of the new LFS main estimation objectives is to achieve yearly district-level unemployment estimates with adequate precision. In particular, the new sample size should be able to estimate a district- level unemployment ratio of 6% with a coefficient of variation no larger than 12%. The unemployment ratio is the proportion of unemployed people 16 years of age or above, over the total population 16 years old or above. An unemployment ratio of 6% corresponds to an unemployment rate of about 9%. Let denote an individual-level variable that takes the value of 1 if the person is unemployed and 0 if the person is either employed or out of the labour force. Let represent the population 16 years or older per district. Then, the unemployment ratio is , and the sample size of individuals per district to achieve the target coefficient of variation ( under simple random sampling can be calculated as1: where the approximation in the second term results from considering a large population . Under and Thus, the minimum sample size required to estimate a district-level unemployment ratio of 6% with a coefficient of variation no larger than 12%, under simple random sampling, is 1,088 individuals. Using the average number of persons 16 years or older per household ( from the 2022 Census, the required sample of households per district in the new LFS is 1 Valliant, R., Dever, J., & Kreuter, F. (2013). Practical Tools for Designing and Weighting Survey Samples (p. 56).
9 Labour Force Survey, Methodology LFS, Methodology © NISR, 2024 hence, Furthermore, based on the current LFS, the expected response rate in the new LFS should be no smaller than 90%, so the required household sample size per district, under simple random sampling, should be which leads to, However, since the LFS sample is clustered in PSUs, its design is less efficient than simple random sampling. The efficiency loss due to clustering is measured through the design effect, and the design effect for the new sample can be estimated as where, is the design effect of variable
is the intraclass correlation coefficient of variable ; and is the average number of sample households per PSU (subsample size). In the 2019 LFS, the estimated average intraclass correlation coefficient for variable was ~ 0.01085; that value is employed here for the new LFS. The number of households to be interviewed per PSU in the new LFS was determined using this value of the intraclass correlation and by evaluating the impact that the number of sample households per PSU would have on the design effect of variable . The decision also contemplated the workload distribution among enumerators in the field team working in a PSU. As a result, it was decided that the new LFS would interview 12 households per PSU, instead of the 25 households per PSU in the current LFS. Therefore, the expected PSU subsample size of individuals 16 years of age or above is Factor 1.8 has been included in the calculation because each individual is expected to be interviewed 1.8 times on average due to the chosen rotation pattern. This is explained below, in Section 8. Finally, hence, Thus, 789.3 households need to be interviewed in each of the 30 districts yearly to attain the yearly district- level estimation target, yielding an annual national sample size of 789.3 x 30 = 23,680.5 household interviews. 10 Labour Force Survey, Methodology LFS, Methodology © NISR, 2024 This annual sample must be randomly distributed across quarters to obtain quarterly estimates, so each quarter will include 23,680.5/4 = 5,920.1household interviews. At the end of each year, the four quarterly samples will be aggregated to attain a large enough sample for the yearly district-level estimates. Given that the number of PSUs in the sample must allow for a 2-2-2 rotation scheme within each stratum (see Section 8), and since the number of selected households per PSU is fixed, the sample size had to be slightly increased in some strata. Therefore, the resulting sample size was finally set to 6,624 household interviews per quarter and 26,496 per year. 1.7 Sample Allocation Once the national quarterly and yearly sample sizes were established, deciding how they would be allocated across Rwanda’s 30 districts was necessary. To calculate the sample size per district , the final quarterly sample size , the number of households in the country , the number of households by district from the census and the standard deviation of variable were used. Four different allocation criteria were assessed: equal allocation, proportionate allocation, Neyman allocation and a customized allocation. Equal allocation With this allocation, the assumption is that there are no differences between districts that allow for differences in the sample allocation. This is the simplest way of allocating the sample, but it is inefficient since it increases the variance of the national estimates. Proportionate allocation With this allocation, the sample assigned to each district will be proportionate to its population share, so the more populated districts will receive a larger sample. This will result in Kigali receiving the largest portion of the national sample, producing very precise estimates for Kigali but quite imprecise estimates for the other districts, especially those with the smallest populations. Neyman allocation The sampling fraction in each district is proportionate to the district population share and the standard deviation of variable . More sample is assigned to districts with the largest variation of to generate the least possible national sampling variance.