en-1718790709-Statistical Methods Standards and Guidelines 2nd Edition 2017.pdf

Type: Document | Status: ready

Statistical Methods, Standards and Guidelines 16

(c) Describe frame problems (missing units on the frame (under-coverage), and duplicates on the frame (over-coverage);

(d) Describe what was done to improve the coverage of the frame;

(e) Describe how data quality and item non-response on the frame may have affected the coverage of the frame; and
(f) Explain limitations of the frame including the timeliness and accuracy of the frame (e.g., misclassification, eligibility, etc.).

Guideline 2.2.1.2: Conduct regular evaluations of coverage rates and coverage of the target population in survey frames that are used for ongoing surveys every 3 years.

2.2.2 Awareness to Prospective Survey Respondents
Standard 2.2.2: MDAs, LGAs and other stakeholders must ensure that all prospective survey respondents are aware of the study and they understand the purpose of the survey.

The guidelines for this standard are: Guideline 2.2.2.1: Provide pre-notification letter / survey brochures to respondents
(a) Informs potential respondents that they have been selected to participate in a survey;
(b) Inform potential respondents about the name and nature of the survey; and
(c) Assure them on the confidentiality of information to be collected.

Guideline 2.2.2.2: Intensify Information, Education and Communication (IEC) campaign through media (such as televisions, radios, newspapers, magazines, etc).

Guideline 2.2.2.3: Involve leaders at LGAs (Ward, Village/Mtaa Executive Officers) all the time during data collection in the EA.

2.2.3 Methods of Data Collection
Standard 2.2.3: MDAs, LGAs and other stakeholders must design and administer their data collection instruments and methods in a manner that achieves the best balance between

Statistical Methods, Standards and Guidelines 17

maximizing data quality and controlling measurement error while minimizing cost, and the burden on respondents.

The guidelines for this standard are: Guideline 2.2.3.1: Design the data collection instruments in a manner that minimizes respondents’ burden, while maximizing data quality. The following strategies may be used to achieve these goals:
(a) Questions should be written clearly;
(b) Observe logical flow of questions and design proper skip patterns;
(c) Don’t overload the questionnaire;
(d) The questionnaire should include only items/variables that have been pre/pilot tested.

Guideline 2.2.3.2: Encourage respondents to participate in order to maximize response rates and improve data quality. The following data collection strategies can also be used to achieve high response rates:
(a) Ensure that the data collection reference period is of adequate and reasonable length (at most 12 months);

(b) Allow three interview attempts (call-backs) before declaring unit non-response;

(c) Use competent interviewers and other staff who can learn techniques for obtaining cooperation and building rapport with respondents. Techniques for building rapport include respect for respondents’ rights and culture, observing appointments, follow-up skills, knowledge of the goals and objectives of the data collection and uses of the data;

(d) Although incentives are not recommended and used in surveys, MDAs, LGAs and other stakeholders may consider use of respondent incentives if they believe incentives would be necessary to use for a specific survey in order to achieve data of sufficient quality for their intended use(s). Some incentives that can be offered to respondents may include:

(i) Small portable radios; (ii) T-shirts and caps;

Statistical Methods, Standards and Guidelines 18

(iii) Hoes; (iv) Mosquito nets; (v) Key holders; (vi) School bags/safari bags; (vii) Football;
(viii) Exercise books; (ix) Pens and pencils; and (x) Survey badges.

Guideline 2.2.3.3: The way data collection is designed and administered contributes to data quality. The following are important to consider:
(a) Collect data at the most appropriate time of the year, when relevant;

(b) Establish the data collection protocol to be followed by the field staff;

(c) Provide training for field staff on survey protocols;

(d) Establish mechanisms to minimize interviewer falsification, such as protocols for monitoring interviewers and re-interviewing respondents;

(e) Establish procedures for field edits of data collected. Enumerators and supervisors should ensure that questionnaires are duly filled before moving to another respondent or cluster.

Guideline 2.2.3.4: Develop supervision for data collection activities, with strategies to correct identified problems. The following are important to consider:
(a) Design control report forms and supervision checklists;

(b) Implement quality by following the process of data collection manuals; and

(c) Use internal reporting systems that provide timely reporting of response rates and the reasons for non-response throughout the data collection.

Statistical Methods, Standards and Guidelines 19

2.3 DATA PROCESSING

2.3.1 Data Editing
Standard 2.3.1: MDAs, LGAs and other stakeholders must edit data appropriately, based on available information, to correct detectable errors.

The guidelines for this standard are: Guideline 2.3.1.1: Check and edit data to correct errors during and after data collection. Data editing is an iterative and interactive process that includes procedures for detecting and correcting errors in the data. When electronic data collection methods are used, data are usually edited both during and after data collection. Obtain inputs from subject matter specialists in the development of edit rules and edit parameters (edit specifications). As appropriate, check data for the following and edit if errors are detected:
(a) Responses that fall outside a pre-specified range (e.g., a person with 4 years old and married with 2 children);
(b) Contradictory responses and incorrect flow through prescribed skip patterns;
(c) Missing data that can be directly filled from other portions of the same record (including the sample frame e.g missing location identification);
(d) The omission of records; and
(e) The duplication of records.

Guideline 2.3.1.2: Code the data set to indicate any actions taken during editing, and/or retain the unedited data along with the edited data (e.g. adding a column in the data set to identify the imputed/edited values).

2.3.2 Data Coding
Standard 2.3.2: MDAs, LGAs and other stakeholders must add codes to collected data to identify aspects of data quality from the collection (e.g., missing data) in order to allow users to appropriately analyze the data. Codes added to convert information collected as text into a form that permits immediate analysis must use standardized codes, when available, to enhance comparability.

The guidelines for this standard are:

Statistical Methods, Standards and Guidelines 20

Guideline 2.3.2.1: Insert codes into the data set that clearly identify missing data and cases where entry is not expected (e.g., skipped over by skip pattern). Do not use blanks and zeros as codes to identify missing data, as they tend to be confused with actual data.

Guideline 2.3.2.2: When converting text data to codes to facilitate easier analysis, use standardized codes, if they exist. Use the Tanzania coding standards listed below, if applicable. Provide cross-referencing tables to the Tanzania standard codes for any coding that does not meet the Tanzania standards. Develop other types of codes using existing Tanzania MDAs, LGAs and a stakeholder practice or standard codes from industry or international organizations, where they exist. Current Tanzania standard codes include the following:

(a) Region, District, Ward, EA/Village/Mtaa Codes which are maintained by NBS.
(b) International Standards for Industry Classification (ISIC Codes) - Use the ISIC to classify establishments. The ISIC is UN comparability in statistics about business activities across the globe. The codes can be downloaded from a website link; (c) Classification of Individual Consumption by Purpose (COICOP); (d) System of National Accounts (SNA), 1993 and 2008; (e) Hotels and Tourism – Three plus stars, etc; (f) Harmonized Commodity Description and Coding System (HS) codes for external trade; (g) Central Product Classification (CPC) codes for industry; (h) Geo Information System (GIS); (i) Government Finance Statistics (GFS); and (j) Tanzania Standard Classification of Occupation (TASCO).

2.3.3 Data Entry Standard 2.3.3: MDAs, LGAs and other stakeholders must use acceptable and easy compatible software to allow data transfers to different statistical applications. Some data entry software may include CSPrO, MS Excel and MS Access.

The guidelines for this standard are: Guideline 2.3.3.1: Data may be entered twice (double entry) to check for consistency.

Statistical Methods, Standards and Guidelines 21

2.3.4 Data Cleaning – Range, Consistency Checks and Validation
Standard 2.3.4: MDAs, LGAs and other stakeholders must make sure that all data entered into the system are consistent before further analysis. All demographic variables should reflect the data items. For example for demographic enquiries, a male should not have pregnancies in his lifetime.

The guidelines for this standard are: Guideline 2.3.4.1: Establish rules for range, consistency and validation checks to be applied to the data during and after data entry.

Guideline 2.3.4.2: Prepare list and printout of errors found in data entered and submit to quality control personnel for further action.

Guideline 2.3.4.3: Make appropriate corrections without altering the collected data from the field. Treat the remaining erroneous data as partial non-response.

2.3.5 Data Protection
Standard 2.3.5: MDAs, LGAs and other stakeholders must observe individual data confidentiality throughout the production process to ensure that survey data are handled to avoid disclosure.

The guidelines for this standard are: Guideline 2.3.5.1: For surveys that include confidential data, establish procedures and mechanisms to ensure the information’s protection during the production, use, storage, transmittal, and disposition of the survey data in any format (e.g., completed survey forms, electronic files, and printouts).

Guideline 2.3.5.2: Ensure that
(a) Individually-identifiable survey data are protected;
(b) Data systems and electronic products are protected from unauthorized intervention; and (c) Data files, network segments, servers, and desktop PCs are electronically secure from malicious software and intrusion using best available information resource security practices that are periodically monitored and updated.

Statistical Methods, Standards and Guidelines 22

Guideline 2.3.5.3: Controlled access to data sets so that only specific, authorized individuals working on a particular data set can have read only, or write only, or both read and write access to that data set. Data set access rights are to be periodically reviewed by the IT manager responsible for that dataset in order to guard against unauthorized release or alteration.

2.3.6 Quality Evaluations
Standard 2.3.6: MDAs, LGAs and other stakeholders must evaluate the quality of the data and make the evaluation public (through technical notes and documentation included in reports of results or through a separate report) to allow users to interpret results of analyses, and to help designers of future surveys to focus on improvement efforts.

The guidelines for this standard are: Guideline 2.3.6.1: Include an evaluation component in the survey plan that evaluates survey procedures and results. Review past surveys similar to the one being planned to determine likely sources of error, appropriate evaluation methods, and problems that are likely to be encountered. Address the following areas:

(a) Potential sources of errors, both sampling and non-sampling may include: (i) Coverage error (including frame errors); (ii) Non response error; (iii) Measurement error, including sources from the instrument, interviewers, respondents, changes associated with time of the object or phenomenon being measured, type of questions – biased or leading ones and collection process; and (iv) Data processing error (e.g., keying, coding, editing, and imputation error);

(b) How sampling error will be measured, including variance estimation and studies to isolate error components; and

(c) Make evaluation studies public to inform data users.

Statistical Methods, Standards and Guidelines 23

2.4
ESTIMATES AND PROJECTIONS

2.4.1 Developing Estimates and Projections
Standard 2.4.1: MDAs, LGAs and other stakeholders must use accepted theory and methods when deriving direct survey-based estimates, as well as model-based estimates and projections that use survey data. Error estimates must be calculated and disseminated to support assessment of the appropriateness of the uses of the estimates or projections. MDAs, LGAs and other stakeholders must plan and implement evaluations to assess the quality of the estimates and projections.

The guidelines for this standard are: Guideline 2.4.1.1: Develop direct survey estimates by employing sampling weights appropriate for the sample design to calculate population estimates. However, MDAs, LGAs and stakeholders may employ an alternative method (e.g., ratio estimators) to calculate population estimates if they have evaluated it and determined that it leads to acceptable results.

Guideline 2.4.1.2: Calculate variance estimates by a method appropriate to a survey’s sample design taking into account probabilities of selection, stratification, clustering, and the effects of non-response, post-stratification and ranking. The estimates must reflect any design effect resulting from a complex design.

Guideline 2.4.1.3: Document methods used to generate estimates and projections to help ensure objectivity, utility, transparency and reproducibility of the estimates and projections (For details on documentation, see 2.7.2). Also, archive data so the estimates/projections can be reproduced.

For population projections using e.g. exponential method or natural growth method compare advantages and disadvantages of each method.