Statistical Methods, Standards and Guidelines
2 (ii) Survey Involves identifying and collecting data from a randomly selected portion (a sample) of a given population, Conducted annually for different subjects during the intercensal period, Many questions asked that produce indicators for measuring early as well as long term results (outcome and impact) of service delivery. Supplements census and administrative data.
(iii) Surveillance and Longitudinal Studies These are ongoing, systematic collection, analysis, interpretation, and dissemination of data from a specific area or population, Collects data for vital events (births, deaths and migration), health, education and other demographic, social and economic variables.
Examples of data produced from surveillance method include HIV testing at Ante-Natal Care clinics at Kisesa in Magu District; Demographic surveillance sites at Ifakara, Rufiji and Hai districts.
Examples of Longitudinal studies include National and Kagera Panel Surveys.
Experimental and Case studies
An experimental study involves taking measurements of the system under study, manipulating the
system and then taking additional measurements using the same procedure to determine if the
manipulation has modified the values of the measurements.
A case study is based on an in-depth investigation of a person, a small group, a single situation, or a specific "case,”. It involves extensive research, including documented evidence of a particular issue or situation; symptoms, reactions, effects of certain stimuli, and the conclusion reached following the study. A case study may show a correlation between two factors, whether or not a causal relationship can also be proven. Case studies may be descriptive or explanatory.
Statistical Methods, Standards and Guidelines
3 1.2 Overview of Stages in Statistical Production In producing statistics, a number of stages have to be considered. These are elaborated below.
1.2.1 Users Demand for Statistical Data Internal and external users approach the national statistical offices and statistical units in MDAs requiring data and indicators for planning and decision making purposes. The statistical experts have to discuss with data users and other stakeholders to identify data needs to be addressed. The statistical experts have to translate the data needs into objectives of the statistical production. They also need to determine what method to use to generate the required statistical data.
1.2.2 Establishing Technical Committees Determine the composition of technical committee based on type of data required.
Involve expertise from different socio-economic fields and disciplines.
Carry out critical analysis of the subject matter in question during the technical committee meetings.
Planning including budget and roadmap for the statistical production.
1.2.3 Formulation of Statistical Problem
Data needs are normally presented in non-statistical language,
Need to come up with statistical formulations to produce the desired data and indicators,
Determine appropriate study design and the type of data needed to estimate population
parameters, statistical association and evaluation.
1.2.4 Information needs The technical committee has to determine what information has to be collected that will meet objectives and user needs,
Statistical Methods, Standards and Guidelines
4 An overall statement of information needed for socio-economic planning and decision making purposes such as health, education, employment and welfare of the population.
1.2.5 Tabulation and Analysis Plan Before collecting data there is need to develop tabulation / analysis plan as explained below. It is a planned way of summarizing and presenting the collected data, It includes frequency tables, cross tabulation and graphs, It involves computation of indicators and measures of association (correlations) and determining cause-effect relationships (regressions), Determine analysis variables (sex, age, locality, education, income levels, etc) in cross tabulations and regressions to unearth disparities.
1.2.6 Formulation of Statistical Questions Converting information needs tables and indicators to questions. Mock interviews among experts to test the questions. Pre-testing – checking how respondents understand the questions, what responses to expect, sensitivity and neutrality of questions, how to improve them, etc. General or specific questions for different respondent groups.
1.2.7 Data collection instruments / questionnaire design Combining all questions into a form, questionnaire or checklist. Logical flow of questions. Separate or single instrument for different respondent categories. Compile instruction manuals. Develop publicity and advocacy materials. 1.2.8 Sample design Resources can determine whether to collect data from the whole or part of total population. In addition, level of accuracy can determine the sample design. Identifying and selecting respondents to represent others including stratification of sub- groups. Adequate number of different respondent categories (adequate sample size). Sampling weights for estimation of population parameters.
Statistical Methods, Standards and Guidelines
5 1.2.9 Recruitment and training Determine number and type of personnel (e.g. supervisors and enumerators) who will be involved in data collection. Criteria for recruiting and selecting data collection personnel. Training to build the capacity of the personnel for the data collection. During training, impart general and specific skills to data collection personnel.
1.2.10 Pilot testing Testing all data production and logistical procedures before main fieldwork. Determine areas of strengths and weaknesses of the data collection system. Improvement of data production and logistical procedures.
1.2.11 Main fieldwork
Conduct advocacy and publicity campaigns before and during data collection phase to
improve the response rates.
Dispatching data collection equipment, instruments and personnel to and from the field.
Collect data from the earmarked respondents using appropriate instruments.
Strengthen field supervision mechanisms and teamwork to improve data quality.
Conduct post-enumeration survey (evaluation) immediately after main fieldwork to
determine coverage, content and quality aspects of the data collected.
1.2.12 Data processing Institute field procedures for checking quality of data by supervisors and manual editors. Transfer data from data collection instruments into computer files (data entry). Institute office procedures for checking quality of data before, during and after data entry. Build capacity of data entry operators in terms of speed and accuracy.
1.2.13 Tabulation / Analysis Implementing the tabulation / analysis plan. Summarizing the collected data into tables and statistics / indicators. Disaggregation of data - presenting data such that socio-economic differentials are clearly seen. Analyzing within and among socio-economic categories – column or row totals.
Statistical Methods, Standards and Guidelines
6
Sex and geographical location as major analysis variables – analyzing important population characteristics by sex and location such that gender and urban/rural differences are clearly reflected.
Make statistical inference from sample data to total population.
1.2.14 Interpretation and report writing This involves extracting main messages from the tabulated / analyzed data. Composition of different experts among the authors. An expert eye / lens is very crucial at this stage to pick the critical issues. Write separate chapters or reports on related findings.
1.2.15 Dissemination and Statistical Literacy Informing users and stakeholders on the results using various means such as reports, media and website. General and specific packages for various users. Promoting the policy agenda of the produced data. Statistical literacy to users to understand the data.
1.2.16 Documentation and Archiving Preparing basic information datasheets describing the data. Archiving the raw data and reports. Institute procedures for accessing the raw data including removing identification (anonymization) of respondents.
Statistical Methods, Standards and Guidelines
7 B: STANDARDS AND GUIDELINES FOR STATISTICAL PRODUCTION
2.0 Introduction There are various methods of data production as outlined earlier in this document. This part dwells in detail about these methods by providing relevant standards and guidelines.
2.1 SURVEY METHODOLOGY
2.1.1 Survey Planning
Standard 2.1.1: When starting a new survey or revision of an existing survey; MDAs, LGAs
and other stakeholders must develop a written proposal (concept note) that sets forth a
justification, including: goals and objectives, potential users, related and previous su rveys,
key survey estimates, the precision required of the estimates, the tabulation and analytic
results that will inform decisions and other uses, steps taken to prevent unnecessary
duplication with other sources of information, confidentiality of indivi dual data, when and
how frequently users need the data and public use of the data.
The guidelines for this standard are:
Guideline 2 .1.1.1: Surveys (and related activities such as focus groups, pretesting, pilot
studies, field tests, etc.) are collection s of information subject to the requirements of an
existing Statistics Act and Tanzania Statistical Master Plan (TSMP). An initial step in
planning a new survey or a revision of an ongoing survey should be to contact the sponsoring
Development Partner (DP), MDA, LGA and a stakeholder‟s most senior designated official to
ensure the survey work is done in compliance with the law and regulations. NBS approval
will be required before the MDAs, LGAs and other stakeholders embark on a data collection
exercise from households and establishments.
Guideline 2.1.1.2: Planning is an important prerequisite when designin g a new survey or implementing an amendment of an ongoing survey. Key planning activities include the following:
a) A justification for the survey i The rationale for the survey ii Relationship to previous surveys
Statistical Methods, Standards and Guidelines
8
iii
Survey goals and objectives
iv
Hypotheses to be tested; and
v
Definitions of key variables.
vi
Consultations with potential stakeholders (to identify their requirements and
expectations).
b) A review of related studies, surveys, and reports of Tanzania and non-Tanzania sources to ensure that part or all of the survey would not unnecessarily duplicate available data from an existing source, or could not be more appropriately obtained by adding questions to existing Tanzania statistical surveys. The goal here is to spend Tanzania funds effectively and minimize burden to data producers.
c) A review of the confidentiality and privacy policy of an existing Statistics Act on surveys that will collect individually-identifiable data from any survey respondent.
d) A complete and review of all survey data items, the justification for each item, and the means of measurements (e.g., through questionnaires, tests, or administrative records).
e) A plan for pre-testing or cognitive interviewing, if applicable
f) A plan for quality assurance during each phase of the survey process to permit monitoring and
assessing performance during implementation.
i
The plan should include possibility to modify the survey procedures if design
parameters appear unlikely to meet expectations (for example, if low response rates are
likely).
ii Should contain general specifications for an internal project management system that identifies critical activities and
iii Key milestones of the survey that will be monitored, and the timeframes among them.
g) A plan for evaluating survey procedures and results
h) An analysis plan that identifies analysis issues, objectives, key variables and proposed
statistical tests
i) An estimate of resources and target timeframe needed for completion of the survey cycle.
Statistical Methods, Standards and Guidelines
9 j) A dissemination plan that identifies target audiences, proposed major information products, and the timing of their release.
k) A data management plan for the preservation of survey data, documentation, and information products as well as the authorized disposition of survey records.
Guideline 2.1.1.3: Include standard elements of project management in the plan, including target completion dates, the resources needed to complete each activity, and risk planning.
Guideline 2.1.1.4: To maintain a consistent data series over time, use consistent data collection procedures for ongoing data collections on core statistics. Continuous improvement efforts sometimes result in a trade -off between the desire for consis tency and a need to improve a data collection. If changes are needed in key variables or survey procedures for a data series, consider the justification or rationale for the changes in terms of their usefulness for policymakers, conducting analyses, and ad dressing information needs. Develop adjustment methods, such as crosswalks and bridge studies that will be used to preserve trend analyses and inform users about the effects of changes.
2.1.2 Survey Designing
Standard 2.1.2: MDAs, LGAs and other stakeho lders must develop a survey design,
including defining the study frame, target population, sampling plan, identify the data
collection instruments and methods, developing a practical timetable, estimating survey cost,
and selecting samples using accepted s tatistical methods (e.g., probabilistic methods that can
provide estimates of sampling error). Any use of non -probability sampling methods (e.g.,
judgmental, Quota and Snowball etc. samples) must be justified statistically and be able to
measure estimation error. The size and design of the sample must reflect the level of detail
needed in tabulations and other data products, and the precision required of key estimates.
Documentation of each of these activities and resulting decisions must be maintained in t he
project files for use in documentation.
The guidelines for this standard are:
Guideline 2.1.2.1: Include the following in the survey design:
a) Frame for selection
b) Proposed target population;
Statistical Methods, Standards and Guidelines
10
c) Stratification levels/domain of study and analysis
d) Response rate from previous survey or expected response rate;
e) Survey frequency
f) Timing of data collection;
g) Data collection modes (such as paper and pencil, mail survey, tel ephone survey,
etc);
h) Sample design;
i) Precision requirements;
j) Effective sample size determination based on power analyses for key variables; and
k) Overall sample size.
Guideline 2.1.2.2: Ensure the sample design will yield the data required to meet the
objectives of the survey. Include the following in the sample design:
a) identification of the sampling frame (address, name, location);
b) identify the sampling unit used (at each stage if a multistage design);
c) identify sampling strata;
d) power analyses to determine sample sizes;
e) effective sample sizes for key variables by reporting domains (U rban/Rural where
appropriate);
f) criteria for stratifying or clustering, sample size by stratum, and the known
probabilities of selection;
g) response rate goals (see Standard 2.1.3); estimation and weighting plan; variance
estimation techniques appropriate to the survey design; and
h) Expected precision of estimates for key variables.
Guideline 2.1.2.3: When a non -probabilistic sampling method is employed, include the
following in the survey design documentation:
a) a discussion of what options were considered and why the final design was chosen,
b) an estimate of the potential bias in the estimates.
Guideline 2.1.2.4: Include a statement of confidentiality along with instructions required to
complete the survey.
Guideline 2.1.2.5: Include the following in the data collection plans:
a) frequency and timing of data collection;
Statistical Methods, Standards and Guidelines
11
b) methods of collection for achieving acceptable response rates;
c) training of enumerators and persons, coding and editing the data;
d) cost estimates, including the costs of pretests,
e) non-response follow-up, and
f) evaluation studies.
2.1.3 Response Rates
Standard 2.1.3: MDAs, LGAs and other stakeholders must design the survey to achieve the
highest rates of response to ensure that survey results are representative of the target
population so that they can be used with confidence to inform decisions. Non -response bias
analyses must be conducted when unit or item response rates or other factors suggest the
potential for bias to occur.
The guidelines for this standard are: Guideline 2.1.3.1: Calculate sample survey unit response rates without substitutions.
Guideline 2.1.3.2: Design data collections that will be used for sample frames for other surveys (e.g., the Population and Housing Census enumeration areas ( EAs), and the Central Register of Establishmen ts) to meet a target unit response rate of at least 80 percent, or provide a justification for a lower anticipated rate.
2.1.4 Focus Group Discussions (for instrument development) Standard 2.1.4: MDAs, LGAs and other stakeholders must ensure that the survey collects the required information for their intended producers and users. The purpose of this standard is to get key issues regarding the planned survey before developing the survey questionnaire.
The guidelines for this standard are: Guideline 2.1.4.1: Identify key stakeholders in the subject matter area who will participate in focus group discussion.
Guideline 2.1.4.2: Prepare semi-structured (focused) discussion with members of the target population to expose what they know about the study that the questionnaire will cover, how they think about the study and what terms they use in talking about the study topics/variables.
Statistical Methods, Standards and Guidelines
12 Guideline 2.1.4.3: Recruit volunteers (10 -20 from data collectors, producers and users side) who are at least familiar or ar e expected to be data producers or users of the study; to participate in a systematic discussion guided by a moderator about the survey topic(s) (questions for discussion should be prepared prior to convene volunteers). Lessons learned from the discussion will be the basis for questionnaire design.
2.1.5 Designing Survey Instrument(s) (Questionnaire(s))
Standard 2.1.5: Based on the experiences and lessons drawn from the literature review and
focus group discussion, but mainly reflecting on the objectives of the proposed study, MDAs,
LGAs and other stakeholders should design a questionnaire that will capture the intended
information to be collected. The instrument shall probe and systematically record
comprehensive information that answers the study questions.
The guidelines for this standard are: Guideline 2.1.5.1: Identify subject matter specialists (mostly statisticians, researchers, sociologists, economists, etc) who will draft the questionnaire. Guideline 2.1.5.2: Review literatures and instruments from previous similar studies for comparability purposes.
Guideline 2.1.5.3: Check the identification and demographic variables of the existing instruments if they meet the requirements of the intended study and update accordingly.
Guideline 2.1.5.4: Design the questionnaire using the available questions bank in Tanzania
(e.g. NADA) or outside Tanzania.
Guideline 2.1.5.5: Prepare instruction manuals for data collectors and supervisors.
2.1.6 Pre-testing of Survey Instruments Standard 2.1.6: MDAs, LGAs a nd other stakeholders must ensure that the draft questionnaire is pre -tested to randomly chosen participants by interviewers to probe the understanding of the study questions to respondents, time to complete one questionnaire and an attempt to learn how th ey formulate their answers. Then the reco rding of the outcome of the interview is done for questionnaire improvement. By conducting a pretest of the survey components, measurement error will be controlled.
Statistical Methods, Standards and Guidelines
13
The guidelines for this standard are: Guideline 2.1.6.1: Randomly choose participants to participate in pre-test interviews.
Guideline 2.1.6.2: Key researchers and survey desk officers should participate fully in the cognitive interview and if possible recording the interview for quality checking.
Guideline 2.1.6.3: Arrange technical meeting to discuss the experiences learned from the cognitive interviews and use the results to improve the questionnaire.
Guideline 2.1.6.4: Record starting and finishing times for questionnaire interviews to determine the average time spent per questionnaire. The technical committee can then allocate number of questionnaires to be completed per interviewer per day.
2.1.7: Training of Trainers (TOT), Supervisors and Training of Enumerators (TOE) Standard 2.1.7: MDAs, LGAs and other stakeholders should recruit field staff to participate in TOT and TOE (supervisors and enumerators) on the basis of their competence and experience in the planned data production exercise such as a census or a survey.
The guidelines for this standard are: Guideline 2.1.7.1: Identify key staff (trainers) who will train supervisors and enumerators
Guideline 2.1.7.2: If supervisors and enumerators are not enough or available within an organization, consider to hire and recruit them. In addition, recruit reserve supervisors and enumerators for replacement in case of dropouts during main survey.
Guideline 2.1.7.3: Prepare conducive environment for training in terms of geography, conference facilities and accommodation for participants.
Guideline 2.1.7.4: A maximum group of 5 enumerators should be supervised by one supervisor.
Guideline 2.1.7.5: Prepare mock exam to test the understanding of trainees