The United Republic of Tanzania
A HANDBOOK OF QUALITY GUIDELINES FOR
STATISTICAL PRODUCTION IN TANZANIA
National Bureau of Statistics
Ministry of Finance
Dar-es-Salaam
November, 2012
A Handbook of Quality Guidelines for Statistical Production in Tanzania
i
TABLE OF CONTENTS
LIST OF ACRONYMS ................................ ................................ ................................ ................. iv PREFACE ................................ ................................ ................................ ................................ ...... v PART ONE ................................ ................................ ................................ ................................ ... 1 GENERAL OVERVIEW ................................ ................................ ................................ ............. 1 1.0 Background ................................ ................................ ................................ .......................... 1 1.1 Basic processes of statistical production ................................ ................................ ............... 2 1.1.1 Identifying data demand ................................ ................................ ................................ 2 1.1.2 Preparation ................................ ................................ ................................ .................... 3 1.1.3 Data collection ................................ ................................ ................................ .............. 3 1.1.4 Data processing ................................ ................................ ................................ ............. 3 1.1.5 Data analysis ................................ ................................ ................................ ................. 3 1.1.6 Dissemination ................................ ................................ ................................ ............... 4 1.1.7 Optimizing statistics ................................ ................................ ................................ ...... 4 1.2 Quality concepts and definitions in statistics ................................ ................................ ........ 4 1.2.1 Relevance ................................ ................................ ................................ ..................... 4 1.2.2 Accuracy ................................ ................................ ................................ ....................... 4 1.2.3 Timeliness and punctuality ................................ ................................ ............................ 5 1.2.4 Accessibility ................................ ................................ ................................ ................. 5 1.2.5 Interpretability ................................ ................................ ................................ .............. 5 1.2.6 Coherence ................................ ................................ ................................ ..................... 5 PART TWO ................................ ................................ ................................ ................................ .. 6 QUALITY FRAMEWORK ................................ ................................ ................................ ......... 6 2.0 Introduction ................................ ................................ ................................ ......................... 6 2.1 Total survey error ................................ ................................ ................................ ................. 6 2.2 Fitness for intended use ................................ ................................ ................................ ........ 6 2.2.1 Relevance ................................ ................................ ................................ ..................... 7 2.2.2 Accuracy ................................ ................................ ................................ ....................... 7 2.2.3 Timeliness and punctuality................................ ................................ ........................... 10 2.2.4 Accessibility ................................ ................................ ................................ ................ 11 2.2.5 Interpretability ................................ ................................ ................................ ............. 13 2.2.6 Coherence ................................ ................................ ................................ .................... 14 2.2.7 Comparability ................................ ................................ ................................ .............. 15 2.3 Survey process quality management ................................ ................................ ................... 16 2.3.1 Develop a sustainable quality management plan. ................................ .......................... 17 2.3.2 Perform quality assurance activities. ................................ ................................ ............ 19 2.3.3 Perform quality control activities. ................................ ................................ ................ 20 2.3.4 Create a quality profile ................................ ................................ ................................ . 20 A Handbook of Quality Guidelines for Statistical Production in Tanzania
ii
PART THREE ................................ ................................ ................................ ............................ 22 QUALITY INPUTS AND GUIDELINES IN DATA PRODUCTION STEPS ......................... 22 3.0 Introduction ................................ ................................ ................................ ........................ 22 3.1 Coverage and frames ................................ ................................ ................................ ........... 22 3.1.1 Quality inputs ................................ ................................ ................................ ............. 23 3.1.2 Guidelines ................................ ................................ ................................ ................... 23 3.1.3 Quality indicators................................ ................................ ................................ ........ 23 3.2 Sample design ................................ ................................ ................................ ..................... 24 3.2.1 Quality inputs................................ ................................ ................................ .............. 24 3.2.2 Guidelines ................................ ................................ ................................ ................... 24 3.2.3 Quality indicators................................ ................................ ................................ ........ 25 3.3 Questionnaire design ................................ ................................ ................................ ........... 25 3.3.1 Quality inputs................................ ................................ ................................ .............. 25 3.3.2 Guidelines ................................ ................................ ................................ .................. 25 3.3.3 Quality indicators................................ ................................ ................................ ........ 27 3.4 Translation of survey instruments ................................ ................................ ........................ 27 3.4.1 Quality inputs................................ ................................ ................................ .............. 27 3.4.2 Guidelines ................................ ................................ ................................ ................... 27 3.4.3 Quality indicators................................ ................................ ................................ ........ 27 3.5 Interview recruitment and training ................................ ................................ ....................... 27 3.5.1 Quality inputs................................ ................................ ................................ .............. 27 3.5.2 Guidelines ................................ ................................ ................................ ................... 28 3.5.3 Quality indicators ................................ ................................ ................................ ........ 28 3.6 Pre-testing ................................ ................................ ................................ ........................... 28 3.6.1 Quality inputs................................ ................................ ................................ .............. 28 3.6.2 Guidelines ................................ ................................ ................................ ................... 28 3.6.3 Quality indicators................................ ................................ ................................ ........ 29 3.7 Data collection ................................ ................................ ................................ .................... 29 3.7.1 Quality inputs................................ ................................ ................................ .............. 29 3.7.2 Guidelines ................................ ................................ ................................ ................... 29 3.7.3 Quality indicators................................ ................................ ................................ ........ 30 3.8 Data processing and statistical adjustment ................................ ................................ ........... 30 3.8.1 Quality inputs................................ ................................ ................................ .............. 30 3.8.2 Guidelines ................................ ................................ ................................ ................... 31 3.8.3 Quality indicators ................................ ................................ ................................ ........ 34 3.9 Data dissemination ................................ ................................ ................................ .............. 34 3.9.1 Quality inputs ................................ ................................ ................................ ............... 34 3.9.2 Guidelines ................................ ................................ ................................ .................... 35 A Handbook of Quality Guidelines for Statistical Production in Tanzania
iii
3.9.3 Quality indicators ................................ ................................ ................................ ......... 36 ANNEXES ................................ ................................ ................................ ................................ ... 37
A Handbook of Quality Guidelines for Statistical Production in Tanzania
iv
LIST OF ACRONYMS
CPI Consumer Price Index
CSPro Census and Survey Processing System
CSI Customer Satisfaction Survey
DP Development Partner
EDP Electronic Data Processing
GDP Gross Domestic Product
GNP Gross National Product
HBS Household Budget Survey
ICR Intelligent Character Recognition
LGA Local Government Authority
MDAs Ministries, Departments and Agencies
MDGs Millennium Development Goals
MKUKUTA Mkakati wa Kukuza Uchumi na Kupunguza Umaskini Tanzania
MKUZA Mpango wa Kukuza Uchumi na Kupunguza Umaskini Zanzibar
MCR Mark Character Recognition
MSE Mean Square Error
MTEF Medium Term Expenditure Framework
NBS National Bureau of Statistics
NER Net Enrolment Ratio
NPES National Poverty Eradication Strategy
NSO National Statistical Office
NSS National Statistical System
OCGS Office of the Chief Government Statistician
OCR Optical Character Recognition
PES Post Enumeration Survey
PORALG President’s Office – Regional Administration and Local
Government
SNA System of National Accounts
SPSS Statistical Package for Social Statistics
STATA Statistical Software Package
TDHS Tanzania Demographic and Health Survey
TNADA Tanzania National Data Archive
TSED Tanzania Socio-Economic Database
TSMP Tanzania Statistical Master Plan
A Handbook of Quality Guidelines for Statistical Production in Tanzania
v
PREFACE The National Bureau of Statistics (NBS) as a Government Agency is responsible for the production and dissemination of official statistics in Tanzania. In order to properly perform the activities, the Department of Statistical Methods, Standards and Coordination within the NBS has been producing guiding documents for use by statistical stakeholders within the National Statistical System (NSS). The documents include ; Concepts and Definitions for Production of Official Statistics , Statistical Methods, Standards and Guidelines for Producing Official Statistics and Quali ty Guidelines for Statistical Production among others.
The main objective of the ‘Handbook on Quality Guidelines for Statistical Production’ is to provide guidelines for improving data quality within the National Statistical System (NSS).
This is the fir st publication of its kind to be produced by the NBS within the implementation
process of the Tanzania Statistical Master Plan (TSMP). The TSMP has important components
aiming at improving the Quality of Data, through strengthening the statistical infrastr ucture within
the National Statistical System as an important pre-requisite for producing quality statistics.
The publication has three parts: Part One is on General Overview covering Background, Basic Processes of Statistics Production, Quality Concepts and Definitions of major Statistics. Part Two is on Quality Framework covering Introduction, Total Survey Error, Fitness for Intended Use and Survey Process Quality while Part Three is on Quality Inputs and Guidelines in Data Production Steps covering Introduction, Survey Coverage and Frames, Sample Design, Questionnaire Design, Translation of Survey Instruments, Interview Methods, Recruitment and Training of Enumerators/Supervisors, Pre -testing, Data Collection, Processing, Statistical Adjustment and Dissemination.
This document is subject to revision , through the provision of inputs from various statistical stakeholders within the NSS. The ultimate goal is to improve the quality of statistics . This is possible by putting in place sustainable Quality Assu rance and Quality Control Systems at all stages of data production.
Last but not least, any comments for improving future publications are welcome.
Dr. Albina A. Chuwa Director General. A Handbook of Quality Guidelines for Statistical Production in Tanzania
1
PART ONE GENERAL OVERVIEW
1.0 Background
The increase in demand for both traditional and development statistics for policy and development
agenda has influenced the Government of Tanzania to review NBS’s mandate and enable other data
producers to facilitate informed decision -making process, through the provision of relevant, timely
and reliable user -driven statistical information - “official statistics” . Careful decision has been
made on how best to develop quality guidelines for data producers to be able to produce official
statistics most effectively and efficiently across the whole National Statistical System.
In this regard, the National Bureau of Statistics (NBS) and the Office of the Chief Government Statistician (OCGS) of the Revolutionary Government of Zanzibar, have prepared a Handbook on Quality Guidelines for Statistical Production in Tanzania to be used by all data producers and users. The handbook addresses possible errors produced at all stages of data collection, processing and dissemination with possible measures to be undertaken to minimize them. This i nitiative is in line with the Tanzania Statistical Master Plan (TSMP) which aims at strengthening the National Statistical System so that quality statistics for decision mak ing are made available objectively, in a coordinated manner, timely and cost effectively.
The main purpose of this handbook is to enhance data quality and efficiency to ensure that the
statistics produced by all data producers are relevant, reliable and timely available and easily
accessible within the NSS. In the NSS t he National Bureau of Statistics is the central institution in
Tanzania Mainland while in Tanzania Zanzibar , it is the Office of the Chief Government
Statistician. Other key Ministries, Departments and Agencies (MDAs) that collect economic,
social, demographic and enviro nment statistics are the Ministries of Local Government;
Agriculture, Food Security and Co -operatives; Livestock and Fisheries Development; Finance;
Education and Vocational Training ; Science and Technology ; Labour, Employment and Youth
Development; Water and Irrigation and Health and Social Welfare. Key agencies and departments
include: Registration Insolvency and Trusteeship Agency ; Bank of Tanzania ; Tanzania
Meteorological Agency; Tanzania Revenue Authority and Tanzania Police Force. Other ministries
and institutions also collect, use and provide statistical information and form part of the NSS.
A Handbook of Quality Guidelines for Statistical Production in Tanzania
2
1.1 Basic processes of statistical production
Source: Quality Standards in German Official Statistics
1.1.1 Identifying data demand This is usually done by establishing a dialogue between data users and data producers who are within and outside the National Statistical System so as to reach a mutual agreement on the way forward. National s tatistical offices identify emerging data demands through observations and contacts with many institutions and groups that are relevant to society. If such demand s cannot be met by means of the existing data, official statistics submit proposals as to how the problem might be solved. This often includes conducting a new surv ey to cover the new data demand depending on the availability of resources.
For example in 1997 , when the Government adopted the National Poverty Eradication Strategy (NPES), an idea emerged to have a data collection system for monitoring poverty trends i n the country. The aim of the NPES was based on the goals of vision 2025 and took into account the Millennium Development Goals (MDGs) to reduce poverty, hunger, diseases, illiteracy, environmental destruction and put emphasis on empowering vulnerable groups in the country like women, children etc. The monitoring process required relevant data that could be collected, processed and disseminated.
A Handbook of Quality Guidelines for Statistical Production in Tanzania
3
1.1.2 Preparation Based on the identified data demand, main activities for data collection are prepared. This in cludes: participating in the development of the legal basis by giving advice and comments, defining the group of respondents and obtaining their consent, confidentiality, implementing the survey mode and variables in a questionnaire and performing the rele vant tests, sample planning, budgeting of resources, selection of the survey method, preparing for data processing, and data analysis as well as dissemination. Conducting stakeholders’ meetings/workshops for identifying types of questions needed to capture the required indicators are among moves aimed at minimizing the possibility of collecting irrelevant data.
1.1.3 Data collection
This phase covers the practical steps of data collection through field work or using administrative
data/records, including the technical-organizational preparations like recruiting qualified and well
trained data collectors. Thereafter, collection of data is done in accordance with the established data
collection instruments such as questionnaires, instruction manuals and control forms for controlling
movements of materials at all stages of the data collection exercise. Data collection is done under
close supervision in order to ensure that the survey operation is done properly and statistics
produced are of good quality.
1.1.4 Data processing
Data processing is done through several stages, starting with manual editing and coding of
questionnaires received from respondents, followed by data entry then data control and computer
editing through relevant computer program s. For fu rther processing, the data are brought into a
form allowing Electronic Data Processing (EDP) and errors are eliminated through corrections.
Plausible data are expanded or weighted in case of sample surveys. Finally, the data are tabulated
and made available for further evaluations.
1.1.5 Data analysis
The main steps in this process are further processing of the statistical results to form overall
systems, documenting the surveys and their data quality, as well as analysing and interpreting the
data. This must be done by ensuring that socio -economic indicators such as Gross
National/Domestic Products, Consumer Price Ind ices and Net Enrolment Ratios which are useful
for measuring development outcomes are revealed.
A Handbook of Quality Guidelines for Statistical Production in Tanzania
4
1.1.6 Dissemination
Dissemination of statistical information is the last step after the collection and analysis activities in
order to ensure that the produced statistics are used for planning and decision making processes at
different administrative levels from National, Regional, District and down to lower administrative
levels such as Wards and Villages. Statistical information is disseminated to stakeholders in various
forms of publications including hard copies, soft copies, in discs and by posting on the websites. In
order for the results to be user -friendly, charts, tables and relevant attachments are normally
included. Dissemination is normally based on marketing concepts of the statistical offices.
Depending on customer interest, and in line with the marketing model, the statistical info rmation is
offered as free basic provision, as standard products or as customer-specific processing.
1.1.7 Optimizing statistics Main goal is the continuous improvement of data quality and an increase in efficiency by continuously analysing and improving all the above -mentioned work processes and their results. This is usually done in order to achieve both the value for the resources spent in producing the statistics and customer satisfaction. In addition to that, quality statistics are essential for makin g both evidence based plans and results oriented decisions.
1.2 Quality concepts and definitions in statistics
The quality of statistical data is assessed by means of a whole set of quality criteria such as
relevance, accuracy, timeliness / punctuality, accessibility, interpretability and coherence . These
have been used by many statistical agencies and organizations in defining quality though the
criteria may slightly differ within the agencies/institutions.
1.2.1 Relevance The relevance of statistical i nformation reflects the degree to which it meets the actual needs of clients. It is concerned with whether the available information sheds light on the issues that are important to users. Assessing relevance is subjective and depends upon the varying needs of users. The statistical producers’ challenge is to weigh and balance the conflicting needs of the current and potential users to produce a program that goes as far as possible in satisfying the most important needs within the given resource constraints.
1.2.2 Accuracy The accuracy of statistical information is the degree to which the information correctly describes the phenomena it was designed to measure. It is usually characterized in terms of error in statistical estimates and is traditionally decom posed into bias (systematic error) and variance (random error) A Handbook of Quality Guidelines for Statistical Production in Tanzania
5
components. It may also be described in terms of the major sources of error that potentially cause inaccuracy (e.g. coverage, sampling, non response and response errors.)
1.2.3 Timeliness and punctuality The timeliness and punctuality of statistical information refers to the delay between the reference point to which the information pertains, and the date on which the information becomes available. The timeliness of information will influence its relevance. Most users of official statistics are interested in up to date information. Therefore, statistics should be released as closely as possible to dates specified in advance. Early release of the result s is getting ever more important and is now a major point of emphasis for many statistics.
1.2.4 Accessibility The accessibility of statistical information refers to the ease with which it can be obtained from the statistical producers. This includes the ease with which the existence of informati on can be ascertained, as well as the suitability of the form or medium through which the information can be accessed. The cost of the information may also be an aspect of accessibility to some users.
1.2.5 Interpretability The interpretability of statis tical information reflects the availability of the supplementary information and metadata necessary to interpret and utilize it appropriately. This information normally includes the underlying concepts, variables and classifications , the methodology of dat a collection and processing, and the indication or measures of the accuracy of the statistical information.
1.2.6 Coherence The coherence of statistical information reflects the degree to which it can be successfully brought together with other statistic al information within a broad analytic framework and over time. The use of standard concepts, classifications and target populations promotes coherence, as does the use of common methodology across statistical productions. Coherence does not necessarily im ply full numerical consistency.
These quality criteria are overlapping and interrelated. There is no general model that brings them together to optimize a level of quality. Achieving an acceptable level of quality is the result of addressing, managing and balancing these elements of quality over time with careful attention to cost, respondent burden, professionalism and design constraints that may affect information quality or user expectations. This balance is a critical aspect of the design of the statistical productions. A Handbook of Quality Guidelines for Statistical Production in Tanzania
6
PART TWO QUALITY FRAMEWORK
2.0 Introduction The quality framework is for assuring and assessing quality . It highlights three aspects of quality: total survey error, fitness for intended use and survey process quality , followed by guid elines for managing and assessing quality throughout the statistical production lifecycle.
2.1 Total survey error
The total survey error (TSE) paradigm is widely accepted as a conceptual framework for evaluating
survey data quality. It defines quality as the estimation and reduction of the mean square error
(MSE) of statistics of interest, which is the sum of random errors (variance) and squared systematic
errors (bias). TSE takes into consideration both measurement (construct validity, measurement
error, and processing error) i.e., how well survey questions measure the constructs of interest and
representation (coverage error, sampling error, non response error and adjustment error) i.e.,
whether one can generalize to the target population using sample su rvey data. In the TSE
perspective, there may be cost -error trade -offs, that is, there may be a tension between reducing
these errors and the cost of reducing them. In this framework, TSE may be viewed as being covered
by the accuracy dimension.
2.2 Fitness for intended use This is a more modern paradigm, it is multidimensional and focuses on criteria for assessing quality in terms of the degree to which survey data meet user requirement s. By focusing on fitness for the intended use, study design strives to meet user requirements in terms of survey data accuracy and other dimensions of quality (such as comparability and timeliness). In this perspective, ensuring quality on one dimension (comparability) may conflict with ensuring quality on another dimension (timeliness); and there may be conflict between meeting user requirements and the associated cost of doing so on one or more dimensions.
Dimensions of quality that are often used to assess the quality of national official statistics in terms of both survey error and fitness for the intended use are illustrated below showing the indicators of quality and the guidelines related to these dimensions.
A Handbook of Quality Guidelines for Statistical Production in Tanzania
7
2.2.1 Relevance The produced statistical data should be valuable inputs that can fulfill the needs of the clients or users. For example, a dataset for trends in age-specific fertility rates derived from Tanzania Demographic and Health Surveys (TDHS) is relevant to the Ministry of Education for the projections of number of pupils expected to start primary education at a certain period of time.
2.2.1.1 Indicators for relevance
2.2.1.1.1 Description of clients and users
This indicator indentifies data producers and users. How relevant the information/statistics are to the clients, and whether the available information/statistics derived from various sources related to statistical production are important to users and whether they meet client’s needs.
2.2.1.1.2 Description of users’ needs ( by main groups)
This indicator describes the needs of the users by categorizing their groups, with regards to the available information/statistics derived from various sources related to statistical production.
2.2.1.1.3 Assessment of user satisfaction
The indicator evaluates the level of customer satisfaction though Customer Satisfaction Index (CSI) related to the importance of the derived data.
2.2.1.2 Guidelines for relevance
2.2.1.2.1
Goals and objectives of statistical production should be clearly stated.
2.2.1.2.2
While designing the questionnaire, ensure that all survey questions are relevant to
the statistical production objectives.
2.2.1.2.3
Construct a data file with a data dictionary of all variables in the selected elements
data file, with all variable names and accompanying descriptions which are
relevant to the statistical production objectives.
2.2.2 Accuracy
To ensure that the derived data describe the phenomena they were designed to measure. This can be
assessed in terms of Mean Square Error (MSE).