26 (or clerical) coding routines may assign numeric codes to text responses according to a pre-determined statistical classification to facilitate data capture and processing. 3.2.4.3 Review and validate Reviewing and validating by examining data to identify potential problems, errors and discrepancies such as outliers, item non-response and miscoding. 3.2.4.4 Edit and impute Editing and imputation where data are considered incorrect, missing, unreliable or outdated, new values may be inserted or outdated data may be removed in this process. The specific steps include determination of whether to add or change data; selection of the method to be used; adding/change data values; writing the new data values back to the data set, and flagging them as changed; and production of metadata on the editing and imputation process. When working with administrative data, it is often useful to suggest to data provider the inclusion of automated editing rules within the collection system, as this can lead to a dramatic increase in the quality of the administrative data. 3.2.4.5 Derive new variables and units Deriving new variables and units that are not explicitly provided in the collection, but are needed to deliver the required outputs. New variables are derived by applying arithmetic formulae to one or more of the variables that are already present in the dataset, or applying different model assumptions. This activity may need to be iterative, as some derived variables may themselves be based on other derived variables. It is therefore important to ensure that variables are derived in the correct order. New units may be derived by aggregating or splitting data or by various estimation methods. 3.2.4.6 Calculate aggregates Calculating aggregates to create aggregate data and population totals from microdata or lower-level aggregates. It includes summing data for records sharing certain characteristics (e.g. aggregation of data by demographic or geographic classifications), determining measures of average and dispersion. It is at this process where indicators are computed.
27
3.2.4.7 Finalise data files
Finalising data files to brings together results from other processes in this
phase in a data file (usually macro-data), which is used as the input to the
"analyse" phase. Sometimes this may be an intermediate rather than a final file,
particularly for business processes where there are strong time pressures, and
a requirement to produce both preliminary and final estimates.
3.2.5 Data analysis
In this phase, statistics are produced, examined in detail and made ready for
dissemination. This phase includes the processes and activities that enable
statistical analysts to understand the statistics produced. The data analysis phase is
broken down into the following processes:
3.2.5.1 Prepare draft outputs
Preparation of draft outputs where the data collected are transformed into
statistical outputs. It includes the production of additional measurements such
as indices, trends or seasonally adjusted series, as well as the recording of
quality characteristics.
3.2.5.2 Validate outputs
Validating outputs to check the quality of the outputs produced, in accordance
with a general quality framework and with expectations. Validation activities
can include:
i. Checking that the population coverage and response rates are as
required;
ii. Comparing the statistics with previous cycles (if applicable);
iii. Confronting the statistics against other relevant data (both internal and
external);
iv. Investigating inconsistencies in the statistics;
v. Performing macro editing;
vi. Validating the statistics against expectations and domain intelligence
28
3.2.5.3 Interpret and explain outputs
Scrutinizing and explaining where the in-depth understanding of the outputs is
gained. Use that understanding to scrutinize and explain the statistics
produced for this cycle by assessing how well the statistics reflect the initial
expectations, viewing the statistics from all perspectives using different tools
and media, and carrying out in-depth statistical analyses.
3.2.5.4 Apply disclosure control
Applying disclosure control in order to ensures that the data (and metadata) to
be disseminated do not breach the appropriate rules on confidentiality. This
may include checks for primary and secondary disclosure, as well as the
application of data suppression or perturbation techniques.
3.2.5.5 Finalise outputs
Finalizing outputs in order to ensure that the statistics and associated
information are fit for purpose and reach the required quality level, and are
thus ready for use. It includes:
i.
Completing consistency checks;
ii.
Determining the level of release, and applying caveats;
iii.
Collating supporting information, including interpretation, briefings,
measures of uncertainty and any other necessary metadata;
iv.
Producing the supporting internal documents;
v.
Pre-release discussion with appropriate internal subject matter experts;
and
vi.
Approving the statistical content for release.
3.2.6 Dissemination of results
Dissemination of statistical products is the last step after the collection and analysis
activities in order to ensure that the produced statistics are used for planning and
decision making processes at different administrative levels from national, regional,
district and down to lower administrative levels such as wards/shehia and villages.
29
Statistical products should be disseminated to stakeholders in various forms of
publications such as hard copies, soft copies and by posting on the websites and
dashboards. In order for the product to be user-friendly, charts, tables and relevant
attachments should normally be included.
During dissemination of statistical products, producers should ensure that the
products meet users’ needs as follows:
3.2.6.1 Update output systems
Make dissemination and data preservation plan early in the statistical
production that includes archiving, publishing and distribution. This is to verify
and ensure that the released statistical products after all the processing steps
are consistent with the source data. In the case of the derived variables, it
means that one should be able to reproduce the same results from the source
data;
3.2.6.2 Produce dissemination products
Produce dissemination products through: preparing the product components
(explanatory text, tables, charts, quality statements etc.); assembling the
components into products; and editing the products and checking that they
meet publication standards. This could include printed publications, press
releases, seminars, awareness program and web sites;
3.2.6.3 Manage release of dissemination products
Manage release of dissemination products for ensuring that all elements for
the release are in place including: Timeliness and punctuality; Accuracy and
reliability;
Transparency;
Accessibility
and
clarity;
Coherence
and
comparability; and Statistical Confidentiality and security;
3.2.6.4 Promote dissemination products
Promote dissemination products for communicating statistical information to
users. This concerns the active promotion of the statistical products produced
in a specific statistical business process, to help them reach the widest
possible audience.
30 3.2.6.5 Manage user support Manage user support to ensure that users’ queries and requests for services such as micro-data access are recorded, and that responses are provided within agreed deadlines. These queries and requests should be regularly reviewed to provide an input to the overarching quality management process, as they can indicate new or changing user needs. 3.2.7 Evaluation of the statistical program This phase manages the evaluation of a specific instance of a statistical business process. It logically takes place at the end of the instance of the process, but relies on inputs gathered throughout the different phases. It includes evaluating the success of a specific instance of the statistical business process, drawing on a range of quantitative and qualitative inputs, and identifying and prioritising potential improvements. For statistical outputs produced regularly, evaluation should, at least in theory occur for each iteration, determining whether future iterations should take place. However, in some cases, particularly for regular and well established statistical business processes, evaluation may not be formally carried out for each iteration. In such cases, this phase can be seen as providing the decision as to whether the next iteration should start from the Specify Needs phase, or from some later phase (often the Collect phase). This phase is made up of three processes, which are generally sequential, from left to right, but which can overlap to some extent in practice. These processes are: 3.2.7.1 Gather evaluation inputs The first process is gathering evaluation inputs where evaluation material can be produced in any other phase or sub-process. It may take many forms, including feedback from users, process metadata, paradata, system metrics, and staff suggestions. Reports of progress against an action plan agreed during a previous iteration may also form an input to evaluations of subsequent iterations. This sub-process gathers all of these inputs, and makes them available for the person or team producing the evaluation.
31 3.2.7.2 Conduct evaluation The second process is to conduct the actual evaluation during which analysis of the evaluation inputs is done and synthesises into an evaluation report. The resulting report should note any quality issues specific to this iteration of the statistical business process, and should make recommendations for changes if appropriate. These recommendations can cover changes to any phase or process for future iterations of the process, or can suggest that the process is not repeated. 3.2.7.3 Agree an action plan The third process is to develop and agree an action plan which will brings together the necessary decision-making power to form and agree an action plan based on the evaluation report. It should also include consideration of a mechanism for monitoring the impact of those actions, which may, in turn, provide an input to evaluations of future iterations of the process. 3.3 Fundamental Principles of Official Statistics The fundamental principles of official statistics adopted by United Nations Statistical Commission at the global level for the purpose of ensuring the statistics produced provide the reliable information of high quality statistics to support evidence-based decision-making. For administrative data to be recognized as official, the NSS have to comply with the ten adopted principles of official statistics as follows: i. Relevance, impartiality and equal access: Official statistics provide an indispensable element in the information system of a democratic society, serving the government, the economy and the public with data about the economic, demographic, social and environmental situation. To this end, official statistics that meet the test of practical utility are to be compiled and made available on an impartial basis by official statistical agencies to honour citizens' entitlement to public information. ii. Professional standards and ethics: To retain trust in official statistics, the statistical agencies need to decide according to strictly professional considerations, including scientific principles and professional ethics, on the
32
methods and procedures for the collection, processing, storage and
presentation of statistical data.
iii.
Accountability and transparency: To facilitate a correct interpretation of the
data, the statistical agencies are to present information according to scientific
standards on the sources, methods and procedures of the statistics.
iv.
Prevention of misuse: The statistical agencies are entitled to comment on
erroneous interpretation and misuse of statistics.
v.
Sources of official statistics: Data for statistical purposes may be drawn
from all types of sources, be they statistical surveys or administrative records.
Statistical agencies are to choose the source with regard to quality,
timeliness, costs and the burden on respondents.
vi.
Confidentiality: Individual data collected by statistical agencies for statistical
compilation, whether they refer to natural or legal persons, are to be strictly
confidential and used exclusively for statistical purposes.
vii.
Legislation: The laws, regulations and measures under which the statistical
systems operate are to be made public.
viii.
National coordination: Coordination among statistical agencies within
countries is essential to achieve consistency and efficiency in the statistical
system.
ix.
Use of international standards: The use by statistical agencies in each
country of international concepts, classifications and methods promotes the
consistency and efficiency of statistical systems at all official levels.
x.
International cooperation: Bilateral and multilateral cooperation in statistics
contributes to the improvement of systems of official statistics in all countries.
3.4 Integration of Gender Dimensions in Administrative Data
To ensure the integration of gender dimension into administrative data, the following
are to be taken into account:
i.
Specify gender data needs;
ii.
Build the relationships with users of gender-specific data and create
awareness of data needs and the processes of data production;