A corporate approach to processing microdata in Eurostat

Similar documents
Integration of INSPIRE & SDMX data infrastructures for the 2021 population and housing census

A STRATEGY ON STRUCTURAL METADATA MANAGEMENT BASED ON SDMX AND THE GSIM MODELS

Proposals for the 2018 JHAQ data collection

Sinikka Laurila Statistics Finland,

Business microdata dissemination at Istat

METADATA MANAGEMENT AND STATISTICAL BUSINESS PROCESS AT STATISTICS ESTONIA

Validation report transmission guide for SBS Structural Business Statistics

Description of the European Big Data Hackathon 2019

Dissemination Web Service. Programmatic access to Eurostat data & metadata

EUROINDICATORS WORKING GROUP. Demetra+, a new seasonal adjustment tool 12 TH MEETING 3 RD & 4 TH DECEMBER 2009 EUROSTAT D5 DOC 287/09

7. Detail: Main SDMX objects for metadata exchange (What is SDMX? Part iii)

New IT solutions for item list management and data validation. 4 th Inter-Agency Coordinating Group Meeting October 23-25, 2017 Washington, DC

Economic and Social Council

Streamlining Data Compilation and Dissemination at ILO Department of Statistics Lessons Learned and Current Status

ESS Shared SERVices project Background, Status, Roadmap. Modernisation Workshop 16/17 March Bucharest

European Interoperability Framework

SDMX GLOBAL CONFERENCE

Statistics and Open Data

On the Design and Implementation of a Generalized Process for Business Statistics

CoE CENTRE of EXCELLENCE ON DATA WAREHOUSING

ESSnet on common tools and harmonised methodology for SDC in the ESS

UNCTAD Capacity Building on ICT measurement

Excel to SDMX Templates for Fisheries Statistics

Metadata Management in the FAO Statistics Division (ESS) Overview of the FAOSTAT / CountrySTAT approach by Julia Stone

3.4 Data-Centric workflow

MEASURING THE PERFORMANCE OF HEALTH SYSTEMS ACROSS EUROPEAN AND NON-EUROPEAN COUNTRIES: SPECIAL CONTENT AND FEATURES OF OECD HEALTH DATABASE

Directorate B: Quality, methodology and information systems

EUROSTAT and BIG DATA. High Level Seminar on integrating non traditional data sources in the National Statistical Systems

Information Infrastructure: Foundations for ABS Transformation. Stuart Girvan, Australian Bureau of Statistics MSIS Paris, April 2013.

A New Data Structure and Codification for Balance of Payments. Rodrigo Oliveira-Soares and René Piche SDMX Conference, Washington DC 2 May 2011

Business Architecture concepts and components: BA shared infrastructures, capability modeling and guiding principles

XML-Publishing Implementation Strategy of an XML-based publishing in Eurostat

Vision 2020 for Statistical Classifications of Economic Activities and Products

Sampling Error Estimation SORS practice

A new international standard for data validation and processing

ESSnet. Common Reference Architecture. WP number and name: WP2 Requirements collection & State of the art. Questionnaire

DIRECTORS OF METHODOLOGY/IT DIRECTORS JOINT STEERING GROUP 18 NOVEMBER 2015

Metadata projects in Sweden and the Nordic countries. C-G Hjelm, Statistics Sweden

The United Nations Crime Trends Survey (UN-CTS) Michael Jandl Statistics and Surveys Section UNODC

NordMAN: The Nordic Microdata Access Network

Item 3 Granting access to microdata for research purposes - an overview

EUDAT. Towards a pan-european Collaborative Data Infrastructure - A Nordic Perspective? -

INSPIRE & Environment Data in the EU

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) (Geneva, 3-5 April 2006)

GUIDELINES ON DATA FLOWS AND GLOBAL DATA REPORTING FOR SUSTAINABLE DEVELOPMENT GOALS

Standard Operating Procedure. Data Collection and Validation. Public

edamis Web Forms for sending data to Eurostat

Privacy and Security Aspects Related to the Use of Big Data Progress of work in the ESS. Pascal Jacques Eurostat Local Security Officer 1

Business Architecture concepts and components: BA Process Flow

News from Nuremberg. 4 th Workshop on Data Access (WDA) 2012/03/26, Luxembourg. Stefan Bender, IAB David Schiller, IAB

Technical Group Health and Health Interview Survey (HIS) Statistics

INSPIRE status report

Jörg Frauenstein. EUGRIS European dimension for availability and accessibility of information sources for contaminated land, soil and water

METADATA FLOWS IN THE GSBPM. I. Introduction. Working Paper. Distr. GENERAL 26 April 2013 WP.22 ENGLISH ONLY

Document for item 4.2 of the agenda

Current status of WP3: smart meters

Guidelines. for sharing of ESS news. across NSI websites. using RSS

Technical aspects of VTL to SQL translation Prepared by Regional Statistical Office in Olsztyn, Poland

2011 INTERNATIONAL COMPARISON PROGRAM

Making Open Data work for Europe

The C3S Climate Data Store and its upcoming use by CAMS

on Air Quality Management and Assessment

European Marine Data Exchange

TOOP Introducing The Once-Only Principle project

Data exchange using EDAMIS. Item 2.1 of the agenda SISAI-Metadata Working Group meeting 2015

Metadata How it has evolved in EU

ULISSE. Carlo Albanese Telespazio. ASI Workshop on Space Foundations Rome, 29 May 2012

2011 INTERNATIONAL COMPARISON PROGRAM

USE CASES IN SEISMOLOGY. Alberto Michelini INGV

Generic Statistical Business Process Model

A Unit-base derived from the SBR. Springboard to a Data Lake

HORIZON 2020 WORK PROGRAMME I: INFORMATION AND COMMUNICATION TECHNOLOGIES

Global Assessment of the National Statistical System and National Strategy of Development of Statistics

Metadata and classification system development in Bosnia and Herzegovina

WP.25 ENGLISH ONLY. Supporting Paper. Submitted by Statistics Finland 1

INSPIRE on tools workshop

Building Semantic Interoperability in Europe

EDIT QUICK START USER GUIDE

CORA COmmon Reference Architecture

TAIEX Expert Database Guidelines for National Contact Points. Version 1.0

Interoperability and transparency The European context

Integrated Data Processing System (EAR)

ESSPROS Task Force on Methodology November Qualitative data review

StatDCAT-AP. A Common Layer for the Exchange of Statistical Metadata in Open Data Portals

ECAS User Guide for external EDAMIS users - How to request an account / reset a password -

EUROPEAN COMMISSION EUROSTAT. Directorate G :Global Business Statistics Unit G-2: Structural business statistics and global value chains

Microdata Management Toolkit (MMT) National Data Archive (NADA)

Open Access to Publications in H2020

Quick User Guide for TCO EDAMIS 3.3

Nationally Determined Contribution (NDC) Registry Submission Portal. User Guide for Parties. Version 1 (2016) UNFCCC

ESSnet. COmmon Reference Environment. WP number and name: WP1 Project Management. Deliverable number and name: 1.1 Preliminary Report

M403 ehealth Interoperability Overview

The European Commission s science and knowledge service. Joint Research Centre

A metadata-driven process for handling statistical data end-to-end

Access to Data / Access to Market

Reportnet Architecture

SDMX self-learning package No. 7 Student book. SDMX Architecture Using the Pull Method for Data Sharing

Report from UN-GGIM: Europe A year in review

Data Entry Tool, version 3.0

Functional & Technical Requirements for the THECB Learning Object Repository

Transcription:

A corporate approach to processing microdata in Eurostat Pál JANCSÓK and Christine WIRTZ Eurostat Unit B4 1

Agenda Introduction Generic SAS Tool (GSAST) architecture Microdata processing Architecture Metadata in GSAST Structure Management Possible extension of GSAST to data providers Current processing Self service Conclusions 2

Introduction Eurostat is the statistical office of the European Union and a General Directorate of the European Commission. Eurostat is the central institution of the European Statistical System (ESS) - a network of National Statistical Institutes (NSI) from all EU and EFTA Countries. Eurostat's mission: to be the leading provider of high quality statistics on Europe. 4

Introduction Streamline the statistical production: Internal efforts External efforts Harmonisation, and consolidation. Modern, generic tools for data processing. Cooperation with data providers, sharing infrastructure. Remote access for data quality. Internal Generic SAS Tool Shared Generic SAS Tool 5

Agenda Introduction Generic SAS Tool (GSAST) architecture Microdata processing Architecture Metadata in GSAST Structure Management Possible extension of GSAST to data providers Current processing Self service Conclusions 6

Microdata Processing Stage 1: Regular production chain Receipt of data Country processing Aggregation Disseminate Member States EDAMIS 7

Microdata Processing Stage 1: Regular production chain Receipt of data Country processing Aggregation Disseminate Data revision, justification Revised data Data quality checks Error and quality report 8

Microdata Processing Stage 1: Regular production chain Receipt of data Country processing Aggregation Disseminate Calculation of European statistics; Disclosure control, etc. 9

Microdata Processing Stage 1: Regular production chain Receipt of data Country processing Aggregation Dissemination www.ec.europa.eu/eurostat Anonymised microdata for researchers Data Explorer 10

Microdata Processing Stage 1: Regular production chain Receipt of data Country processing Aggregation Disseminate Stage 2: Additional ad-hoc processing further analyse the data provide statistics to answer external requests additional publications 11

GSAST, the corporate approach Requirements: Use state-of-the-art technologies; Apply a centralised approach; Modular design including generic and reusable modules for all data collections; Standard user interface; Possibility to perform additional computations on the data (Stage 2); Easy to use and easy to learn by statisticians; Maintenance of the tool should be easy and could preferably be performed by the statisticians. 12

GSAST, the corporate approach Implementation: SAS Business Intelligence platform: Server Client approach; Common server-based tool with central development and support services; SAS Enterprise Guide (EG) as user interface with stored processes; Full computational power of SAS is available for additional processing; Easy to learn common user interfaces; Use of metadata to parameterise the production processes. 13

GSAST Workflow example Continuous Vocational Training Survey 14

GSAST in Use GSAST is in use for 9 different surveys. In SAS EG the user initiates the operation one by one Parameters are provided by users. Feedback is provided about the operations. 15

Agenda Introduction Generic SAS Tool (GSAST) architecture Microdata processing Architecture Metadata in GSAST Structure Management Possible extension of GSAST to data providers Current processing Self service Conclusions 16

Metadata in GSAST: Structure Technical metadata Defines the computational environment SAS BI architecture Process definition metadata Input parameters Process parameters Structural metadata Datasets Variables Statistical metadata Flags Footnotes Code lists 17

Metadata in GSAST: Management Maintenance of the surveys mainly by means of metadata updates Complex metadata structure Integrity constraints Difficult to maintain manually GSAST Metadata Editor 18

Metadata in GSAST: Management GSAST Metadata Editor the maintenance tool Integrated to the existing GSAST as an EG add-in; User-friendly way to browse complex metadata structures; Parametric and metadata driven; Metadata editing and validation to maintain integrity; Centralised management of code lists by the SDMX registry. 19

Agenda Introduction Generic SAS Tool (GSAST) architecture Microdata processing Architecture Metadata in GSAST Structure Management Possible extension of GSAST to data providers Current processing Self service Conclusions 20

GSAST extension for data providers Well-established workflows for microdata collections at NSIs. Data Quality checks at Eurostat at different levels. There is an overlap between the processing workflows at the NSIs and at Eurostat. Provide remote access to NSIs for the Country processing in the GSAST chain. Self-Service approach. 21

GSAST: Current processing 1. Data Upload 2. Country Processing 3. Feedback on the data 4. Data Upload 5. Country Processing 6. Multiple Countries Processing EDAMIS GSAST Country Resposibility Eurostat Responsibility Country Loop 22

GSAST: Proposed workflow 1. Data Upload 2. Country Processing 3. Feedback on the data 4. Data Upload 5. Country Processing 6. Multiple Countries Processing GSAST 1: Web Based Interface GSAST 2 Country Resposibility Country Loop Eurostat Responsibility 23

Agenda Introduction Generic SAS Tool (GSAST) architecture Microdata processing Architecture Metadata in GSAST Structure Management Possible extension of GSAST to data providers Current processing Self service Conclusions 24

Conclusions: The GSAST platform is proven to be useful to process of several microdata collections. GSAST Metadata Editor solved the problem of the relatively heavy metadata for survey maintenance. Possible extension towards NSIs presents several challenges. While the proposed architecture ensures secure treatment of microdata, the adaptation of working methods presents a major challenge which is, however, fully in line with the joint strategy of the ESS. 25