WORKING GROUP ON PASSENGER MOBILITY STATISTICS

Similar documents
2011 INTERNATIONAL COMPARISON PROGRAM

Deliverable 2.3 Development of MOBI Online tool & application

Applications to support the curation of African government microdata for research purposes

2011 INTERNATIONAL COMPARISON PROGRAM

ESPON BSR-TeMo Territorial Monitoring for the Baltic Sea Region

INSPIRE status report

Microdata Management Toolkit (MMT) National Data Archive (NADA)

WM2015 Conference, March 15 19, 2015, Phoenix, Arizona, USA

Proposals for the 2018 JHAQ data collection

CORA Technical Annex. ESSnet. WP3: Definition of the Layered Architecture. 3.2 Technical Annex. Statistics Netherlands

HEALTH INFORMATION INFRASTRUCTURE PROJECT: PROGRESS REPORT

Data Management Plan

Placement Administration and Support System (PASS) User Guide. System Version January 2018 (v9)

MT+ Beneficiary Guide

strat-e-gis Congestion User Guide istockphoto.com/chrishepburn Congestion User Guide

European Transport Policy: ITS in action ITS Action Plan Directive 2010/40/EU

GUIDELINES ON DATA FLOWS AND GLOBAL DATA REPORTING FOR SUSTAINABLE DEVELOPMENT GOALS

Generic Statistical Business Process Model

DNA Certification Programs Overview (as of 10 June 2006)

Flash Eurobarometer 468. Report. The end of roaming charges one year later

HPE Network Transformation Experience Workshop Service

Research towards the finalization of European Transport Information System (ETIS)

Deposit instructions for social and behavioural sciences

EISAS Enhanced Roadmap 2012

D2.5 Data mediation. Project: ROADIDEA

Deliverable D10.1 Launch and management of dedicated website and social media

Introduction to and Aims of the Project : Infocamere and Data Warehousing


D3.1 Validation workshops Workplan v.0

ECC Recommendation (17)04. Numbering for ecall

Open Geospatial Consortium

Workpackage WP 33: Deliverable D33.6: Documentation of the New DBE Web Presence

Dexterity: Data Exchange Tools and Standards for Social Sciences

IST CRUMPET, Creation of User Friendly Mobile Services Personalised for Tourism R. Report

MT+ Beneficiary Guide

PADOR HELP GUIDE FOR CO-APPLICANTS

SLHC-PP DELIVERABLE REPORT EU DELIVERABLE: Document identifier: SLHC-PP-D v1.1. End of Month 03 (June 2008) 30/06/2008

Business Model for Global Platform for Big Data for Official Statistics in support of the 2030 Agenda for Sustainable Development

Cleaning the data: Who should do What, When? José Antonio Mejía Inter American Development Bank SDS/POV MECOVI Program February 28, 2001

Google Analytics. Gain insight into your users. How To Digital Guide 1

WEB USER GUIDE CONTENT OVERVIEW. Setting up an Account Installing Devices Setting up Schedules Administering Vehicles Monitoring

Interactivity in producing graphs challenges for producers and data users

Contents. International Union for Conservation of Nature Basic guide to the Forum s Web-spaces

Basics in good research data management (RDM) for reviewing DMPs

Integration of INSPIRE & SDMX data infrastructures for the 2021 population and housing census

TAIEX Expert Database Guidelines for National Contact Points. Version 1.0

METADATA MANAGEMENT AND STATISTICAL BUSINESS PROCESS AT STATISTICS ESTONIA

CoE CENTRE of EXCELLENCE ON DATA WAREHOUSING

GCSE ICT AQA Specification A (Full Course) Summary

A Centralised System for Administrative Data Collection at Statistics Finland

Delivery Manual for Articles 12 and 17:

Mobile telephones/international roaming frequently asked questions (see also IP/05/161)

tripwallet freedom for YOU!

CREATING SMART TRANSPORT SERVICES BY FACILITATING THE RE-USE OF OPEN GIS DATA

Chapter 17: INTERNATIONAL DATA PRODUCTS

Training Workshop on Multi Hazard Early Warning Systems

European Reference Data Management Service (ERDMS)

MT+ Beneficiary Guide

Economic and Social Council

Deliverable Monitoring of active participation of future suppliers of R&D. Author: Mr. Christophe Veys

Transport Modelling in OmniTRANS. Exercise book

For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS

Website Privacy Policy

IDA Organic Farming Global Implementation Plan

ONLINE. Join us! Release Note IRU Academy Accredited Training Institutes (ATIs) Worldwide professional excellence in road transport

European Master s in Translation (EMT) Frequently asked questions (FAQ)

Workshop on Prototyping the catalogue of reusable data visualisation tools in the EU Institutions

SDMX GLOBAL CONFERENCE

Demo: Linked Open Statistical Data for the Scottish Government

RIDE WITH TRANSLOC ANDROID USER GUIDE

Managing Translations for Blaise Questionnaires

ESSnet. Common Reference Architecture. WP number and name: WP2 Requirements collection & State of the art. Questionnaire

The future of European research on public transport: the UITP perspective. Laurent Franckx

5PRESENTING AND DISSEMINATING

PRIVACY POLICY. We will use the information that we collect about you in accordance with:

Publishing Microdata to <odesi> Using Nesstar Publisher 4.X (using DDI 2.x) August 21, 2012

COUNCIL OF THE EUROPEAN UNION. Brussels, 24 May /13. Interinstitutional File: 2013/0027 (COD)

Fact Sheet No.1 MERLIN

Technical Working Session on Profiling Equity Focused Information

Introduction to IPUMS

A Blaise Editing System at Westat. Rick Dulaney, Westat Boris Allan, Westat

LAWRENCE-DOUGLAS COUNTY INTELLIGENT JOURNEY

EC Horizon 2020 Pilot on Open Research Data applicants

NVivo: 11Pro. Essentials for Getting Started Qualitative Data Analysis

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

Data Migration Plan (40) Fingrid Oyj

Initial Operating Capability & The INSPIRE Community Geoportal

MOBILE DEVICES FOR SURVEY WORK

How to Create a European INSPIRE Compliant Data Specification. Anja Hopfstock, BKG (Germany) Morten Borrebæk, SK (Norway)

WFD Reporting Guidance 2016

MEETING OF THE WMO COMMISSION FOR CLIMATOLOGY TASK GROUP ON FUTURE WMO CLIMATE DATABASE MANAGEMENT SYSTEMS. (GENEVA, 3-5 May 2000) WCDMP-No.

We appreciate your feedback

XML-based production of Eurostat publications

PRIVACY STATEMENT FOR DATA COLLECTED FOR DATA COLLECTED VIA ON-LINE SURVEYS

FAQs to your Premium-Transfer at fixed prices

DELIVERABLE D12.6/12.7 IECM Database Extension & User Interface with Tabulator

Installation and Getting Started Guide Alchemex for MYOB Account Right

Informatica PowerExchange for Tableau User Guide

TABLE OF CONTENTS PAGE

Data Migration Plan Updated (57) Fingrid Datahub Oy

Transcription:

Document: PM-2003-05/EN Original: English "Transport Statistics" WORKING GROUP ON PASSENGER MOBILITY STATISTICS Luxembourg, 24-25 April 2003 Jean Monnet Building, Room M5 Beginning 0:00 am Database and Data Retrieval Data Analysis Software ELMIS Item 5.5 of the agenda - -

Design and Application of a Travel Survey for European Long-distance Trips Based on an International Network of Expertise - 5 th Framework Programme Competitive and Sustainable Growth - European Commission DG TREN Database and Data Retrieval - Data Analysis Software ELMIS Meeting of the Passenger Mobility Working Group Luxembourg, 24-25 April 2003 - University of Maribor SLOVENIA - Doc PM/2003/05 2

INTRODUCTION The database brings together a consistent set of long-distance passenger travel information collected throughout the European Union and Switzerland. Information coded from paper questionnaires and CATI systems form the basis of this database. Apart from coded information, the database also contains other valuable information such as weighting and projection factors. The results are available through the European Long-distance Mobility Information System (ELMIS), which is a web based application supported by the NESSTAR statistical engine. ELMIS delivers information about the survey concepts as well as collected data and tools for data analysis. 2 DATABASE, CODING AND DATA RETRIEVAL 2. DATABASE The database provides information about long-distance mobility in the European Union. The contents describe long-distance mobility at several levels of detail. The database holds information at household and person levels. Travel behaviour information such as journeys and trips is linked to both levels. In addition, travel information itself is structured into two levels of detail journey level and trip level. The journeys provide general information about travel behaviour whilst trips give a deeper insight into the journeys of particular interest to the project. In addition to journeys and trips, the database also contains information about Commuting Journeys. In order to avoid data redundancy and to preserve all the relationships, the database is structured as a set of normalised relational tables. In the Annex, one finds Figure which presents an entity-relationship diagram of the database structure. The survey design allows for different survey methods to be applied. This decision affects the data content. In case of telephone surveys, most of the time travel information is gathered on just one person from the household. In postal surveys households are asked to provide information for all members of the household. The second specialty of the survey is a two-phase data collection process. As mentioned above journeys, trips and excursions are collected. Information on journeys is collected during the first phase. Later, journeys are divided into trips and excursions. This 3

information was collected during the second phase. However, not all the journeys have information about trips and excursions. In the main survey this information was only collected for journeys of particular interest with regard to a defined selection rule. In the database, users can also find journeys outside the agreed reporting period in cases where no journeys within the reporting period were discovered. Along with this information, the database contains information from exploration and non-response surveys. The database can be accessed and downloaded from the ELMIS web site at: http://cgi.fg.uni-mb.si/elmis. It can also be obtained on CD ROM free of charge from: University of Maribor Faculty of Civil Engineering Construction IT Centre Smetanova 7 SI 2000 Maribor Email: elmis@uni-mb.si The database includes data from the 5 European Union Member States and Switzerland. It is possible to acquire either subsets of the database by country or the complete database. The database is available as a set of comma separated ASCII files. These files are exports stemming from the relational tables of the MS SQL database system. Each relational table is extracted as a separate file. All files are put together, compressed and made available as a self-extracting executable file. 2.2 DATABASE BUILDING PROCESS The database is a result of work carried out in the project by many partners from various European countries. The process of database building started concurrently with the data collection process. The data collection process began with the coding of the survey questionnaires. Survey organisations were responsible for this task. The coding process required digitising of information provided by respondents and also geocoding of places used in the questionnaires. 4

On a regular basis and according to a predefined schedule, site administrators sent their coded data to the data centre at the University of Maribor. At the data centre, all partial databases were integrated into one database and the database was checked for consistency and errors. Error reports were generated for each survey organisation and sent back for correction. When all the errors were corrected, several updates were performed on the database. Calculated values such as journey distance, number of journeys per household and person, NUTS codes, etc. were derived from existing information. The complete database was sent for data analysis in which two main tasks were carried out. The first task was the implementation of weighting and projection; and secondly, the derivation of main long-distance mobility indicators. 2.3 CODING Although survey organisations were free to use any other software tool for the coding, the project provided software tools to support coding, geocoding, error checking and the transfer of information to a central location for all survey organisations free of charge. In cases where survey organisations preferred to use their own coding software, the project defined a procedure for data preparation. In such cases, survey organisations prepared comma delimited ASCII files for sending to the data centre. The European Coding Book defined file structure, contents, code lists and other required characteristics of the resulting data files. A particular problem for organisations that had not used software was the geocoding. For this purpose the database of place names was extracted to a text file where all places were listed. If mapping support was needed, a survey organisation had to provide its own. Some survey organisations, when using their own software for coding, took advantage of the Geocode It application to support the geocoding process. Data integration started quite early and in parallel with the data collection and coding procedures. Survey organisations received a schedule for sending the data, which contained four sending intervals. Each interval consisted of two stages. In the first stage, survey organisations sent the data coded so far. At the data centre the error-checking procedure was performed and the results in the form of error lists and suggestions about 5

possible causes were sent back to the survey organisation. In the second stage, the survey organisation tried to correct all the errors that had been discovered and after three weeks they sent a new, corrected dataset back to the data centre. According to the schedule all questionnaires were to be coded after four such iteration cycles. The data sending schedule was tailored to the needs of a particular country and survey organisation, because survey work did not start at the same time in all countries. 2.4 DATA INTEGRATION AND PROCESSING After the data reached the data centre, the data were processed and checked for plausibility. Feedback was provided to survey organisations. Figure 2 in the Annex presents an overview of data processing. The process of coding and error correction was followed by data preparation. The aim of this task was to prepare the database for weighting and projection, data analysis and finalisation of the database and the creation of exports. Data preparation included calculation of derived values. In addition to the raw data, new variables were defined and calculated to support processes that follow data collection and coding such as weighting and statistical analyses. Data preparation also included data imputation. Trip information was generated for each one-day journey. For such journeys trip information was not coded but variables from the journey record were copied to trip records. For each one-day journey two trip records were created the outward and return trip. 2.5 GEO-CODING The project has implemented a mobility study; therefore on the one hand the results give answers to questions such as why, when and how someone travels. On the other hand, the information collected reveals another important dimension about mobility: the spatial distribution of passenger activities. For every geocoding activity applied after the actual journey, a reference list of geocoded places was needed. Since the objective is to present the data at a regional level, using the NUTS administrative classification, it was decided to use the city (town, village) as the smallest unit for collecting geographical information. This place name geocoding 6

was also recommended by other projects, e.g. MEST and TEST, for several reasons: (a) people are not keen on giving out exact address information, (b) they easily forget more detailed information or (c) they do not know the exact address. Two databases were selected for compilation of the database of places. The main source for place data was GEOnet Names Server (GNS). The GNS is a US system and provides access to the National Imagery and Mapping Agency's (NIMA) database of foreign geographic feature names. The database is the official repository for foreign place name decisions approved by the US Board on Geographic Names (US BGN). Since the GNS only includes places outside the US, the Geographic Names Information System (GNIS), developed by the USGS in co-operation with the US Board on Geographic Names (BGN) was used to cover places inside the US Survey organisations were responsible for coding and they were also responsible for geocoding places. This proved to be a good solution since the majority of places that were to be geocoded were in the area familiar to the coders. In addition, the coders speak the same language as the respondents, which is important with respect to different spellings of places. Most survey respondents use the spelling the coder understands. To support geo-coding, the project provided software tools to the survey organisations. This support was integrated into the Collect It coding application. In this way, geo-coding was an integral part of the whole coding process. Mapping functionality also supported the application. Survey organisations who decided to use their own software for coding, could use a standalone geocoding tool provided by or rely on self-implementation of geocoding. For this purpose, the project provided a list of all place names. One important result of geocoding was distance calculation. During error checking, journeys were tested to see if the calculated distance from the journey origin to the journey destination was greater than 00 km. In the end, all short journeys were excluded from the MEST and TEST are projects founded by the EC in 4 th framework programme dealing with methods and technologies for long distance mobility studies. 7

final database. 8

In addition to distance calculation, the second purpose for collecting geographic information was the preparation of origin destination matrices (O-D matrices). The project delivers O-D matrices at the regional level, consistent with the NUTS classification. For the purpose of matrix building, it was necessary to create aggregations of geocoded places. Therefore, besides existing information, each place received a code for the region in which it is located. The project has the objective of analysing the data at the NUTS level. Despite this, more detailed NUTS 3 codes have been assigned to the places, since the maps obtained allow for that level of detail. For matrix building, the NUTS classification was used; however, users can utilise the more detailed codes for their own purposes. 3 EUROPEAN LONG-DISTANCE MOBILITY INFORMATION SYSTEM 3. OVERVIEW The European Long-distance Mobility Information System (ELMIS) is a result of the project and generally known as Deliverable 9 (D9). The aim of ELMIS is to make the data collected in the survey available to the general public. Since intends to deliver the results to a broad audience using contemporary media, the decision was made to develop an Internet-based data retrieval and analysis system. The system is available at web address http://cgi.fg.uni-mb.si/elmis. ELMIS combines all efforts of the project and allows the outside community to explore the outcomes. In this sense, ELMIS should be considered as a medium that gives access to an expertise on long-distance mobility and surveys while allowing to tap a source of information and experience which is the result of a great amount of work carried out in the data collection process and its processing. ELMIS is a web site supported by applications that deliver the survey results in a highly interactive way. To allow ELMIS users to browse through the database and perform statistical analyses, the system integrates the NESSTAR server application, which contains a statistical engine. When using ELMIS, the user interacts with the NESSTAR statistical engine through the NESSTAR Light Client (NCL). For the purposes of ELMIS, the client has been adapted slightly from the standard client normally used. More information about NESSTAR can be found at www.nesstar.com. 9

The second application behind the system is the O-D matrices browser. This application is based on an O-D matrices viewer developed by the Peter Davidson Consultancy in England. The original matrix viewer is a Windows application and has been upgraded with a web interface to meet the needs of ELMIS. In order for the user to gain a quick overview of basic results, a number of main indicators constructed from the survey data have been prepared and can be easily accessed online. For an extended analysis of the results, ELMIS provides the original database with all necessary descriptions. This database can be downloaded from the ELMIS site (web address will be announced). Acting as a supplement to the results, further information about the survey is given via Internet links relevant to the project (e.g. location of deliverables) and through additional blocks of background information. ELMIS also includes fundamental project information about dissemination events, involved partners and other related activities. 3.2 TABULATIONS AND CHARTS The core functionality of ELMIS is found in the NESSTAR Light Client. This is a set of web pages, which form a user interface for statistical analysis and the construction of charts. The client allows users to search for, locate, browse, analyse and download a wide variety of statistical and background information from NESSTAR server within a web browser. There is no need for a specialised software to be able to view the data stored on the NESSTAR server. The server also gives access to metadata (descriptions of data tables and variables) and contains basic information about the survey itself with links to other parts of ELMIS. The metadata is structured according to the Data Documentation Initiative (DDI). The DDI is an internationally recognised standard for the creation, presentation and preservation of metadata. More information about the DDI can be found at http://www.icpsr.umich.edu/ddi/index.html 0

During data preparation, weighting variables have been appropriately marked, allowing the NESSTAR system to recognise them. The user can select the variables with weights he or she wants to use for the preparation of a desired set of statistics. The data can be exported in several formats compatible with common statistical analysis tools. Through the NESSTAR Light Client, ELMIS offers the following formats for exports: - SPSS system file - SPSS portable file - NSDstat - Statistica - Stata - Data Interchange Format (DIF) (suitable for use in Excel) - Dbase 3 - SAS 3.3 O-D MATRICES Since the project is concerned with long-distance mobility, one of the key project reports is about O-D matrices. For the purpose of the project, it was decided that for all 5 EU countries, the O-D matrices should be constructed following the regional differentiation of NUTS. If a destination lies outside the EU, a defined system of superimposed zones applies. Not all, but most of these zones correspond to a country. During the development of the matrix, the modal split has been taken into account as a third feature. The user is able to explore journeys undertaken by plain, train, car or by some other mode of transport. Naturally, matrices showing all modes of transport are also available. Through ELMIS, the user can view the O-D matrices in tabular or graphical form. In both cases, origins by regions and destinations can be selected from respective lists offered through the matrix page. In addition to viewing places, modes of transport can also be chosen.

ANNEX Figure : Entity-relationship diagram of the database structure HH_State Households..* Persons 0.. Commuting * * Journeys..* Participants * * Trips Excursions 2

Figure 2: Data integration flowchart Start take one country data from FTP server Assign geographic coordinates to geo-coded places list of places with coordinates coded files are in ASCII format Yes Calculate derived variables No Convert ASCII to Paradox No Check for errors, warnings, geoerrors and item non-response Error reports by received data packet is conversion A No A succsesful Yes all data packets prepared Comment errors and advise future coding Comments, customised error reports Correct or exclude data packet with errors Yes Join partial databases Prepare exports for weighting Data exports by country End No is join successful Import data to MSSQL database lists of interviewed persons Import or generate»interviwed person«flag household group size files Import or calculate numer of HH participants for journeys 3