GEOSS Common Infrastructure and the Big Data challenges

Similar documents
THE GEOSS PLATFORM TOWARDS A BIG EO DATA SYSTEM LINKING GLOBAL USERS AND DATA PROVIDERS

The GEO Discovery and Access Broker

Multi-disciplinary Interoperability: the EuroGEOSS Operating Capacities

DATA SHARING AND DISCOVERY WITH ARCGIS SERVER GEOPORTAL EXTENSION. Clive Reece, Ph.D. ESRI Geoportal/SDI Solutions Team

Next GEOSS der neue europäische GEOSS Hub

GENeric European Sustainable Information Space for Environment.

Multi-disciplinary interoperability challenges

SEXTANT 1. Purpose of the Application

Citizens Science and Smart Cities JRC Ispra 5-7 February The Brokering Approach

Esri Geoportal Server

Introduction to Prod-Trees

GEO Update and Priorities for 2014

Webservice-energy.org GEO Community Portal & Spatial Data Infrastructure for Energy

The GeoPortal Cookbook Tutorial

Discovery and Access of Geospatial Resources Using GIS Portal Toolkit Marten Hogeweg Product Manager GIS Portal Toolkit

IMAGERY FOR ARCGIS. Manage and Understand Your Imagery. Credit: Image courtesy of DigitalGlobe

Increasing dataset quality metadata presence: Quality focused metadata editor and catalogue queriables.

Call for Participation in AIP-6

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

How to become an INSPIRE node and fully exploit the investments made?

NextData System of Systems Infrastructure (ND-SoS-Ina)

Catalog-driven, Reproducible Workflows for Ocean Science

INSPIRE & Environment Data in the EU

natural disasters: the role of the Group

The geospatial metadata catalogue. FOSS4G Barcelona. Jeroen Ticheler. Founder and chair. Director

The EOC Geoservice: Standardized Access to Earth Observation Data Sets and Value Added Products ABSTRACT

GEOSS Common Infrastructure (GCI) Research Engineering Report GEO Architecture Implementation Pilot, Phase 5 GEOSS Architecture Implementation Pilot

PRODUCT BROCHURE ERDAS APOLLO MANAGING AND SERVING GEOSPATIAL INFORMATION

Achieving Interoperability using the ArcGIS Platform. Satish Sankaran Roberto Lucchi

Air Quality Community Scenario Engineering Report GEOSS Architecture Implementation Pilot, Phase 2 Version <1.4>

Uniform Resource Locator Wide Area Network World Climate Research Programme Coupled Model Intercomparison

Introduction to INSPIRE. Network Services

The Common Framework for Earth Observation Data. US Group on Earth Observations Data Management Working Group

EPOS a long term integration plan of research infrastructures for solid Earth Science in Europe

Addressing Geospatial Big Data Management and Distribution Challenges ERDAS APOLLO & ECW

SMARTERDECISIONS. Geospatial Portal 2013 Open Interoperable GIS/Imagery Services with ERDAS Apollo 2013 and ERDAS Imagine 2013

SPACE for SDGs a Global Partnership

ERDAS APOLLO Managing and Serving Geospatial Information

Growing Variety and Volume of Remote Sensing and In Situ Data

European Space Policy

GCI Research and GeoViQua Integration Engineering Report GEOSS Architecture Implementation Pilot Phase 6. Version 1.0

Earth Observation Imperative

Copernicus Space Component. Technical Collaborative Arrangement. between ESA. and. Enterprise Estonia

Leveraging OGC Services in ArcGIS Server. Satish Sankaran, Esri Yingqi Tang, Esri

Standards and business models transformations

Esri Support for Geospatial Standards

Leveraging OGC Standards on ArcGIS Server

Leveraging metadata standards in ArcGIS to support Interoperability. David Danko and Aleta Vienneau

Making Open Data work for Europe

Greece s Collaborative Ground Segment Initiatives

Understanding and Using Metadata in ArcGIS. Adam Martin Marten Hogeweg Aleta Vienneau

Global Earth Observation System of Systems. GEO Secretariat Geneva, Switzerland

EXTRA Examples of OGC standards in support of health applications

Initial Operating Capability & The INSPIRE Community Geoportal

/// INTEROPERABILITY BETWEEN METADATA STANDARDS: A REFERENCE IMPLEMENTATION FOR METADATA CATALOGUES

Compass INSPIRE Services. Compass INSPIRE Services. White Paper Compass Informatics Limited Block 8, Blackrock Business

Data discovery and access via the SeaDataNet CDI system

The Many Facets of THREDDS Thematic Real-time Environmental Distributed Data Services

Sharing Environmental Information In Action. Chris Steenmans European Environment Agency

Some Big Data Challenges

Implementing a Data Quality Strategy to simplify access to data

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation

Monitoring the Environment for Climate Change: The case of GMES

The NCAR Community Data Portal

HDF Product Designer: A tool for building HDF5 containers with granule metadata

EVOlution of EO Online Data Access Services (EVO-ODAS) ESA GSTP-6 Project by DLR, EOX and GeoSolutions (2015/ /04)

IMPROVING THE INFRASTRUCTURE FOR EARTH OBSERVATION, ACTIONS AND UPDATES FROM THE SENTINEL ALPINE OBSERVATORY AT EURAC RESEARCH

Oracle Spatial Users Conference

Interoperability Between GRDC's Data Holding And The GEOSS Infrastructure

The NextGEOSS Project

SDI SOLUTIONS FOR INSPIRE: TECHNOLOGIES SUPPORTING A FRAMEWORK OF COOPERATION

GSCB-Workshop on Ground Segment Evolution Strategy

Leveraging OGC Services in ArcGIS Server. Satish Sankaran Yingqi Tang

Introduction to SDIs (Spatial Data Infrastructure)

GMES Status November 2004

Enabling Efficient Discovery of and Access to Spatial Data Services. CHARVAT, Karel, et al. Abstract

International Organization for Standardization Technical Committee 211 (ISO/TC211)

Jeffery S. Horsburgh. Utah Water Research Laboratory Utah State University

Big Data Earth Observation Standardization elements Codrina Ilie TERRASIGNA TF7/SG5

GeoDCAT-AP Representing geographic metadata by using the "DCAT application profile for data portals in Europe"

The What, Why, Who and How of Where: Building a Portal for Geospatial Data. Alan Darnell Director, Scholars Portal

(Geo)DCAT-AP Status, Usage, Implementation Guidelines, Extensions

GEOSPATIAL ERDAS APOLLO. Your Geospatial Business System for Managing and Serving Information

CWIC Data Partner s Guide (OpenSearch) Approval Date:

SII Law Organization Coordination activities Examples of good practices Education Technical matters Success stories Challenges

Copernicus Data and Information Access Services(DIAS)

Interoperability with ArcGIS

Presenting Quality Information : From Dataset Quality to Individual Sample. INSPIRE Workshops. GeoViQua project

Providing Interoperability Using the Open GeoServices REST Specification

Outline. The Collaborative Research Platform for Data Curation and Repositories: CKAN For ANGIS Data Portal. Open Access & Open Data.

Earth Science Community view on Digital Repositories

Leveraging metadata standards in ArcGIS to support Interoperability. Aleta Vienneau and Marten Hogeweg

Italy - Information Day: 2012 FP7 Space WP and 5th Call. Peter Breger Space Research and Development Unit

NetCDF and Related Interna/onal Standards

FP7-INFRASTRUCTURES Grant Agreement no Scoping Study for a pan-european Geological Data Infrastructure D 4.4

INSPIRE Spatial Data on the Web building a user-friendly webby SDI

Achieving Interoperability Using Open Standards

Toward the Development of a Comprehensive Data & Information Management System for THORPEX

Spatial Data on the Web

Transcription:

16th Workshop on meteorological operational systems 1-3 March, 2017 GEOSS Common Infrastructure and the Big Data challenges S. Nativi (1), J. Van Bemmelen (2), M. Santoro (1), G. Colangeli (2) O. Ochiai (3), P. De Salvo (3) (1) Institute of Atmospheric Pollution Research, National Research Council of Italy (2) European Space Agency (3) GEO Secretariat

Group on Earth Observation and Global Earth Observation system of systems GEO AND GEOSS

The Group on Earth Observation (GEO) GEO is a partnership of more than 100 national governments and in excess of 100 Participating Organizations that envisions a future where decisions and actions for the benefit of humankind are informed by coordinated, comprehensive and sustained Earth observations. Ministers of the GEO member governments meet periodically to provide the political mandate and overall strategic direction for GEO. GEO The Mexico is acity unique Ministerial global Declaration networkfrom connecting the GEO Ministerial government institutions, Meeting 2015 academic saw world andleaders research commit institutions, to support data open providers, Earth businesses, observation data engineers, for the scientists next decade. and experts to create innovative solutions to global challenges at a time of exponential data growth, human development and climate change that transcend national and disciplinary boundaries. The unprecedented global collaboration of experts helps identify gaps and reduce duplication in the areas of sustainable development and sound environmental management.

104 Member States

106 Participating Organizations

Global Earth Observation System of Systems (GEOSS) Together, the GEO community is creating a Global Earth Observation System of Systems (GEOSS). Earth observations from diverse sources, including satellite, airborne, in-situ platforms, and citizen observatories, when integrated together, provide powerful tools for understanding the past and present conditions of Earth systems, as well as the interplay between them. GEOSS aims to better integrate observing systems and share data by connecting existing infrastructures. There are more than 200 million open data resources in GEOSS from more than 150 national and regional providers such as NASA and ESA; international organizations such as WMO and the commercial sector such as Digital Globe.

GEOSS Societal Benefit Areas

GEOSS Common Infrastructure (GCI) IMPLEMENTING GEOSS

DOWNSTREAM MIDSTREAM GEOSS end-users GEOSS GEOSS Applications GEOSS GEOSS Applications Applications GEOSS Application Developers (intermediate Users) GEOSS Common Infrastructure APIs Mediation modules GEOSS Portal UPSTREAM Enterprise System 1 Enterprise System 2 System Enterprise 4 System 3. Enterprise System 3 Enterpris e System 1 Enterprise SBA 1 System j SBA 2 System 4. Enterprise System 2 Enterpris e System K. Enterprise System 1 System 4 Enterprise System 2 SBA 8. Enterprise System 3 Enterprise System Z GEOSS Providers

GEOSS Common Infrastructure (GCI) Societal Benefit Areas Data Providers GCI Registration M2M > 200 million data resources spanning all SBAs

Enhanced GEOSS Portal - Overview Enhanced during 2016 Accessible from www.geoportal.org Coordinated with ESA, CNR-IIA, DG-RTD, DG-JRC and GeoSec Focus on engagement, delivery and advocating Structured in 3 phases 1 st phase 2016: interface restyling: completed 2 nd phase 2017/18: deployment of major upgrades 3 rd phase 2019 onwards operations and evolutions

GEO Discovery and Access Broker (DAB) GEO DAB is a brokering framework that interconnects hundreds of heterogeneous and autonomous supply systems (the enterprise systems constituting the GEO metasystem) by providing mediation, harmonization and transformation capabilities.

BIG DATA IN GEOSS

Big Data Enabling Technologies Computing Storage Monitoring Auto Scaling Load Balancing Routing NoSQL Database Clustering

Big Data challenges for the GCI VARIETY

Variety in GEOSS Variety is the most important V for GEOSS. More than 155 Brokered Systems About 200 M granules

OGC CSW 2.0.2 AP ISO 1.0 INPE OGC CSW 2.0.2 ebrim EO OGC CSW 2.0.2 ebrim CIM ESRI GEOPORTAL 10 GI-cat OAI-PMH 2.0 ESRI GEOPORTAL 10 OpenSearch 1.1 NCML-OD OpenSearch 1.1 ESIP BCODMO OpenSearch GENESI DR CKAN NetCDF-CF 1.4 CKAN DCAT Adopted Solutions GEO DAB NCML-CF Introduction of a brokering tier (GEO DAB) dedicated FTP populated with supported metadata types to mediation of service interfaces and WAF Web Accessible Folders GeoNetwork (2.2.0 or greater) models harmonization in a transparent way for both NERRS (National Estuarine Research Reserve System) users and data providers. CUAHSI HIS-Central ESRI REST API 10.3 OGC WCS OGC WMS Ecological Markup Language 2.1.1 OGC WFS 1.0.0, 1.1.0, 2.0.0 OGC WMTS HMA CSW 2.0.2 ebrim/cim OGC SOS 1.0.0, 2.0.0, 2.0.0 Hydro Profile HDF OGC WPS 1.0.0 OGC CSW 2.0.0 Core OGC CSW 2.0.2 AP ISO 1.0 OGC CSW 2.0.2 ebrim/eo AP OGC CSW 2.0.2 ebrim/cim AP IRIS Station SHAPE files (FTP) IRIS Event KISTERS Web - Environment of Canada HYRAX THREDDS SERVER 1.9 OAI-PMH 2.0 - Harvesting OpenSearch 1.1 GBIF DIF HYDRO UNAVCO SITAD (Sistema Informativo Territoriale Ambientale Diffuso) CDI 1.04, 1.3, 1.4 File System ISO19115-2 GDACS THREDDS 1.0.1, 1.0.2 GeoRSS 2.0 THREDDS-NCISO 1.0.1, 1.0.2 Degree catalog service 2.2 THREDDS-NCISO-PLUS 1.0.1, 1.0.2 OpenSearch GENESI DR The GEO DAB maps the diverse IADC DB (MySQL) models onto its own GrADS-DS FedEO internal model, which is general enough to comprise ARPA DB (based on Microsoft SQL) ESRI Map Server all the necessary concepts. The key features of the GEO Environment DAB Canada Hydrometric internal data (FTP) data and Earth Engine metadata models are flexibility and extensibility RASAQM EGASKRO allowing adding new concepts and related attributes.

Adopted Solutions GEOSS Portal User-centric, considering various user communities: GEO Flagships and Global initiatives ESA Thematic Exploitation Platforms SBA/Thematic Customization: Satellite: includes smart filters for imagery (Landsat, Sentinel 2) and SAR-type (Sentinel 1) satellite data; Disater Resilience SBA: Earthquake events filters

Big Data challenges for the GCI VOLUME

Volume in GEOSS GEOSS has to deal with the large amount of datasets provided by the end systems, e.g. millions of discoverable (small to medium size) products, and long EO time/space series. While GEOSS does not store the datasets, it has to collect metadata (at least for harvested catalogs) and provide effective discoverability.

Adopted Solutions Dealing with such numbers, normally constrained queries commonly match a large number of datasets. GCI addresses this challenge by returning a smaller and/or an ordered result sets. Ranking and Paging Views

Ranking and Paging No-SQL DB Good performances on large stores No preliminary constraint on data structure Need to preliminarily index queryable elements GEO DAB Internal Metadata Model Pre-calculated in batch, based on: Metadata Quality Accessibility Etc. Calculated on-the-fly, based on: Query Constraints Applied to scores (configurable)

GEOSS View Definition: Subset of the whole GEOSS resources defined by applying, via the DAB, a set of clauses Discovery clauses (e.g. spatial envelope, keywords, sources, etc.) Access clauses (e.g. data format, access protocol, CRS, etc.) Defined View exposed on the GEOSS Portal Consumer-defined View i.e. Client-side These views are available only for the client application which defined the view. Provider-defined View i.e. Server-side These views are available for all client applications.

Big Data challenges for the GCI VELOCITY

Velocity in GEOSS In GEOSS, Velocity related challenges include: Processing rate to transform and preview data Asynchronous approach for data access Real-time (or near real-time) data access

Adopted Solutions Fast Preview GEO DAB provides a fast preview service allowing to get data preview: Metadata record is augmented by adding a reference to data preview; preview tiles at different zoom levels are generated in a batch mode. To store and retrieve single tiles in an efficient way, GEO DAB utilizes a NoSQL key-value DB. When available, GEO DAB utilizes data provider fast prview services by implementing the required mediation. GEOSS Portal uses allows Users to quickly evaluate discovered data before deciding the download.

Adopted Solutions Asyncronous Approach In an environment such as GEOSS, no matter which technique is implemented there will always be cases in which the required processing is consuming too much time for a click-and-get pattern. The DAB + GEOSS Portal access transformation allows to deliver discovered datasets according to a common grid: format, Coordinate Reference System, spatial and temporal extent and resolution. Where this transformation workflow requires a long processing time, Users are allowed to opt for an asynchronous version of the same services.

Adopted Solutions Real-time (or near realtime) GEOSS must support near real-time data discovery and access (i.e. GEOSS must be able to broker near real-time systems) Two strategies have been pursued to broker these systems: Distribute GEOSS Users' queries to the near real-time systems, on-the-fly: Provides Users with the most updated content Lower performance Non-consistent ranking Harvest information of near real-time systems at regular and effective intervals: Global Biodiversity Facility (GBIF) INPE Steallite Imagery ESRI ArcGIS Online Does not provide Users with the most updated content... Good performance Consistent ranking

Big Data challenges for the GCI VISUALIZATION

Visualization in GEOSS In GEOSS, challenges related to Visualization stem from datasets heterogeneity and volume. In addition, GEOSS needs to address the requirement to support diverse (cross-)disciplinary applications targeting different Communities and User categories which have different needs, as for data visualization in an informative and significant way.

Adopted solutions GEOSS Portal customization: In addition to what was described in Variety challenge, GEOSS Portal is focusing on providing resuable Portlets (for integration in external Community Applications) and custom visualization of results (e.g. display seismic events according to magnitude) A set of high-level APIs (Application Program Interfaces) have been designed and developed along with documentation and usage examples (the GEO DAB APIs) to allow the development of ad-hoc applications exploiting GEOSS content.

DOWNSTREAM MIDSTREAM GEOSS end-users GEOSS GEOSS Applications GEOSS GEOSS Applications Applications GEOSS Application Developers (intermediate Users) GEOSS Common Infrastructure APIs Mediation modules GEOSS Portal Different APIs for serving diverse Application development use cases (environments) A set of standard Web service interfaces: e.g. OGC service interfaces, CKAN, OAI-PMH, FTP, etc. Enterprise System 1 A set of APIs for software Enterprise developers: System 2 System Client side Enterprise APIs: UPSTREAM System 3 (high-level). JavaScript library 1.. (Python) Server side APIs: Enterprise SBA 1 System j SBA 2 REST/JSON APIs OpenSearch APIs. 4 Enterprise System 3 Enterpris e System System 4 Enterprise System 2 Enterpris e System K. Enterprise System 1 Enterprise System System 3 Enterprise 4 System 2 SBA 8. Enterprise System Z

Big Data challenges for the GCI VERACITY AND VALUE

Veracity and Value in GEOSS Giving access to a huge amount of datasets coming from different systems with their own mandate and governance, GEOSS has to consider the veracity and value of the published information. Particularly true if considering that GEOSS targets not only research communities, but also decision and policy makers, and therefore the veracity and value of the pub- lished information may affect relevant decisions.

Adopted Solutions GEOSS Data Mangement Working Group provides a a set of Data Management Principles, including quality-related aspects; Essential Variables: EVs can be defined as those parameters required for study, reporting, and management of problems in a specific scientific or societal domains. This effort is particularly important for an infrastructure such as the GCI: the formalization and use of the EVs concept, and related instances, allows extracting the most valuable data matching User's request.

Conclusions In the past 10 years GEOSS has developed a truly Global and multidisciplinary System-of Systems A valuable framework to experiment and learn how to face Big Data challenges in particular Variaty and Volume ones. The new GEOSS Portal + DAB platform signifcantly improved the discoverability and accessibility of sahred GEOSS resources, addressing more and more User requirements.

Thank you

Backup