Title: Author(s)/Organisation(s): Working Group: References: Quality Assurance: A5.2-D3 [3.7] Information Grounding Service Component Specification

Similar documents
A5.2-D3 [3.5] Workflow Design and Construction Service Component Specification

A5.2-D3 [3.5] Workflow Design and Construction Service Component Specification. Eva Klien (FHG), Christine Giger (ETHZ), Dániel Kristóf (FOMI)

Framework specification, logical architecture, physical architecture, requirements, use cases.

Metadata of geographic information

HUMBOLDT Application Scenario: Protected Areas

The European Commission s science and knowledge service. Joint Research Centre

Welcome. to Pre-bid meeting. Karnataka State Spatial Data Infrastructure (KSSDI) Project, KSCST, Bangalore.

A5.2-D3 [3.6] HUMBOLDT Processing Components General Model and Implementations

Initial Operating Capability & The INSPIRE Community Geoportal

INSPIRE: The ESRI Vision. Tina Hahn, GIS Consultant, ESRI(UK) Miguel Paredes, GIS Consultant, ESRI(UK)

INSPIRE status report

EarthLookCZ as Czech way to GMES

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

The French Geoportal : linking discovery and view network services. INSPIRE Conference Krakow

GeoDCAT-AP Representing geographic metadata by using the "DCAT application profile for data portals in Europe"

Compass INSPIRE Services. Compass INSPIRE Services. White Paper Compass Informatics Limited Block 8, Blackrock Business

FP7-INFRASTRUCTURES Grant Agreement no Scoping Study for a pan-european Geological Data Infrastructure D 4.4

Introduction to INSPIRE. Network Services

Consolidation Team INSPIRE Annex I data specifications testing Call for Participation

The GIGAS Methodology

Managing Learning Objects in Large Scale Courseware Authoring Studio 1

The Scottish Spatial Data Infrastructure (SSDI)

Spatial Data on the Web

Integration of INSPIRE & SDMX data infrastructures for the 2021 population and housing census

Leveraging metadata standards in ArcGIS to support Interoperability. David Danko and Aleta Vienneau

The AAA Model as Contribution to the Standardisation of the Geoinformation Systems in Germany

This document is a preview generated by EVS

Download Service Implementing Rule and Technical Guidance

MY DEWETRA IPAFLOODS REPORT

Proposed Revisions to ebxml Technical Architecture Specification v ebxml Business Process Project Team

Proposed Revisions to ebxml Technical. Architecture Specification v1.04

GOVERNMENT GAZETTE REPUBLIC OF NAMIBIA

PortalU, a Tool to Support the Implementation of the Shared Environmental Information System (SEIS) in Germany

INSPIRE & Linked Data: Bridging the Gap Part II: Tools for linked INSPIRE data

/// INTEROPERABILITY BETWEEN METADATA STANDARDS: A REFERENCE IMPLEMENTATION FOR METADATA CATALOGUES

Web Map Servers. Mark de Blois. Septembre 2016

INSPIRE & Environment Data in the EU

GENeric European Sustainable Information Space for Environment.

Understanding and Using Metadata in ArcGIS. Adam Martin Marten Hogeweg Aleta Vienneau

Experience federating the metadata catalogue of IGN in the Spanish Open Data Portal

METAINFORMATION INFRASTRUCTURE FOR GEOSPATIAL INFORMATION

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

Open Geospatial Consortium

Conceptual schema matching with the Ontology Mapping Language: requirements and evaluation

IHO S-100 Framework. The Essence. WP / Task: Date: Author: hansc/dga Version: 0.6. Document name: IHO S-100 Framework-The Essence

ECP-2007-GEO OneGeology-Europe. Annex 1: Cookbook

SAFER the GIGAS Effect

Presented by Kit Na Goh

1. CONCEPTUAL MODEL 1.1 DOMAIN MODEL 1.2 UML DIAGRAM

Leveraging OGC Services in ArcGIS Server. Satish Sankaran, Esri Yingqi Tang, Esri

SEXTANT 1. Purpose of the Application

SWIM Standards Evolution Workshop

International Organization for Standardization Technical Committee 211 (ISO/TC211)

Javier NOGUERAS-ISO 1, Manuel A. UREÑA-CÁMARA 2, Javier LACASTA 1, F. Javier ARIZA-LÓPEZ 2

Increasing dataset quality metadata presence: Quality focused metadata editor and catalogue queriables.

The GeoPortal Cookbook Tutorial

ISO INTERNATIONAL STANDARD. Geographic information Filter encoding. Information géographique Codage de filtres. First edition

This document is a preview generated by EVS

Web Services for Geospatial Mobile AR

<Insert Picture Here> Click to edit Master title style

1 Executive Overview The Benefits and Objectives of BPDM

Lynnes, Yang, Hu, Domenico and Enloe Category: Technical Note March Interoperability between OGC CS/W and WCS Protocols

DATA SHARING AND DISCOVERY WITH ARCGIS SERVER GEOPORTAL EXTENSION. Clive Reece, Ph.D. ESRI Geoportal/SDI Solutions Team

Extension of INSPIRE Download Services TG for Observation Data

Enabling Efficient Discovery of and Access to Spatial Data Services. CHARVAT, Karel, et al. Abstract

Description. Speaker Patrizia Monteduro (International Consultant, FAO) TRAINING GEONETWORK OPENSOURCE Islamabad, Pakistan, Jan 29-31, 2014

This document is a preview generated by EVS

ARCHITECTURE OF SPATIAL DATA INFRASTRUCTURE (SDI) (DRAFT)

The Value of Metadata

ISO/IEC INTERNATIONAL STANDARD. Information technology Software asset management Part 2: Software identification tag

Detailed analysis + Integration plan

Part 1: Content model

Testbed-12 CITE User Guide - Profiles

GeoNetwork User Manual

DanubeGIS User Manual Document number: Version: 1 Date: 11-Nov-2016

Guidelines for the encoding of spatial data

On the Design and Implementation of a Generalized Process for Business Statistics

Esri Support for Geospatial Standards

Cataloguing GI Functions provided by Non Web Services Software Resources Within IGN

This is a preview - click here to buy the full publication TECHNICAL REPORT. Part 101: General guidelines

Designing a System Engineering Environment in a structured way

GeoNetwork User Manual

(Geo)DCAT-AP Status, Usage, Implementation Guidelines, Extensions

WP3 Technologies and methods for Web applications

Umweltbundesamt. Masaryk University Laboratory on Geoinformatics and Cartography

Multi-disciplinary Interoperability: the EuroGEOSS Operating Capacities

GEO-SPATIAL METADATA SERVICES ISRO S INITIATIVE

How to Create Metadata in ArcGIS 10.0

INSPIRE overview and possible applications for IED and E-PRTR e- Reporting Alexander Kotsev

Draft INSPIRE Implementing Rule on Metadata

Global ebusiness Interoperability Test Beds (GITB) Test Registry and Repository User Guide

This document is a preview generated by EVS

Content Management for the Defense Intelligence Enterprise

BPMN Working Draft. 1. Introduction

Developing data catalogue extensions for metadata harvesting in GIS

A service oriented approach for geographical data sharing

Approach to persistent identifiers and data-service-coupling in the German Spatial Data Infrastructure

When using this architecture for accessing distributed services, however, query broker and/or caches are recommendable for performance reasons.

Vocabulary-Driven Enterprise Architecture Development Guidelines for DoDAF AV-2: Design and Development of the Integrated Dictionary

GeoNetwork User Manual

Transcription:

Title: A5.2-D3 [3.7] Information Grounding Service Component Specification Author(s)/Organisation(s): Ana Belén Antón/ETRA Working Group: Architecture Team/WP05 References: A1.8-D5 User Involvement Document, Third Version A5.2-D3 [3.0] A Lightweight Introduction to the HUMBOLDT Framework V3 A5.2-D3 [3.1] Specification Introduction and Overview V3 A5.2-D3 [3.2] Mediator Service Component Specification A5.2-D3 [3.3] Conceptual Schema Specification and Mapping A5.2-D3 [3.3.1] The HUMBOLDT Alignment Editor A5.2-D3 [3.4] Context Service Specification A5.2-D3 [3.5] Workflow Design and Construction Service Specification A5.2-D3 [3.6] Processing Components General Model and Implementations A5.2-D3 [3.7] Information Grounding Service Component Specification A5.3-D3 Humboldt Commons Specification / Framework Common Data Model V3 Planning and management of a protected area. User story Quality Assurance: Review WP Leader: Thorsten Reitz (FhG) Review dependent WP leaders: Review Executive Board: Review others: Moses Gone (FhG), Daniel Fitzner (FHG) Delivery Date: 30.11.2009 1

Short Description: This document describes the specification of the Information Grounding Service Component developed as part of the HUMBOLDT software framework. For an overview of the entire framework, please refer to the main specification document A5.2-D3 [3.1]. Each service component specification follows the RM-ODP (ISO 10476), and is aimed at providing information on the responsibilities and collaborations with other components of the service component described herein. In this version it is intended to make the framework better understandable to outside people; because of that, more precise descriptions are given in the Enterprise Viewpoint. These descriptions have been aligned to the end-to-end example available in the Specification introduction and overview document (A5.2-D3 [3.1]) The Information Grounding Service is the HUMBOLDT component responsible for the discovery of geospatial data services. It does no replace existing catalogues but serves as a catalogue of catalogues being capable of discovering a huge number of different resources covering different administrative areas and registered to different catalogues. Furthermore, the Information Grounding Service discovers not just the datasets which fulfil a list of constraints; but also those can be transformed by other HUMBOLDT components in order to satisfy it. Keywords: Framework specification, logical architecture, physical architecture, requirements, use cases, Information Grounding Service, catalogue, service and data discovery. History: Version Author(s) Status Comment 001 Ana Belén Antón new Initial draft of the document from the last version of the Information Grounding Service specification document. 002 Ana Belén Antón Rfc Descriptions aligned to the end-to-end example given in the documents A5.2-D3 [3.0] and [3.1]. 003 Ana Belén Antón Rfc Re-editing sections and annexes. 004 Ana Belén Antón Rfc Clarifications on document after Moses Gone revision. 005 Ana Belén Antón Final WSDL and xsd schemas describing the IGS interface added as Annexes. Modifications on the interfaces after Daniel Fitzner comments. 2

Table of content 1... Introduction...7 1.1. Purpose of this document...7 1.2. Abbreviations and Definitions used in this document...7 1.3. Standards used in this document... 10 1.3.1 OGC Catalogue Service Web (CSW)...10 1.3.2 ISO 19115:2003...11 1.3.3 ISO 19139...11 2... Enterprise viewpoint... 12 2.1. Actors in this component... 12 2.1.1 Data Integrators...12 2.1.2 End Users...12 2.1.3 WDCS System...12 2.1.4 Front End...13 2.2. Business process overview... 13 2.2.1 Setting up the infrastructure...13 2.2.2 Discovery and automated harmonisation...14 2.3. Scenario Integration... 16 2.3.1 Introduction...16 2.3.2 Some requirements...17 2.3.3 Discovery and automated harmonisation...17 2.4. Information Grounding Service Use Cases... 18 2.5. Requirements... 19 2.5.1 Functional Requirements...20 2.5.2 I/O Requirements...20 2.5.3 Interaction with other HUMBOLDT components requirements...21 2.5.4 Technical Requirements...21 3... Computational viewpoint... 22 3.1. Overview... 22 3.2. Description of the Component... 23 3.3. Information Grounding Service Interface... 23 3.4. Internal Architecture of the Information Grounding Service... 24 3.4.1 Request Handler...24 3.4.2 Grounding Catalogue Manager...26 3.4.3 Grounding Catalogue Repository...28 3.4.4 Grounding Service Repository...29 3.4.5 Harvest Module...32 4... Information Viewpoint... 34 4.1. Constraint... 35 4.2. GroundingService... 36 4.3. GroundingCatalogue... 38 3

5... Summary & Outlook... 40 Annex A: Examples... 41 A.1 Request and Response XML Examples... 41 A.2 Grounding Services metadata vs HUMBOLDT Constraints... 44 A.3 Giving a relevancy weight to a Grounding Service... 51 A.4 Grounding Service Model (XML-schema)... 57 A.5 Information Grounding Service Description (IGS.wsdl)... 59 Annex B: Use Case Descriptions... 60 B.1 UC IGS01 Services/data discovery...60 B.2 UC IGS02 Harvesting process...64 B.3 UC IGS03 Catalogues management...66 4

Figures Figure 1: BPMN diagram showing how the IGS gives support setting up the infrastructure.... 14 Figure 2: BPMN diagram showing how the IGS gives support on automated harmonization.... 16 Figure 3: Discovery of Workflow Inputs... 17 Figure 4: UML Use Case diagram with the overview of the Use Cases for the Information Grounding Service component... 19 Figure 5: Component diagram of the Information Grounding Service, with its logical structure and collaboration relations... 22 Figure 6: Interface of the Information Grounding Service. It shows the overview and the public interfaces offered by the IGS. More specific description is given in next sections... 24 Figure 7: Interface of the Request Handler. For a detailed description of the individual operations, please refer to the API documentation... 25 Figure 8: Interface of the Grounding Catalogue Manager. For a detailed description of the individual operations, please refer to the API documentation....27 Figure 9: Interface of the Grounding Catalogue Repository. See GroundingCatalogue description in Figure 8. For a detailed description of the individual operations, please refer to the API documentation.... 28 Figure 10: Interface of the Grounding Service Repository. For a detailed description of the individual operations, please refer to the API documentation....30 Figure 11: Interface of Harvest Module. For a detailed description of the individual operations, please refer to the API documentation... 33 Figure 12: Class diagram for the Information Grounding Service Model showing the information processing. Note that only some of the GroundingServices interfaces are shown in the diagram... 34 Figure 13: Grounding Service data model... 37 Figure 14: Grounding Catalogue data model.... 39 Figure 15: Grounding Service Model as XML-schema: groundingservice.xsd.... 58 Figure 16: Information Grounding Service Description: IGS.wsdl... 60 5

Tables Table 1: Abbreviations used in this document... 9 Table 2: Definitions of terms used in this document...10 Table 3: API descriptions of the Request Handler....26 Table 4: API descriptions of the GC Manager... 28 Table 5: API descriptions of the Grounding Catalogue Repository... 29 Table 6: API Description of the Grounding Service Repository... 32 Table 7: API Description of the Harvest Module... 33 Table 8: Constraints used by the IGS for discovering... 35 Table 9: Identification Info... 46 Table 10: Distribution Info... 46 Table 11: Service Identification Information... 47 Table 12: Temporal Information... 49 Table 13: Quality Information... 49 Table 14: Reference System Information... 50 Table 15: Bounding Box Information... 50 Table 16: Mapping between requirements and requests to the IGS... 53 Table 17: List of Grounding Services included in the response to the first request... 56 6

1 Introduction 1.1. Purpose of this document This document provides the specification of the Information Grounding Service Component: a component of the HUMBOLDT Framework (and as such it is part of Deliverable A5.2-D3). This component is responsible for the discovery of geospatial data services and it is mainly accessed by the Workflow Design and Construction Service, the HUMBOLDT component responsible for the creation of geospatial workflows which answer complex geospatial requests that can not be answered by single data services and require further processing (see the A.5.2-D3 [3.5] document). The Information Grounding Service is especially of importance in geodata integration. It enables to integrate a huge number of different geoservices and datasets available across the EU, and to discover the data which can answer end users specific geographic requests. Therefore, it is of particular interest for the HUMBOLDT user groups Data Integrators and End Users; as such defined in the introduction and overview document (A5.2-D3 [3.1]) and more concretely in the A1.8-D5 User Involvement Strategy Document, Third Version. The specification follows the RM-ODP, a reference model based on precise concepts and, as far as possible, on the use of formal description techniques for specification of the architecture. This model uses the concept of viewpoints framework for providing separate viewpoints into the specification of a given complex system. These viewpoints each satisfy an audience with interest in a particular set of aspects of the system. Associated with each viewpoint is a viewpoint language that optimizes the vocabulary and presentation for the audience of that viewpoint. The structure of this document is the following: Section 2 Enterprise Viewpoint: This section focuses on the purpose, scope and added value of the component. It describes the business requirements and how to meet them. It has been written in such a way that it is understandable to all stakeholders of the component. Section 3 Computational Viewpoint: This chapter outlines the logical architecture of the component and describes in detail its interfaces and interactions with other components. Section 4 Information Viewpoint: It contains information on the internal data models and message structures used by the Information Grounding Service. Data structures shared with other components are part of the HUMBOLDT Commons specification (A5.3-D3). Technology and Engineering Viewpoints are summarily explained in the Specification introduction and overview (A5.2-D3 [3.1]), because of that they are not available here. 1.2. Abbreviations and Definitions used in this document This section summarizes the abbreviations and definitions used specifically for this document. It collects several kinds of abbreviations to provide a single point of reference, including names of modules, protocols, services, standards and tools. If any general abbreviation is not found here, please see the Specification introduction and overview document (Deliverable A5.3-D3 [3.1]). Abbrev. Name Definition 7

BPMN Business Process Modelling Notation Standard graphical representation for specifying business processes based on a flowcharting technique very similar to activity diagrams from UML. (See http://www.bpmn.org/) CSW Catalogue Service Web OGC standard to offer Catalogue Service for Geodata. It defines common interfaces to discover, browse and query metadata about data, services and other potential resources. FE Front End This is a simple graphical user interface which provides an interface for the management of several kinds of data, in this context is used for the management of grounding catalogues. This component is not specified in the HUMBOLDT Framework. It should be implemented using the API offered by the Grounding Catalogue Manager (see section 3.4.2). GC Grounding Catalogue See Table 2. GCM GC Manager Module of Information Grounding Service (See chapter 3.4.2) GCR GC Repository Module of Information Grounding Service (See chapter 3.4.3) GS Grounding Service See Table 2. GSR Grounding Service Repository Module of the Information Grounding Service (See chapter 3.4.4) HM Harvest Module Module of Information Grounding Service (See chapter 3.4.5) IGS Information Grounding Service Component of the HUMBOLDT Framework described in this document. It is used for the integration of different geodata catalogues and the discovery of geospatial data services. RH Request Handler Module of the Information Grounding Service (See chapter 3.4.1) RM-ODP Open Distributed Processing Reference Model ISO standard for designing open, distributed processing systems (See A5.2-D3 [3.1]). 8

SDI Spatial Data Infrastructure Framework of spatial data, metadata, users and tools that are interactively connected in order to use spatial data in an efficient and flexible way. Another definition is the technology, policies, standards, human resources, and related activities necessary to acquire, process, distribute, use, maintain, and preserve spatial data. XML Extensible Markup Language General-purpose specification for creating custom markup languages. Its primary purpose is to facilitate the sharing of structured data across different information systems, particularly via the Internet. It is classified as an extensible language because it allows its users to define their own tags. XSD XML Schema XML schema language. It is an abstract collection of metadata, consisting of a set of chiefly element and attribute declarations and complex and simple type definitions. XPath XML Path Langugage Expression language for addressing portions of an XML document, or for computing values (strings, numbers, or boolean values) based on the content of an XML document. WDCS Workflow Design and Construction Service Component of the HUMBOLDT system which provides workflows for accomplishing harmonization tasks, and for managing Workflows and Transformers. (See A5.2-D3 [3.5]) WSDL Web Services Description Language XML-based language that provides a model for describing Web services. WDCS Workflow Design and Construction Service Component of the HUMBOLDT system which provides workflows for accomplishing harmonization tasks, and for managing Workflows and Transformers. (See A5.2-D3 [3.5]) Table 1: Abbreviations used in this document Concept Definitions Constraint Restriction on the data to be returned, expressed by users. There are a huge list of type of constraints in HUMBOLDT: language constraint, spatial constraint, service constraint (See A5.3-D2) GeoNetwork GeoNetwork opensource is a standards based, free and open source catalogue application to manage spatially referenced resources through the web. It is a tool for data and metadata management. It is used as Grounding Catalogue in 9

HUMBOLDT Framework. Grounding Catalogue It is a database with information (metadata) about the geospatial entities available for a user community; one example is GeoNetwork. The Grounding Catalogue stores the descriptive data of groundings, but not the groundings themselves. Groundings definition includes services but also non-service resources; this is filebased datasets available via URL. The interfaces which describe the catalogue are defined by the OGC standards - the Catalogue Service Web (CSW). Grounding Service Geospatial information resources distributed on the Internet. They are the available services which provide an interface allowing requests for any subsets of a multidimensional and multitemporal geospatial data for a specific geographic region. They follow OGC specifications: WFS, WMS, WPS In this document, grounding services also includes file-based datasets available via URL. Table 2: Definitions of terms used in this document 1.3. Standards used in this document In this section, specific standards to the Information Grounding Service are listed and it is also described shortly how and why they are used. As with the sections before, if any general description is not found here, please see the A5.2-D3 [3.1] document. 1.3.1 OGC Catalogue Service Web (CSW) Catalogue services support the ability to publish and search collections of descriptive information (metadata) for data, services, and related information objects. Metadata in catalogues represent resource characteristics (properties of spatial data, such as geographic area of interest) that can be queried and presented for evaluation and further processing by both humans and software. Catalogue services are required to support the discovery and binding to registered information resources within and between collaborating information communities that seek to share information efficiently. "Communities" in the OGC context typically refer to communities who use similar naming schemas for geospatial features and phenomena such as roads, wetlands, land use zones, population density, etc. The OpenGIS Catalogue Services Specification 1 defines the Catalogue Service Web (CSW). This is the common interfaces to discover, browse, and query metadata about data, services, and other potential resources. It addresses the controlled enterprise environment where a-priori knowledge exists about the client and server, and it also addresses the global Internet case where no a-priori knowledge exists between client and server. It is consistent with existing and pending geomatics and metadata standards under the ISO Technical Committee 211, and it is consistent with XML data discovery and processing and with the emerging Web Services infrastructure. 1 The details of the OpenGIS Catalogue Services Specification are available in this site: http://www.opengeospatial.org/standards/cat 10

The CSW GetRecords operation (see Annex A.1) is possibly the most important operation that the specification offers. It allows clients to access the set of metadata records available from a server that all fulfill the client-desired parameter values. The metadata records are a set of parameters describing a geospatial resource, which may be a data set, service, and any other information. These metadata records are usually encoded in XML. The aforementioned operation is the primary function of the CSW, but it offers: GetCapabilities operation (mandatory): Allows clients to get CSW server metadata, including lists of resource types catalogued and filter capabilities implemented. DescribeRecord operation (mandatory): Allows clients to access information model supported by the specific catalogue server. GetRecords operation (mandatory): Allows clients to access metadata by searching with constraints. GetRecordById operation (mandatory): Allows clients to access metadata records by their identifiers. GetDomain operation (optional): Allows clients to access current domain of identified metadata record parameters. Transaction operation (optional): Allows clients to add, modify, and delete current metadata records, by sending the desired changes. Harvest operation (optional): Allows clients to request a server to retrieve new and modified metadata records from a network location. 1.3.2 ISO 19115:2003 International standard that defines the schema required for describing geographic information and services. It provides information about the identification, the extent, the quality, the spatial and the temporal schema, spatial reference and distribution of digital geographic data. This is the standard accepted by the IGS for the discovery of the grounding services 1.3.3 ISO 19139 Technical specification that defines geographic metadata XML (gmd) encoding. It is an XML schema implementation derived from ISO 19115. This is the implementation used by catalogues which interoperate with the Information Grounding Service. See in section 4 the use of this implementation and the mapping among its features, HUMBOLDT constraints and the information required for automated harmonization 11

2 Enterprise viewpoint The Enterprise viewpoint describes the functional purpose of the service component from a high-level point of view. First, the HUMBOLDT user groups who are stakeholders of the IGS are introduced in section 2.1. Section 2.2 outlines the general business process supported by the IGS. BPMN diagrams are given in that section. Section 2.3 then shows an application of the IGS to one of the HUMBOLDT scenarios, namely Protected Areas. This description is aligned to the end-to-end example as introduced in A5.2- D3 [3.1]. Finally, the set of functional requirements, outlining the functionality expected from the component are given. 2.1. Actors in this component Although the Information Grounding Service is a rather upstream service not directly visible to end users, its capabilities give some important benefits to two groups defined in the HUMBOLDT user involvement (see the A1.8-D5 User Involvement Strategy Document, Third Version): the Data Integrators and the End-Users of geodata/geoinformation. The Information Grounding Service is mainly used by other component of the HUMBOLDT framework, namely the Workflow Design and Construction Service, to discover grounding services which can be subject to automated harmonisation. Another component that uses the IGS is a Front End (FE) to manage the integration of different grounding catalogues available on the network. This component is not part of the HUMBOLDT framework. The actors for this component are consequently as follows: 2.1.1 Data Integrators This is the standard data integrator actor from the specification introduction and overview document (A5.2-D3 [3.1]). Within the scenario description this actor is represented by Luigi, a programmer and IT expert responsible for maintaining the IT-infrastructure of the protected areas management agency. The Information Grounding Service is of particular interest for him because it enables the registration and integration of different individual catalogues that hold the metadata of available data and services which are relevant for his agency. 2.1.2 End Users This is the standard end-users actor as defined in the specification introduction and overview document (A5.2-D3 [3.1]). Within the scenario description this actor is embodied by Mario, a regional officer at the Territorial Planning Department of an Italian region. The IGS is internally used by Mario when he requests, using an OGC conformant client, data of the park. The IGS discovers those datasets which can potentially satisfy his request. 2.1.3 WDCS System This actor represents a deployed and configured instance of the WDCS, i.e. the HUMBOLDT framework component for specifying application specific processing chains (workflows) and for identifying automatically the required harmonisation transformations based on the grounding services retrieved by the IGS. 12

2.1.4 Front End This actor represents a front end for managing the integration of several grounding catalogues available on the network -one of the main functionalities of the IGS. It can be a web front end or a form integrated in the GIS application. 2.2. Business process overview The Information Grounding Service is part of the most business processes in HUMBOLDT scenarios. The IGS adds value by providing the capability for discovering datasets to fulfill the constraints imposed in user requests or on the user context depending on the degree on how much the data fulfill the constraints. The IGS does not only return information on those data sources / groundings that completely fulfill the constraints but also on those that can fulfill them after transformations via available processing components. Furthermore, it enables discovering services harvested in catalogue servers coming from different administrations, agencies. The IGS is part of two main business processes that the HUMBOLDT Framework supports: setting up the infrastructure and discovery and automated harmonisation. They are shown in Figure 1 and Figure 2 respectively. Note that the process description is an abstraction of the concrete collaboration between HUMBOLDT components. The actors do not directly interact with the IGS but this component brings them valuable capabilities in an indirect way. The activities related with the IGS are highlighted in these figures with more emphasis. 2.2.1 Setting up the infrastructure Before the HUMBOLDT framework can be employed, it must be set up and configured. For the IGS, this involves the registration of external catalogues to be harvested. Figure 1 shows the two business processes required / involved: 1. Setting up the infrastructure: Within this business process, the data integrator registers external catalogues to be used. This registration involves the provision of parameters such as: URL, version, username and password etc. 2. Manage Catalogues: This business process involves all activities related to the management of the IGS and the registered catalogues, i.e. listing all catalogues as well as removing them. 13

Figure 1: BPMN diagram showing how the IGS gives support setting up the infrastructure. The IGS does not replace existing catalogues but it can be seen as a catalogue of catalogues capable of discovering a large number of different resources covering different geographic but also thematic areas. Geographic data is distributed and in most scenarios data is available from several administrations and/or agencies which have their own catalogues. The IGS provides access to information available on each of them, in case they have been previously registered. 2.2.2 Discovery and automated harmonisation The main important usage scenario of the Information Grounding Service is data discovery. The IGS receives a set of constraints (i.e. a context, expressed in the HUMBOLDT constraint model, specified in the document A5.3-D3) and delivers those grounding services / data sources that satisfy the constraints directly or that can be transformed such that they satisfy the constraints. The walk-through on how the IGS achieves this activity is described through the business process depicted in Figure 2: 1. The end user requests harmonised geodata over a standard interface, requesting a specific format, conceptual schema, natural language and projection system. These constraints are specific for his/her goals. 2. This request is enriched with the constraints of the application-specific product (ASP) defined before by the data integrator- building a Product Definition. This Product Definition is concrete for this request and it involves the ASP constraints and those included in the request. Of course, the constraints included in the request overwrite the ASP constraints if it is the case. If there were not specific constraints in the request, then only the constraints defined in the ASP would be used. 3. Based on the Product Definition, a basic workflow is retrieved if it has been previously 14

defined. 3.1 In this case each input descriptions are passed as discovery queries to Information Grounding Service. 3.2 In case a basic workflow is not defined, the constraints of the Product Definition are passed to the IGS as a single discovery query. 4. The IGS handles the discovery business process via two activities: get grounding services available on the known catalogues and give them relevancy depending on the mandatory condition of the satisfied constraint and depending also on the defined priority of the constraint type. The meaning of a mandatory constraint, it is a constraint that can not be harmonised. For example, the service constraint is a not harmonisable constraint because it is not possible to transform WMS to WFS. Thus, this is a mandatory constraint. However, language constraint is a harmonisable constraint because the dataset can be transformed by the language transformer to the required language. If the grounding service contains a non-satisfied constraint which is also a mandatory (not harmonisable) constraint, the grounding service gets a relevancy of 0 percent. Some Grounding Services have more or less relevancy weigh depending on the type of the satisfied constraints. For instance, if a Grounding Service do not fulfil some metadata constraint, like the keywords of the dataset, is not as relevant as if it does not fulfil the spatial constraint. Therefore, given two grounding services very similar, one of them fulfils the bounding box required but not one of the keywords given in the metadata constraint; the other one fulfils the metadata constraint, but only covers a part of the bounding box and it would need to be combined with other data set for covering the full area; the first grounding service will have greater relevancy weigh than the second one. For more information about how the IGS handles these issues, see section 4. 5. If the IGS has not discovered available data none of the groundings fulfils a mandatory constraint- the user request is cancelled as long as the Product Definition can not be relaxed. 6. If the IGS has not discovered perfect matches perfect matches refers to groundings satisfying completely all the constraints of the discovery request- but some grounding services which could fulfil the Product Definition after some harmonisation processes, they are passed to the WDCS for identifying the required transformations and these transformations are executed via the corresponding WPSes. These groundings are satisfying at least the mandatory constraints and they are retrieved ordered depending on its relevancy weigh how closeness it is to the perfect match. 7. These grounding services (harmonised if necessary) will be used for the data retrieval and fusion, transformation and encoding when needs which is handled by the Mediator Service. 8. Finally, the specified product is delivered according to the definition given by the end user. The relevancy weighting process mentioned above gives to the IGS its added value. If a request with a list of constraints to be satisfied is sent to a standard catalogue, it will retrieve only the data services satisfying completely the request. The IGS retrieves also those which can be subject to automated 15

transformation and the list retrieved is ordered by its relevancy. Figure 2: BPMN diagram showing how the IGS gives support on automated harmonization. 2.3. Scenario Integration The IGS is used as catalogue in every HUMBOLDT scenario. The example given in this section is based on one of them, namely protected areas. The description is aligned to the end-to-end example in the reference specification document (A5.2-D3 [3.1]). For specific information on the user story named Tourism valorization of protected areas and territorial integration- please refer to the scenario specification for protected areas. This section highlights the usage of the Information Grounding Service in the aforementioned example. Therefore, only the parts where the IGS is involved are described. The goals of the scenario, the domain business and the full process behind this user experience are given in the document (A5.2-D3 [3.1]). 2.3.1 Introduction This example deals with the task of creating a web portal that can be used by hikers for identifying suitable hiking routes. An important goal of the project is to keep the portal independent from concrete data sources in order to enable the reuse in different areas. Another technical goal is to enable users to employ OGC-conforming clients for retrieving data from the portal. Hence, the portal server must offer standardized OGC-interfaces for data access, such as WFS or WMS. The aim of this example is 16

to show how the task of setting up this portal and using it is achieved using the HUMBOLDT components, especially the HUMBOLDT Information Grounding Service. 2.3.2 Some requirements There are some steps necessary before the IGS can take action within this example and we assume that they all have been carried out previously. They involve setting up the infrastructure, e.g. setting up / deploying the HUMBOLDT Mediator Service, the IGS, registering external catalogues etc. We assume the application specific processing chain / workflow Sustainable Hiking Paths has already been created. Further, we assume that a user context has been defined, a request has been issued on the portal surface, i.e. the HUMBOLDT Mediator Service and the workflow Sustainable Hiking Paths has been retrieved internally and enriched with request specific constraints such as bounding box etc. For a detailed introduction to all these steps see the document A5.2-D3 [3.1] Specification Introduction and Overview. Within this example, the data integrator, responsible for setting up the infrastructure is embodied by Luigi, the end user by Mario. 2.3.3 Discovery and automated harmonisation The input parameters of the Basic Workflow Sustainable Hiking Paths are four input layers -stopping places, protected areas, potential hiking paths and vegetation- and the buffer distance which defines the closeness of the stopping places (Figure 3). The description of the aforementioned inputs this is the list of constraints according to the HUMBOLDT constraint model for each input to the Basic Workflow, such as stopping places- is passed as a discovery query to the Information Grounding Service. The IGS returns pointers to a number of data services (e.g. WFS) that can deliver the data requested. Figure 3: Discovery of Workflow Inputs The list of data services fulfilling the constraints contains at least one grounding service compliant (relevancy weigh = 100%) with the description of the input layers of the stopping places, potential hiking paths and vegetation. Unfortunately the perfect matching for the description of the protected areas has not been found because Mario has requested data for an area that crosses the boundary between France and Italy. That is therefore not served by one single data service. However, as Luigi the data integrator - has also registered the catalogue of one protected areas agency from France in charge of this area, the IGS discovers a set of grounding services which cover the French part. The IGS retrieves a list of grounding services with relevancy weight under the 100% - it is a not perfect match-, the first one covers the Italian area and the second one the French area they are the grounding services with the highest relevancy weigh in this case, but they cover the area requested if they are combined. The IGS knows this transformation possibilities based on the HUMBOLDT constraint definitions mandatory/no mandatory and harmonisable/no harmonisablegiven in Table 8. 17

The two data sources discovered (Protected Areas WFS1 and WFS2) can deliver input data to the Basic Workflow if a suitable transformer is available to transform / merge them such that the output covers the area required. The WDCS identifies the needed transformations and an executable workflow description is returned to the Mediator Service, which after the workflow execution returns the result to the end user Mario. 2.4. Information Grounding Service Use Cases This section summarizes the Use Cases this component has to fulfil. The following UML Use Case shows the functionality offered by the IGS through the three main use cases identified: IGS01: Services/Data Discovery: This Use Case is the main functionality of the component; the discovery of grounding services. It discovers the grounding services which fulfils or could fulfil a set of constraints by looking for them in OGC standard grounding catalogues -for instance GeoNetwork- and assess their relevancy weigh depending on how close they match input descriptions. The information can be accessed remotely or locally using the IGS repository (both specializations are showed in the following diagram). IGS02: Harvesting Process: This Use Case involves finding changes and updating the Catalogue/s Server/s in the local repository. It includes listing the available catalogues, checking connectivity and updating the local repository with these changes. IGS03: Catalogues Management: The Data Integrator via the Front End is managing the grounding catalogues to be used by his/her organization. This Use Case includes three specializations; removing, listing and adding grounding catalogues. Removing grounding catalogues includes the repository s update and adding grounding catalogues the confirmation of the connectivity with the catalogue, the harvesting process (IGS02) and finally the repository update. 18

Figure 4: UML Use Case diagram with the overview of the Use Cases for the Information Grounding Service component The actors involved in these use cases have been described in section 2.1, however the Grounding Catalogue has also been included for clarifying the explanations. It represents the external catalogues registered in the IGS. A more detailed description of these use cases can be found in Annex B. 2.5. Requirements This chapter contains both the functional requirements derived from the Scenarios/Use Cases and the non-functional requirements of the HUMBOLDT scenarios and other sources. The most of requirements have been exported from previous versions of the IGS specification such as A5.2-D1 [1.4] and also from the Volere Requirements Management tool (http://humboldt.etra.es/). The rationales, acceptance criteria and some comments of the requirements are described in the aforementioned site. Furthermore, some new requirements derived from change requests, specification requests, etc via the procedures for managing requests in specification process defined 19

in WP04 2 have been collected. 2.5.1 Functional Requirements Catalogue Integration: The system must integrate and use catalogues from different organisations, administrations, agencies, etc for the service discovery task in a transparent way for the end user. Management of Catalogues: The system must offer the tools for grounding catalogues management functionalities, enabling tasks like register, list and unregister catalogues. Efficient management: The system should manage the available information on the catalogues on an efficient way. It concerns efficient use of the grounding catalogues and the filter process used during the discovery. Web service discovery: The system must be able to discover OGC standard web services providing geo data (WMS, WFS) based on the user product definition. o Given a set of constraints of the HUMBOLDT constraint model, as such specified in the document A5.3-D3 Humboldt Commons Specification, the system should discover not only the services fully compliant with the constraints but also those that can be transformed in order to finally fulfil the aforementioned requirements. Relevancy weigh: The system should give a relevancy weigh to the services discovered, giving a priority to them, based on how close they match the context. 2.5.2 I/O Requirements CSW-2 usage: The system must be able to create OGC Catalogue Service Web (CSW) request and to process CSW responses. o The system must invoke Harvesting and GetRecords operations following CSW-2 specification. Configuration data: The system must use catalogue configuration data to access to grounding catalogues. Metadata processing: The system must process and filter metadata compliant with ISO 19139 implementation, provided by the grounding catalogues. Mapping user requirements: The system must be able to map the user product definition (user requirements) to the metadata defined by ISO 19115 in the grounding catalogue. o The system shall be able to map the following parameters or constraints from the harmonisation product: spatial; temporal (TRS, start, end, interval resolution, runtime and forecast time); licenses and prices; thematic; situative context presentation/usage which reflect to predefined interesting keywords (qnames); definition of data provider; level of detail and scale; ISO quality model, geometric precision, completeness, and correctness; service type and URL. Mandatory constraints: The system must known what HUMBOLDT constraints are mandatory 2 See document A4.3-D1: Process specification evaluation and improvement. 20

and not for giving the grounding service relevancy weigh. A mandatory constraint usually is a not harmonisable constraint (see Table 8). WDCS interaction: The system must able to receive/send well formed requests/responses from/to the WDCS. Access to catalogues: The system should check the availability of the groundings catalogues before to register them. Authorized access: The system must block unauthorized access to grounding catalogues. 2.5.3 Interaction with other HUMBOLDT components requirements Tools for the metadata management of geodata: The system must offer interoperability with the most important tools for metadata management of geodata, like Geonetwork. It is an open source catalogue application to manage spatially referenced resources. Receiving requests from the WDCS: The system must be able to retrieve the constraints included in a WDCS request. Sending responses to the WDCS: The system must be able to send a response to the WDCS with the list of grounding services which fulfils the request, containing the following information: relevancy weight, precondition id of the request -if it is given- and the list of constraints satisfied and unsatisfied from the requirements provided in the precondition request. 2.5.4 Technical Requirements OGC compliance: The system must be compliant with the OGC Catalogue Service Web (CSW) standard. OGC interoperability: The system must comply with OGC interoperability specifications. Platforms: The system should be able to run in different environments or platforms. Object Oriented Programming: The system implementation must follow object oriented programming paradigms for providing good reusability and encapsulation of the component. Error messages: The system implementation should provide explicit error messages to developers and business users to verify concepts with end users and to assist developers with the software analysis. 21

3 Computational viewpoint This chapter details the services and APIs that this component provides, as well as the internal logical architecture, processes used and interactions with other components. 3.1. Overview The Information Grounding Service component is defined as a component which offers service and data search. It provides the links to spatially referenced resources which fulfill -or could fulfill after the harmonization process- a set of constraints satisfying the end users product definition and specific request. From the product definition and concrete end user request, the Workflow Design and Construction Service determines the necessary inputs to the corresponding Basic Workflow (see document A5.2.D3 [3.5]) and via the IGS, discovers the available resources. Therefore, the IGS and WDCS have a strong collaboration. Furthermore, this component is responsible for managing the connection and integration of several grounding catalogues. A Data Integrator can register/unregister catalogues using the IGS via a Front End (section 2.1.4). The follow component diagram shows an overview of the composition of this service component, including its internal architecture and relations to the aforementioned components of the HUMBOLDT Framework. Figure 5: Component diagram of the Information Grounding Service, with its logical structure and collaboration relations. The Information Grounding Service architecture contains four main modules which are described in the next section. 22

3.2. Description of the Component The Information Grounding Service achieves its functional goals using four modules with specific functionalities: Request Handler (RH): This module manages the requests received from the WDCS. It accepts a request and invokes the IGS Repository for discovering the suitable grounding services. It also builds the response ordering the retrieved list of grounding services by its relevancy weigh. GC Manager (GCM): This module allows registering, listing or removing grounding catalogues. IGS Repository: This module is in charge of the information storage of the data managed by the IGS. Two repositories can be distinguished, one for the catalogues configuration and the other one for the grounding services metadata. Grounding Catalogue Repository (GCR): It offers the mechanisms for saving and retrieving the configuration data of the registered grounding catalogues. Grounding Service Repository (GSR): It provides the mechanisms for saving and retrieving the available grounding services metadata. For this last task it filters the suitable grounding services and gives them its relevancy weigh depending on the WDCS request. This relevancy parameter is saved in the repository joint to its correspondent request id. Therefore, when a request with the same precondition id is received it only has to read it from the repository. Harvest Module (HM): This module invokes the grounding catalogues. It builds the requests to send to the GCs (using CSW protocol), establishes the SOAP communications and stores the responses in the aforementioned repository (GSR). 3.3. Information Grounding Service Interface The Information Grounding Service is used by the WDCS for discovering suitable grounding services and by a Front End for managing grounding catalogues. The public interfaces offered by this service are shown in the following diagram: 23

Figure 6: Interface of the Information Grounding Service. It shows the overview and the public interfaces offered by the IGS. More specific description is given in next sections These interfaces are described in detail in the following sections (section 3.4.1 and 3.4.2). 3.4. Internal Architecture of the Information Grounding Service 3.4.1 Request Handler Full Name: Service Integration Component Framework Information Grounding Service (IGS) Request Handler (RH) 1. Responsibilities of the Module This module retrieves the Grounding Services which fulfil the imposed constraints, providing also their main information (metadata). The list of grounding services is ordered depending on how closeness is to the perfect match (= all constraints are satisfied). If a precondition id is given in the request this is equivalent to a request identifier- it retrieves the list of grounding services linked to this request. 2. Collaboration The Request Handler is being used by the IGS to retrieve the Grounding Services and their information. The Request Handler accesses the Grounding Service Repository (GSR) to discover and get information on the suitable Grounding Services. 24

It is a module of the Information Grounding Service component. 3. Actions fulfilled by this Module To ask to the GSR for discovering the suitable grounding services. To calculate the relevancy weigh of each GS. To build Grounding Service objects and its relevancy weigh. To order the grounding services depending on what grade fulfil the imposed constraints To retrieve a list of grounding services the list contains the maximum number of GS to retrieve- and its metadata. The retrieved grounding services are the groundings which fulfil the constraints required or which can fulfil them after available transformation processes. 4. Interface Overview Figure 7: Interface of the Request Handler. For a detailed description of the individual operations, please refer to the API documentation. 5. API descriptions Request Handler List<GroundingService> getgroundingservices(int maxmatches, List<Constraint> constraints) Retrieves an ordered list of Grounding Services according to the required constraints. The maximum number of matches to be retrieved by the IGS 25

is defined with maxmatches. List<GroundingService> getgroundingservices(int maxmatches, Map<UUID,List<Constraint>> precondition) Retrieves an ordered list of Grounding Services according to the required constraints. The list of constraints is linked to a precondition id. This id is used for retrieving the list of GS linked to this identifier. Of course, if it is the first time a request is invoked with this id, then it is used for storing and next requests. The maximum number of matches to be retrieved by the IGS is defined with maxmatches. Table 3: API descriptions of the Request Handler. 3.4.2 Grounding Catalogue Manager Full Name: Service Integration Component Framework Information Grounding Service (IGS) Grounding Catalogue Manager (GCM) 1. Responsibilities of the Module This module provides the mechanisms for managing grounding catalogues: registering, listing and unregistering. 2. Collaboration The GC Manager is being used by the IGS to retrieve the list of available Grounding Catalogues and to offer the possibility of adding/removing them. The GC Manager accesses the Grounding Service Repository (GSR) to retrieve information on the available Grounding Catalogues and to read/write the configuration information about them. It is a module of the Information Grounding Service. 3. Actions fulfilled by this Module To configure connected Grounding Catalogues. To make available or non-available Grounding Catalogues for the IGS. 26

4. Interface overview Figure 8: Interface of the Grounding Catalogue Manager. For a detailed description of the individual operations, please refer to the API documentation. 5. API descriptions GC Manager boolean addgroundingcatalogue(groundingcatalogue gc) Adds a GC to the configuration of the IGS. Returns true, if the operation was successful. List<GroundingCatalogue> getallgroundingcatalogues() Retrieves a list of Grounding Catalogues which are configured in the IGS. boolean removegroundingcatalogue(groundingcatalogue gc) Removes a GC from the IGS configuration. Returns true, if the operation was successful. 27

Table 4: API descriptions of the GC Manager. 3.4.3 Grounding Catalogue Repository Full Name: Service Integration Component Framework Information Grounding Service (IGS) Grounding Catalogue Repository (GCR) 1. Responsibilities of the Module This module provides the mechanisms to read/write the repository which contains the information about registered Grounding Catalogues. 2. Collaboration The GCR is being used by the GC Manager to read the information on the repository about Grounding Catalogues. The GCR is used by the Harvest Module for knowing the connection parameters and the last successful harvest in order to update the GSR with last changes in the grounding catalogues registered. It is a module of the Information Grounding Service. 3. Actions fulfilled by this Module To store Grounding Catalogue configurations. To check the Grounding Catalogue availability. To retrieve information on Grounding Catalogue configurations. 4. Interface overview This is a local interface only for internal usage within the service, not a public interface for other components. Figure 9: Interface of the Grounding Catalogue Repository. See GroundingCatalogue description in Figure 8. For a detailed description of the individual operations, please refer to the API documentation. 28

5. API descriptions Grounding Catalogue Repository void add(groundingcatalogue gc, boolean check) List<GroundingCatalogue> getall () Adds a GC to the repository. The gc is the grounding catalogue to add and check is used for defining whether the Catalogue shall be checked before adding. Retrieves all Grounding Catalogues from the repository (GCs registered in the IGS). int getsize() Retrieves the number of Grounding Catalogues available in the repository. boolean isgroundingcatalogueavailable(groundingcatalogue gc) Tests, if a Grounding Catalogue is available, means can answer a GetCapabilities request to the gc. It retrieves true if the GC can answer the GetCapabilities request. Void remove(groundingcatalogue gc) Removes a Grounding Catalogue from the repository (it unregisters a GC in the IGS). Table 5: API descriptions of the Grounding Catalogue Repository. 3.4.4 Grounding Service Repository Full Name: Service Integration Component Framework Information Grounding Service (IGS) Request Handler (RH) 1. Responsibilities of the Module This module provides the mechanisms for saving and retrieving the available grounding services metadata. Furthermore, it filters the suitable grounding services and gives them its relevancy weigh. This relevancy parameter is saved in the repository joint to its correspondent request id. Therefore, when a request with the same precondition id is received it only has to look for the grounding services with this identifier from the repository. 2. Collaboration The Grounding Service Repository is being used by the Request Handler to retrieve the information on the repository about Grounding Services. The GSR is being used by the Harvest Module to store the response received from the Grounding Catalogues with new and modified data. It is a module of the Information Grounding Service. 29

3. Actions fulfilled by this Module To store the useful information about grounding services available on the registered grounding catalogues. To filters suitable grounding services based on the constraints to fulfil. To calculate the relevancy weigh of a grounding service based on the constraints fulfilled in a concrete request and store both, the relevancy weigh and the request identifier. To retrieve useful information about the grounding services. 4. Interface overview This is a local interface only for internal usage within the service, not a public interface for other components. Figure 10: Interface of the Grounding Service Repository. For a detailed description of the individual operations, please refer to the API documentation. 5. API descriptions Grounding Service Repository void addpreconditionid(groundingservice groundingservice, UUID preconditionid, float relevancyweigh) Writes in the repository adding the preconditionid and its 30

correspondent relevancyweigh to the groundingservice. The preconditionid is the identifier of the request to the IGS and servers for linking grounding services fulfilling the set of constraints given in the request. The relevancyweigh is based on the constraints satisfied. boolean addrecord(record record, TypeRecord typerecord) Adds a record to the repository. The typerecord identifies the record is describing a Grounding Service. boolean existrepository() Retrieves false if the Repository is empty. List<GroundingService> List<GroundingService> GroundingService getallgroundingservices(uuid preconditionid) Reads from the repository all grounding services which satisfy the constraints given in the request with preconditionid identifier. getallgroundingservices(int maxmatches, List<Constraint> constraintlist, UUID preconditionid) Reads the maxmatches best grounding services which satisfy the constraints given in the constraintlist. The preconditionid is the request identifier. If a request was received with this preconditionid, this is used for reading directly the list of grounding services. In other case, it is stored for each grounding service satisfying the request. getgroundingservice(string id) Reads from the repository the information on the grounding service given its identifier. int getnumberofgroundingservices() Retrieves the number of Grounding Services available in the repository. String getpath() Retrieves the path of the repository. List<Record> getrecords(document document, Xpath xpath) Retrieves the records which fulfil a XPath expression on the document given. Document getrepository() Retrieves all information stored in the repository regarding Grounding Services. boolean ismandatoryconstraint(constraint constraint, int level) 31

It determines whether the constraint is mandatory given a level. The highest level is for always mandatory constraints, such as ServiceConstraint. float readrelevancyweigh(groundingservice groundingservice, UUID preconditionid) Reads from the repository the relevancy weigh of a groundingservice given a preconditionid. Table 6: API Description of the Grounding Service Repository 3.4.5 Harvest Module Full Name: Service Integration Component Framework Information Grounding Service (IGS) Harvest Module (HM) 1. Responsibilities of the Module This module requests to each Grounding Catalogue registered in the IGS for new and modified metadata records. 2. Collaboration The Harvest Module is being used by the IGS to discover new and modified geospatial information resources using the registered Grounding Catalogues. The Harvest Module uses the Grounding Service Repository to store the information retrieved from the Grounding Catalogues. It is a module of the Information Grounding Service. 3. Actions fulfilled by this Module To build the Harvest request to the Grounding Catalogue. To manage the communication (requests/responses) with the Grounding Catalogue. To retrieve and save the response received from the Grounding Catalogue. 32

4. Interface overview Figure 11: Interface of Harvest Module. For a detailed description of the individual operations, please refer to the API documentation. 5. API descriptions Harvest Module CatalogueRequest buildrequest() Builds the HarvestOperation request. See chapter 1.4.1 boolean executerequest() Executes the HarvestOperation request. boolean saveresponse(string path, String response) Saves the response in the Repository. Table 7: API Description of the Harvest Module 33

4 Information Viewpoint This chapter describes the information viewpoint of the Information Grounding Service component as defined by RM-ODP. This covers various data structures that are being used for storage and exchange between the IGS component defined in the computational viewpoint and the WDCS defined in the document A5.2-D3 [3.5]. These include the domain model of the component as well as specific message structures it exchanges with the WDCS and FE via its public interfaces. Figure 12: Class diagram for the Information Grounding Service Model showing the information processing. Note that only some of the GroundingServices interfaces are shown in the diagram. For the Information Grounding Service, the main data structures are: Constraint: This data structure is exchanged between the IGS and the WDCS. It is used by the IGS for discovering the grounding services which fulfill them. A constraint is a single rule that a data set or service has to fulfill in order to complete a task. The list of type of constraints in HUMBOLDT: language constraint, spatial constraint, service constraint are described at the HUMBOLDT Commons specification (document A5.3-D3) Grounding Service (GS): This data structure is exchanged between the IGS and the WDCS. It is the representation of a grounding service; the geospatial information resources distributed on the network. The IGS provides to the WDCS the metadata describing the available services which provide an interface allowing requests for any subset of a multi-dimensional and multi-temporal geospatial data for a specific geographic region. This metadata is stored in XML format. Grounding Catalogue (GC): This data structure is exchanged between the FE and the IGS. It is the representation of a catalogue service available for a user community. It provides and stores the configuration information needed for accessing to the catalogue server. This information is stored in XML format. 34

4.1. Constraint This data structure is exchanged between several components on HUMBOLDT Framework, because of that it is part of the HUMBOLDT Commons. Specific definitions and explanations are given in the HUMBOLDT Commons specification (document A5.3-D3). From the list of available constraints, only the following constraints are used by the IGS for discovering grounding services. The weighs given to each constraint are also indicated in Table 8. The mandatory constraints (constraints representing not harmonisable elements) have the highest weigh. If the grounding service is not fulfilling the mandatory constraints never will be in the list retrieved in a response to the WDCS. Constraint Type Weigh Mandatory Harmonisable ServiceConstraint 8 Yes No ThematicConstraint / topic 8 Yes No ResolutionConstraint 8 Yes No ThematicConstraint / theme 7 No Yes SpatialConstraint 6 No Yes ScaleConstraint 5 No No QualityConstraint 4 No No TemporalConstraint 3 No No LanguageConstraint 2 No Yes MetadataConstraint 1 No No Table 8: Constraints used by the IGS for discovering Please note when there are differences on the processing of the different elements in a concrete constraint type, they are specified. Please also note the specific cases of the Metadata Constraint and Temporal Constraint. Metadata Constraint is a not harmonisable constraint, however the IGS don t use it as a mandatory constraint 35

because elements as keywords, telephone, address should not be restrictive. The case of Temporal Constraint is similar. Furthermore, for clarification, Table 8 shows the default prioritization, but it can be changed depending on constraint definition, A constraint can be defined as a mandatory constraint or harmonisable constraint, this definition should have priority over the aforementioned default classification. The definition of a constraint as a mandatory constraint could be changed by the user, but the definition of a constraint as a harmonisable constraint only by the system (WDCS). From Table 8 it can be derived that a grounding service which satisfies only the mandatory Constraints included in the request will have a relevancy weigh of the 12.5% -this the maximum relevancy weigh to return because grounding services with relevancyweigh=0% are not retrieved; if the request also includes Thematic Constraint(theme) and this is satisfied, its relevancy weigh will be 25%; if the request also includes Spatial Constraint and this is satisfied, its relevancy weigh will be 37.5%; If all constraints included in the request are satisfied, the relevancy weigh will be 100%. This is calculated as follows: (max_weigh - contraint_weigh) * (1/max_weigh) * 100 ; constraint_weigh is the highest weigh of the not-fulfilled constraint. 4.2. GroundingService The Grounding Service information model represents the metadata (based on ISO 19115) which is provided by the Grounding Catalogue to describe the Grounding Services. The information provided by this data structure is showed in the following Figure. Note that the full model is showed in the picture, it means the internal model used within the IGS private attributes- and also the information passed to other components public attributes. 36

Figure 13: Grounding Service data model. The definitions of each attribute on the Grounding Service information model are equivalent to the definitions on ISO 19115. The information model of the Grounding Service is linked to the HUMBOLDT constraints. The Annex 2 shows the mapping among metadata described in ISO 19115, metadata included in the grounding services and the HUMBOLDT constraints. The specific information of a Grounding Service information not linked to ISO 19115- is: preconditioninfo: It is the information which allows mapping a request from the WDCS with the grounding services which fulfill this request. Also it provides the relevancy weigh of the grounding service for this precondition identifier. o o preconditionid: It is the unique identifier of the precondition (from the IGS viewpoint it is an identifier of a concrete request). relevancy: isperfectmatch: It is true if the grounding service matches the full list of constraints passed. relevancyweigh: Float which indicates how close a grounding service perfectly matches a list of constraints. allconstraints: It is the list of constraints satisfied by the grounding services. It is like a translation of the Grounding Service elements to HUMBOLDT constraints. constraintsnosatisfied: It is a list of constraints not satisfied by the grounding service for a 37

concrete request (among the constraints given there). constraintssatisfied: It is a list of constraints satisfied by the grounding service for a concrete request (among the constraints given there). Regarding the information passed to the WDCS, this is reduced to: distributioninfo: Information about the location for on-line access. It includes the address using a Uniform Resource Locator addressing schema; connection protocol and name of the online resource. preconditioninfo: Identifier of the precondition satisfied and how is satisfied (relevancy weigh). constraintsnosatisfied: It is a map between constraints required and available constraints. This map only contains the constraints not satisfied by the grounding service for a concrete request (among the constraints given there). The XML-schema of the Grounding Service (groundingservice.xsd) is attached in Annex 4. 4.3. GroundingCatalogue The GroundingCatalogue interface defines the following attributes (see also Figure 12), which are necessary to access a Grounding Catalogue: 1. Host: Machine name where the grounding catalogue is located (i.e: localhost) 2. Port: Port number where the catalogue is served (i.e: 9090) 3. Address: Address where the service is served (i.e: geonetwork/srv/en/csw) 4. Method: Method type of the request via HTTP: POST/GET (i.e: POST) 5. usesoap: Boolean which identifies if SOAP is used or not (i.e: true) 6. servicetype: Service type of the catalogue service (i.e: CSW) 7. version: Version of the catalogue service (i.e: 2.0.1) 8. login: a. loginaddress: Address where login the server (i.e: geonetwork/srv/en/xml.user.login) b. username: User name for accessing to the catalogue server (i.e: admin) c. password: Password for accessing to the catalogue server (i.e: admin) 9. lastharvestsuccess: Date of the last harvest operation made. Figure 14 defines the required metadata for managing grounding catalogues. This UML diagram describes the content of the Grounding Catalogue Repository: 38

Figure 14: Grounding Catalogue data model. 39

5 Summary & Outlook The specification detailed in this document represents the improvement of the IGS definition regarding the last version. The specification that it is presented with this document can be understood as a solid basis for one of the core services of the HUMBOLDT Framework. It incorporates the lessons learned from the state of the art, from the implementation experiences, and mainly from the scenario requirements. Some of the improvements added to this specification service are: Detailed descriptions on the enterprise viewpoint; making the specification better understandable to the outside people. Scenario integration aligned to the end-to.end example in the introduction and overview specification document. Requirements discussed in the Enterprise viewpoint and covered in the Computational Viewpoint. Integration of the scenario requirements. Definition of mandatory and harmonisable constraints. Mapping among grounding services metadata, HUMBOLDT constraints and ISO 19115. Explanations on grounding service relevancy weigh derivation. The information passed to other components/services has been reduced in order to avoid much overhead. Precise descriptions on the information viewpoint and computational viewpoint for being easier to implement. The Grounding Service model has been provided as XML-schema and the Information Grounding Service description as wsdl. 40