D6.1.2 Piloting Plan

Size: px

Start display at page:

Download "D6.1.2 Piloting Plan"

Herbert Barnett
5 years ago
Views:

ICT Seventh Framework Programme (ICT FP7) Grant Agreement No: 318497 Data Intensive Techniques to Boost the Real Time Performance of Global Agricultural Data Infrastructures D6.1.2 Piloting Plan Project Reference No.

1 ICT Seventh Framework Programme (ICT FP7) Grant Agreement No: Data Intensive Techniques to Boost the Real Time Performance of Global Agricultural Data Infrastructures D6.1.2 Piloting Plan Project Reference No. ICT FP Deliverable Form Deliverable No. D6.1.2 Relevant Workpackage: WP6: Real-life Deployment and User Evaluation Nature: O=Other Dissemination Level: PU = Public Document version: 1.0 Date: 16/06/2014 Authors: FAO, DLO, AK, UAH Document description: The document describes the demonstrators that will be implemented and evaluated in the project. Page 1 of 61

2 Document History Version Date Author (Partner) Remarks v0.1 18/10/2013 FAO Initial version v0.2 10/03/2014 FAO Intro, Chapter 1 v0.3 01/04/2014 FAO Chapter 2 v0.4 05/04/2014 FAO Chapter 3 v0.5 04/05/2014 AK Chapter 4 v0.6 08/05/2014 DLO Inclusions of latest contributions from DLO v0.7 09/05/2014 AK v0.8 05/06/2014 FAO, UAH, DLO, AK Re-structure the deliverable and update the evaluation parts Revision by FAO and inclusion of updates and contributions by FAO, UAH, DLO, AK resulting from 4th Plenary Meeting discussions v0.9 10/06/2014 NCSR-D Internal review v1.0 23/06/2014 FAO, UAH, DLO Final updates in response to review and delivery as D Page 2 of 61

3 EXECUTIVE SUMMARY This deliverable is an updated version of deliverable D It describes the design of the end-user applications that will be used in the SemaGrow pilot trials, the piloting plan, and the evaluation methodology that will use pil final plan for the implementation of the applications selected as demonstrators of SemaGrow technologies, including more details about the evaluation methods and tools for the SemaGrow demonstrators. Section 1 provides the relevant introduction to this deliverable. Section 2 introduces the evaluation context for the SemaGrow demonstrators, Section 3 introduces the layered evaluation framework that we adopt, Section 4 describes the SemaGrow evaluation methods, tools, metrics, and evaluation experiments. Section 5 is about the implementation of the pilot trials. Finally, Section 6 presents the conclusions of the deliverable. Page 3 of 61

4 TABLE OF CONTENTS LIST OF FIGURES... 6 LIST OF TABLES INTRODUCTION Purpose and Scope Approach Relation to Other Work Packages and Deliverables Big Data Aspects SEMAGROW DEMONSTRATORS Trees4Future/AgMIP Rationale for selecting Trees4Future/AgMIP as a demonstrator Technical description Demonstrator development Evaluation objectives for stakeholders AGRIS Rationale for selecting AGRIS as a demonstrator Current architecture of AGRIS AGRIS demonstrator Foreseen evaluation Agricultural Discovery Space (ADS) Rationale for selecting ADS as a demonstrator Current architecture and status of the Agricultural Data Discovery Space Foreseen architecture & implementation Foreseen evaluation Overview of the evaluation stakeholders SEMAGROW EVALUATION APPROACH Layered evaluation approach Towards a layered decomposition for SemaGrow Use Cases EVALUATION METHODS AND TOOLS FOR SEMAGROW DEMONSTRATORS Review of evaluation methods and tools Evaluation methods and tools selected for SemaGrow demonstrators Overview of the SemaGrow evaluation Controlled pilot trials Hackathons Potential implementation of controlled pilot trials Introduction to Open Data in Agriculture Hands-On workshop Page 4 of 61

5 4.3.3 Hackathon event DEMONSTRATORS DEVELOPMENT AND EVALUATION PLAN Evaluation timeframes and deliverables Trees4Future-Agmip demonstrator Development plan Evaluation plan AGRIS Development plan Evaluation plan Agricultural Discovery Space Development plan Evaluation plan CONCLUSIONS REFERENCES ANNEX: Annex A: Example of user satisfaction questionnaire Page 5 of 61

6 LIST OF FIGURES Figure 1-1: Dependencies between D6.1.2 and other deliverables Figure 2-1: Trees4Future architecture Figure 2-2: Trees4Future-AgMIP demonstrator user interface Figure 2-3: Trees4Future application infrastructure Figure 2-4: SemaGrow Trees4Future-AgMIP application infrastructure Figure 2-5: AGRIS current architecture Figure 2-6: AGRIS data flow Figure 2-7: Agricultural Data Platform Figure 2-8: Metadata Aggregation Workflow Figure 2-9: Agricultural Data Platform supported by SemaGrow software stack Figure 3-1: The SemaGrow Architecture and the evaluation layers Figure 4-1: Overview of tasks for pilot evaluations, partners responsible, and evaluation activities 39 Figure 4-2: Piloting trials stages Figure 4-3: Piloting trials stage Figure 4-4: Piloting trials Stage Figure 4-5: Piloting trials stage Figure 5-1 Steps of the overall validation approach Figure 5-2: Example of cohort analysis for a specific period Page 6 of 61

7 LIST OF TABLES Table 2-1: List of properties used in AGRIS Table 3-1: Mapping the evaluation layers with the SemaGrow Architecture [26] Table 3-2: Mapping the evaluation layers with Trees4Future-AgMIP Table 3-3: Mapping the evaluation layers with AGRIS Table 3-4: Mapping the evaluation layers with ADS Table 4-1: Evaluation methods (source: USINACTS, 1999) Table 4-2: Mapping Trees4Future-AgMIP with evaluation methods and metrics Table 4-3: Mapping AGRIS with evaluation methods and metrics Table 4-4: Mapping ADS with evaluation methods and metrics Table 5-1: Summary of planning, implementation and evaluation of SemaGrow demonstrators Table 5-2: List of SemaGrow deliverables related to demonstrators evaluation Table 5-3: Implementation planning for the Trees4Future-AgMIP demonstrator Table 5-4: SemaGrow Trees4Future-AgMIP 1st Pilot Trial Evaluation Table 5-5: SemaGrow Trees4Future-AgMIP 2nd Pilot Trial Evaluation Table 5-6: Implementation planning for the AGRIS demonstrator Table 8-1: Generic User satisfaction questionnaire Page 7 of 61

8 1. INTRODUCTION 1.1 Purpose and Scope The present deliverable presents a revised plan for the implementation of the service demonstrators for SemaGrow, and it also describes the methodology for carrying out pilot trials on using the SemaGrow Stack to support applications that address real-life data problems; and for using such trials as the basis of evaluating user satisfaction regarding the reactivity of the SemaGrow Stack. Actual implementation of the demonstrators is carried out within task T6.2 (which started at M16 of the project), while their evaluation is carried out within T6.3 (starting at M19 of the project).following the terminology adopted in WP2, we use here Use Case Categories to refer to the main areas of focus of WP6: Heterogeneous Data Collections & Streams (lead by DLO); Reactive Data Analysis (lead by FAO); and Reactive Resource Discovery (lead by AK). Each partner has identified one or more relevant application to serve as a basis for the project demonstrators. The demonstrators discussed in this deliverable are: Trees4Future/AgMIP (identified by DLO); AGRIS (FAO); ADS (AK). By use case we mean a sequence of actions meant to address a given information goal. Use cases as referred to in this deliverable, are meant to serve as a basis for the evaluation of demonstrator s features. By pilot trials we refer to the evaluation of the developed demonstrators. We call them pilots to emphasise the fact that they will be performed with real users, as opposed to the technical evaluation that is performed using the automated methods and tools developed in WP4 Rigorous Experimental Testing. Besides this introductory section, this document includes presentations of the context and current status of the applications that are assumed as a basis for the pilots, the envisaged updates and a plan for their implementation, and their relevance to the purpose of the SemaGrow pilots (Section 2). The document then proceeds to introduce the layered evaluation framework that we adopt (Section 3), to present the evaluation methodology (Section 5), and conclude (Section 6). 1.2 Approach The implementation of the demonstrators started at M16 of the project. UAH will provide technical support for the backend of the applications that will be used in the pilots, while changes required for applying SemaGrow technology in existing applications and / or interfaces will be realized by the partner responsible for each pilot (FAO, DLO and AK). The first implementation of the demonstrators will be delivered at M21, followed by a first evaluation phase to be concluded by M24. Then, a second iteration of implementation and evaluation will follow. Page 8 of 61

9 WP2: Use Cases & Architecture WP6: Real-life Deployment and User Evaluation Task 2.1: Envisaged Applications and Use Cases (D2.1.2) Task 2.2: Data Streams and Collections (D2.2.2) Task 6.1: Piloting Plan (D6.1.1) Task 6.1: Piloting Plan (D6.1.2) Task 6.2: Pilot Deployment (D6.2.1) Task 6.3: Pilot Trials (D6.3.1) Figure 1-1: Dependencies between D6.1.2 and other deliverables. 1.3 Relation to Other Work Packages and Deliverables This document builds on deliverable D6.1.1, where the first plan was presented. Among other updates, this version also incorporates the project s reaction to the recommendations of the first year project review. These are the main changes with respect to the previous version of the plan: 1. The number of demonstrators is reduced to three, one for each of the three use case categories. 2. Better, more detailed description of each demonstrator, so as to describe which big data challenges they solve and to include technologies and architectures of the current versions of the applications selected to serve as demonstrators, and a description of the future demonstrators. 3. Tree4Future and AgMIP have been merged, while the AGRIS focus changed to include the automatic discovery of Web resources related to the AGRIS domain. At the same time AgLR Toolkit was removed, and the repercussions of SemaGrow deployment for data management back-ends are demonstrated in ADS instead. The demonstrators described in this deliverable are based on the work done in WP2 for what concerns the features requested by users and stakeholders (D2.1.2: Envisaged Applications and Use Cases), and the data to use (D2.2.2: Data Streams and Collections). This document refines the applications and use cases described in D2.1 from the perspective of evaluating user satisfaction and system reactivity in real-life deployments. This design and planning is used to drive the deployment (D6.2.1) and execution (D6.3.1) of the first round of SemaGrow pilot trials (Figure 1-1). Page 9 of 61

10 Initial provisions for the second piloting round are also described, but will be refined and finalized as an outcome of the first piloting round. 1.4 Big Data Aspects The piloting plan aims at designing pilots that will provide the user interactions necessary in order to evaluate user experience from the perspective of the different stakeholders. The big data aspects of this design are discussed in Section Evaluation objectives for stakeholders for the Trees4Future/AgMIP demonstrator and in Section The AGRIS Demonstrator for queries that join AGRIS data against Web crawl data. Page 10 of 61

11 2. SemaGrow Demonstrators In this section we present the SemaGrow demonstrators, explaining the rationale behind their selection and presenting which big data challenge they address. For each demonstrator, we describe its underlying architecture, the technologies adopted, its main functionalities, the lifecycle of the data used in them, and the users involved. We also provide a short indication of what functionalities we select for evaluation, including a discussion of the reason for selecting them, how the evaluation of those functionalities contributes to the project and to the partners developing those demonstrators. We also highlight the user groups involved in the evaluation. 2.1 Trees4Future/AgMIP Trees4Future develops a research infrastructure for European forestry research. As part of the project a geo-spatial Clearinghouse is developed that allows forestry research stakeholders (researchers, modellers, policy makers, sector organisations etc.) to effectively discover relevant data for their work. The main mechanism for increasing the effectiveness of searches is by exploiting semantic technologies on top of (harvested) metadata. AgMIP is a global network of researchers that work on agricultural modelling and intercomparison of different crop and agro-economic models. From that perspective they run into similar issues as Trees4Future stakeholders regarding the discovery, selection and (pre)processing of data that is required for their modelling exercises and the size, amount and complexity of involved data Rationale for selecting Trees4Future/AgMIP as a demonstrator The Trees4Future and the AgMIP communities share many issues related to the discovery of large amounts of heterogeneous data sources. First of all the discovery of targeted data sets is generally an issue. This is usually performed based on the dataset s metadata, but the use of semantics is generally limited. Besides, a lot of valuable information which resides in the data content itself is not accessible. Trees4Future therefore extends the classical approach of spatial dataset discovery by including knowledge on dataset attributes (as an extension of the metadata, so still not using the data itself) where possible. Triplification of this extended metadata and enriching the triples with semantics allows for more efficient, semantically driven discovery of geo-datasets. This increases the effectiveness of searches by filtering out irrelevant results on the one hand and discovery of otherwise undiscovered datasets on the other hand. Besides, it allows for more effective valuation of search results, e.g. by more effective relevance scoring of search results. AgMIP faces the same type of issues in discovery of agronomic data sources for agricultural modelling exercises. It is expected that the AgMIP community can benefit highly from the Trees4Future approach by just adopting the concepts and principles used by Trees4Future. That might start as straightforward as adopting the Trees4Future application as a search mechanism on top of triplified and semantically enriched AgMIP data sets. Moreover, being able to analyse data sets based on their (data) contents should also allow querying for sub selections of data. Page 11 of 61

Figure 2-1: Trees4Future architecture 2.1.2 Technical description The Trees4Future Clearinghouse is a knowledge portal offering search & discover functions on top of forestry research related datasets.

12 Figure 2-1: Trees4Future architecture Technical description The Trees4Future Clearinghouse is a knowledge portal offering search & discover functions on top of forestry research related datasets. The backend of the system is a semantic data store containing triplified metadata which is harvested from registered catalogues and is semantically enriched through linkage to relevant ontologies. Figure 2-1 shows the architecture of the current Trees4Future Clearinghouse. Metadata is harvested from a range of metadata catalogues supporting different metadata standards (OAI-PMH, OGC-CSW, Thredds (NetCDF) etc.). Harvested metadata is subsequently triplified and stored as RDF triples in a Triple Store, following the structure of a custom developed ontology. After triplification, the concepts identified in the metadata are automatically linked to a set of relevant (external) ontologies. The Trees4Future user interface provides a search interface on top of the triple store. The query interface between GUI and backend components is implemented through a SOAP interface, communicating with a server component that transforms the SOAP request to SPARQL queries. Page 12 of 61

13 Figure 2-2: Trees4Future-AgMIP demonstrator user interface The technical specifications of the Trees4Future components are as follows: Graphical user-interface: PHP, HTML/JavaScript/JQuery Query protocols: SOAP (GUI to server) + SPARQL (server to triple store) Server components: J2EE, Java Triple store: Sesame OpenRDF Catalogue harvester: GI-CAT, customized OAI-PMH harvester Figure 2-2 shows a sketch of the user interface of the Trees4Future-AgMIP demonstrator, derived from the Trees4Future GUI. It clearly shows the different dimensions that can be used to query the current Trees4Future metadata. Search terms The free search terms textbox allows users to perform queries based on one or more search terms (implicitly using the AND operator). The search terms are matched with concepts (skos:concepts) linked to the datasets in the Trees4Future triple store using semantic search mechanisms. The relevance of the dataset is derived from the distance between the dataset and the concept in the ontology. Page 13 of 61

14 When typing, the users the system will come up with suggestions based on the first 3 characters and the matches with contents of the triple store. Spatial search The spatial search allows entering spatially explicit search terms, using either geographical names or the option to select a spatial extent on a map. Geographical names are converted to a geographical extent, so queries are essentially similar to the extent based queries. For the metadata/full dataset search, the application returns all datasets that have an overlap with the selected feature or extent. When retrieving data subsets, queries return only the subset of data that is inside the selected feature or extent. The resolution refers to the spatial scale required by the end user. The application will return all data that has either the requested resolution or resolutions that can be transformed to the requested resolution (so with a higher resolution, request for 25x25 km will also return 10x10 km) Temporal search Through the temporal search option, a user can define a start and/or end time point of interest for the queried datasets. The start/end dates can be exact dates or only years. For the metadata/full dataset search, the application returns al datasets that overlap with the selected period or, if only one time point specified, the datasets that contain data that (partially) concerns the period after the specified start date or before the specified end date respectively. For actual data selection, queries return only the subset of data that is inside the selected period, after the selected start date or before the selected end date respectively. The resolution refers to the temporal scale required by the end user. The application will return all data that has either the requested resolution or resolutions that can be transformed to the requested resolution (so with a higher resolution, request for weekly data will also return datasets with daily data) Query results The application returns as search results references to datasets. If available, the results include URLs to view (e.g. WMS) and data (e.g. WFS, HTTP-based file access) services. For the SemaGrow demonstrator a two-step process is foreseen, where the data services retrieve data (sub)sets trough the SemaGrow Sparql endpoint. In the first step the query is fired to retrieve the dataset references (metadata + links as shown in Figure 2-2). In the second step, the user could click on the data access link to retrieve the data content. This would fire a second query, retrieving the (subset of) data according to the query criteria through the SemaGrow endpoint Demonstrator development The Trees4Future AgMIP demonstrator will combine T4F s Clearing house forestry research data, such as forestry genetics and forest management data and AgMIP s data (e.g. climate data, both on past climate as well as future estimations, soil data, crop trials) required to run agronomic and agroeconomical models. All data will be triplified and stored in triple stores accessible by the Semagrow stack, along with triple stores containing triplified metadata of the aforementioned data. Page 14 of 61

15 Figure 2-3: Trees4Future application infrastructure More specifically, the architecture of the Trees4Future AgMIP demonstrator comprises: The Trees4Future AgMIP user interface (GUI / demonstrator), where users can fill in the parameters of their search and details like resolution. The Semagrow stack, accessed by the GUI through a single SPARQL endpoint, where the query components are processed against metadata information, semantic alignments of pre-stored ontologies and data stored in external triple stores. The data repositories, accessed by the Semagrow stack through SPARQL endpoints, one for each dataset. The Trees4Future AgMIP semantic user interface (GUI) will provide fields to state the search parameters (thematic, statistical, spatial, temporal and combinations of these). The user can, besides search terms, also specify the required resolution of the data. When submitting the user defined query parameters, an initial query is executed on a metadata repository, containing metadata harvested from dataset headers (e.g. the headers of NetCDF files) and compiled from external catalogues describing individual datasets or groups or datasets. The procedures and guidelines prepared in Task 5.1 for adding new datasets and the associated metadata to a SemaGrow federation also apply for NetCDF files, provided the SemaGrow triplifiers have been applied and the resulting triple store is accessible to the SemaGrow Stack through a SPARQL endpoint. The results of the initial search on the metadata are displayed to the user, who can then refine the initial search criteria as well as adjust the desirable resolution of the parameters, in order to perform Page 15 of 61

16 adapted queries and get more specific or relevant results. The results of the queries are displayed as datasets relevant to the search, along with descriptive metadata and (automatically produced) estimates of the number of results contained in them. The user also has the ability to state the required resolution for the results. In the presence of applicable re-scaling mechanisms, this will act as a request to apply such mechanisms in the final post-processing step below; otherwise it will act as a filter. Finally, the user selects the appropriate resolution and datasets among the displayed results and issues the last query. It should be noted that rescaling and merging will not always be possible, so care will be taken to author the query templates that will be filled by the GUI is such a way that they will only retrieve datasets with matching specifications and dimensions. In other words, some prior domain knowledge about what can be reasonably merged will be encoded in the query templates, alleviating the user from the burden of wading through long lists of meaningless suggestions. The final query is performed on the actual data contained in the datasets the user selected. This time, the datasets containing the actual data, and not their metadata as done in the previous queries, are queried by the SemaGrow Stack through their SPARQL endpoints. First, the SemaGrow Stack creates a local, temporary dump containing all the data matching the search criteria. A final post-processing step (the NetCDF Creator, Figure 2-4) merges and (if possible) re-scales data collected from different datasets. The output of the procedure is a NetCDF file containing the requested combination of variables at the requested resolution and dimension constraints and can be directly used as input for research models. As an example, a user interested in temperature and precipitation data in Spain might first query for data for a region in Spain for a specific time period. When there is no feasible data available, he might extend the query to search for datasets covering the whole of Spain or even Europe. Subsequently he can choose the relevant datasets and additionally select the desired spatial and temporal resolution (e.g. monthly temperature and rainfall at a resolution of 25 x 25 km or higher). For this final query the user can request the actual data, which will be a file containing only the data of the parameters the user stated, in association with the rest of the criteria (space, time etc.) In order to develop the Trees4Future AgMIP demonstrator based on the Trees4Future system a range of development activities are required to migrate from the current Trees4Future application to the Trees4Future-AgMIP demonstrator. To support the understanding of the required changes, Figure 2-3 and Figure 2-4 show the current Trees4Future application infrastructure and the foreseen implementation of the Trees4Future-AgMIP demonstrator for SemaGrow respectively. Page 16 of 61

17 SPARQL Endpoint Metadata Repository Trees4Future/AgMIP Demonstrator GUI Data services SPARQL Proxy SOAP NetCDF SPARQL Endpoint ClearingHouse Server NetCDF Creator Triplified Trees4Future Repository SPARQL SPARQL Endpoint Triplified AgMIP Repository... SPARQL Endpoint Triplified dataset SPARQL SemaGrow Stack Figure 2-4: SemaGrow Trees4Future-AgMIP application infrastructure A stepwise approach will be defined to migrate from the current Trees4Future implementation to the full SemaGrow Trees4Future-AgMIP demonstrator of which the steps are defined as follows: 1) Transform the Trees4Future interface to a Trees4Future-AgMIP demonstrator working on top of the SemaGrow infrastructure, serving Trees4Future data and metadata: - Adaptation of the Trees4Future server component to query the SPARQL endpoint of the SemaGrow infrastructure; - Provide access to Trees4Future metadata through SPARQL endpoint on the Trees4Future triple store; - Transfer the parts of the Trees4Future ontology required by the SemaGrow infrastructure to effectively query the Trees4Future Sparql endpoint. 2) Extend the SemaGrow Trees4Future-AgMIP demonstrator with query functions on Trees4Future data content (using the same end user interface): - Design of additional / extended SPARQL queries that support (1) the querying of datasets based on criteria related to the data and (2) support selection of a subset of data based on criteria related to the data; Page 17 of 61

18 - Extending the Trees4Future ontology in the SemaGrow infrastructure with the semantics required to effectively query the Trees4Future data. - Development of a data conversion component translating the result set (RDF data) into a usable (spatial) format, e.g. WCS, WFS, NetCDF or ESRI shape. 3) Add AgMIP data and search and discovery support for AgMIP: - Setup of a triple store for AgMIP (meta)data and triplification of the AgMIP datasets - Extending SemaGrow ontologies with semantics required to effectively query the AgMIP (meta)data. The development timeline for this demonstrator, its activity planning and the deliverable schedule, as well as the plan for evaluation are further elaborated in paragraph Evaluation objectives for stakeholders The stakeholders for the Trees4Future-AgMIP demonstrator are on one side the potential user community of the demonstrator (user perspective) and on the other hand the parties interested in exploiting the SemaGrow infrastructure in a broader domain for similar big data problems (project perspective). The user community for the demonstrator consists of the following groups: - Forestry researchers and students requiring big data (sub)sets for their modelling and analysis work (Trees4Future community). - Forestry practitioners and consultants searching for data to support their advice and analysis work (Trees4Future community). - Climate change adaptation researchers in the area of forestry and agriculture requiring big data (sub)sets for their modelling and analysis work (Trees4Future and AgMIP community). - Policy makers in the area of agriculture and forestry searching for data related to their policy domain (Trees4Future and AgMIP community). These users commonly have problems discovering the data required for their work in the wealth of available data in the domains of forestry, agriculture and climate change. They do not know all available datasets and can in some cases not judge the relevance of datasets. Therefore they are often depending on experts, and the process of deriving the required data can be a time consuming and error-prone exercise. The expectations for this demonstrator from the perspective of the user community are: - A more effective way of searching datasets. o Finding the best data for the job without missing relevant datasets. o Finding data with the best possible (spatial and temporal) resolution o Being able to assess the relevance of discovered datasets - Discovery of data and datasets should be possible with acceptable performance - The search mechanism should also be able to provide data sets or subsets in a usable format (which is in general not the RDF/XML format). From the perspective of the project it is essential to evaluate the capacities of the SemaGrow infrastructure to be able handle the heterogeneous big data sets. The user community and the use Page 18 of 61

19 case elaborated can stand model for the big data problems that exist in many knowledge intensive domains that deal with heterogeneous and multi-dimensional big data sets. The expectations from the project are: - To test the capabilities of the SemaGrow infrastructure to deliver the Trees4Future and AgMIP functionalities: thematic, spatial and temporal queries over (meta)datasets, returning references to datasets, the datasets itself and sub-selections of datasets. - To explore and test the ways to increase the effectiveness of data discovery through semantic technologies - To test and compare performance of comparable search queries over the classical Trees4Future system and the SemaGrow implementation. The expectations described here will be translated to a set of evaluation criteria that can be evaluated in the demonstrator pilot trials to be performed. The evaluation procedures will be elaborated in Section AGRIS AGRIS is one of the biggest and most important information systems in the agricultural domain. It is a database composed of more than 7.7 million bibliographic references in agriculture. The AGRIS Web portal ( receives an average of visits/months and it is World Wide accessed (from more than 190 countries and territories, according to Google Analytics statistics). AGRIS is indexed by Google and its content comes on top of Google results. Moreover, AGRIS is already largely oriented to semantic technologies, as it uses RDF data and accesses various SPARQL endpoints. For all these reasons AGRIS is a natural choice to test SemaGrow technologies Rationale for selecting AGRIS as a demonstrator AGRIS may be considered a mash up application based on semantic technologies, in that it - when returning documents relevant to a user s query - it also enhances them with links to a variety of related resources. This enhancement is based on resulting documents content and not on the user s query. By exploiting the technologies developed within SemaGrow, we expect to further enhance the ability of AGRIS to provide users with relevant resources in a reactive and robust manner; while also allowing for diverse, heterogeneous, and large-scale data sources to be easily incorporated by the AGRIS administrators. As a first step, the SemaGrow demonstrator will combine bibliographic results from the AGRIS database with relevant information already available in the Web. The latter will be retrieved from the FAO Web crawl database, holding metadata about Web resources relevant the agricultural domain and that may nicely integrate the information already available in AGRIS. This database is populated by a dedicated system (developed and maintained by FAO outside SemaGrow) for crawling the public Web and annotating Web pages with AGROVOC concepts. From the point of view of AGRIS users, SemaGrow technology may be the key to improve the informativeness of the service, as more data sources are combined with the bibliographic references they retrieve, including: Page 19 of 61

20 Figure 2-5: AGRIS current architecture Relevant pages crawled from the public Web and annotated with AGROVOC concepts, selected for the semantic similarity to the AGROVOC annotations of the bibliographic entries Relevant meteorological, soil, and experimental results datasets from the Trees4Future/ AgMIP collections, selected for the semantic similarity to the AGROVOC annotations of the bibliographic entries under the SemaGrow-produced vocabulary alignment between AGROVOC and the DLO vocabulary. From the point of view of the AGRIS administrators, SemaGrow technology may contribute to consolidating AGRIS in its role of major information service in the area of agriculture. From the project point of view, AGRIS is also a good test bed because it is already largely based on semantic technologies, and it is therefore interesting to understand the impact of adopting SemaGrow technologies for systems that are already in the area i.e., without drastically change the underlying architecture, or the skills of the system administrators Current architecture of AGRIS The logical view of the AGRIS high level architecture may be described as consisting of four main components, plus the CIARD RING (Figure 2-5). At the centre we can see the AGRIS Web application. This is a Java application deployed in a Tomcat Web Server. This application refers to the Web interface and to the algorithms to allow the users to look for agricultural information in AGRIS. Page 20 of 61

Figure 2-6: AGRIS data flow The Apache Solr server allows the AGRIS Web application to quickly retrieve results and display them to the users, helping in retrieving also statistical information and

21 Figure 2-6: AGRIS data flow The Apache Solr server allows the AGRIS Web application to quickly retrieve results and display them to the users, helping in retrieving also statistical information and to perform some analysis on top of AGRIS data. A filesystem XML database is used as a bibliographic repository to store metadata coming from data providers. In the existing AGRIS architecture, the Apache Solr index is built on top of this database. Then, there is a triplestore (currently Allegrograph) used to store the so called AGRIS RDF, namely the RDF-ization of the filesystem XML database, enhanced with additional AGROVOC URIs computed by automatic procedures as the AgroTagger. Finally, we mention the CIARD RING. This is an external component to AGRIS, in that it does not strictly belong to AGRIS, but it is necessary to retrieve information about AGRIS data providers. Technologies. The AGRIS application is entirely based on Java. Some APIs used by the application: Apache Struts 2.0 for the Web interface and the exchange of parameters between the user and the application itself; Apache Solr for the indexing of resources; Sesame 2 to query the triplestore. Page 21 of 61

22 Data workflow. AGRIS receives data from a variety of applications and institutions, each managing their data independently. Therefore, the very first phase of the data workflow in AGRIS is to map the format received into the AGRIS internal model. In this way, AGRIS data providers may continue to use their own data model and still contribute to AGRIS. Currently, the AGRIS internal data model is the AGRIS AP 1 for the filesystem XML database, and the AGRIS RDF for the triplestore. In the future, only the AGRIS RDF should be the AGRIS format, possibly extended with more properties. Table 2-1 lists the AGRIS RDF properties currently in use. After the conversion to the AGRIS internal model, data is indexed by the Solr index and made accessible for search through the AGRIS web site. Data coverage. The AGRIS core database consists of bibliographic metadata in the agricultural domain. However, other types of data (always related to agriculture and food) are interlinked to AGRIS, such as maps, statistics, etc. AGRIS now accesses the following external datasets: Europeana, species distribution data from GBIF, Nature, DBPedia, germplasm data from Biodiversity International, FAO Country Profiles, IFPRI, World Bank. Users involved in AGRIS. The following groups (in terms of profile) of users are involved in AGRIS: software developers, agricultural researchers, students, librarians, information management specialists, agricultural journal editors, related data providers, and other interested people. Table 2-1: List of properties used in AGRIS. bibo:article dct:creator -> foaf:organization -> foaf:name bibo:abstract dct:creator -> foaf:person -> foaf:name bibo:doi dct:datesubmitted bibo:isbn dct:description bibo:presentedat -> bibo:conference -> dct:title dct:extent bibo:uri dct:identifier dct:alternative dct:rights dct:type dct:publisher -> foaf:organization -> foaf:name dct:issued dct:subject dct:source dct:language dct:ispartof dct:title 1 Page 22 of 61

23 2.2.3 AGRIS demonstrator The AGRIS demonstrator for SemaGrow requires changes mostly at the level of data flow. The current architecture will not undergo substantial changes, while a new source of data should be added - coming from a massive harvesting of the Web and subsequent processing to find meaningful combination between the AGRIS core database, the resulting LOD database, and other interesting databases provided by SemaGrow. The core idea is to harvest the Web, starting from pre-selected sources of information in the agricultural domain: then, discovered resources will be enhanced with AGROVOC and stored in a big triplestore (the crawler database). This triplestore can be used to define combinations with the AGRIS core database (in order to create a widget for the AGRIS Web portal) and with other databases. In detail, the following technical components will have to be added to the current AGRIS in order to make a demonstrator for SemaGrow: 1. A customized Apache Nutch Web crawler to harvest data from the Web o The Web crawler and the tagging component may be provided by FAO, by tuning existing tools. A web crawler is needed to gather data from the Web, while an automatic tagging tool is needed in order to apply a first phase of filtering for relevant content, relying on the AGROVOC thesaurus. 2. A triplestore suitable to store the big data collected in the previous phase, i.e. the AGRIS core database (~200 million triples) and the crawler database (expected to quickly reach the order of magnitude of gigatriples) o The needed triple store will be provided by SemaGrow partners. Currently, the discussion to find a place where to store this database is still open. o Once the triplestore is provided, FAO can provide triplestores to fill it, coming from the AGRIS core database and the crawler database. 3. A processing phase will then take place, in order to work on the dataset collected and discover meaningful combinations between the AGRIS core database and crawler database. We also plan on connecting AGRIS data with the data made available by DLO (see Trees4Future-AgMIP demonstrator), therefore this processing phase will also include that data. o A domain expert in agriculture is needed to define possible combinations of databases in natural language. Currently, this domain expert has not been identified yet, but it could be provided by SemaGrow partners. o Processing tools will be provided by SemaGrow partners, with the coordination and direct contribution of UAH. The type of processing envisioned would include the selection of resources that meet requirements such as: an AGRIS record and a crawler record having at least 4 (or 5) common AGROVOC URIs o Support on mapping data, as needed, will also be provided by SemaGrow partners, with specific role of UNITO. 4. A triplestore of relevant selected resources will have to be set up o Technical support will be provided by SemaGrow partners. Page 23 of 61

24 5. A new widget, based on the information included in the triplestore of point 4, will be added to the AGRIS Web portal o FAO will provide the technical support to write and deploy the widget in AGRIS. In order to interlink DLO data to the AGRIS database, a mapping between AGRIS and DLO will be needed. Two options are available: - DLO data could be indexed with AGROVOC URIs. In this way, the AGRIS engine is able to automatically display DLO data when an AGRIS record comes with specific AGROVOC URIs - DLO data could be accessible by a REST Webservice and queried using scientific names. These two options are valid for any dataset which has to be interlinked to AGRIS. Skills required building an AGRIS demonstrator. Given the current status of AGRIS, no major changes in skill would be needed on the side of the system administrators. No changes at all are required on the side of the application end users Foreseen evaluation Two aspects of the demonstrator will have to be evaluated. On the one hand, the efficiency of the data enhancement phase. On the other hand, a more user-oriented evaluation will also have to be performed, in order to assess the improved experience of the end user. - For what concerns the former type of evaluation, support will be provided by technical partners in the project, who have to ensure response time in the order of the second. - For the latter, a survey (via AIMS) could be used to evaluate this process, when a widget will be available in the AGRIS mashup page. This will affect the following evaluation stakeholders: researchers, domain specialists, librarians. Note that from a technical point of view, this step requires that the discovered combinations of point 4 will be put somewhere and made queryable via sparql endpoint or RESTful Webservice. 2.3 Agricultural Discovery Space (ADS) For the use case Reactive Resource Discovery, based on the recommendations of the reviewers, from the two demonstrators that were originally selected, AgLR toolkit and Agricultural Discovery Space (ADS), we will focus on ADS with the support of the semantic search tools of SemaGrow infrastructure. ADS will be realised covering the research needs of educators and trainers in the areas of Food Safety and Agricultural Research Information to explore specific ways to cover their requirements in order to find material for their activities. Any agricultural data discovery space either it is part of a web portal or it is part of a tool like the AgLR toolkit, is based on the Agricultural Data Platform that AK has developed. In order to improve the discovery experience of the user we need to improve the Search API that the Agricultural Data Platform provides to the developers of the discovery applications. Therefore, the use case of Reactive Resource Discovery will mainly focus on how the data platform can be improved through the results Page 24 of 61

25 of the SemaGrow project. The improvements will be reflected both at the layer of the APIs that the data platform is exposing and at the layer of the front end discovery applications. The ADS case will focus on how the agricultural data platform could be enhanced in order to allow multiple and diverse data sources with specialised educational and research content to be searched, accessed and interlinked with the aggregated (by the platform) content. In specific, the existing agricultural data platform aggregates metadata describing mainly educational and bibliographic resources. However, the existing platform is neither scalable nor efficient for handling too many different types of resources described by heterogeneous data Rationale for selecting ADS as a demonstrator The expectation is that ADS may demonstrate the efficiency of SemaGrow technologies, in particular for what concerns the reactive discovery of resources described in different contexts. In the ADS case, the perspective from which multiple heterogeneous and diverse data sources are considered is the one of Food Safety and Agricultural Research Information, during which the users need to cope with reactive resource discovery in order to be able to find, reuse and exploit data resources. In order to provide meaningful and efficient agricultural data discovery services to the end users, AK plans to improve its data platform at the following directions Support and link heterogeneous data sources. Currently only specific data types can be supported, namely bibliographic and educational, and for any new data type it is required to set up a new customized instance of the data platform. The customization consists in creating new data model for the database, new transformers for the new data type, and revising existing processing components. In addition to that, the final index that is created can be connected only through aligned or common classifications to the other existing data types. This means that federated query is not possibly for all the data types supported by the data platform. The process of supporting a new data type is costly and time consuming. Support reactive response discovery. Currently querying two or more different data types is implemented as a parallel call to two or more APIs. This highly reduce the user experience as concerns the data discoverability. More specifically, user cannot perform complex queries and needs to perform more clicks to discover the content that he is seeking for. High efficiency. Currently AK needs to install a new data platform instance in new cloud infra (at least 4 VMs) every time that a new data type should be supported. Moreover, at the front end applications with high visibility there are cases in which resources are consumed to call APIs that are not available or that cannot provide the requested information due to low content coverage. Page 25 of 61

Figure 2-7: Agricultural Data Platform The Agricultural Data Platform can be connected with a global Open Agricultural Data Registry provided by directories like the CIARD RING (http://ring.ciard.

26 Figure 2-7: Agricultural Data Platform The Agricultural Data Platform can be connected with a global Open Agricultural Data Registry provided by directories like the CIARD RING ( of FAO where all the data sources are described and published in machine-readable format. Such global directories can work as information backbones for the ecosystem of stakeholders such as data scientists, developers, and SMEs that would like to use the available open agricultural data to develop new meaningful services for the end users. One of the main business objectives of AK is to build a data shop for open agricultural data that will be based on the agricultural data platform. Since heterogeneity in such data shop is a typical challenge, a basic enabler for such agricultural data shop will be the SemaGrow software stack. Figure 2-8: Metadata Aggregation Workflow Page 26 of 61

27 2.3.2 Current architecture and status of the Agricultural Data Discovery Space The Agricultural Data Platform is an open system that can aggregate data from various data sources, store, enrich, transform and index them in order to prepare data to be consumed by developers and applications. The backend of the system is an aggregator with a number of steps for supporting the acquisition and maintenance of the metadata records from different content providers The various steps of the aggregation workflow are presented in the last figure. More specifically the workflow for metadata acquisition includes: The ingestion step: the first step consists of ingesting all the metadata records from a remote site of a content provider. Metadata standards such as OAI-PMH are used in most cases. The filtering step: filtering is a step consisting of discarding incoming records considered as inappropriate either because the object it describes is inappropriate (e.g., in a collection of educational resources, discarding metadata describing resources covering topics not related to Organic Agriculture and Agroecology ) or because the record is syntactically incorrect. The latter can be seen as a light form of validation that focuses on detecting errors that can potentially compromise the correct functioning of the aggregation service. The identification and deduplication step: during this step, a software component is used to compare new metadata records to the existing ones to see if the objects they describe are already referenced in the catalogue. Transform into internal format: this step is used to transform the XML versions of the metadata records to JSON files that follow the principles of an abstract data model. This step requires transformers capable to convert the various formats and application profiles of the metadata records collected at step 1 into the internal format. Link checking: this step is responsible for checking if the URL for accessing the learning object is broken or not. For all learning objects for which the location included in the metadata record has been recognised as broken, the index is updated accordingly in an automatic way. Post processing: there are cases in which there is a need to normalize the metadata records in order to avoid problems in the front-end applications. Such example is the normalization of language attributes values for title in English which may be provided either using en or eng. In this case the post processing step will normalize all the values so they can use the correct ISO code for the language. Enrichment: this step can be used to enrich the metadata elements of some collections. Page 27 of 61

Figure 2-9: Agricultural Data Platform supported by SemaGrow software stack Store and publish records: the final step of the metadata aggregation workflow is the storage in a repository of all the

28 Figure 2-9: Agricultural Data Platform supported by SemaGrow software stack Store and publish records: the final step of the metadata aggregation workflow is the storage in a repository of all the new metadata records that have successfully passed the deduplication and URL checking step. They are stored on the file system where they are organized by sets. This consolidated metadata store is exposed to a web server so that records can be easily access online. A typical URL is of the form e.g. /LOM/GREENOER/12345.xml Also, this step consists of the metadata publishing through standard protocols and APIs, one supported by the repository and the other by a stand-alone web application. In order to provide a friendly way to access the aggregated and processed metadata, a RESTful API allows several search options over the indexed metadata records (JSON files) following the internal format. In specific it allows the users (or applications) to make the following type of queries: 1) Simple text-based search, 2) Searching within specific fields (metadata), 3) Fetching specific resources given an identifier, 4) A combination of text-based search with faceted search, Page 28 of 61

29 5) Filtering resources according to dates mentioned in specific fields but not with date ranges. Regarding the described data platform, one major issue is that high quality mappings and transformations are needed to be defined and implemented by experts in order to integrate new different types of resources. Such a procedure costs in terms of human resources and time required until a new data source is available through the index Foreseen architecture & implementation The following figure depicts the foreseen general architecture. In the new architecture, the new discovery services will be set up on top of the existing data platform enhanced by providing access to more metadata through SemaGrow powered search API. The green parts in the diagram correspond to SemaGrow revisions in the Data Platform for Agricultural Data Discovery. More specifically, by taking advantage of the project s technology we plan to enhance the data platform towards the following dimensions: Provide to users the ability to access and reuse more resources of several types Provide the capability to cover more information needs of users through APIs with higher expressivity Have a provenance mechanism for filtering the origin of resources in the response of a query Enhance the current data platform to be more robust and automatic: Minimize the effort required of vocabularies and metadata alignment Interlinking and finding interesting relationships among ingested and non-ingested resources of several types Foreseen evaluation The changes in ADS demonstrator will affect the following evaluation stakeholders: Developers: either individuals or working for a SME that wants to develop data products for the agricultural and food sector using the data APIs that the SemaGrow powered data platform will provide. They should evaluate the data platform and the Search end point (SPARQL) powered by SemaGrow to identify the main problems that they are facing when they are using a Search API in order to build a discovery application. Data scientists that want to use the data APIs of the AK s data platform to develop and test new data processing algorithms and components using data that are exposed by the Data Platform Trainers that seek for training courses, educational resources and that want to create a training pathway related to food safety. They will be affected by the new SemaGrow powered discovery applications that will focus in finding information related to agricultural research and food safety, which performance will be altered in terms of time, accuracy of the search results, user experience but also in flexibility of the different queries to external sources. Page 29 of 61

30 Domain specialists (e.g. agronomists, food safety experts) from Organizations and Institutes that seek scientific information related to agricultural research and food safety topics. They will be affected in the same way with the trainers. Organizations and Institutes that want to set up a discovery service that can use heterogeneous data sources. They will be affected in the same way with the trainers. 2.4 Overview of the evaluation stakeholders In this section, we discuss the evaluation of the SemaGrow demonstrators from the point of view of the people involved in the evaluation, the evaluation stakeholders. A stakeholders-focused approach is the main part of the SemaGrow evaluation methodology, involving the relevant community of users (educators, researchers and information officers). The involvement of stakeholders can also add to the overall recognition and participation in SemaGrow activities, like the technical testing and the feedback on user satisfaction. Furthermore, such community involvement during the pilot deployment and evaluation phase may also contribute to improve the user understanding of the impact that using large volumes of data may bring to the sector. It is useful to distinguish between direct and indirect stakeholders: Direct stakeholders in the design, pilot deployment and evaluation of SemaGrow Web Applications are the envisaged users and their organizations Indirect stakeholders are other parties who take an interest in the development and provision of the SemaGrow infrastructure (i.e. national and European decision makers in research policies, research associations and others), or may be indirectly affected by their future use (i.e. existing organizations that offer individual services also covered by SemaGrow). Primarily relevant for the SemaGrow evaluation process are the direct stakeholders, i.e. the envisaged user communities whose engagement and feedback is decisive for a successful conduct and results of the evaluation activities. SemaGrow aims at providing advanced data services for agricultural data infrastructures. Among these communities we distinguish different evaluation stakeholders that are described in the next paragraphs. Researchers These are scientists who are involved in developing agricultural models and in running models for large scale research studies and policy assessments (agronomists, biologists, environmentalist, climate change experts etc.). They are interested in web applications that require input from large to very large and heterogeneous datasets, covering various type of content (agricultural data, soil, measures, environmental information, climate change details, geographical information, economic and statistic data). This type of users includes data scientists (mathematicians, statistician, who are interested to use the agricultural models, in order to get complex results from combing data from different types of data sources, i.e., the prediction of the production in specific environmental conditions (climate and soil data) and the correlation of the production with the prices of the agricultural goods. Page 30 of 61

31 Developers Developers either individuals or working for a SME that wants to develop data products for the agricultural and food sector using the data APIs that the SemaGrow powered data platform will provide. Domain specialists Agricultural specialists, consultants and domain experts (PhD students and professionals with agricultural knowledge), who are interested in using agricultural models and tools, by analysing their research data or related data from other external sources. Data specialists are looking for data from various type of sources, like bibliographic / academic data, educational material, genetic information, geographical details, economic / statistic data etc. This type of stakeholders includes pupils and students who are interested to access not only to educational material, but also to the whole educational activities (including the supporting material: handbooks, manuals, research papers, videos, lectures) of an educational / research pathway, i.e. the analytic method for nutrient analysis of grapes. End users of agricultural models maybe agricultural specialists with business oriented needs In order to improve the innovative perspective of their business idea (innovative product or method). Educators/Trainers This category of users includes agricultural-related consultants and educators (i.e. trainers and extension workers). They are located in an array of domains of research and education including agriculture and food safety. They are interested to identify and access, as well as to share their own current research and educational material (scientific papers, research data, training material). These user groups create a new educational / training pathway with their own material or using data from external resources. Data owners / curators and annotators of data This group of users work on the organisation and integration of data / content around various agricultural research and education topics (i.e. land use, soils, water and other natural resources) and broader thematic fields (i.e. organic agriculture, sustainable agriculture, food security). Some examples include curators and annotators ( inter-linkers ) of bibliographic resources, agricultural learning collections, genetic resources and geographical data. Librarians and other institutional information managers This group of users includes librarians and other managers of institutional or subject/domain-based content databases / repositories of many universities and other research centres, publishers, public administrations and NGOs. This user group manages the content provided by researchers and educators and avail of a certain level of IT capacity, both in terms of technical infrastructure and Page 31 of 61

32 skills. Some of them are managing advanced digital libraries or data centres, while many more have only limited capacity and small content collections, but aim to connect to, and collaborate with, content / data sharing initiatives. Page 32 of 61

33 3. SemaGrow evaluation approach 3.1 Layered evaluation approach SemaGrow evaluation approach will be based on a layered evaluation framework. Evaluation of such complex systems can be a challenging and difficult task. Recent studies have suggested the adoption of layered approaches in order to identify the components of a system that may affect its overall performance (Pu et al., 2012). Layered evaluation (or decomposition) frameworks have attracted research attention for more than a decade, with several frameworks, methods and instruments being proposed and tested in relevant literature (Paramythis et al. 2010; Manouselis and Verbert, 2013; Manouselis et al., 2014). They try to decompose a system in its constituent subsystems or layers that can be evaluated one by one and then apply particular evaluation methods that can assess the performance of each targeted layer. Pu et al. (2012) have suggested that layered evaluation can be used as a powerful technique in identifying areas of a system that require further improvements. A series of layered evaluation frameworks have been proposed in the literature of the evaluation systems, advocating that each component of the systems should be evaluated separately, in order to collect valuable feedback for the pros and cons of each part of the system. The idea can be traced back to the early 90s, when Totterdell & Boyle (1990) proposed that (i) the accuracy of the user model and (ii) the effectiveness of the changes (adaptations) made by the adaptive systems should be evaluated separately. Ten years later, Karagiannidis & Sampson (2000) proposed the term layered evaluation, and suggested that the evaluation should address the main components of each system separately. A similar layered framework was proposed by Weibelzahl (2001), with the decomposition of the adaptation into three layers: Evaluation of input data Evaluation of the inference mechanism Evaluation of the adaptation decisions On the other hand, Paramythis et al. (2010) further elaborated their decomposition of the layered evaluation framework by proposing five layers (or modules): Interaction monitoring Interpretation and interface Modelling Adaptation decision making Applying adaptations Additional approaches that adopted to some extent a layered- or component-based approach were also proposed by Herder (2003), Magoulas et al. (2003), and Tobar (2003). Reviewing the state of the art in related work, Paramythis et al. (2010) grouped together the main approaches and suggested the following main layers of adaptation: Collection of input data Interpretation of the collected data Modelling the current state of the world Page 33 of 61

34 Deciding upon adaptation Applying (or instantiating) adaptation They argued that these adaptation layers serve as the core components upon which evaluation can take place, aiming to isolate and evaluate separately, as many as possible given the particularities of a given system. 3.2 Towards a layered decomposition for SemaGrow Use Cases To illustrate how a layered de-composition can serve as a starting point for the development of a more concrete and practical evaluation framework, as proposed by Manouselis et al. (2014a and 2014b), we elaborate on the mapping an adapted version of the layers presented by Karagiannidis & Sampson (2001) to the components of SemaGrow use cases, in order to provide some generic principles and guidelines that the three SemaGrow use cases and the respective demonstrators could explore. More specifically, we focus on each layer and breakdown the interaction components to distinguishable elements. Then, using an existing analysis of each SemaGrow use cases to various dimensions (as it is described in the deliverable D2.2.2 Envisaged Applications & Use Cases), we further analyse the interaction components to more fine-grained sub-components. This analysis can down to the level of granularity, that the evaluation framework designers believe that it will provide meaningful results to the researchers. For each SemaGrow use case, there is the need to clarify the dimensions that have to be evaluated, using a similar approach that it was proposed for the recommendation systems (Manouselis & Costopoulou, 2007; Manouselis et al., 2014a). The high level analysis for all SemaGrow demonstrators is presented in Table 3-1, while a more detailed analysis per demonstrator is presented in Tables 3-2, 3-3 and 3-4. Based on this analysis, different evaluation methods and criteria are set for each evaluation layer of each demonstrator. This analysis is presented in section after an overview of the available appropriate evaluation methods. Figure 3-1 presents the mapping of the SemaGrow Architecture with the evaluation layers. Table 3-1: Mapping the evaluation layers with the SemaGrow Architecture [26] Evaluation layers (Karagiannidis & Sampson, 2000) Interaction Assessment Adaptation Assessment Interaction Components (Pu et al, 2012) Resource Presentation (Client) Resource Discovery (SemaGrow Stack) SemaGrow Dimension (Objectives) Client front-end Interaction Client back-end interaction Indexing algorithms for efficient storage and retrieval Query decomposition and rewriting Schema alignment methods Page 34 of 61

35 INTERACTION LAYER Client ADAPTATION LAYER Semagrow Stack Off-Stack Semagrow Components SemaGrow SPARQL endpoint Resource Indexing Query Decomposition Alignment Query Transformation Query Manager and Execution Engine Data Source #1 Data Source #n Figure 3-1: The SemaGrow Architecture and the evaluation layers Table 3-2: Mapping the evaluation layers with Trees4Future-AgMIP Evaluation layers Client Interaction Assessment SemaGrow Stack Assessment Interaction Components Client Usability/ Performance Search Ability to select data sub-sets Table 3-3: Mapping the evaluation layers with AGRIS Evaluation layers Agricultural data crawled (SemaGrow Stack) Interaction Components Client Usability/ Performance Agricultural data combinations / front-end application Content integration Table 3-4: Mapping the evaluation layers with ADS Evaluation layers Agricultural Data Discovery front-end applications Agricultural data aggregation and processing Agricultural data retrieval and publishing (SemaGrow Stack) Interaction Components Client Usability/ Performance Content integration Search effectiveness Usage of end points (Activation) Page 35 of 61

36 4. Evaluation Methods and Tools for SemaGrow Demonstrators In this section, we describe the methods, tools and evaluation metrics that will be used in the context of the SemaGrow pilot trials. We also explain the proposed experiments for evaluation. 4.1 Review of evaluation methods and tools Evaluation methods can be either quantitative or qualitative methods in nature. Table 4-1 provides a summary of various testing and evaluation methods that allow for comparison (Holzinger, 2005, Matera et al., 2006, Rohrer, 2008, USINACTS, 1999). The selection of the evaluation methods to adopt was made by taking into account our requirements on evaluation, which cover the following aspects: Use of qualitative and quantitative methods in order to ensure an appropriate number of users (quantitative) and depth of involvement (qualitative); Consideration of the difference between opinions versus actual behaviour (i.e. what users say about tested the three service demonstrators vs. what they actually do with them); and Consideration of different contexts of actual use: i.e. evaluation with selected users in the lab environment versus open, online use of tools and services. The assessment of which methods should be selected took account of the use cases described into deliverable D2.1.2 Envisaged Applications & Use Cases. More use cases will be described in the next version of the present deliverable in order to define the pilot specifications for each demonstrator. Thus, different dimensions have been considered in the selection of the set of evaluation methods. In the table below the selected methods are highlighted (in grey), and briefly described. Evaluation methods used in IT projects include the experiments, interviews, surveys, observations, focus groups etc. Table 3-1 provides a summary of the properties of each method reviewed in USINACTS guideline (USINACTS, 1999) to compare them and choose the most appropriate for SemaGrow requirements for every evaluation phase. The SemaGrow controlled pilot trials will be based on structured interviews, usability evaluation, surveys (on-site and on-line questionnaires) and input logging, which support the aims of the evaluation methodology. Page 36 of 61

37 Table 4-1: Evaluation methods (source: USINACTS, 1999) Method Lifecycle Stage Users Main Advantage Main Disadvantage Experiments Components design (hardware or software). Establishing generic principles for system design. Usually few, but depends on complexity It allows testing design hypotheses or alternatives in an optimal way. Complex techniques involved, which requires expert knowledge for maximum benefit. Usually made in the usability laboratory, and not in the real use environment. Interviews User requirements. Task analysis 5 Flexible, in-depth attitude and experience probing. Time consuming. Hard to analyze and compare. Observation Task analysis Usability testing Several (>3) It is made in real use environment. Very costly. Difficult to analyse, and to know the reasons for behaviour. Usability testing Early design, "inner cycle" of iterative design None (it is made by experts) Finds out individual usability problems. Can address expert user issues. Does not involve real users, so does not find "surprises" relating to their needs. Focus groups User group feedback < 10 / group Spontaneous reactions and group dynamics. Allows to find out opinions or factors to be incorporated in other methods (i.e., surveys) Hard to analyse. Low validity. Input logging (Web analytics) Final testing, follow-up studies At least 20 Finds highly used (or unused) features. Can run continuously. Analysis programs needed for huge mass of data. Violation of users privacy must be prevented. Surveys (User Feedback) Follow-up studies. Also for user requirements. Hundreds Tracks changes in user requirements. Analysis of user's opinion for the working system in its real environment. Special organization needed to handle replies. Page 37 of 61

38 4.2 Evaluation methods and tools selected for SemaGrow demonstrators Overview of the SemaGrow evaluation The evaluation process will address the assessment of three service demonstrators on top of semantic store infrastructure, taking place through different phases and involve different stakeholder / users groups. The evaluation will take place in five (5) different phases including a numbers of evaluation activities Deployment I - First functional version of the integrated SemaGrow components Controlled pilot trials - Cycle I Deployment II - Refinement and alignment of the second integrated version Controlled pilot trials - Cycle II The following figure (Figure 4-1) provides a first overview of important elements of the SemaGrow evaluation experiments. Based on the above explanation, we would like to highlight that FAO, ALTERRA and AK have strong experience in their fields and their knowledge and previous experiences in the evaluation of ICT systems, tools and services will contribute greatly to the professional execution of the testing and evaluation activities. The following tables present a mapping of the evaluation methods and metrics to be used for each evaluation layer for each specific demonstrator, based on the initial analysis presented in Section 3. Table 4-2: Mapping Trees4Future-AgMIP with evaluation methods and metrics Evaluation layers Interaction Components Evaluation Metric Evaluation Method Client Interaction Assessment Client Usability/ Performance Usability and User satisfaction Usability testing and User satisfaction Correctness Experts testing, Controlled Physical Trial, Online Controlled Trial SemaGrow Stack Assessment Search Completeness Ranking Accuracy Experts testing Experts testing, Controlled Physical Trial, Online Controlled Trial Ability to select data subsets Correctness Completeness Experts testing Experts testing Page 38 of 61

Figure 4-1: Overview of tasks for pilot evaluations, partners responsible, and evaluation activities 4.2.

39 Figure 4-1: Overview of tasks for pilot evaluations, partners responsible, and evaluation activities Controlled pilot trials The controlled pilot trials will take place at partners sites where selected groups of stakeholders will be invited to test and evaluate SemaGrow service demonstrators and provide their feedback. This task will be organized and run with selected users that belong to the different communities that are being considered. The group of users (between 10 and 20) will give feedback on how they can overcome their data problems by using the SemaGrow-enhanced demonstrators. As discussed earlier, two rounds of controlled pilot trials have been planned, controlled pilot cycle I and controlled pilot cycle II: Controlled pilot - Cycle I: This pilot will take place immediately after the first deployment of SemaGrow integrated tools, giving input for the second SemaGrow integration Controlled pilot - Cycle II: After the second integrated version of SemaGrow tools, the second phase of controlled pilots will take place in order to ensure a realistic vision of how the SemaGrow results may be deployed in real life environments. This phase will follow an iterative approach. Table 4-3: Mapping AGRIS with evaluation methods and metrics Evaluation layers Interaction Components Evaluation Metric Evaluation Method Agricultural data crawled (SemaGrow Stack) Client Usability/ Performance Usability Speed Usability testing, Observation, Experiments Agricultural data combinations / front-end application Content integration Correctness Completeness Surveys, Interviews, Website feedback Page 39 of 61

40 Results will be collected by each pilot trial and analysed in an integrated report that will provide recommendations for the further improvement of the SemaGrow components and ideas for the possible deployment of the demonstrators under real life conditions. Due to the diversity of the demonstrators and the respective evaluation stakeholders, there will be different types of pilot trials that will be used based on each specific case Experts testing A number of developments in the demonstrators can only be tested with a small number of experts that will offer their expertise and examine the demonstrator extensively. Such a case is the back-end of the ADS demonstrator that is dealing with agricultural data aggregation and processing. Only experts can evaluate it and their numbers and availability are very small. In this case a physical pilot trial with multiple evaluators cannot take place. This approach will be used in other demonstrators as well, according to the needs of each specific case Physical trials The controlled physical pilot trials will take place at partners sites where selected groups of stakeholders will be invited to test and evaluate SemaGrow service demonstrators and provide their feedback. This task will be organized and run with selected users that belong to the different communities that are being considered. The group of users (between 10 and 20) will give feedback on how they can overcome their data problems by using the SemaGrow-enhanced demonstrators. Table 4-4: Mapping ADS with evaluation methods and metrics Evaluation layers Agricultural Data Discovery front-end applications Agricultural data aggregation and processing Interaction Components Evaluation Metric Evaluation Method Client Usability/ Performance Content integration Search depth Number of searches transformed to content access Relevance Precision/Recall Speed Ranking Accuracy Time/effort needed to support a new data type Data types supported by data platform Domain experts testing, Controlled Physical Trial, Online Controlled Trial Data experts testing Agricultural data retrieval and publishing (SemaGrow Stack) Search effectiveness Usage of end points (Activation) Precision/ Recall Speed Number of active registered developers Developers testing, Controlled Physical Trial, Developers testing, Controlled Physical Trial, Page 40 of 61

41 Interviews In addition to the controlled physical pilot trials engaging groups of users, we will also conduct oneto-one meetings that will include a presentation of the demonstrator, a hands-on session that will allow us to observe the user and an interview to collect more qualitative results. Engaging a single user each time will allow us to gather more quality feedback and address issues related to the availability and the flexibility of the users Online trials experiments) Online pilots will give the opportunity to the designer of the SemaGrow use case to measure the change in user s behaviour when they are interacting with more than one demonstrator or they are setting queries, which require more time than the allocated time of a controlled pilot (in the context of a half-day event). Additionally, an online experiment provides evidence that the candidate approaches are reasonable, which gradual reduces the risk in causing significant user dissatisfaction. The online evaluations will be conducted using the same methods and tools, as they will be developed for the controlled pilots. They are not obligatory but offer an alternative to evaluating a demonstrator that may be more suitable to test depending on the circumstances (i.e. to allow more flexibility for remote users to participate, etc.) Hackathons Hackathons are piloting events that will use the SemaGrow Stack components and datasets in competitions (and benchmarking) for developing real world applications, from external to the consortium groups. Such events provide the opportunity for verifying that SemaGrow results address not only the needs of the participating user partners, but also the needs of the Semantic Web community in general, providing also performance evaluation measurements in real-world applications. The first hackathon already took place during the first year to test the datasets and the 2 nd will take place on 4-7 July The 3 rd will take place during the final phase of controlled pilot trials. 4.3 Potential implementation of controlled pilot trials This section presents a possible implementation of each cycle of the controlled pilot trials for the demonstrators that consists of three stages: (1) Introduction to Open Data in Agriculture, (2) Handson workshop and (3) Open event - Hackathon (Figure 5-1). Stage 1 and Stage 2 lasts one day each. Stage 3 is optional, and can last around 2 days. The implementation of such a structured approach in the pilot trials is not obligatory, but provides a way to combine Hackathons and the pilot trials with events that will help attract more stakeholders, by offering them information that is of great value to them. Page 41 of 61

4.3.1 Introduction to Open Data in Agriculture Figure 4-2: Piloting trials stages Stage 1, Introduction to Open Data in Agriculture, provides a full introduction to the selected use cases for the

A series of presentations will be given to the participants in order to empower the stakeholders group and expose them to the real business opportunities and challenges in the agricultural value

42 4.3.1 Introduction to Open Data in Agriculture Figure 4-2: Piloting trials stages Stage 1, Introduction to Open Data in Agriculture, provides a full introduction to the selected use cases for the specific controlled pilots. A series of presentations will be given to the participants in order to empower the stakeholders group and expose them to the real business opportunities and challenges in the agricultural value chain. Figure 4-3: Piloting trials stage 1 This stage will present the importance of open data in agriculture and the opportunities that it offers. During this one-day event, organizers will detect the business ideas of the participants that are related to the service demonstrator that will be presented and tested during Stage 2 hands-on workshop. The intro day is a full day event, based on the involvement of instructors with strong domain knowledge and strong business experience and the blended activities with theoretical sessions and hands-on session, defining the data products that are interested to build on. Participants will be educators, trainers, researchers, data owners or information officers in correlation with the type of pilot (Heterogeneous Data Collections & Streams, Reactive Data Analysis and Reactive Resource Discovery). In more details, a typical agenda for the workshop will include: Introduction to a data-powered agricultural business ecosystem (typical data collections, example of service/application providers and representative end users) Overview of open data types, sources and sets for agriculture Case study: research data and examples of data challenges Page 42 of 61

Case study: an agricultural data processing platform and hands-on with the selected use cases 4.3.

43 Case study: an agricultural data processing platform and hands-on with the selected use cases Hands-On workshop The stage 2 is the actual pilot trial event which aims to present a tutorial on how to use the testing service demonstrator and then to give the opportunity to each invited group of stakeholders (researchers, educators, data owners and information officers) to have an experimental session for giving their input about the user satisfaction on playing with the service demonstrator-powered by semantic technologies. The participants feedback will be collected, by using specific evaluation tools, selected each time by the organizers (i.e. online questionnaire, interview). Figure 4-4: Piloting trials Stage 2 The objective of Stage 2 is to familiarize all the stakeholders (10-20 participants) with the service demonstrator that is going to be evaluated each time. The second stage is one-day event, including theory and hands-on: Theory session delivered by data experts from the implemented partners (FAO, ALTERA, AK) Hands on workshop with technology experts, in order to familiarize participants with the testing use case each time (i.e. on how to use the SEEMLESS integrated database) In more details, a typical agenda for the workshop will include: Introduction to the service demonstrator that is powered by semantic infrastructure Hands-on the selected service demonstrator Feedback on user satisfaction Hackathon event The implementation of this event is not required for the evaluation of all demonstrators, but we include it here so as to support the teams who are interested in that. This stage is about having a Hackathon weekend focused on the agricultural data-powered ideas / start-ups, attracting top talent from the data-powered domain, participants from the previous two steps and connecting them with the data scientists and developers who are interested to implement SemaGrow semantic technologies into the design of their ideas. A typical Hackathon is organised during a weekend and focuses on the provision of working prototypes for one or more relevant technological problem. The pre-defined challenges (related with Page 43 of 61

the objective of each selected use case) should be well introduced to the participants, including the actual business background and the user need that it tries to cover.

44 the objective of each selected use case) should be well introduced to the participants, including the actual business background and the user need that it tries to cover. Following to that, participating teams undertake the task to provide working solutions for each challenge within the tight timeframe provided. Figure 4-5: Piloting trials stage 3 As key success factors for the organisation of this stage, organisers should invest on good dissemination in order to manage to attract the most appropriate candidates and ensure a rich pipeline of talent for the rest of the process. The dissemination effort is most effective when it is addressed the participants in a targeted way (universities, start-up associations etc). The challenges, which are related to the selected use cases, should be relevant, inspiring and have a clear impact potential to agricultural research community and to business start-ups with innovative ideas. The typical technology start-up audience is strongly attracted to meaningful challenges. The participants to these types of events can be either existing start-ups or team of programmers / technologists that have not officially formed a company and they are interested to test and explore the SemaGrow use cases. Organizers may also bring in the team more individual participants. Other types of audience are also possible, like data scientists and domain experts with special interest in analysing data from agricultural and related research fields. One or more meet-ups before or during the first day of the hackathon event will contribute to the success of the Hackathon. Usually, meet-ups are a series of contextualised events, visiting typical examples of the agricultural industry that are related to the selected use case that will be tested during each pilot event and hosted at a specific agricultural company. The key objectives of the meetups are the exposure of start-ups to the real expressions of each presented use case, facing real world stakeholders. The key success factors are related to careful selection of the hosting companies and/or the invitation of the appropriate keynote speakers. Page 44 of 61

45 5. Demonstrators development and evaluation plan 5.1 Evaluation timeframes and deliverables The evaluation plan includes the schedule of all the corresponding SemaGrow activities for the proper evaluation of the SemaGrow integrated components. Each evaluation aspect should be considered independently, taking into consideration: a) the expected deployment time of the SemaGrow integrated components, and b) the nature of each evaluation step, including details on the methods that will be used. The evaluation plan is divided into five different phases that are presented in the table below (months refer to project month). Each phase is linked to a number of tasks that should be undertaken with certain methods and tools. Based on the recommendations of the reviewers from the 1 st review meeting, the second cycle of pilot trials will follow an iterative approach with smaller cycles, depending on the demonstrator. The results of testing and controlled pilots will be integrated into the integrated evaluation report, aimed at supporting for further development, improvement or refinement of the integrated services and tools. Table 5-2 presents the evaluation related products that need to be delivered during the project and their respective deadlines (months refer to project month). Table 5-1: Summary of planning, implementation and evaluation of SemaGrow demonstrators Phases Tasks Tools Piloting Plan Pilot Deployment Controlled Pilots - Cycle I Pilot Deployment II Controlled Pilots - Cycle II First version of the evaluation plan, documenting the plan for the development of the three (3) demonstrators and the methodology and materials for the pilot trials. Second version of the evaluation plan, documenting the plan for the development of the three (3) demonstrators and the methodology and materials for the pilot trials. First functional version of the integrated SemaGrow components First controlled pilot will provide input for the second integrated SemaGrow platform Second functional version of the integrated SemaGrow components Second cycle of pilot trails Iterative approach Start Month End Month - M7 M12 - M14 M19 Test cases M19 M21 Interviews, Questionnaires, Input logging M22 M24 Test cases M22 M27 Interviews, Questionnaires, Input logging M28 M30 Page 45 of 61

Table 5-2: List of SemaGrow deliverables related to demonstrators evaluation 5.2 Trees4Future-Agmip demonstrator 5.2.1 Development plan The following stepwise approach will be performed for implementation of the Trees4Future-AgMIP demonstrator.

46 Table 5-2: List of SemaGrow deliverables related to demonstrators evaluation 5.2 Trees4Future-Agmip demonstrator Development plan The following stepwise approach will be performed for implementation of the Trees4Future-AgMIP demonstrator. Phase 1: Transform the Trees4Future interface to a Trees4Future-AgMIP demonstrator working on top of the SemaGrow infrastructure, serving Trees4Future data and metadata: a. Adaptation of the Trees4Future server component to query the SPARQL endpoint of the SemaGrow infrastructure; b. Provide access to Trees4Future metadata through SPARQL endpoint on the Trees4Future triple store; c. Transfer the parts of the Trees4Future ontology required by the SemaGrow infrastructure to effectively query the Trees4Future Sparql endpoint. Phase 2: Extend the SemaGrow Trees4Future-AgMIP demonstrator with query functions on Trees4Future data content (using the same end user interface): a. Design and implementation of additional / extended SPARQL queries that support (1) the querying of datasets based on criteria related to the data and (2) support selection of a subset of data based on criteria related to the data; b. Extending the Trees4Future ontology in the SemaGrow infrastructure with the semantics required to effectively query the Trees4Future data; c. Development of a data conversion component translating the result set (RDF data) into a usable (spatial) format, e.g. WCS, WFS (or NetCDF, ESRI shape if required). Phase 3: Add AgMIP data and search and discovery support for AgMIP: a. Connection to the triple store for AgMIP (meta)data as implemented in WP2; b. Set up an ontology to support effective data selection from AgMIP data sources, including parts of AGROVOC and the standardized variable list offered though the ICASA version 2.0 data standards. c. Extending SemaGrow ontologies with semantics required to effectively query the AgMIP (meta)data. Page 46 of 61

47 The deployment of the first version and the execution of associated pilot trial for the demonstrator will include the full implementation work of phase 1 as well as some of the work performed for phase 2. The phase 2 functions included in the first pilot trial will be limited to those required to shown basic (big) data querying. The second deployment and pilot trial is performed on the full demonstrator including the implementation work from phase 2 and 3. The time planning of the implementation of the Trees4Future demonstrator and its pilot trials is given in Table Evaluation plan The evaluation plan for the Trees4Future-AgMIP demonstrator is fully aligned with the implementation plan described in the previous paragraph. Thus, the first controlled pilot trial will focus on the Trees4Future community and end users, evaluation the similar functionalities over both the Trees4Future application and the SemaGrow demonstrator and some new, data-oriented queries demonstrating SemaGrow big data query capabilities. The second controlled pilot trial will focus also on the AgMIP user community and on evaluating the full set of offered functionalities from their perspective. Phase / Task Table 5-3: Implementation planning for the Trees4Future-AgMIP demonstrator Delivery Month Deploy ment 1a- Adaptation of the Trees4Future server component M22 1 st DLO 1b - Provide access to Trees4Future metadata through SPARQL endpoint 1c - Transfer parts of the Trees4Future ontology required by the SemaGrow infrastructure 2a - Design of additional / extended SPARQL queries on datasets Partners M22 1 st DLO, NCSR-D M22 1 st NCSR-D, DLO M24 1 st, 2 nd DLO, UAH 2b - Extending current Trees4Future ontologies M27 1 st, 2 nd DLO 2c - Development of a data conversion component M27 2 nd UAH, DLO 3a Connection to triple store for AgMIP (meta)data M27 2 nd NCSR-D, DLO 3b - Set up of an ontology to support effective data selection from AgMIP data sources M27 2 nd DLO, UNITOV 3c - Extension of current ontologies with semantics for AgMIP M27 2 nd DLO, UNITOV st Controlled Trial The first controlled pilot trial focusses on the Trees4Future user community and the evaluation criteria that are most relevant for that community. Specification of these criteria concentrates on (1) evaluating specific Trees4Future queries against metadata versus their SemaGrow demonstrator Page 47 of 61

48 analogues and (2) evaluating some basic data oriented queries against current (manual or semiautomated) procedures. The 1 st controlled pilot trial will be performed as an off-line experiment with a limited group of (3-5) Trees4Future users. It will consist of: - Evaluating a set of pre-defined queries on the Trees4Future metadata against both the Trees4Future application and the SemaGrow demonstrator. This will be a list of 5 10 predefined queries covering thematic, spatial, temporal and combined queries. - Evaluating a limited and pre-defined set of queries on the data content of Trees4Future. Since the queries will not be processed to a user format (as it is functionality planned for the 2 nd pilot deployment), limited evaluation will be performed, focussing on performance and correctness of the results. Table 5-4: SemaGrow Trees4Future-AgMIP 1st Pilot Trial Evaluation Component Objects of evaluation Evaluation methods Metadata queries Data queries 5-10 predefined queries on Trees4Future metadata 3-5 predefined queries on Trees4Future data Correctness / Completeness Objective / quantitative assessment of returned datasets against the pre-assessed expected output datasets Performance Objective / quantitative comparison of SemaGrow end user query performance compared with pre-assessed performance of Trees4Future application User experience User questionnaire, assessing general and query specific opinions regarding demonstrator functionality and behaviour Correctness / Completeness Objective / quantitative assessment of returned data against the pre-calculated expected output data User experience User questionnaire, assessing general and query specific opinions regarding demonstrator functionality and behaviour Page 48 of 61

49 nd Controlled Trial The second controlled pilot trial will include the AgMIP user community and its specific evaluation criteria and will focus on the functions and additional (AgMIP) data added to the Trees4Future-AgMIP demonstrator for the 2 nd deployment. Specification of the criteria concentrates on evaluating specific AgMIP queries on metadata and data oriented queries against current (manual or semi-automated) procedures. The 2 nd controlled pilot trial will be performed as an off-line experiment with a selected group of (5-10) users from the AgMIP and the Trees4Future community. Table 5-5: SemaGrow Trees4Future-AgMIP 2nd Pilot Trial Evaluation Component Objects of evaluation Evaluation methods Metadata queries Data queries predefined queries on both AgMIP and Trees4Future metadata 5-10 predefined queries on AgMIP and Trees4Future data Correctness / Completeness Objective / quantitative assessment of returned datasets against the pre-assessed expected output datasets Performance Quantitative and qualitative assessment of SemaGrow end user query performance Quantitative assessment by comparison of required time against time required for (semi)manual data processing. User questionnaire, assessing performance experience by AgMIP users User experience User questionnaire, assessing general and query specific opinions regarding demonstrator functionality and behaviour Correctness / Completeness Objective / quantitative assessment of returned data against the pre-calculated expected output data. Evaluation of correctness of format(s) of returned datasets by opening / processing with one or more selected tools. User experience User questionnaire, assessing general and query specific opinions regarding demonstrator functionality and behaviour Page 49 of 61

50 The offline experiment will consist of: - Evaluating a set of pre-defined queries on AgMIP and Trees4Future metadata. A list of predefined queries covering thematic, spatial, temporal and combined queries will be evaluated. The Trees4Future related queries will be evaluated against both the Trees4Future application and the SemaGrow demonstrator. AgMIP queries will be qualitatively evaluated through a user questionnaire. - Evaluating a limited pre-defined set of queries on the data content of both AgMIP and Trees4Future. The evaluation will again focus on performance and correctness of the results, but will specifically also evaluate the usability of the returned dataset by importing and testing the delivered format(s) against selected tools. 5.3 AGRIS Development plan The implementation of the AGRIS demonstrators will require the following phases: Phase 1: Development of the Web crawler environment, which includes the automatic tagging tool (AgroTagger): 1.1. Customization and deployment of a Web Crawler (e.g. Apache Nutch); 1.2. Adaptation of the AgroTagger to work with the Web Crawler output; Phase 2: Development of a LOD environment, which includes the AGRIS core database and the output of the process described in Phase 1: 2.1. Execution of the Web Crawler + AgroTagger to generate a big set of triples (the crawler database); 2.2. Storage of the AGRIS core database and the crawler database in a triplestore provided by SemaGrow (the triplestore and its physical location have still to be defined); Phase 3: Discovering meaningful combinations between the AGRIS core database and crawler database (and other SemaGrow databases, like DLO): 3.1. A domain expert will define possible combinations of databases in natural language. Currently, this domain expert has not been identified yet; 3.2. SemaGrow partners - with the coordination and direct contribution of UAH will provide processing tools, which include the translation of queries identified in point 3.1 to SPARQL or other machine languages and the generation of results; 3.3. Resulting triples will be stored in a triplestore; Phase 4: Extend the AGRIS Web portal with the output of Phase 3, in order to allow users finding relevant data to take better decisions related to the agricultural domain and food security: Page 50 of 61

51 a. A new widget, based on the output of Phase 3, will be added to the AGRIS Web portal, so that users will find meaningful and related resources to the information provided by the AGRIS Web portal. The deployment of the first version of the demonstrator and the execution of associated pilot trial will include the full implementation work of Phase 1 and Phase 2. Phase 2 requires additional work for SemaGrow partners to share a physical or virtual server to set up a triplestore and stores the output of the crawler, as well as a copy of the AGRIS core database. The second deployment and pilot trial is performed on the full demonstrator including the implementation work from Phases 3 and 4. To complete Phase 3, it is still necessary to identify a domain expert who can define useful combinations between databases; moreover, a technical work to compute combinations is needed. The time planning of the implementation of the AGRIS demonstrator and its pilot trials is given in Table 5-6. Phase / Task Table 5-6: Implementation planning for the AGRIS demonstrator Delivery Month Deploy ment 1.1 Customization and deployment of a Web Crawler M21 1 st FAO 1.2 Adaptation of the AgroTagger to work with the Web Crawler output 2.1 Execution of the Web Crawler + AgroTagger to generate a big set of triples (the crawler database) 2.2 Storage of the AGRIS core database and the crawler database in a triplestore provided by SemaGrow (the triplestore and its physical location have still to be defined) M21 1 st FAO M22 1 st FAO Partners M24 1 st, 2 nd UAH, NCSR-D, IPB (not yet completely defined) 3.1 Domain expert to define combinations M25 1 st, 2 nd UAH, NCSR-D, AK (not yet completely defined) 3.2 Technical processing to discover combinations M27 2 nd UAH, NCSR-D, FAO 3.3 Generate triples M27 2 nd UAH, NCSR-D, IPB (not yet completely defined) 4.1 Widget in the AGRIS portal M30 2 nd FAO Page 51 of 61

52 5.3.2 Evaluation plan Two aspects of the demonstrator will have to be evaluated. On the one hand, the efficiency of the data enhancement phase. On the other hand, a more user-oriented evaluation will also have to be performed, in order to assess the improved experience of the end user st Controlled Trial The first controlled pilot trial focusses on the AGRIS data technology and, in particular, on performances and usability of the Sparql endpoint that results as output of the crawler. Specification of these criteria concentrates on evaluating the speed and performances of the triplestore instance and the usability of the data contained in such a triplestore. Evaluation methods for this trial are: usability testing, observation, and experiments. This means: - Run Sparql queries against the Sparql endpoint to evaluate performances; - Run combinations of Sparql queries to combine this triplestore with the AGRIS core database and comment on performances; - Observation of the behaviour of the infrastructure in the short term period (for instance, logging eventual downtime of the system) nd Controlled Trial The front-end AGRIS demonstrator will be evaluated by end users, such as in-house agricultural information officers and external users of AGRIS. This will affect the following evaluation stakeholders: researchers, domain specialists, librarians. What needs to be evaluated in this phase is the final widget that will be available in the AGRIS portal: for each AGRIS record, a widget will show related information discovered by meaningful combinations between the AGRIS core database, the crawler database, and other SemaGrow databases like DLO. We plan on using both interviews and surveys administered on a face-to-face base, and from distance (surveys reachable from AGRIS website). Surveys and interviews will focus on user satisfaction with respect to the amount of data accessed by the demonstrators, their relevance to the user information needs, and their relevance to each AGRIS record they are interlinked. 5.4 Agricultural Discovery Space The main goal of the evaluation approach for the Agricultural Data Discovery use case will be to evaluate both the data platform layer and the front-end discovery applications. We will follow the lean methodology principles to validate the new version of the data platform and the discovery applications. The validation is focusing both on problem and solution. The main steps of this methodology are depicted in the following figure. Page 52 of 61

Figure 5-1 Steps of the overall validation approach More specifically, an iterative process will be followed that includes the following steps problem understanding that can be conducted using

53 Figure 5-1 Steps of the overall validation approach More specifically, an iterative process will be followed that includes the following steps problem understanding that can be conducted using interviews with the real users solution definition that will be validated with real users e.g. a new SemaGrow powered component in a discovery app Qualitative validation of the new solution with interviews Quantitative validation using metrics that will be based on logs and analytics For the quantitative evaluation the following tools will be used Actionable metrics rather than simple metrics. These are metrics that tie specific and repeatable actions to observed results. For instance the number of queries in a discovery application that has been transformed to view of specific resources. Funnel reports e.g. to check how many visits in the APIs or discovery app are transformed to usage of the Search API. Cohort Analysis to study the long-term effects of the improvements. In the case of API usage the evolution of the metrics can be studied for the different hackathon events that will be realized in the context of SemaGrow. An example of cohort analysis diagram is presented in Figure Development plan In the case of the ADS demonstrator, there will be two separate development phases: Phase 1: Develop the enhanced data platform that will include the following components: an analytics tool in the API page to see how much the developers are interested in the API and which are the ones that are using it. It is important to understand at which point they are dropping off and to ask them why is this happening e.g. incomplete documentation, low number of parameters in the API, architecture of the implemented API a component that will wrap the existing data platform Search API so it can be included in the SemaGrow end point. This component will be developed by NCSR-D team. a component in the Search API that will identify developers e.g. se an API key to be able to track developers. a component in the Search API that will track queries and will combine them with timestamp and users Page 53 of 61

Figure 5-2: Example of cohort analysis for a specific period Phase 2: Develop the front end discovery applications that will include the following components: Component at the front end discovery

54 Figure 5-2: Example of cohort analysis for a specific period Phase 2: Develop the front end discovery applications that will include the following components: Component at the front end discovery application that will consume the new SemaGrow powered Search end point Components that will implement the new functionalities powered by the SemaGrow end point. Such functionalities will be the display of related resources at the view item page, the federated search over A component that will identify the users in the discovery application so we can track the user activities in the analysis of metrics e.g. users that performed a more complex query stayed more in the discovery application. A component that will implement the A/B testing for the discovery interface. This script will select randomly either the SemaGrow powered search api or the GLN API. The search terms, API used, time, user session should be stored at a NoSQL db e.g. MongoDB. These logs will be used to estimate the metrics. These logs will be also combined with simple metrics from Google analytics e.g. search depth. It should be pointed that the new components at the discovery application will be implemented after the validation of the problems with real users. The problem validation is planned to take place during June and July The new version of discovery applications will be iteratively developed and tested until the end of Evaluation plan Approach to evaluate the data platforms APIs As regards the data platform, the Search end point (SPARQL) powered by SemaGrow will be evaluated by developers and data scientists during events such as hackathons. Such evaluation events will be used to Identify the main problems that the developers are facing when they are using a Search API in order to build a discovery application. Evaluate the different versions of the SemaGrow powered Search end point that will be integrated in the AK s Data Platform. The main tool for the identification of the problems and the evaluation will be interviews with the real users. For the quantitative validation, the log files and analytics will be collected and analysed in Page 54 of 61

Linked Open Data and Semantic Technologies for Research in Agriculture and Forestry

Linked Open Data and Semantic Technologies for Research in Agriculture and Forestry Linked Open and Semantic Technologies for Research in Agriculture and Forestry Platform Linked Nederland 2 April 2015 Rob Lokers, Alterra, Wageningen UR Contents related challenges in agricultural (and