META-SHARE : the open exchange platform Overview-Current State-Towards v3.0

Similar documents
META-SHARE: An Open Resource Exchange Infrastructure for Stimulating Research and Innovation

On the way to Language Resources sharing: principles, challenges, solutions

clarin:el an infrastructure for documenting, sharing and processing language data

META-SHARE metadata: Overview of the schema & Interoperability with other schemas

Language Resources. Khalid Choukri ELRA/ELDA 55 Rue Brillat-Savarin, F Paris, France Tel Fax.

EUDAT. Towards a pan-european Collaborative Data Infrastructure

Open Language Resources & Meta-Resources: a Treasure and a Challenge for Linked Data

Introduction

EUDAT- Towards a Global Collaborative Data Infrastructure

Interoperability for electro-mobility (emobility)

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector

Data Management Plans. Sarah Jones Digital Curation Centre, Glasgow

Guide to depositing research in 02, the UOC s institutional repository SENDING PUBLICATIONS TO THE O2 REPOSITORY FROM GIR

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

Platform UI Specification (20)

Future Core Ground Segment Scenarios

Implementation of the Data Seal of Approval

European Open Science Cloud Implementation roadmap: translating the vision into practice. September 2018

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh

Platform UI Specification

Developing a Research Data Policy

NRF Open Access Statement

How can CLARIN archive and curate my resources?

Federation Operator Practice: Metadata Registration Practice Statement

Guide for depositing Final Bachelor's Degree Project Final Master's Degree Project- Practicum in O2, the UOC's institutional repository

INSPIRE status report

DATA MANAGEMENT PLANS Requirements and Recommendations for H2020 Projects. Matthias Razum April 20, 2018

Basic Requirements for Research Infrastructures in Europe

The Multilingual Language Library

Federation Operator Practice: Metadata Registration Practice Statement

Trust and Certification: the case for Trustworthy Digital Repositories. RDA Europe webinar, 14 February 2017 Ingrid Dillo, DANS, The Netherlands

Digital Preservation: How to Plan

National Data Sharing and Accessibility Policy-2012 (NDSAP-2012)

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication

PROVIDING COMMUNITY AND COLLABORATION SERVICES TO MMOG PLAYERS *

Machine Translation Research in META-NET

Open Access to Publications in H2020

DRIVER Step One towards a Pan-European Digital Repository Infrastructure

From Open Data to Data- Intensive Science through CERIF

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

Deliverable 6.4. Initial Data Management Plan. RINGO (GA no ) PUBLIC; R. Readiness of ICOS for Necessities of integrated Global Observations

BT Assure Cloud Identity Annex to the General Service Schedule

OpenAIRE From Pilot to Service The Open Knowledge Infrastructure for Europe

Checklist and guidance for a Data Management Plan, v1.0

Deliverable D10.1 Launch and management of dedicated website and social media

Workpackage WP 33: Deliverable D33.6: Documentation of the New DBE Web Presence

Federation Operator Practice: Metadata Registration Practice Statement

Data Management Checklist

Deliverable Initial Data Management Plan

The DART-Europe E-theses Portal

Company Overview SYSTRAN Applications Customization for Quality Translations

Pascal Gilles H-EOP-GT. Meeting ESA-FFG-Austrian Actors ESRIN, 24 th May 2016

Research Data Management

DURAFILE. D1.2 DURAFILE Technical Requirements

Nuno Freire National Library of Portugal Lisbon, Portugal

Rationale for the Evolution of the EUPL v1.1 (towards the EUPL v 1.2)

Building a Digital Repository on a Shoestring Budget

Promoting semantic interoperability between public administrations in Europe

MPEG-21 Current Work Plan

Collection Policy. Policy Number: PP1 April 2015

EUDAT. Towards a pan-european Collaborative Data Infrastructure

Horizon 2020 Open Research Data Pilot: What is required? Sarah Jones Digital Curation Centre

Making Open Data work for Europe

Legal Issues in Data Management: A Practical Approach

IPR Issues (2/2) Standardisation initiatives around Digital Rights Management. IPR Issues. Multimedia content. Representation: Metadata

OpenAIRE From Pilot to Service

Implementation of the Data Seal of Approval

CORLI. a linguistic consortium for corpus, language and interaction

Helix Nebula The Science Cloud

Open Education Resource Repository (OERR) Rubric Developed by the BCOER Librarians Working Group

REACH-IT Stakeholder Workshop. REACH-IT Architecture

Research Infrastructures and Horizon 2020

End User Licence. PUBLIC 31 January 2017 Version: T +44 (0) E ukdataservice.ac.uk

Data is the new Oil (Ann Winblad)

Coordination with and support of LIDER

The challenge of collecting and evaluating LRs for commercial use

User s Guide Brother Software Licence Management Tool

VI-SEEM Data Repository. Presented by: Panayiotis Charalambous

INSPIRE & Environment Data in the EU

The Wikipedia XML Corpus

Session Two: OAIS Model & Digital Curation Lifecycle Model

Energy Data Innovation Network GA N

SHARING YOUR RESEARCH DATA VIA

Privacy-Enabled NFTs: User-Mintable, Non-Fungible Tokens With Private Off-Chain Data

Digital Platforms for 'Interoperable and smart homes and grids'

Service Description MA-CUG. Solutions. For SWIFT for Corporates

PROJECT FINAL REPORT. Tel: Fax:

D5.2 FOODstars website WP5 Dissemination and networking

McAfee Security Management Center

Best practices in the design, creation and dissemination of speech corpora at The Language Archive

Metal Recovery from Low Grade Ores and Wastes Plus

CLARIN s central infrastructure. Dieter Van Uytvanck CLARIN-PLUS Tools & Services Workshop 2 June 2016 Vienna

EC Horizon 2020 Pilot on Open Research Data applicants

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure

Erasmus+ Linguistic Support: Licence Management System for Beneficiaries User Guide 1.0

CEF e-invoicing. Presentation to the European Multi- Stakeholder Forum on e-invoicing. DIGIT Directorate-General for Informatics.

Towards a joint service catalogue for e-infrastructure services

Big Data, exploiter de grands volumes de données

Paving the Rocky Road Toward Open and FAIR in the Field Sciences

Transcription:

META-SHARE : the open exchange platform Overview-Current State-Towards v3.0 Stelios Piperidis Athena RC, Greece spip@ilsp.gr A Strategy for Multilingual Europe Brussels, Belgium, June 20/21, 2012 Co-funded by the 7th Framework Programme of the European Commission through the contract T4ME, grant agreement no.: 249119.

Introduction Data is the crude oil of today s research technology development On all language technology related lists (corpora-list / linguist-list / mt-list,..., requests from English-X parallel corpora to syntactically annotated corpora whether they are aligned, validated,... language identification in twitter streams Similar requests (domain-independent needs) articulated during all Vision Group meetings on the way to the SRA Data, need data, more data, big data, open linked data,... http://www.meta-net.eu 2

Introduction However, only a portion of language resources is known / announced / shared / traded /... But...data collection, cleaning, annotation, curation and maintenance is a very costly business As evidence from other domains (e.g. biotechnology, geodata, earth sciences) shows data and tools become valuable through opening and sharing Both for research and technology development Evaluation Supporting innovative applications http://www.meta-net.eu 3

META-SHARE rationale Language resources (data and tools) are dynamic living entities they evolve over time in various dimensions (quantity, annotation levels, converted to a new format, addition of new languages) they are usually the product of collaborative work they may come with varying restrictions,... Need solutions that enable every language resource provider, at any granularity level (individual/lab/organisation), to Create his own store of LRs Describe, document and update it Link to a network of other providers Keep track of the use of his LRs, trade LRs Need solutions that enable every language resource consumer to Discover what LRs suitable for his purposes exist Get information about, download or acquire them http://www.meta-net.eu 4

META-SHARE: what it is META-SHARE tries to match LR providers and consumers needs and expectations by enhancing visibility, documentation, identification, availability, preservation of language data and (basic language processing) tools It is an open, non-monolithic, expandable, secure exchange infrastructure for language data and tools for the Human Language Technologies domain A rather long-term multidimensional endeavour by which language resources can boost research, technology and innovation through wide availability, pooling, openness and sharing http://www.meta-net.eu 5

META-SHARE architecture META-SHARE portal User oriented and support services Registration authentication - authorisation Search / browse licence download statistics E xt er n al mappings META-SHARE inventory reporting recommenders META-SHARE inventory Billing / payment Resources provision services META-SHARE inventory re p os metadata harvesting Inventory Inventory Inventory Inventory LR repo LR repo LR repo LR repo http://www.meta-net.eu 6

META-SHARE provider side META-SHARE is a network of distributed repositories of LRs Local (organisation-based), and central repositories Facilities for documenting, updating descriptions, storing/ linking LRs Provider support services (forum, knowledge base) Each repository maintains an inventory with all LRs MD, exports MD for harvesting Harvested MD are stored in synchronised central servers http://www.meta-net.eu 7

Metadata-based descriptions of LRs http://www.meta-net.eu 8

Aim to support META-SHARE users (LRs providers and consumers) in all services provided LR description (creation, storage and editing) search and retrieval, browse, uploading & downloading resources metadata harvesting/updating, monitoring of LRs and related objects, etc. http://www.meta-net.eu 9

Metadata schema ontology entities described core entity: the LR satellite entities: related objects, e.g. - actor: persons and organisations involved, such as creators of resources, funders, distributors, etc. - document: reference documents, such as papers describing the resource, reports, tagset manuals, guidelines for LR production, etc. - project: projects that have funded the creation of an LR, or where an LR has been used, etc. - licence: used for the distribution of the LR http://www.meta-net.eu 10

Ontology excerpt http://www.meta-net.eu 11

LRs typology (1) mediatype: text, audio, image, video written corpora spoken corpora images (multimedia) videos (multimedia) 12

LRs typology (2) search for text written corpora spoken corpora images (multimedia) videos (multimedia) 13

META-SHARE user side Users (LR consumers) can search the central inventory browse using multiple facets access the actual resources by visiting the respective repositories to get legally interoperable licence (s) to download and use them get support through an online user forum and helpdesks dedicated to technical, metedata and legal issues access a knowledge base http://www.meta-net.eu 14

META-SHARE user support services Versions 2.0, 2.1 come with an online forum The purpose of the Forum is to : better manage the user support services monitor questions asked answered - pending avoid repetition (both in asking and answering) help crystallise correct/mostly acceptable answers to user questions transform them into a useful wiki/knowledge base http://www.meta-net.eu 15

Organisational and Legal Framework http://www.meta-net.eu 16

Charter MoU Consortium Agreement Constituent Documents Core service terms/ network structure Depositor s Agreement Repositories obligations Licences Terms of use of LRs

Legal provisions Language Resources Sharing Charter high level principles Memorandum of Understanding aka membership agreement Licensing templates and deposition agreements Inclusive mix of open and openness inspired models Creative Commons licences (starting with Creative Commons Zero (CC-0) and all possible combinations along the CC differentiation of rights of use) META-SHARE Commons licences, fully developed CC-based licensing tool that allows META-SHARE members to make their resources available inside the network only META-SHARE No Redistribution licences, allowing use and exploitation of the Resources while permitting the LR Owner to have full control over the Resource distribution. Software tools and web services are either provided though one of the standard Open Source licenses or under a custom commercial license. http://www.meta-net.eu 18

META-SHARE legal features Rights based on type of use rather than type of user Differentiation along the following axes Attribution or No Attribution Open share with everybody or within the network only Redistribution vs No Redistribution Commercial non commercial Derivative vs Non-Derivative Share alike Re-deposition of derivatives, as a soft norm in the membership agreement, to act as a driver for collaborative LR building http://www.meta-net.eu 19

...share inside META-SHARE? BY Commercial Derivatives SA MSC BY Y Y Y N MSC BYSA Y Y Y Y MSC BY NC Y N Y N MSC BY NC SA Y N Y Y MSC BY ND Y Y N N MSC BY NC ND Y N N N

...protect the original? Commercial Redistribution Derivatives Fee MS Commercial NoRed FF Y N Y Y MS Commercial NoRed Y N Y N MS Commercial NoRed NoDer FF Y N N Y MS Commercial NoRed NoDer Y N N N MS NonCommercial NoRed NoDer FF N N N Y MS NonCommercial NoRed NoDer N N N N MS NonCommercial NoRed FF N N Y Y MS NonCommercial NoRed N N Y N

Some remarks clear before sharing or opening use standard licences don t limit unless you need use METASHARE services

Network structure Associate members Third Party Consumers Depositing-only Members non-local repositories User services providing nodes local repositories

To sum up... META-SHARE software, open source, under a permissive licence (BSD), to set up a language resource repository Legal instruments catering for a range of uses Mapping services to big resource inventories Software-based services for both LR providers and LR consumers User support services User Forum helpdesks http://www.meta-net.eu 24

META-SHARE Repos v2.0 Resource types 29 4 corpus Media Type 18 1 586 650 lexicalconceptualres ource toolservice 504 789 text audio video image languagedescrip<on 30 36 38 27 28 25 21 Distribu3on per Language 20 293 503 English Spanish 40 41 76 202 204 391 French German Italian Chinese 99 Dutch http://www.meta-net.eu 25

In the month(s) to come Improved user and access rights management, catering for different roles in the resources lifecycle; production, maintenance, procurement Search engine optimisations Improved single resource view Recommendation services Fix usability bugs Data migration tools http://www.meta-net.eu 26

In the month(s) to come More META-SHARE nodes and respective language resources will be integrated Integration of ELRA supported initiatives, LRE Map, Language Library Adoption of the META-SHARE platform and framework by ELRA Mappings to other resource inventories, enhancing the information provision dimension Full deployment of the services of ELRA and members within the META-SHARE network from software availability and maintenance to language resources storage and preservation as well as legal support http://www.meta-net.eu 27

Q/A Thank you very much! office@meta-net.eu http://www.meta-net.eu http://www.facebook.com/meta.alliance http://www.meta-net.eu 28