ALIADA, an Open Source Solution to Easily Publish Linked Data of Libraries and Museums

Similar documents
Linked Data Evolving the Web into a Global Data Space

Linked Open Data in Aggregation Scenarios: The Case of The European Library Nuno Freire The European Library

Links, languages and semantics: linked data approaches in The European Library and Europeana. Valentine Charles, Nuno Freire & Antoine Isaac

Contribution of OCLC, LC and IFLA

datos.bne.es: a Library Linked Data Dataset

Linking library data: contributions and role of subject data. Nuno Freire The European Library

DBpedia Data Processing and Integration Tasks in UnifiedViews

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Europeana Creative. EDM Endpoint. Custom Views.

Semantic challenges in sharing dataset metadata and creating federated dataset catalogs

Enrichment, Reconciliation and Publication of Linked Data with the BIBFRAME model. Tiziana Possemato Casalini Libri

Linked Data and cultural heritage data: an overview of the approaches from Europeana and The European Library

When Semantics support Multilingual Access to Cultural Heritage The Europeana Case. Valentine Charles and Juliane Stiller

Europeana Creative. EDM Endpoint. Custom Views

The Local Amsterdam Cultural Heritage Linked Open Data Network

Mapping from Flat or Hierarchical Metadata Schemas to a Semantic Web Ontology. Justyna Walkowska, Marcin Werla

NOTSL Fall Meeting, October 30, 2015 Cuyahoga County Public Library Parma, OH by

Building a Linked Open Data Knowledge Graph Henning Schoenenberger Michele Pasin. Frankfurt Book Fair 2017 October 11, 2017

Europeana update: aspects of the data

Reducing Consumer Uncertainty

From strings to things

Fusing Corporate Thesaurus Management with Linked Data using PoolParty

Enhancing discovery with entity reconciliation: Use cases from the Linked Data for Libraries (LD4L) project

Linked data implementations who, what, why?

Mapping Relational data to RDF

Linked Data: What Now? Maine Library Association 2017

Programming Technologies for Web Resource Mining

COMP9321 Web Application Engineering

Utilizing Open Data for interactive knowledge transfer

Future Trends of ILS

Introduction to Metadata for digital resources (2D/3D)

RDA: Where We Are and

Linked Data Applications: There is no One-Size-Fits-All Formula

Semantiska webben DFS/Gbg

Publishing Statistical Data and Geospatial Data as Linked Data Creating a Semantic Data Platform

Transforming Our Data, Transforming Ourselves RDA as a First Step in the Future of Cataloging

RDF and RDB 2 D2RQ. Mapping Relational data to RDF D2RQ. D2RQ Features. Suppose we have data in a relational database that we want to export as RDF

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision

Semantic Integration with Apache Jena and Apache Stanbol

COMP9321 Web Application Engineering

A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC ENRICHMENT OF DIGITAL CONTENT (NEW L10N BUSINESS OPPORTUNITIES) FREME WEBINAR HELD FOR GALA, 28 APRIL 2016

OXLOD Pilot Oxford Linked Data. 4 October OeRC

Global standard formats for opening NLK data. Adding to the Global Information Ecosystem

Why You Should Care About Linked Data and Open Data Linked Open Data (LOD) in Libraries

Cataloguing is riding the waves of change Renate Beilharz Teacher Library and Information Studies Box Hill Institute

An overview of RDB2RDF techniques and tools

(Geo)DCAT-AP Status, Usage, Implementation Guidelines, Extensions

Proposal for Implementing Linked Open Data on Libraries Catalogue

Using Linked Data Concepts to Blend and Analyze Geospatial and Statistical Data Creating a Semantic Data Platform

Oshiba Tadahiko National Diet Library Tokyo, Japan

Setting up a CIDOC CRM Adoption and Use Strategy CIDOC CRM: Success Stories, Challenges and New Perspective

The Creation of a Linked Data-based Application Service at the National Library of Korea

The UC Davis BIBFLOW Project

Semantic Web Fundamentals

Markus Kaindl Senior Manager Semantic Data Business Owner SN SciGraph

Enrichment of Sensor Descriptions and Measurements Using Semantic Technologies. Student: Alexandra Moraru Mentor: Prof. Dr.

RDF VISUALIZER. 3/4/ th CRM-SIG meeting M.Doerr, K. Doerr, K.Petrakis, L.Harami, N.Minadakis

Library of Congress BIBFRAME Pilot. NOTSL Fall Meeting October 30, 2015

How We Build RIC-O: Principles

Joining the BRICKS Network - A Piece of Cake

verapdf Industry supported PDF/A validation

Linked data for manuscripts in the Semantic Web

Blink Project: Linked Open Data for Countway Library Final report for Phase 1 (June-Nov. 2012) Prepared by Sophia Cheng

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

State of Bio2RDF. Marc-Alexandre Nolin François Belleau Peter Ansell Other Bio2RDF collaborators

Linked Data. Department of Software Enginnering Faculty of Information Technology Czech Technical University in Prague Ivo Lašek, 2011

Coordinating TMS Cataloging with data harvesting requirements. David Parsell Yale Center for British Art

Corso di Biblioteche Digitali

Role of Social Media and Semantic WEB in Libraries

THE GETTY VOCABULARIES TECHNICAL UPDATE

Building a federated science web one link at a time

Building the Multilingual Web of Data. Integrating NLP with Linked Data and RDF using the NLP Interchange Format

For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS

Linked Data and RDF. COMP60421 Sean Bechhofer

KARMA. Pedro Szekely and Craig A. Knoblock. University of Southern California, Information Sciences Institute

DBpedia-An Advancement Towards Content Extraction From Wikipedia

Semantic Web Fundamentals

Re-using Cool URIs: Entity Reconciliation Against LOD Hubs

Database of historical places, persons, and lemmas

Building Consensus: An Overview of Metadata Standards Development

a paradigm for the Semantic Web Linked Data Angelica Lo Duca IIT-CNR Linked Open Data:

Report on the Linked Logainm project. Rebecca Grant Nuno Lopes Catherine Ryan

Architecture and Applications

EUROMUSE: A web-based system for the

W3C Workshop on RDF Access to Relational Databases October, 2007 Boston, MA, USA D2RQ. Lessons Learned

Sharing Archival Metadata MODULE 20. Aaron Rubinstein

PAD: A Semantic Social Network

Chinese Geo-Names Calculator A Linked Data Approach

Data integration perspectives from the LTB project

Corso di Biblioteche Digitali

Organizing Existing Metadata Terms and Structural Constraints to Support Metadata Schema Creation

Towards Open Innovation with Open Data Service Platform

Natural Language Processing with PoolParty

Multi-agent and Semantic Web Systems: Linked Open Data

COLLABORATIVE EUROPEAN DIGITAL ARCHIVE INFRASTRUCTURE

Interactive Knowledge Stack A Software Architecture for Semantic Content Management Systems

The CoBiS Linked Open Data Project and Portal

4 th Linked Data on the Web Workshop (LDOW 2011)

Semantic Interoperability of Basic Data in the Italian Public Sector Giorgia Lodi

Transcription:

Automatic Publication under Linked Data Paradigm of Library Data ALIADA, an Open Source Solution to Easily Publish Linked Data of Libraries and Museums ALIADA Project Consortium SWIB15, November 23-25, 2015 Hamburg

The challenge: why LOD in LM A global pool of shared data that can be re-used to describe resources will avoid the redundant effort of the current cataloging processes. The use of the Web and Web-based identifiers will make up-to-date resource descriptions directly citable by catalogers. Linked Data is more durable and robust than metadata formats that depend on a particular data structure. Developers will also no longer have to work with library-specific data formats (MARC, LIDO). With Linked Open Data, libraries can increase their presence on the Web, where most information seekers may be found. http://www.w3.org/2005/incubator/lld/wiki/benefits 2

The challenge: how to start Cataloguing data according international conceptual models and standards (FRBR, BIBFRAME, CIDOC-CRM, ) Exporting records to standard metadata schemes (MARC, LIDO or Dublin Core) and formats (XML) Selecting an ontology Converting MARC/LIDO/DC metadata to RDF statements Linking their own dataset to other datasets (one domain or multidomain) Publishing data as 5-star Linked Open Data 3

The challenge: how to start Librarians and curators are experts in cataloguing and making accessible their resources, but the don t know about Linked Data technology, so they need an ally 4

ALIADA, the ally to publish LODLM Open source Java application to automatically publish as Linked Data the metadata created by a library or museum management system Supported metadata types (types of datasets): bibliographic records, authority records, descriptions of museum objects and other information resources Compliant with MARC XML, LIDO XML and Dublin Core formats Conversion to RDF triples (mapping) according to the ALIADA ontology, mainly based on FRBRoo, SKOS, WGS84 and FoaF ontologies Linking to other datasets, such as Europeana, British National Bibliography, Spanish National Library, Freebase Visual Art, DBpedia, Hungarian National Library, Library of Congress Subject Headings, Lobid, MARC codes list, VIAF Virtual International Authority File or Open Library Automatic publication of dumps (URIs) and SPARQL Endpoint on DataHub 5

ALIADA, the ally to publish LODLM EU FP7- ENV-2012 Collaborative project 2013-2015 Partners: Art museums, libraries, ILS vendors, researchers on Semantic Web technology Final release: October 2015 Open source community expected ALIADA is free software, you can redistribute it and/or modify it under the terms of the GNU GPL v3 (License) 6

ALIADA, the ally to publish LODLM 1 st ws ALIADA 1.0 2 nd ws 1 st deployment ALIADA 2.0 Final ws 2 nd deployment M11 September 2014: 1 st Usability workshop M12 October2014: 1 st Prototype + opening to the community M15 January 2015: 1 st deployment M16 February 2015: 2 nd Usability workshop M23 September 2015: Final Usability workshop M24 October 2015: 2 nd Prototype & 2 nd deployment + opening to the community ALIADA 2.1 7

ALIADA, the ally to publish LODLM 1 st prototype (2014) User interface in Spanish and English. Validation of imported records (MARC Bibliographic and LIDO) Mapping templates (FRBRoo) RDF-izer (ALIADA ontology) Linking to some datasets: Europeana, BNB, BNE, Freebase, Dbpedia, NSZL, Geonames and MARC Code Lists (SPARQL endpoint) Linked data server creation + SPARQL endpoint + URIs dereferencing. Linked dataset validation: through a number of SPARQL queries. 8

ALIADA, the ally to publish LODLM 2nd prototype (2015) Translation of DublinCore XML and MARC XML Authorities to ALIADA ontology. Validation of RDF dataset consistency. NER (Named Entity Recognition) for some text free elements Ad-hoc linking to some of the listed external datasets, which do not provide a SPARQL endpoint such as VIAF, LOBID, Open Library and Library of Congress Subject Headings. Links disambiguation: the system offers to the user a set of possible ambiguous links: so the user can decide which links are correct and which ones should be rejected. Advanced URI de-referencing and creation of a web page for the generated dataset. Publication in CKAN of the created linked dataset + DataHub LOD validator. Translation of the user interface to Italian and Hungarian. ALIADA offers REST services in order to be integrated with other systems: ILS and CMS. 9

Apache Tomcat ALIADA, LOD main features Dublin Core Validation of Input Data Validation RDF dataset Links Disambiguation USER U s e r I n t e r f a c e MARCXML2RDF LIDO2RDF DublinCore2RDF translation RDF Consistency validation RDFizer URL rewrite rules Linked Data server Silk Discovery Framework Links Discovery Creation CKAN DataHub page Modules Configuration MySQL DB endpoint ALIADA ontology Graph1 endpoint Virtuoso RDF Store 10

ALIADA, LOD main features ALIADA RDFizer Scalability (RESTful application, Apache Camel asynchronous channels, JEE web application) Modularity (Conversion templates are configurable and extensible) Reusability (Standalone installation of the RDFizer) Easy to use/maintain (Conversion job is controlled by the socalled conversion templates, which are runtime-interpreted scripts very easy to maintains) Easy to extend (The library is free to create their conversion template, producing an arbitrary output format, with another ontology) Validation before the conversion (Jena OWL Micro Reasoner) 11

ALIADA, LOD main features NER of free text fields. A dedicated component for doing NLP (Natural Language Processing). Recognizes sequences of words in a text which are the names of things, such as person, company names and places. In LIDO it is applied to <lido:descriptivenotevalue xml:lang="en" lido:type ="physical-description">sculpture of Mozart</lido:descriptiveNoteValue > In MARC it is applied to 522 a : Geographic Coverage Note. 525 a : Supplement Note. 520 a : Summary, etc. 520 b : Expansion of summary note. The NER results are stored as RDF triples that enrich the owning records 12

ALIADA, LOD main features Linking to Datasets that provide SPARQL endpoint ALIADA dataset E39_Actor E21_Person F10_Person BNB dc:title F3_Manifestation_Product_Type Title E18_Physical_Thing Appellation E73_Information_Object Title E53_Place F9_Place wgs84:lat wgs84:long Place_Appellation Actor_Appellation dc:title edm:providedcho people.person book.book Freebase visual_art.artwork film.film Europeana Work:label Agent:name Place:name bibo:document NSZL foaf:person ifla-frbr:c1001 ifla-frbr:p3001 / ifla-frad:p4033 BNE ifla-frbr:c1005 ifla-frbr:p3039 / ifla-frad:p4031 DBpedia MARC_Country MARC_GeographicArea MARC:Language MARCCode Lists E56_Language geo:name GeoNames wgs84:lat wgs84:long 13

ALIADA, LOD main features Linking to Datasets that do not provide SPARQL endpoint E39_Actor E21_Person F10_Person http://www.viaf.org/viaf/autosuggest?query=stringtosearchfor Actor_Appellation Actor_Appellation ecrm:p3_has_note StringToSearchFor VIAF ALIADA dataset F3_Manifestation_Product_Type - Title Title ecrm:p3_has_note StringToSearchFor http://api.lobid.org/resource?name=stringtosearchfor&form at=ids E39_Actor - Actor_Appellation Actor_Appellation ecrm:p3_has_note StringToSearchFor https://openlibrary.org/search?title=stringtosearchfor Open Library LOBID: Bib. resources http://api.lobid.org/organisation?name=stringtosearchfor&format=ids LOBID: Libraries & rel. organisations E89_PropositionalObject ecrm:p129_is_about skos:concept skos:concept skos:preflabel StringToSearchFor http://id.loc.gov/search/?q=string ToSearchFor&q=cs:http://id.loc.g ov/authorities/subjects&format=x ml E1_CRM_Entity P137 exemplifies skos:concept skos:concept skos:preflabel StringToSearchFor Library of Congress Subject Headings 14

ALIADA, LOD main features Links disambiguation 15

ALIADA, LOD main features Advanced URI de-referencing URI regulation : http://www.w3.org/tr/cooluris/ Be on the web Machines and people should be able to retrieve a description about the resource identified by the URI from the Web. Machines should get RDF data and humans should get a readable representation, such as HTML Cool URIs Simplisity: Short and mnemonic Stability: Does not change as long as possible Managabilty: Keep all URIs in a dedicated subdomain 16

ALIADA, LOD main features URI name convention of ALIADA: URI structure {domain}/{type}/{concept}/{class}/{reference}.{format} data.szepmuveszeti.hu/id/collections/museum/e18_physical_thing/sz epmuveszeti.hu_object_29 17

ALIADA, LOD main features URI name convention of ALIADA: URI structure Extension Media type Common name.html text/html HTML.json application/rdf+json JSON.jsonld application/ld+json JSON-LD.nt text/plain N-Triples.opac OPAC of the institution (if restful) OPAC.rdf application/rdf+xml RDF/XML.ttl text/rdf+n3 N3/Turtle http://data.szepmuveszeti.hu/doc/collections/museum/e18_physical_t hing/szepmuveszeti.hu_object_29.nt http://data.szepmuveszeti.hu/doc/collections/museum/e18_physical_t hing/szepmuveszeti.hu_object_29.html 18

ALIADA, LOD main features Dataset Default Web Page created by ALIADA 19

ALIADA, LOD main features Publication in CKAN Datahub 20

ALIADA, LOD main features Data Hub LOD Validator A handy record validator is provided at http://validator.lodcloud.net/validate.php to check that at least the minimum required information is present. After applying such validator over our dataset published by ALIADA, a completeness level of 3 is reached. The following step to reach the top completeness level of 4 is to e-mail to the contacts indicated at http://lodcloud.net/ page 21

ALIADA, LOD main features Integration of ALIADA with the LMS (RESTful API) 22

ALIADA, LOD main features Integration of ALIADA with the LMS (RESTful API) 23

ALIADA demo DEMO SITE Demo video: https://vimeo.com/aliadaproject ARTIUM s datasets http://datos.artium.org/ MFAB s dataset http://data.szepmuveszeti.hu/ ARTIUM s RDF data store (queries) http://datos.artium.org/sparql MFAB s RDF data store (queries) http://data.szepmuveszeti.hu/sparql 24

ALIADA open source community ALIADA site has been updated with the second release, the version resulting from the usability and performance tests as well as from the second deployment in public institutions http://www.aliada-project.eu/ 25

ALIADA open source community The source code of the second release has been uploaded to the GitHub s ALIADA repository and the technical documentation, the user manual and the wiki have also been updated. https://github.com/aliada/aliada-tool/ 26

info@aliada-project.eu 27