A FRAMEWORK FOR ONTOLOGY-BASED LIBRARY DATA GENERATION, ACCESS AND EXPLOITATION

Similar documents
A framework for ontology-based library data generation, access and exploitation

NeOn Methodology for Building Ontology Networks: a Scenario-based Methodology

datos.bne.es: a Library Linked Data Dataset

Libraries, classifications and the network: bridging past and future. Maria Inês Cordeiro

ISA Action 1.17: A Reusable INSPIRE Reference Platform (ARE3NA)

Methodological Guidelines for Publishing Linked Data

Contribution of OCLC, LC and IFLA

Linked Data Applications: There is no One-Size-Fits-All Formula

BIB-R: a Benchmark for the Interpretation of Bibliographic Records

Representing Data about Key Performance Indicators

Datos abiertos de Interés Lingüístico

Studying conceptual models for publishing library data to the Semantic Web

Common Pitfalls in Ontology Development

Syrtis: New Perspectives for Semantic Web Adoption

Evaluation of RDF(S) and DAML+OIL Import/Export Services within Ontology Platforms

Linking FRBR Entities to LOD through Semantic Matching

Improving access and facilitating research: The music collections in the new catalogues of the French National Library (BnF)

Reducing Consumer Uncertainty

Web Ontology Editor: architecture and applications

Semantics-Based Integration of Embedded Systems Models

A REVIEW ON RDB TO RDF MAPPING FOR SEMANTIC WEB

Methodologies, Tools and Languages. Where is the Meeting Point?

DBpedia-An Advancement Towards Content Extraction From Wikipedia

Maximising (Re)Usability of Language Resources using Linguistic Linked Data

Proposal for Implementing Linked Open Data on Libraries Catalogue

The Linked Data Value Chain Model: A Methodology for Information Integration and Orchestration

A Survey of FRBRization Techniques

R2RML by Assertion: A Semi-Automatic Tool for Generating Customised R2RML Mappings

The Linking Open Data Project Bootstrapping the Web of Data

KNOWLEDGE MANAGEMENT AND ONTOLOGY

Supporting FRBRization of Web Product Descriptions

Interoperability of Protégé using RDF(S) as Interchange Language

An overview of RDB2RDF techniques and tools

Library of Congress BIBFRAME Pilot. NOTSL Fall Meeting October 30, 2015

A Semantic Portal for Fund Finding in the EU: Semantic Upgrade, Integration and Publication of Heterogeneous Legacy Data

Building a missing item in INSPIRE: The Re3gistry

StdTrip+K: Design Rationale in the RDB-to-RDF process

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Alexander Haffner. RDA and the Semantic Web

Just in time and relevant knowledge thanks to recommender systems and Semantic Web.

SABiO: Systematic Approach for Building Ontologies

COLLABORATIVE EUROPEAN DIGITAL ARCHIVE INFRASTRUCTURE

Supporting i*-based Context Models Construction through the DHARMA Ontology

Linked Data Evolving the Web into a Global Data Space

BIBFRAME Update Why, What, When. Sally McCallum Library of Congress NCTPG 10 February 2015

Envisioning Semantic Web Technology Solutions for the Arts

TABLE OF CONTENTS CHAPTER NO. TITLE PAGENO. LIST OF TABLES LIST OF FIGURES LIST OF ABRIVATION

Linked data for manuscripts in the Semantic Web

Building a Linked Open Data Knowledge Graph Henning Schoenenberger Michele Pasin. Frankfurt Book Fair 2017 October 11, 2017

Intelligent Information Management

Linking Distributed Data across the Web

Towards an Ontology Visualization Tool for Indexing DICOM Structured Reporting Documents

Data Harvesting, Curation and Fusion Model to Support Public Service Recommendations for e-governments

Interactive Knowledge Stack A Software Architecture for Semantic Content Management Systems

A Novel Vision for Navigation and Enrichment in Cultural Heritage Collections

Ontology Refinement and Evaluation based on is-a Hierarchy Similarity

Development of an Ontology-Based Portal for Digital Archive Services

Linked Open Europeana: Semantic Leveraging of European Cultural Heritage

Linking library data: contributions and role of subject data. Nuno Freire The European Library

Knowledge Sharing and Reuse: Ontologies and Applications

data elements (Delsey, 2003) and by providing empirical data on the actual use of the elements in the entire OCLC WorldCat database.

Evolva: A Comprehensive Approach to Ontology Evolution

A Tutorial of Viewing and Querying the Ontology of Soil Properties and Processes

Blink Project: Linked Open Data for Countway Library Final report for Phase 1 (June-Nov. 2012) Prepared by Sophia Cheng

M3 Framework: User s guide & tutorial

Linked Open Europeana: Semantics for the Digital Humanities

Linked Data Querying through FCA-based Schema Indexing

Integrating Heterogeneous Data Sources in the Web of Data

Prof. Dr. Christian Bizer

Ontologies and Database Schema: What s the Difference? Michael Uschold, PhD Semantic Arts.

The Implementation of Semantic Web Technology in Traditional Plant Medicine

For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS

Automation of Semantic Web based Digital Library using Unified Modeling Language Minal Bhise 1 1

VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems

Financial Dataspaces: Challenges, Approaches and Trends

Study and guidelines on Geospatial Linked Data as part of ISA Action 1.17 Resource Description Framework

Semi-Automatic Conceptual Data Modeling Using Entity and Relationship Instance Repositories

Webinar Annotate data in the EUDAT CDI

A service based on Linked Data to classify Web resources using a Knowledge Organisation System

PROJECT PERIODIC REPORT

Annotation for the Semantic Web During Website Development

Introduction to the Semantic Web Tutorial

Incremental Export of Relational Database Contents into RDF Graphs

Graph Exploration: Taking the User into the Loop

Semantic Web: Core Concepts and Mechanisms. MMI ORR Ontology Registry and Repository

The Dacura Data Curation System

Access rights and collaborative ontology integration for reuse across security domains

A service based on Linked Data to classify Web resources using a Knowledge Organisation System

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

Linked Open Data: a short introduction

The role of vocabularies for estimating carbon footprint for food recipies using Linked Open Data

Collaborative Framework for Testing Web Application Vulnerabilities Using STOWS

A Framework for Developing Manufacturing Service Capability Information Model

ODESeW. Automatic Generation of Knowledge Portals for Intranets and Extranets

Collaborative Ontology Evolution and Data Quality - An Empirical Analysis

> Semantic Web Use Cases and Case Studies

Joining the BRICKS Network - A Piece of Cake

W3C Provenance Incubator Group: An Overview. Thanks to Contributing Group Members

Management of Complex Product Ontologies Using a Web-Based Natural Language Processing Interface

Smart Gas Grids. Manuel Sánchez, Ph.D. Team Leader Smart Grids Directorate General for Energy European Commission

Transcription:

A FRAMEWORK FOR ONTOLOGY-BASED LIBRARY DATA GENERATION, ACCESS AND EXPLOITATION Daniel Vila Suero Advisors: Prof. Dr. Asunción Gómez-Pérez and Dr. Jorge Gracia del Río PhD in Artificial Intelligence, Defense, Madrid, 27 th of July 2016

CONTEXT Classify Catalogue! Information resources! Libraries! experts! Digital Catalogues (Limited semantics)! records describing persons, concepts, works, and other entities. MARC 21 international standard describing millions of records. (e.g., European +80 million records) 2

CONTEXT CATALOGUE RESOURCES ARE DISCONNECTED FROM THE WEB Online catalogues! WWW! Web forms HTML! Specialized users Applications Search engines 3

CONTEXT CATALOGUE RESOURCES ARE DISCONNECTED FROM THE WEB?! Online catalogues! WWW! Web forms HTML! Specialized users Applications Search engines creator of! topic! semantics! reusability! same as! discoverability! Semantic web and Linked Data technologies (Ontologies, RDF, HTTP URIs): Creating and delivering library data while providing a natural extension to the collaborative sharing models historically employed by libraries. My thesis! 4

OUTLINE Problem and state of the art Research methodology and hypotheses Contributions Mapping Ontology development applications Conclusion and future directions 5

INTRODUCTION RESEARCH PROBLEM Catalogues! ONTOLOGIES RDF HTTP URIS! PRINCIPLED TRANSFORMATION OF LIBRARY CATALOGUES INTO ONTOLOGY-BASED DATA AND THEIR PUBLICATION AND CONSUMPTION ON THE WEB.! WWW! 6

INTRODUCTION THE PROBLEM: GLOBAL-AS-VIEW ONTOLOGY-BASED DATA INTEGRATION SOURCE SCHEMA! GLOBAL SCHEMA! creator of! MAPPING! same as! topic! Data sources (MARC 21)! CHALLENGES 1. records present a nested structure (fields, subfields, etc.) 2. data sources lack schema and query mechanisms 3. standards and ontologies are highly specialized Target (RDF Linked Data)! 7

INTRODUCTION RESEARCH PROBLEM AND AREAS Catalogues! ONTOLOGIES RDF HTTP URIS! PRINCIPLED TRANSFORMATION OF LIBRARY CATALOGUES INTO ONTOLOGY-BASED DATA AND THEIR PUBLICATION AND CONSUMPTION ON THE WEB.! TRANSFORMATION & MAPPING LANGUAGES! ONTOLOGY DEVELOPMENT! WWW! LIBRARY APPLICATIONS! 8

INTRODUCTION STATE OF THE ART FRBR model! Experiments FRBR! Applications of RDF! Linked Data! TRANSFORMATION METHODS! 1998! [Hickey2002] [Bennet2003] [Hegna2003]! [Aalberg2006] [Manguinhas2010] 2012 [Takhirov2012] [Simon2013] [marc2bibframe2015] 9

INTRODUCTION STATE OF THE ART FRBR model! Experiments FRBR! Applications of RDF! Linked Data! TRANSFORMATION METHODS! 1998! 2012 MAPPING LANGUAGES! Relational databases into RDF! 2004! [Barrasa04] [Bizer04] R2RML W3C standard! 2014 Extensions of R2RML for non-rdb data sources! RML [Dimou2014] xr2rml [Michel2015] KR2RML [Slepicka2015] 10

INTRODUCTION STATE OF THE ART FRBR model! Experiments FRBR! Applications of RDF! Linked Data! TRANSFORMATION METHODS! 1998! 2012 MAPPING LANGUAGES! Relational databases into RDF! 2004! R2RML W3C standard! 2014 Extensions of R2RML for non-rdb data sources! ONTOLOGY DEVELOPMENT! Initial methodologies! 1995! [Grüninger1995] [Fernández-López1997] Ontology lifecycle models! [Staab2001] [Pinto2004] Neon methodology [Suarez-Figueroa2015] Agile approaches! [Auer2006] [Presutti2012] 2015 11

LIMITATION 1: MAPPING LANGUAGE AND METHODS INTRODUCTION TRANSFORMATION MAPPING LANGUAGES ONTOLOGY DEVELOPMENT! LIBRARY LINKED DATA APPLICATIONS! Lack of interoperability and reuse of mapping rules.! 12

LIMITATION 1: MAPPING LANGUAGE AND METHODS INTRODUCTION TRANSFORMATION MAPPING LANGUAGES ONTOLOGY DEVELOPMENT! LIBRARY LINKED DATA APPLICATIONS! Lack of interoperability and reuse of mapping rules.! Format-dependent (e.g., JSONPath) [RML] [xr2rml]!! Lack of queries over nested data [KR2RML]! 13

LIMITATION 2: FEEDBACK OF LIBRARY EXPERTS INTRODUCTION TRANSFORMATION MAPPING LANGUAGES ONTOLOGY DEVELOPMENT! LIBRARY LINKED DATA APPLICATIONS! Lack of interoperability and reuse of mapping rules.! Format-dependent (e.g., JSONPath)! Feedback of library experts not integrated within the transformation process. [Takhirov2012] Lack of queries over nested data! 14

LIMITATION 2: FEEDBACK OF LIBRARY EXPERTS INTRODUCTION TRANSFORMATION MAPPING LANGUAGES ONTOLOGY DEVELOPMENT! LIBRARY LINKED DATA APPLICATIONS! Lack of interoperability and reuse of mapping rules.! Format-dependent (e.g., JSONPath)! Limited technical support for the active participation of domain experts. [NeOn]! Feedback of library experts not integrated within the transformation process. [Takhirov2012] Lack of queries over nested data! Limited methods for understanding similarities among overlapping ontologies. [Vandenbussche2014] 15

INTRODUCTION LIMITATION 3: EXPERIMENTAL RESULTS TRANSFORMATION MAPPING LANGUAGES ONTOLOGY DEVELOPMENT! LIBRARY LINKED DATA APPLICATIONS! Lack of interoperability and reuse of mapping rules.! Format-dependent (e.g., JSONPath)! Limited technical support for the active participation of domain experts.! Lack of evaluation and experimental results of impact in end-user library applications [Simon2013] Feedback of library experts not integrated within the transformation process. Lack of queries over nested data! Limited methods for understanding similarities among overlapping ontologies. 16

HYPOTHESES Research methodology, and hypotheses 17

HYPOTHESES DESIGN-SCIENCE RESEARCH METHODOLOGY RESEARCH PROBLEMS HYPOTHESES EVALUATION CONTRIBUTIONS! P1 Mapping FORMULATION! EVALUATE HYPOTHESES! P2 Ontology development process P3 H1 H2 H3 H4 H5 applications ARTIFACT: MARIMBA FRAMEWORK! REFINEMENT METHODS MODELS INSTANTIATIONS! 18

HYPOTHESES MARIMBA FRAMEWORK: PROBLEMS AND CONTRIBUTIONS marimba-mapping marimba-topicmodel marimba-rml BNE ontology Empirical studies marimba-datamodel P1 Mapping data sources! marimba-modelling P2 Ontology development process Ontology-based data (RDF)! datos.bne.es P3 applications online applications! 19

P1: DEFINITION OF INTEROPERABLE MAPPING RULES HYPOTHESES marimba-topicmodel marimba-mapping marimba-rml marimba-datamodel P1 Mapping data sources! BNE ontology H1 H2 marimba-modelling Ontology-based data (RDF)! Empirical studies The structure of MARC 21 records can be fully represented using a nested relational model. datos.bne.es P2 Minimal modifications P3 to the W3C R2RML Ontology development language to define mapping rules between process MARC 21 data sources applications into RDF.! online applications! RESTRICTIONS: No query optimization, only operational semantics of mapping language, authority and bibliographic formats, MARC 21 standard 20

HYPOTHESES P2: LIBRARY ONTOLOGY DEVELOPMENT ng model es! BNE ontology marimba-topicmodel marimba-modelling P2 Ontology development process Ontology-based data (RDF)! H3marimba-topicmodel H4 Analytical data and the feedback of library experts can be used to develop a library ontology with sufficient quality BNE ontology marimba-modelling Ontology-based data (RDF)! Empirical studies datos.bne.es Probabilistic topic modelling techniques can P2 P3 Ontology produce coherent descriptions of development ontologies and can perform better than process applications existing methods used in ontology search. online applications! ASSUMPTIONS: Participation of library experts, available library ontologies and records 21

HYPOTHESES P3: APPLICATION OF SEMANTIC TECHNOLOGIES IN LIBRARY APPLICATIONS model ling ed RDF)! Empirical studies datos.bne.es P3 applications online applications! The application of semantic technologies to end-user library applications can increase user satisfaction and efficiency for finding information H5marimba-topicmodel BNE ontology marimba-modelling P2 Ontology development process Ontology-based data (RDF)! Empirical studies datos.bne.es P3 applications online applications! RESTRICTIONS: Transformation in a batch process 22

OUTLINE marimba-mapping marimba-rml marimba-datamodel P1 Mapping data sources! marimba-topicmodel BNE ontology marimba-modelling P2 Ontology development process Ontology-based data (RDF)! Empirical studies datos.bne.es P3 applications online applications! 23

marimba-datamodel MAPPING marimba-mapping marimba-rml?! P1 Mapping SOURCE SCHEMA! GLOBAL SCHEMA! creator of! same as! topic! Data source (MARC 21)! Target source (RDF)! 24

MAPPING marimba-mapping marimba-rml Schema extraction marimba-datamodel P1 Mapping SOURCE SCHEMA! GLOBAL SCHEMA! creator of! same as! topic! Data source (MARC 21)! Target source (RDF)! 25

MAPPING marimba-datamodel marimba-rml marimba-mapping Study of MARC 21 standard in Nested Relational Model (Makinouchi [1977]). Attributes: Atomic: e.g., Field 001 Relation-valued: e.g., Field 100, 008 Complete coverage of MARC 21 standard elements MARC 21 Format Leader Control Field (fixed-length) Control Field (single-valued) Data Field Subfield Indicators marimba-datamodel (NRM) Relation scheme R(S) relation-valued attribute R leader (S leader ) 2 S with S leader =(p00,p01,p02,...,p23) relation-valued attribute R cf (S cf ) 2 S with S cf =(p00,p01,p02,...,pn) for N positions Atomic attribute cf 2 S relation-valued attribute R df (S df ) 2 S with S cf =(i1,i2,sf1,...,sfn) with i1 and i2 indicators and sf1..sfn the N subfields of the data field Atomic attribute sf in a relation scheme of a data field 2 S df Atomic attributes i1 and i2 in a relation scheme of a data field 2 S df H1 26

MAPPING marimba-datamodel marimba-rml marimba-mapping Schema extraction Novel algorithm for schema extraction of MARC 21 records. Support the validation of queries to MARC 21 data. Support the generation of mapping templates for library experts. Complete representation of patterns in existing MARC 21 data sources. SCHEMA TABLE bibliographic ITEM f001 UNIQUE NOT NULL /* C.field Unique ID */ ITEM TABLE leader /* Leader Info about language, etc.*/ ITEM p01 ITEM p23 ITEM TABLE f100 /*Data field Author information */ ITEM a /* Subfield */ ITEM i0 /* Indicator */ 27

MAPPING marimba-mapping marimba-rml Schema extraction marimba-datamodel P1 Mapping SOURCE SCHEMA! GLOBAL SCHEMA! R2RML MAPPING!?! topic! creator of! same as! Data source (MARC 21)! Target source (RDF)! 28

MAPPING marimba-mapping Schema extraction marimba-datamodel P1 Mapping SOURCE SCHEMA! GLOBAL SCHEMA! R2RML MAPPING! marimba-sql topic! creator of! same as! Data source (MARC 21)! marimba-rml Target source (RDF)! 29

MAPPING marimba-datamodel marimba-rml marimba-mapping marimba-sql Minimal query language based on SQL/NF (Roth et al. [1987]): - Conciseness - Orthogonality of expressions! SFW nested expressions. /* Selects bibliographic records of type Drawing*/ bibliographic / * SELECT FROM * / WHERE EXISTS / * NESTED EXPRESSION* / (leader WHERE p6 = k" AND p7!= "s") AND / * NESTED EXPRESSION * / EXISTS (f007 WHERE p2 = d ) *BNF grammar provided in Annex I! 30