Connecting SMW to RDF Databases: Why, What, and How?

Similar documents
COMPUTER AND INFORMATION SCIENCE JENA DB. Group Abhishek Kumar Harshvardhan Singh Abhisek Mohanty Suhas Tumkur Chandrashekhara

3. Queries Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences

Semantic Web. Lecture XIII Tools Dieter Fensel and Katharina Siorpaes. Copyright 2008 STI INNSBRUCK

Semantic Web Fundamentals

Semantic Web Fundamentals

FOUNDATIONS OF DATABASES AND QUERY LANGUAGES

An overview of RDB2RDF techniques and tools

Linked Data and RDF. COMP60421 Sean Bechhofer

Querying the Semantic Web

Grid Resources Search Engine based on Ontology

Triple Stores in a Nutshell

KNOWLEDGE GRAPHS. Lecture 3: Modelling in RDF/Introduction to SPARQL. TU Dresden, 30th Oct Markus Krötzsch Knowledge-Based Systems

Approach for Mapping Ontologies to Relational Databases

Semantic Annotation, Search and Analysis

KNOWLEDGE GRAPHS. Lecture 2: Encoding Graphs with RDF. TU Dresden, 23th Oct Markus Krötzsch Knowledge-Based Systems

Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Introduction to RDF and the Semantic Web for the life sciences

W3C Workshop on RDF Access to Relational Databases October, 2007 Boston, MA, USA D2RQ. Lessons Learned

A General Approach to Query the Web of Data

Semantic Web: Core Concepts and Mechanisms. MMI ORR Ontology Registry and Repository

Profiles Research Networking Software API Guide

Using RDF to Model the Structure and Process of Systems

Enhancing Security Exchange Commission Data Sets Querying by Using Ontology Web Language

Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute

Event Stores (I) [Source: DB-Engines.com, accessed on August 28, 2016]

Layers of Abstraction to Semantic Web Programming

Representing Linked Data as Virtual File Systems

Orchestrating Music Queries via the Semantic Web

Readme file for Oracle Spatial and Graph and OBIEE Sample Application (V305) VirtualBox

Jena.

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

Querying Semantic Web Data

SEPA SPARQL Event Processing Architecture

From the Web to the Semantic Web: RDF and RDF Schema

Enrichment of Sensor Descriptions and Measurements Using Semantic Technologies. Student: Alexandra Moraru Mentor: Prof. Dr.

Semantic Web Information Management

RDF Stores Performance Test on Servers with Average Specification

Today s Plan. 1 Repetition: RDF. 2 Jena: Basic Datastructures. 3 Jena: Inspecting Models. 4 Jena: I/O. 5 Example. 6 Jena: ModelFactory and ModelMaker

Linked Data and RDF. COMP60421 Sean Bechhofer

APPLYING KNOWLEDGE BASED AI TO MODERN DATA MANAGEMENT. Mani Keeran, CFA Gi Kim, CFA Preeti Sharma

Semantic Web Programming

Today s Plan. 1 Repetition: RDF. 2 Jena: Basic Datastructures. 3 Jena: Inspecting Models. 4 Jena: I/O. 5 Example. 6 Jena: ModelFactory and ModelMaker

SPARQL QUERY LANGUAGE WEB:

Semantic Web. Tools. Copyright Dieter Fensel and Katharina Siorpaes

Ivan Herman. F2F Meeting of the W3C Business Group on Oil, Gas, and Chemicals Houston, February 13, 2012

RDF Data Management: Reasoning on Web Data

DBPedia (dbpedia.org)

Table of Contents. iii

Building Blocks of Linked Data

Building Virtual Earth Observatories Using Scientific Database, Semantic Web and Linked Geospatial Data Technologies

BUILDING THE SEMANTIC WEB

Semantic Wikipedia [[enhances::wikipedia]]

COMP9321 Web Application Engineering

SPARQL: An RDF Query Language

Scaling the Semantic Wall with AllegroGraph and TopBraid Composer. A Joint Webinar by TopQuadrant and Franz

Ontology-Based Data Access via Ontop

RDF Support in the Virtuoso DBMS

COMPUTER SUPPORTED COLLABORATIVE KNOWLEDGE

Semantic Integration with Apache Jena and Apache Stanbol

Multi-agent and Semantic Web Systems: RDF Data Structures

Linking and Finding Earth Observation (EO) Data on the Web

DBpedia-An Advancement Towards Content Extraction From Wikipedia

Linked Data. Department of Software Enginnering Faculty of Information Technology Czech Technical University in Prague Ivo Lašek, 2011

Semantic Technologies to Support the User-Centric Analysis of Activity Data

Semantic Wikipedia [[enhances::wikipedia]]

a paradigm for the Introduction to Semantic Web Semantic Web Angelica Lo Duca IIT-CNR Linked Open Data:

Day 2. RISIS Linked Data Course

Semantic Web Knowledge Representation in the Web Context. CS 431 March 24, 2008 Carl Lagoze Cornell University

COMP718: Ontologies and Knowledge Bases

PECULIARITIES OF LINKED DATA PROCESSING IN SEMANTIC APPLICATIONS. Sergey Shcherbak, Ilona Galushka, Sergey Soloshich, Valeriy Zavgorodniy

GeoSPARQL Support and Other Cool Features in Oracle 12c Spatial and Graph Linked Data Seminar Culture, Base Registries & Visualisations

SPARQL BGP Optimization For native RDF graph implementations

A Community-Driven Approach to Development of an Ontology-Based Application Management Framework

Semantic reasoning for dynamic knowledge bases. Lionel Médini M2IA Knowledge Dynamics 2018

Expressive Querying of Semantic Databases with Incremental Query Rewriting

Semantic Web. MPRI : Web Data Management. Antoine Amarilli Friday, January 11th 1/29

Mapping Relational data to RDF

Semi-Automatic Discovery of Meaningful Ontology from a Relational Database

OLAP over Federated RDF Sources

Harvesting Open Government Data with DCAT-AP

Knowledge Representation RDF Turtle Namespace

Semantic Web Technologies. Topic: RDF Triple Stores

Knowledge Representation in Social Context. CS227 Spring 2011

Web Ontology for Software Package Management

Formalising the Semantic Web. (These slides have been written by Axel Polleres, WU Vienna)

5 RDF and Inferencing

Enterprise Information Integration using Semantic Web Technologies:

KNOWLEDGE GRAPHS. Lecture 11: Cypher / Knowledge Graph Quality. TU Dresden, 8th Jan Markus Krötzsch Knowledge-Based Systems

Semantic Web and Python Concepts to Application development

Semantics. Matthew J. Graham CACR. Methods of Computational Science Caltech, 2011 May 10. matthew graham

Semantic Technologies & Triplestores for BI

Semantic representation of genetic circuit designs

Mapping Relational Data to RDF with Virtuoso's RDF Views

SEMANTIC WEB 07 SPARQL TUTORIAL BY EXAMPLE: DBPEDIA IMRAN IHSAN ASSISTANT PROFESSOR, AIR UNIVERSITY, ISLAMABAD

Practical Semantic Applications Master Title for Oil and Gas Asset Reporting. Information Integration David Price, TopQuadrant

Optimising a Semantic IoT Data Hub

The Open University s repository of research publications and other research outputs. Building SPARQL-Enabled Applications with Android devices

R2RML by Assertion: A Semi-Automatic Tool for Generating Customised R2RML Mappings

Linked data from your pocket

Transcription:

University of Oxford Department of Computer Science Connecting SMW to RDF Databases: Why, What, and How? Markus Krötzsch University of Oxford SMWCon 2011 Fall, Berlin

* * Talk given during the 2011 papal visit to Berlin Page 2

The Semantic Web Programme Goal: Intelligent automated processing of information on the Web without direct human intervention search, aggregation, question answering, Page 3

The Semantic Web Programme Goal: Intelligent automated processing of information on the Web without direct human intervention search, aggregation, question answering, Approach: Agree on Web-compatible standard data formats and protocols Develop software tools for data gathering, storing, analysing, querying, Establish best practices for data publishing and get a lot of data published Page 4

Status as of 2011: The Web of Data Established standards and best practices: IRI (addressing), RDF(data), SPARQL (query), OWL (model), vocabularies and ontologies Significant tool infrastructure: Data stores/databases, query engines, inference engines/reasoners, libraries, crawlers, data browsers Large data sources and consumers: Sites publishing data feeds, search engines crawling for data, browsing/visualisation interfaces & mash-ups Page 5

Status as of 2011: The Web of Data Page 6

Status as of 2011: The Web of Data Page 7

The Role of SMW SMW wikis as miniature Semantic Webs Data model similar to RDF Query model similar to OWL (queries = classes) Data feeds (RDF, JSON, RSS, ) Limited mapping/integration features to other data collections Page 8

But how does SMW fit into the Web of Data? Page 9

Page 10

RDF and SMW: Aligning Data Models Page 11

Page 12

RDF Data Model RDF: data as directed graph, labelled with IRIs: http://semanticweb.org/id/smw http://xmlns.com/foaf/ 0.1/homepage http://semantic-mediawiki.org http://semanticweb.org/id/ Property-3AHas_author http://semanticweb.org/id/ Jeroen_De_Dauw http://xmlns.com/foaf/ 0.1/homepage http://www.bn2vs.com/ Page 13

RDF Data Model RDF: datatypes and literals http://semanticweb.org/id/smw http://semanticweb.org/id/ Property-3ARelease_date 2011-07-30 ^^xsd:date http://semanticweb.org/id/ Property-3AHas_author http://semanticweb.org/id/ Jeroen_De_Dauw http://xmlns.com/foaf/ 0.1/name Jeroen De Dauw Page 14

RDF Data Model RDF: graphs consist of triples: http://semanticweb.org/id/smw Subject http://semanticweb.org/id/ Property-3ARelease_date Predicate http://semanticweb.org/id/ Jeroen_De_Dauw http://xmlns.com/foaf/ 0.1/homepage 2011-07-30 ^^xsd:date Object http://www.bn2vs.com/ Page 15

Data model: RDF vs. SMW Subject-Predicate-Object Page-Property-Value Difference: SMW properties have fixed datatypes Structure of data more constrained in SMW IRIs and literals SMW dataitems Similarity: subjects must be IRIs/wiki pages RDF datatypes SMW datatypes Different, incomparable type set Many-to-many relationship Page 16

Other Data Models SQL Relational model: data in tables Strong emphasis on schema No real data exchange/publication/syndication JSON Object model Very flexible, schema-less, little datatype support Exchange syntax, no standard for object identifiers Page 17

RDF and SMW: Data Access and Querying Page 18

Getting Access to Data Main task of SMW: query answering Internal query language: ASK Different data models provide different access paths: Relational model: SQL RDF: SPARQL JSON: no standard; some nosql approaches Page 19

Layers of an ASK Query {{ #ask: [[Category:City]] [[Population::>3000000]]?Population?Location of format=broadtable sort=population }} 1) Which pages are query results? 2) What other information should be included? 3) How should the result be presented? Page 20

Layers of an ASK Query 1) Which pages are query results? 2) What other information should be included? 3) How should the result be presented? SQL and SPARQL cover (1) Presentation (3) is achieved by higher level libraries, e.g. SPARK for SPARQL (2) not directly supported Page 21

SMW vs. SPARQL/SQL Result Tables {{ #ask: [[Category:City]] [[Population::>3000000]]?Population?Twin city?location of }} Result in SMW: 3-dimensional table Population Twin city Location of Berlin 3471756 Paris, London, SMWCon, Tokio, Buenos Aires, Berlin Marathon,.................. Page 22

SMW vs. SPARQL/SQL Result Tables {{ #ask: [[Category:City]] [[Population::>3000000]]?Population?Twin city?location of }} Result in SQL/SPARQL: 2-dimensional table Population Twin city Location of Berlin 3471756 Paris SMWCon Berlin 3471756 London SMWCon......... Berlin 3471756 Paris Berlin Marathon............... Page 23

Comparing Page Selection Queries (1) [[Category:City]] [[Population::>3000000]] SQL SELECT DISTINCT t5.smw_title AS t,t5.smw_namespace AS ns FROM `smw_ids` AS t5 INNER JOIN `smw_inst2` AS t1 ON t5.smw_id=t1.s_id INNER JOIN `smw_atts2` AS t3 ON t1.s_id=t3.s_id WHERE ( t1.o_id='738' AND ((t3.value_num>='3000000') AND t3.p_id='363') ); Page 24

Comparing Page Selection Queries (1) [[Category:City]] [[Population::>3000000]] SPARQL SELECT DISTINCT?result WHERE {?result rdf:type wiki:category-3acity.?result property:population?v1. FILTER(?v1 >= "3000000"^^xsd:double ) } Page 25

Comparing Page Selection Queries (1) [[Category:City]] [[Population::>3000000]] OWL ObjectIntersectionOf( wiki:category-3acity DataSomeValuesFrom( property:population DatatypeRestriction( xsd:int xsd:mininclusive "3000000"^^xsd:int ) ) ) Page 26

Possible Query Languages for SMW SQL Many stable, large-scale implementations Very different from ASK Resulting SQL queries are unusual for RDBMS SPARQL Numerous mostly stable, reasonably large-scale implementations Quite similar to ASK Many SMW queries quite typical for SPARQL OWL Numerous stable, highly-optimized implementations Most similar to ASK, but limited on FILTER features Implementations not optimised for big, changing data Page 27

SMW & SPARQL: An almost perfect match Not all SMW-data is naturally triples Example: page namespaces not stored like this Different capabilities of SQL and SPARQL Examples: Random order, pattern matching, geographic coordinates support Data formats in SQL and SPARQL are not the same Example: Type:String and Type:Text Aggregation (count) in SPARQL tools still shaky Page 28

SMW & SPARQL: An almost perfect match Some query structures hard to express in SPARQL: {{#ask: [[population::>80,000,000]] OR [[Paris]] }} Page 29

SMW & SPARQL: An almost perfect match Some query structures hard to express in SPARQL: {{#ask: [[population::>80,000,000]] OR [[Paris]] }} SELECT DISTINCT?result WHERE {?result swivt:page?url. OPTIONAL {?v2 property:population?v1. FILTER(?v1 >= "80000000"^^xsd:double ) } FILTER(?result = wiki:paris?result =?v2 ) } Page 30

Page 31

RDF and SMW: Support in SMW 1.6.0 Page 32

Using RDF Databases with SMW RDF data model very close to SMW SPARQL query model close to ASK Use RDF/SPARQL database to manage SMW data Supported by new SPARQL 1.1 Update Available since SMW 1.6.0 Benefits: Better query performance (hopefully) Additional features (e.g. SPARQL endpoint) More flexible integration with other local data stores Page 33

Popular RDF Databases (September 2011) Various advanced platforms to chose from: 4Store (free OSS) 5Store (commercial) AllegroGraph (commercial) OWLIM (commercial, some free downloads) Virtuoso (free OSS, commercial support) Oracle (commercial) Jena Sesame SMW integration first for 4Store Should work with most SPARQL 1.1 stores, maybe adjustments needed, feedback welcome Page 34

Installation and Configuration (1) Install RDF store Page 35

Installation and Configuration (1) Install RDF store (2) Configure store to run in server mode, offering SPARQL services for read and write Page 36

Installation and Configuration (1) Install RDF store (2) Configure store to run in server mode, offering SPARQL services for read and write (3) Configure SMW to use these services by giving URLs: $smwgdefaultstore = 'SMWSparqlStore'; $smwgsparqldatabase = 'SMWSparqlDatabase4Store'; // for 4Store $smwgsparqlqueryendpoint = 'http://localhost:8080/sparql/'; $smwgsparqlupdateendpoint = 'http://localhost:8080/update/'; $smwgsparqldataendpoint = 'http://localhost:8080/data/'; Page 37

Current Limitations RDF store only for #ask page selection Basic lookups still done in SQL All semantic data mirrored in SQL and RDF No support for Concepts No SMW emulation for inferencing as in SQL (subclass, subproperty); though the RDF store might have it Equality (redirect) resolution always supported Page 38

RDF and SMW: Possible Futures Page 39

SPARQL in and SPARQL out SPARQL service support Native (hard) or brokered from RDF store (easy) Currently available directly from RDF store only Inline SPARQL Use SPARQL queries like #ask on pages Pull in data from external sources Formatting like in SMW Partial solutions: SPARQL SMW extensions, Spark Page 40

Wikitext-independent Data Allow data that does not come from wiki pages to co-exist in the store Visible in queries Possibly editable via interface SMW as Web frontend to a semantic data store? Page 41

Mapping Data integration across data sources unsolved Current vocabulary import does not work with RDF store support (no inverse lookup) How should resources be mapped? Dynamic discovery of related sources? Prototype project: Shortipedia Page 42

Page 43

Connecting SMW to RDF Databases: Why, What and How? Page 44

Connecting SMW to RDF Databases: Why, What, How and Who? Page 45

Page 46