The necessity of hypermedia RDF and an approach to achieve it

Similar documents
FedX: A Federation Layer for Distributed Query Processing on Linked Open Data

The Data Web and Linked Data.

REST. And now for something completely different. Mike amundsen.com

HTTP, REST Web Services

Revisiting Blank Nodes in RDF to Avoid the Semantic Mismatch with SPARQL

Preserving Linked Data on the Semantic Web by the application of Link Integrity techniques from Hypermedia

Publishing data for maximized reuse

RESTful Services. Distributed Enabling Platform

Programming the Semantic Web

Interlinking Multimedia Principles and Requirements

FedX: Optimization Techniques for Federated Query Processing on Linked Data. ISWC 2011 October 26 th. Presented by: Ziv Dayan

Digital Public Space: Publishing Datasets

> Semantic Web Use Cases and Case Studies

Web Ontology for Software Package Management

Multi-agent and Semantic Web Systems: Linked Open Data

The Emerging Web of Linked Data

A model-driven approach for REST compliant services

Web Information System Design. Tatsuya Hagino

Linked Data Evolving the Web into a Global Data Space

SPARQL เอกสารหล ก ใน มคอ.3

Multi-agent and Semantic Web Systems: RDF Data Structures

Multi-agent Semantic Web Systems: RDF Models

Semantic Web Fundamentals

A General Approach to Query the Web of Data

The Semantic Web & Ontologies

SRI International, Artificial Intelligence Center Menlo Park, USA, 24 July 2009

Semantic Web Test

SWSE: Objects before documents!

ReST 2000 Roy Fielding W3C

Future Challenges for Linked APIs

Unlocking the full potential of location-based services: Linked Data driven Web APIs

Just in time and relevant knowledge thanks to recommender systems and Semantic Web.

LinDA: A Service Infrastructure for Linked Data Analysis and Provision of Data Statistics

The Politics of Vocabulary Control

Orchestrating Music Queries via the Semantic Web

a paradigm for the Introduction to Semantic Web Semantic Web Angelica Lo Duca IIT-CNR Linked Open Data:

Linked Open Data. University of Rome "Tor Vergata" ART: Artificial Intelligence Tor Vergata. "The Semantic Web done right" Tim Berners-Lee

Interacting with Linked Data Part I: General Introduction

Linking Distributed Data across the Web

Knowledge Representation in Social Context. CS227 Spring 2011

Hyperdata: Update APIs for RDF Data Sources (Vision Paper)

Open Research Online The Open University s repository of research publications and other research outputs

The Point of View Axis: Varying the Levels of Explanation Within a Generic RDF Data Browsing Environment

A Brief Introduction to REST

THE GETTY VOCABULARIES TECHNICAL UPDATE

R2RML by Assertion: A Semi-Automatic Tool for Generating Customised R2RML Mappings

ISA Action 1.17: A Reusable INSPIRE Reference Platform (ARE3NA)

COMP9321 Web Application Engineering

Linked Open Data: a short introduction

COMP9321 Web Application Engineering

How Interlinks Influence Federated over SPARQL Endpoints

A Developer s Guide to the Semantic Web

Why You Should Care About Linked Data and Open Data Linked Open Data (LOD) in Libraries

Architecture and Applications

Enabling suggestions in tabular data cleaning and RDF mapping validation

Combining Government and Linked Open Data in Emergency Management

Enabling Semantic Web Programming by Integrating RDF and Common Lisp

Automated Classification. Lars Marius Garshol Topic Maps

Practical Semantic Applications Master Title for Oil and Gas Asset Reporting. Information Integration David Price, TopQuadrant

The RMap Project: Linking the Products of Research and Scholarly Communication Tim DiLauro

Federated Query Processing: Challenges and Opportunities

Publishing and Consuming Provenance Metadata on the Web of Linked Data

Design & Manage Persistent URIs

Mapping Relational data to RDF

Linked Data Semantic Web Technologies 1 (2010/2011)

Sindice Widgets: Lightweight embedding of Semantic Web capabilities into existing user applications.

How to Publish Linked Data on the Web - Proposal for a Half-day Tutorial at ISWC2008

REST API s in a CA Plex context. API Design and Integration into CA Plex landscape

INF3580/4580 Semantic Technologies Spring 2015

DBpedia-An Advancement Towards Content Extraction From Wikipedia

Semantic Document Architecture for Desktop Data Integration and Management

Fusion of Event Stream and Background Knowledge for Semantic-Enabled Complex Event Processing

LODmilla: shared visualization of Linked Open Data

Web Services Week 10

XML Perspectives on RDF Querying: Towards integrated Access to Data and Metadata on the Web

Proposal for Implementing Linked Open Data on Libraries Catalogue

BUILDING THE SEMANTIC WEB

Semantic Web Fundamentals

An Holistic Evaluation of Federated SPARQL Query Engine Nur Aini Rakhmawati

Develop Mobile Front Ends Using Mobile Application Framework A - 2

Extended Identity for Social Networks

Semantic Web Information Management

Library of Congress BIBFRAME Pilot. NOTSL Fall Meeting October 30, 2015

RKB, sameas and dotac

On Distributed Querying of Linked Data

Representing Linked Data as Virtual File Systems

Advances in Data Management - Web Data Integration A.Poulovassilis

Twelve Patterns for Hypermedia Service Architecture

RDF and RDB 2 D2RQ. Mapping Relational data to RDF D2RQ. D2RQ Features. Suppose we have data in a relational database that we want to export as RDF

Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD

An Approach to Enhancing Workflows Provenance by Leveraging Web 2.0 to Increase Information Sharing, Collaboration and Reuse

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction

WWW, REST, and Web Services

Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic

A Privacy Preference Ontology (PPO) for Linked Data

Linked data from your pocket

Accessing information about Linked Data vocabularies with vocab.cc

Semantic Web and Linked Data

Linking and Finding Earth Observation (EO) Data on the Web

What is REST? ; Erik Wilde ; UC Berkeley School of Information

Transcription:

The necessity of hypermedia RDF and an approach to achieve it Kjetil Kjernsmo 1 Department of Informatics, Postboks 1080 Blindern, 0316 Oslo, Norway kjekje@ifi.uio.no Abstract. This paper will give an overview of the practical implications of the HATEOAS constraint of the REST architectural style, and in that light argue why hypermedia RDF is a practical necessity. We will then sketch a vocabulary for hypermedia RDF using Mike Amundsen s H Factor classification as motivator. Finally, we will briefly argue that SPARQL is important when making non-trivial traversals of Linked Data graphs, and see how a bridge between Linked Data and SPARQL may be created with hypermedia RDF. 1 Introduction Mike Amundsen defines hypermedia types [3] as Hypermedia Types are MIME media types that contain native hyperlinking semantics that induce application flow. For example, HTML is a hypermedia type; XML is not. Furthermore, he defines a classification scheme called H Factor as a measurement of the level of hypermedia support and sophistication of a media-type. The REST & WOA Wiki defines the Hypermedia Scale [5], where the categorization is based on capabilities to do CRUD (Create, Read, Update, Delete) operations. Considered on their own, RDF serializations are hypermedia types, but only at the LO (outbound links) and CL (control data for links) levels (see [3] for details on this notation), and just an R Type (read) on the Hypermedia Scale. We aim at improving this situation. Amundsen argues in a blog post [2] that the Semantic Web community should not pursue an API for RDF, but rather make RDF serializations more powerful hypermedia types. He argues that a key factor in the success of the Web is that messages not only contain data but also application control information, and that this is needed for the Web to scale. Semantic Web services are likely to be an important source of data for future applications, but for foreseeable future they are just one of many. A key promise of the Semantic Web is the ability to integrate many data sources easily. There are two key issues, first read-write operations must be similar to how it is done with other hypermedia types, to make it possible to use generic code to minimize the effort required to support RDF. Importantly, if interacting with Linked Data

requires extensive out-of-band information, it will be harder to use than media types that do not. Secondly, it requires relevant links and that the resulting graph can be traversed by reasonable means. While links are abundant across the Linked Open Data (LOD) cloud, this paper will argue that key links are missing to make it possible to traverse the resulting graph and to enable read-write operations. Moreover, we will argue that this shortcoming is due to that the community does not adequately take into account the constraint known as Hypermedia as the Engine of Application State (abbreviated HATEOAS) from the REST architectural style, see [6] Chapter 5. In a read-only situation, the HATEOAS constraint requires that the application can navigate from one resources to another using the hypermedia links in the present resource only, they should not require any out-of-band information. Since RDF is built on URIs, many of which can be dereferenced to obtain further links, it will therefore almost always satisfy the HATEOAS constraint in the read-only case. Contrast the read-write situation: It is very common that specifications that advertise themselves as RESTful has an API that is specified in terms of URI structure as well as HTTP verbs. This is not a violation of the HATEOAS constraint per se, but it would usually be superfluous as the same information must be available in a hypermedia message in order to satisfy the HATEOAS constraint. The constraint does not require this information to be available in all messages, and so, it is not a very strict constraint. It is debatable whether a protocol can be said to be RESTful if the links don t enable any useful interactions. Even in the read-only case, we must carefully add links that enable the desired interactions to take place, e.g. linking a SPARQL endpoint. Importantly, if there is to be such a thing as a RESTful read-write Semantic Web protocol, there must be something in the RDF itself that can be used by practical applications to write data. Therefore, whether the HATEOAS constraint is satisfied must be judged on the basis of the practical applications the protocol enables. 2 Required links We note that the Atom Publishing Protocol [7] has a single service endpoint. From there, you can navigate to what you need. This motivates the first discussion topic of this paper: Question 1. Is a single service endpoint enough for Linked Data? We note that the distributed graph structure of the LOD Cloud makes this awkward: For every new data source encountered in a graph traversal, a new service endpoint must be queried. Moreover, it would be awkward if fine-grained access controls for writing are being used to record permissions for all resources in a single service description. It seems likely that a few triples attached to each information resource are better suited in most cases.

3 Defining hypermedia RDF We noted that for a read-write RESTful protocol, we need to say in the RDF message itself what kind of operations can be made. We also note that only information resources can be manipulated, and we argued that it should be possible to add the required triples to every resource. We should be able to define hypermedia RDF in terms of a minimal vocabulary, but like the H Factor web page, we will find it instructive to use examples. In the following, we use Turtle syntax, with prefixes omitted for brevity. Also, we note that in many cases, we are making statements about the current resource, which given a reasonable base URI can be written as <>. We have not found the following factors to be relevant to RDF: LE (embedded links), CR, CU (control data for read and update requests respectively) and LT (templated queries); and LO and CL are trivially supported. The remaining are: LN Support for non-idempotent updates (HTTP POST) <> hm:canbe hm:mergedinto ; hm:createsimilarat <../>. The first triple says that the current resource can be merged with another resource. In a typical HTTP case, this would be achieved by POSTing to the current resource with an RDF payload, and the server would perform a RDF merge of the payload with the current resource. This term is motivated by that POST operations in [10] are specified to result in an RDF Merge of the payload into the named graph. Amongst other possibilities is a union of graphs. The second triple exists to make it possible to POST an RDF payload to some URI and the server will itself assign a URI to the posted RDF payload. This may for example be a named graph, as defined in [10] or a follow the protocol recently suggested in [9] Section 5.4. LI Support for idempotent updates (HTTP PUT, DELETE) <> hm:canbe hm:replaced, hm:deleted. The first triple says that the current resource may be replaced by PUTting an RDF payload to the resource URI. The second triple says that the resource may be deleted, typically by a HTTP DELETE on the resource URI. CM Support for indicating the interface method for requests (e.g. HTTP GET, POST, PUT, DELETE methods). For example (with similar triples for other HTTP methods): hm:replaced hm:httpmethod "PUT". We see that with a simple vocabulary, RDF serializations can become a very powerful hypermedia types. We also note that this vocabulary enables agents to do the same operations as the SPARQL 1.1 Graph Store HTTP Protocol [10], while being fully RESTful. The Protocol specification is currently not RESTful

per the discussion above as it requires extensive out-of-band information, even though it was a key design goal. It should also be possible to use this on the default (nameless) graph by assigning it a name in the service description rather than defining it in an out-of-band specification like is currently being done. Question 2. Should the SPARQL 1.1 Graph Store HTTP Protocol be replaced by a vocabulary like the above? 3.1 New H Factors Mike Amundsen solicits feedback on the completeness of his classification scheme. The following are suggested as discussion items: 1. Support for self-description. The self-describing nature of RDF is an important characteristic of the model and arguably an important characteristic of RDF as hypermedia as agents may be able to infer possible interactions based on vocabulary definitions. 2. Support for supplying control data relevant to subsequent requests. It may be useful for agents to know e.g. what formats they can send to a given resource before the request is made. This may require a new control data factor. An example of this use may be e.g.: <> hm:acceptsformat <http://www.w3.org/ns/formats/turtle>. 3. A factor encompassing RaUL [8] templates. RaUL goes beyond HTTP GET queries, which is what the LT factor is about. The interaction RaUL enables are quite different from those discussed in this paper and are not captured by any of the current factors. Question 3. Can we create a better vocabulary for hypermedia RDF than the sketch above? 4 Bridging LOD and SPARQL In many cases, it is sufficient to get a small number of resources to collect the data needed for a given usage, but slightly more involved queries (e.g. what kind of connections exists between Kate Bush, Roy Harper and bands that have sold more than 200 million albums? ) would likely cause thousands of resources to be downloaded and examined. For this, more advanced graph pattern matching, as well as more advanced mechanisms for source selection, is required. [11] provides some techniques that should prove very valuable in this respect and so it becomes important that SPARQL can be used with Linked Data in a RESTful manner. In [4] Tim Berners-Lee notes To make the data be effectively linked, someone who only has the URI of something must be able to find their way the SPARQL endpoint., but this is generally not possible today without requiring out-of-band information. The triples needed to do this are already defined in VoID (see e.g. [1]) and are in use: <> void:indataset [ void:sparqlendpoint </sparql>. ].

5 Conclusion We have shown how RDF serializations can become very powerful hypermedia types by using a simple vocabulary. As a consequence, developers can use Linked Data in a read-write scenario without specialized knowledge about RDF APIs or other out-of-band information. We have argued briefly that some triples should be added to every information resource, but note that most triples are only relevant to write-operations and should only appear in that case. We also noted that SPARQL is important for non-trivial traversal of Linked Data. To sum up, the addition of the following triples would make life easier for developers of applications that use Semantic Web data: <> hm:canbe hm:mergedinto, hm:replaced, hm:deleted ; hm:createsimilarat <../> ; hm:acceptsformat <http://www.w3.org/ns/formats/turtle> ; void:indataset [ void:sparqlendpoint </sparql>. ]. References 1. K. Alexander and M. Hausenblas. Describing linked datasets - on the design and usage of void, the vocabulary of interlinked datasets. In In Linked Data on the Web Workshop (LDOW 09), in conjunction with 18th International World Wide Web Conference (WWW 09), 2009. 2. M. Amundsen. API for RDF? don t do it! http://amundsen.com/blog/archives/1083, 2010. 3. M. Amundsen. Hypermedia Types. http://amundsen.com/hypermedia/, 2010. 4. T. Berners-Lee. Linked Data Design Issues. http://www.w3.org/designissues/linkeddata, 2009. 5. S. Bjorg. The Hypermedia Scale. http://restpatterns.org/articles/the Hypermedia Scale, 2009. 6. R. T. Fielding. Architectural Styles and the Design of Network-based Software Architectures. PhD thesis, University of California, Irvine, Irvine, California, 2000. 7. J. Gregorio and B. de hora. The Atom Publishing Protocol. http://www.ietf.org/rfc/rfc5023, 2007. 8. M. Hausenblas, J. Umbrich, and A. Haller. RAUL Vocabulary. http://vocab.deri.ie/raul, 2011. 9. M. Nally, S. Speicher, J. Arwe, and A. L. Hors. Linked Data Basic Profile 1.0. http://www.w3.org/submission/2012/subm-ldbp-20120326/, 2012. 10. C. Ogbuji. SPARQL 1.1 Graph Store HTTP Protocol. http://www.w3.org/tr/2011/wd-sparql11-http-rdf-update-20110512/, 2011. 11. A. Schwarte, P. Haase, K. Hose, R. Schenkel, and M. Schmidt. FedX: Optimization techniques for federated query processing on linked data. In The Semantic Web ISWC 2011, LNCS 7031:601-616.