Processing Heterogeneous RDF Event Streams with Standing SPARQL Update

Similar documents
Introduction to INSTANS

SPARQL-Based Applications for RDF-Encoded Sensor Data

Rinne, Mikko; Abdullah, Haris; Törmä, Seppo; Nuutila, Esko Processing Heterogeneous RDF Events with Standing SPARQL Update Rules

The Event Processing ODP

Smart Spaces Semantic Interoperability and Complex Event Processing

A Formal Definition of RESTful Semantic Web Services. Antonio Garrote Hernández María N. Moreno García

Event Object Boundaries in RDF Streams A Position Paper

Linked Stream Data Processing Part I: Basic Concepts & Modeling

Web-based BIM. Seppo Törmä Aalto University, School of Science

Event Stores (I) [Source: DB-Engines.com, accessed on August 28, 2016]

Semantic Integration with Apache Jena and Apache Stanbol

Lesson 5 Web Service Interface Definition (Part II)

SEPA SPARQL Event Processing Architecture

On the use of Abstract Workflows to Capture Scientific Process Provenance

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

FAGI-gis: A tool for fusing geospatial RDF data

Novel System Architectures for Semantic Based Sensor Networks Integraion

Network Based Hard/Soft Information Fusion Network Architecture/SOA J. Rimland

Maximising (Re)Usability of Language Resources using Linguistic Linked Data

AllegroGraph for Flexibility in the Enterprise and on the Web. Jans Aasman Franz Inc

BSC Smart Cities Initiative

Semantic Web and Python Concepts to Application development

Adding formal semantics to the Web

ProLD: Propagate Linked Data

Position Paper for Ubiquitous WEB

Scaling Parallel Rule-based Reasoning

A Survey of Context Modelling and Reasoning Techniques

SAF: A Provenance-Tracking Framework for Interoperable Semantic Applications

Context-aware Semantic Middleware Solutions for Pervasive Applications

H1 Spring B. Programmers need to learn the SOAP schema so as to offer and use Web services.

Semantic Processing of Sensor Event Stream by Using External Knowledge Bases

WHAT ISINTEROPERABILITY? (AND HOW DO WE MEASURE IT?) INSPIRE Conference 2011 Edinburgh, UK

DIONYSUS: Towards Query-aware Distributed Processing of RDF Graph Streams

Flexible Tools for the Semantic Web

CHAPTER 7. Observations, Conclusions and Future Directions Observations 7.2. Limitations of the Model 7.3. Conclusions 7.4.

Ontology Servers and Metadata Vocabulary Repositories

Linked Data and RDF. COMP60421 Sean Bechhofer

Forward Chaining Reasoning Tool for Rya

Using RDF to Model the Structure and Process of Systems

Semantic enablers for dynamic digital-physical object associations in a federated node architecture for the Internet of Things

Towards a Semantic Web Platform for Finite Element Simulations

A Framework for Performance Study of Semantic Databases

Choosing between Axioms, Rules & Queries: Experiments with Semantic Integration Techniques

onem2m AND SMART M2M INTRODUCTION, RELEASE 2/3

WebGUI & the Semantic Web. William McKee WebGUI Users Conference 2009

Collage: A Declarative Programming Model for Compositional Development and Evolution of Cross-Organizational Applications

Real World Data Governance- Part 1

Application of the Peer-to-Peer Paradigm in Digital Libraries

WHAT IS WEB 3.0? Abstract. While the concept of Web2.0 has made a significant impact on the

Grid Resources Search Engine based on Ontology

Semantic agents for location-aware service provisioning in mobile networks

An Evaluation of Geo-Ontology Representation Languages for Supporting Web Retrieval of Geographical Information

What you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web

Web 2.0 and the Semantic Web

QuickTime and a Tools API Breakout. TIFF (LZW) decompressor are needed to see this picture.

Architectural Styles - Finale

Proposed Cooperative ICT Projects. Mie Mie Thet Thwin. Rector University of Computer Studies, Yangon, Myanmar

Oracle Spatial and Graph: Benchmarking a Trillion Edges RDF Graph ORACLE WHITE PAPER NOVEMBER 2016

IoT CoAP Plugtests & Workshop November 27 th 2012

Przemek Woznowski, Cardiff University Supervised by Prof. Alun Preece

A SEMANTIC MATCHMAKER SERVICE ON THE GRID

The Semantic Event Broker. Francesco Morandi

SWoTSuite: A Toolkit for Prototyping End-to-End Semantic Web of Things Applications

Keyword Search in RDF Databases

Sempala. Interactive SPARQL Query Processing on Hadoop

Overview 4.2: Routing

H1 Spring C. A service-oriented architecture is frequently deployed in practice without a service registry

Enhancement of CoAP Packet Delivery Performance for Internet of Things. Hang Liu

Linked Data: What Now? Maine Library Association 2017

Efficient Temporal Reasoning on Streams of Events with DOTR

a paradigm for the Introduction to Semantic Web Semantic Web Angelica Lo Duca IIT-CNR Linked Open Data:

[MS-TURNBWM]: Traversal using Relay NAT (TURN) Bandwidth Management Extensions

DBpedia-An Advancement Towards Content Extraction From Wikipedia

INF3580/4580 Semantic Technologies Spring 2017

Energy-related data integration using Semantic data models for energy efficient retrofitting projects

Vijetha Shivarudraiah Sai Phalgun Tatavarthy. CSc 8711 Georgia State University

XML Data Stream Processing: Extensions to YFilter

A Knowledge Model Driven Solution for Web-Based Telemedicine Applications

Sensor Data Management

Integrating Soar into the OneSAF Models Framework. Dr. Doug Reece

Chapter 11 - Data Replication Middleware

Graph Data Management & The Semantic Web

Hyperdata: Update APIs for RDF Data Sources (Vision Paper)

ELENA: Creating a Smart Space for Learning. Zoltán Miklós (presenter) Bernd Simon Vienna University of Economics

02 - Distributed Systems

W3C WoT call CONTEXT INFORMATION MANAGEMENT - NGSI-LD API AS BRIDGE TO SEMANTIC WEB Contact: Lindsay Frost at

Resource Discovery in IoT: Current Trends, Gap Analysis and Future Standardization Aspects

Domain Specific Semantic Web Search Engine

Chapter 13: Advanced topic 3 Web 3.0

02 - Distributed Systems

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

Semantic Web: Core Concepts and Mechanisms. MMI ORR Ontology Registry and Repository

Cross-Fertilizing Data through Web of Things APIs with JSON-LD

Part I: Future Internet Foundations: Architectural Issues

Enrichment of Sensor Descriptions and Measurements Using Semantic Technologies. Student: Alexandra Moraru Mentor: Prof. Dr.

The Emerging Data Lake IT Strategy

Linked Data for Data Integration based on SWIMing Guideline: Use Cases in DAREED Project

GeoSPARQL Support and Other Cool Features in Oracle 12c Spatial and Graph Linked Data Seminar Culture, Base Registries & Visualisations

Collaborative Conferencing

Transcription:

Processing Heterogeneous RDF Streams with Standing SPARQL Update Mikko Rinne, Haris Abdullah, Seppo Törmä, Esko Nuutila http://cse.aalto.fi/instans/ 11.9.2012 Department of Computer Science and Engineering Distributed Systems Group

Smart Cities Need Interoperability Smart environments of the future interconnect billions of sensors Platforms from multiple vendors Operated by different companies, public authorities or individuals Highly distributed, loosely coupled solutions based on common standards are required Challenge to proprietary platforms Semantic web standards RDF, SPARQL and OWL offer a good base for interoperability How would they work for event processing?

Solution Components 1. Method: Multiple collaborating SPARQL queries and update rules processing heterogeneous events expressed in RDF 2. Implementation (INSTANS): Incremental continuous query engine based on the Rete-algorithm

An = Anything that happens or is contemplated as happening *) Seppo came in Mikko came in Esko came in (Simple) (Simple) (Simple) It is 9 a.m. Seppo, Mikko and Esko are in. (Simple) (Simple) (Simple) (Simple) Composite Synthesized Complex Meeting started in time Summarizes, represents, or denotes a set of other events *) *) Luckham, D., Schulte, R.: processing glossary version 2.0 (Jul 2011)

Heterogeneous Representations Variable event structures in an open environment Different sensors may support different parameters Queries can match the data of interest and disregard the rest Semantic web standard RDF has flexible support for heterogeneous event structures Alternative approaches typically cover data stream processing on individual timeannotated triples :p3 tl: Insta nt event: agent rdf: type event: rdf: type :e1 event: time tl: at 2011-10-03T08:1 7:11 event: place Example Location Update 60.1587 76 geo: lat geo: long 24.8814 90 geo: alt

SPARQL Query + Update SPARQL is tailor-made to query RDF data SPARQL 1.1 Update supports INSERT operations, enabling Memory Communication between SPARQL queries Stepwise processing of data Applications can be constructed entirely of SPARQL Queries

Close Friends Example Service Mobile clients emit location updates Service produces a nearby notification if two friends come geographically close to each other 1. Static input (RDF Store) Configuration 2. Producer (RDF Stream) 5. Consumer Mobile Client 3. Channel 4. Processing Agent Network

Approach 1: Single Query CONSTRUCT {?person1 :nearby?person2 } WHERE { # Part 1: Bind event data for pairs of persons who know each other GRAPH <http://externalgraphstore.org/socialnetwork> {?person1 foaf:knows?person2 } <bind events for p1+p2> # Part 2: Remove events, if a newer event can be found FILTER NOT EXISTS { Find the latest?event3 rdf:type event: ; location for event:agent?person1 ; event:time [tl:at?dttm3]. each person?event4 rdf:type event: ; event:agent?person2 ; event:time [tl:at?dttm4]. FILTER ((?dttm1 <?dttm3) (?dttm2 <?dttm4)) } # Part 3: Check if the latest registrations were close in space and time FILTER ( (abs(?lat2-?lat1)<0.01) && (abs(?long2-?long1)<0.01) && (abs(hours(?dttm2)*60+minutes(?dttm2)-hours(?dttm1)*60-minutes(?dttm1))<10))} Finds friends, whose latest registrations are close in space and time Doesn t do anything for buffer management or re-execution of the query

Approach 2: Window-Based Streaming SPARQL REGISTER QUERY CloseFriends COMPUTED EVERY 2m AS SELECT?person1?person2 FROM STREAM <http://myexample.org/personlocationupdates> [RANGE 10m STEP 2m] FROM http://streams.org/socialnetwork.rdf WHERE { # Part 1: Bind event data for all friends?person1 foaf:knows?person2 <bind events for p1+p2> FILTER ( ( ((?lat2-?lat1)*(?lat2-?lat1)) < 0.01*0.01) ) FILTER ( ( ((?long2-?long1)*(?long2-?long1)) < 0.01*0.01) ) } ORDER BY?dttm1?dttm2 Window range and repetition rate C-SPARQL environment handles windowing, removal of old events and repetition of query Duplicate removal has to be handled by external means Notification delay and duplicates lead to compromises

Approach 3: Collaborative SPARQL Update Rules Query 1: Maintain only the latest registration in the workspace Query 3: Emit notifications Query 2: Insert a nearby detection marker Query 4: Delete nearby status No duplicate detections Buffer management handled by SPARQL

Rete-Algorithm in INSTANS Translation of SPARQL queries into an incremental processor Each input triple propagates according to the queries and resulting states are saved within the structure When a complete query is matched, results are immediately available This sample query selects events between 10 and 11 o clock!1 Query: "1:! a event:event Y1 SELECT?event WHERE {?event a event: ; event:7me?7me.?7me tl:at?day7me. FILTER ( hours(?day7me) = 10 ) } Process flow:?event 2?event!2 1 "2:! event:time! :e1?event 1 Each condi7on corresponds to an α- node. α1 matches with sample input :e1 a event:. 2 :e1 propagates to β2 and is stored there. 3 α2 matches with :e1 event:,me _:b1, where _:b1 is a blank node. Input from β2 matches with?event in Y2. 4 :e1 and _:b1 propagate un7l β3. 5 α3 matches with input _:b1 tl:at 2011-10- 03T10:05:00 ˆˆxsd:dateTime. 6 In Y3 _:b1 is equal in both incoming branches and can be eliminated. 7 :e1 and 2011-10- 03T10:05:00 ˆˆxsd:dateTime reach filter1. The condi7on hour = 10 is true. 8 :e1 is selected as a result. Y2 4!3 3?event,?time?event,?time 6 7 8 :e1 _:b1?event,?time Y3 filter1 select1 "3:! tl:at!?event,?daytime?event 5?time,?daytime Drop _:b1 :e1 10:05

Comparison of Approaches Single Query C-SPARQL INSTANS Correctness of notifications Yes Yes if windows overlap Yes Duplication elimination Only within one query Only inside window Yes Timeliness of notifications Query triggered Periodically Triggered triggered Scalability wrt #events No Yes Yes

Notification Delay Results 5 simulated friends moving on a map C-SPARQL query processing delay varied 12-253 ms for 5-60 events, respectively Window repetition rate is the dominant component of the notification delay With 1 event per second inter-arrival time C-SPARQL notification delay measured at 1.34 25.90 seconds. Notification Delay [s] 30.0 25.0 20.0 15.0 10.0 5.0 0.0 C-SPARQL INSTANS 5s 10s 20s 30s 40s 50s 60s C-SPARQL Window Length INSTANS: 12 ms independent of window length

Summary processing based on RDF-encoded heterogeneous events format can evolve independently of event processing application Built-in support for disjoint vocabularies SPARQL Query + Update Application can be built entirely out of collaborating SPARQL queries Access to linked open data, future possibilities for inference No proprietary extensions needed so far Promise of good interoperability in multi-vendor multi-actor environments Continuous incremental matching using the Rete-algorithm No repeating windows (processing repetition, duplicate matches, missed detections on window borders) Application areas in smart spaces, context-aware mobile systems, internet-of-things, the real-time web etc. *) ACM Special Interest Group for Applied Computing

Conclusions Collaborative SPARQL queries are a promising method for event processing using semantic web technologies A platform capable of continuous event-driven evaluation of parallel SPARQL-queries supporting SPARQL 1.1 Update (INSERT) is needed INSTANS outperforms the comparison approaches Single SPARQL query lacks buffer management and repetition Window-based streaming SPARQL suffers from contradicting requirements in setting operation parameters INSTANS gives corrent notifications without duplicates in a fraction of the notification time of Streaming SPARQL.

Background Material

Queries in Approach 3 Query 1) Window-query: DELETE { <bind event to variables>} WHERE { <bind event to variables> FILTER EXISTS {?event2 event:agent?person ; event:time [tl:at?dttm2]. FILTER (?dttm <?dttm2) } } Query 2) Nearby detection INSERT {?person1 :nearby?person2 } WHERE {?person1 foaf:knows?person2. <bind events for p1+p2> # Check proximity in space and time FILTER ((abs(?lat2-?lat1)<0.01) && (abs(?long2-?long1)<0.01) && (abs(hours(?dttm2)*60+minutes(?dttm2) -hours(?dttm1)*60-minutes(?dttm1))<10)) # Don't insert, if the relation already exists FILTER NOT EXISTS {?person1 :nearby?person2}} Query 3) Notification: SELECT?person1?person2 WHERE {?person1 :nearby?person2 } Query 4) Removal of ``nearby'' status: DELETE {?person1 :nearby?person2 } WHERE {?person1 foaf:knows?person2. <bind events for p1+p2> FILTER ( (abs(?lat2-?lat1)>0.02) (abs(? long2-?long1)>0.02)) FILTER EXISTS {?person1 :nearby?person2 } }