Supporting Complex Thematic, Spatial and Temporal Queries over Semantic Web Data

Similar documents
Sensor Data Management

A Framework to Support Spatial, Temporal and Thematic Analytics over Semantic Web Data

Matthew S. Perry.

A Framework to Support Spatial, Temporal and Thematic Analytics over Semantic Web Data

SPARQL-ST: Extending SPARQL to Support Spatiotemporal Queries

Semantic Sensor Web. CORE Scholar. Wright State University. Amit P. Sheth Wright State University - Main Campus,

Scaling the Semantic Wall with AllegroGraph and TopBraid Composer. A Joint Webinar by TopQuadrant and Franz

CORE Scholar. Wright State University. Amit P. Sheth Wright State University - Main Campus,

Extending SPARQL to Support Spatially and Temporally Related Information

Publishing Statistical Data and Geospatial Data as Linked Data Creating a Semantic Data Platform

Using Linked Data Concepts to Blend and Analyze Geospatial and Statistical Data Creating a Semantic Data Platform

AllegroGraph for Flexibility in the Enterprise and on the Web. Jans Aasman Franz Inc

GeoSPARQL Support and Other Cool Features in Oracle 12c Spatial and Graph Linked Data Seminar Culture, Base Registries & Visualisations

Cultural and historical digital libraries dynamically mined from news archives Papyrus Query Processing Technical Report

Representing and Querying Linked Geospatial Data

A General Approach to Query the Web of Data

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Title. Prolog, Rules, Reasoning and SPARQLing Magic in the real world. Franz Inc

Querying Data with Transact SQL

Developing markup metaschemas to support interoperation among resources with different markup schemas

ON HANDLING GEOGRAPHIC DATA OF PRINT AND DIGITAL FORMS IN ACADEMIC LIBRARIES: THE ROLE OF ONTOLOGIES

Knowledge Representation for the Semantic Web

Building Virtual Earth Observatories Using Scientific Database, Semantic Web and Linked Geospatial Data Technologies

Transforming Data from into DataPile RDF Structure into RDF

FOUNDATIONS OF SEMANTIC WEB TECHNOLOGIES

Flexible Tools for the Semantic Web

FOUNDATIONS OF SEMANTIC WEB TECHNOLOGIES

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

An overview of RDB2RDF techniques and tools

RDF AND SPARQL. Part III: Semantics of RDF(S) Dresden, August Sebastian Rudolph ICCL Summer School

Oracle Spatial and Graph: Benchmarking a Trillion Edges RDF Graph ORACLE WHITE PAPER NOVEMBER 2016

Making BioPAX SPARQL

Today: RDF syntax. + conjunctive queries for OWL. KR4SW Winter 2010 Pascal Hitzler 3

SPARQL Query Re-Writing for Spatial Datasets Using Partonomy Based Transformation Rules

A Deductive System for Annotated RDFS

Incremental Export of Relational Database Contents into RDF Graphs

2. RDF Semantic Web Basics Semantic Web

Transformative characteristics and research agenda for the SDI-SKI step change:

Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course:

Keyword Search in RDF Databases

Building Blocks of Linked Data

Deep integration of Python with Semantic Web technologies

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

Semantic Web and Linked Data

SEXTANT 1. Purpose of the Application

COMPUTER AND INFORMATION SCIENCE JENA DB. Group Abhishek Kumar Harshvardhan Singh Abhisek Mohanty Suhas Tumkur Chandrashekhara

<?xml version='1.0' encoding='iso '?> <!DOCTYPE rdf:rdf [ <!ENTITY rdf ' <!

Mustafa Jarrar: Lecture Notes on RDF Schema Birzeit University, Version 3. RDFS RDF Schema. Mustafa Jarrar. Birzeit University

Open And Linked Data Oracle proposition Subtitle

A Relaxed Approach to RDF Querying

Transformative characteristics and research agenda for the SDI-SKI step change: A Cadastral Case Study

Querying Data with Transact-SQL

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Event Stores (I) [Source: DB-Engines.com, accessed on August 28, 2016]

Presented by Kit Na Goh

Semantic Web. MPRI : Web Data Management. Antoine Amarilli Friday, January 11th 1/29

[AVNICF-MCSASQL2012]: NICF - Microsoft Certified Solutions Associate (MCSA): SQL Server 2012

Formalising the Semantic Web. (These slides have been written by Axel Polleres, WU Vienna)

Semantic Web In Depth: Resource Description Framework. Dr Nicholas Gibbins 32/4037

Logic and Reasoning in the Semantic Web (part I RDF/RDFS)

Semantic Processing of Sensor Event Stream by Using External Knowledge Bases

Introducing Linked Data

Querying Data with Transact SQL Microsoft Official Curriculum (MOC 20761)

FedX: Optimization Techniques for Federated Query Processing on Linked Data. ISWC 2011 October 26 th. Presented by: Ziv Dayan

Grid Resources Search Engine based on Ontology

Peer-to-Peer Discovery of Semantic Associations

Enterprise Information Integration using Semantic Web Technologies:

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

RDF Data Management: Reasoning on Web Data

May 22, 2013 Ronald Reagan Building and International Trade Center Washington, DC USA

Developing GeoSPARQL Applications with Oracle Spatial and Graph

INF3580/4580 Semantic Technologies Spring 2017

FAGI-gis: A tool for fusing geospatial RDF data

Semantic Web Technologies: Theory & Practice. Axel Polleres Siemens AG Österreich

Forward Chaining Reasoning Tool for Rya

Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Perceptions

Representation and Querying of Valid Time of Triples in Linked Geospatial Data

Mapping Relational Data to RDF with Virtuoso's RDF Views

Web Ontology Editor: architecture and applications

An Introduction to the Semantic Web. Jeff Heflin Lehigh University

Outline RDF. RDF Schema (RDFS) RDF Storing. Semantic Web and Metadata What is RDF and what is not? Why use RDF? RDF Elements

ORACLE DATABASE 12C INTRODUCTION

Strabon. Semantic support for EO Data Access in TELEIOS. Presenter: George Garbis

Novel System Architectures for Semantic Based Sensor Networks Integraion

Querying Data with Transact-SQL

Handling time in RDF

II B.Sc(IT) [ BATCH] IV SEMESTER CORE: RELATIONAL DATABASE MANAGEMENT SYSTEM - 412A Multiple Choice Questions.

Motivation and basic concepts Storage Principle Query Principle Index Principle Implementation and Results Conclusion

MySQL for Developers. Duration: 5 Days

M. Andrea Rodríguez-Tastets. I Semester 2008

APPLYING KNOWLEDGE BASED AI TO MODERN DATA MANAGEMENT. Mani Keeran, CFA Gi Kim, CFA Preeti Sharma

Ontological Modeling: Part 2

Linked data and its role in the semantic web. Dave Reynolds, Epimorphics

Peer-to-Peer Discovery of Semantic Associations

ISSUES IN SPATIAL DATABASES AND GEOGRAPHICAL INFORMATION SYSTEMS (GIS) HANAN SAMET

Hybrid Acquisition of Temporal Scopes for RDF Data

Linked Open Data: a short introduction

CREATING VIRTUAL SEMANTIC GRAPHS ON TOP OF BIG DATA FROM SPACE. Konstantina Bereta and Manolis Koubarakis

Spatial Data Management

New Trends in Database Systems

Transcription:

Supporting Complex Thematic, Spatial and Temporal Queries over Semantic Web Data Matthew Perry 1, Amit P. Sheth 1, Farshad Hakimpour 2, Prateek Jain 1 2 nd International Conference on Geospatial Semantics Mexico City, MX, November 29-30, 2007 1. Kno.e.sis Center, CSE Dept., Wright State University, Dayton, OH, USA 2. LSDIS Lab, CS Dept., The University of Georgia, Athens, GA, USA

Outline Background and Motivation Contributions Modeling Approach Querying Approach Evaluation Conclusions

Background

From.. Semantic Discovery * Finding things To.. Finding out about things Relationships! * http://knoesis.wright.edu/projects/semdis/

Semantic Associations ρ-path association ρ-iso association

SemDis Project author_of E1:Reviewer E2:Paper E5:Person author_of author_of E3:Paper author_of author_of E7:Submission friend_of E4:Person friend_of E6:Person author_of How is entity1 (Reviewer) related to entity7 (Submission)? Semantic Analytics Searching, Exploring and Visualizing semantically meaningful relationships Aggregated RDF Instance Base Ontology Schemas XML HTML RDBMS TEXT

Motivation

Motivation In many analytical applications spatial and temporal data is critical national security, regulatory compliance, etc. Current semantic analytics research has focused on thematic relationships GIS and Spatial Databases Traditionally model thematic aspects as directly attached attributes of geospatial objects Thematic entities/events and relationships should be represented as first-class objects to allow semantic analytics

Motivating Example Scenario (Biochemical Threat Detection): Analysts must examine soldiers symptoms to detect possible biochemical attack Find all soldiers with symptoms indicative of exposure to chemical X that fought in battles within 2 miles of sightings of members of enemy group Y Query specifies (1) a relationship between a soldier, a chemical agent and a battle location (2) a relationship between members of an enemy organization and their known locations (3) a spatial filtering condition based on the proximity of the soldier and the enemy group in this context

Contributions Storage and indexing scheme for spatial and temporal SW data (i.e. RDF) Efficient treatment of temporal inferencing Definition and implementation of 4 spatial and temporal query operators Performance study using large dataset

Modeling Approach

Representing ontologies and instance data W3C standards Resource Description Framework (RDF) Language for representing information about resources Resources are identified by Uniform Resource Identifiers (URIs) globally-unique Common framework for expressing information allows exchange and reuse without loss of meaning Graph-based data model Relationships are first class objects

RDF Model rdfs:class Directed Labeled Graph lsdis:person rdfs:literal rdfs:range rdf:property lsdis:speech rdfs:domain lsdis:politician Statement lsdis:nam (triple): <lsdis:politician_123> e <lsdis:gives> <lsdis:speech_456>. Subject lsdis:give Predicate rdfs:range Object rdfs:domain s Statement (triple): <lsdis:politician_123> <lsdis:name> Hillary Clinton. Defining Subject Defining Properties: Classes: Predicate Object <lsdis:gives> <lsdis:person> <rdf:type> <rdf:type> <rdf:property> <rdfs:class>.. lsdis:gives lsdis:politician_12 Subject Subject Predicate Predicate Object Object lsdis:speech_456 3 Defining Defining Class/Property Properties Hierarchies: (domain and range): <lsdis:politician> name <lsdis:gives> <rdfs:subclassof> <rdfs:domain> <lsdis:politician> <lsdis:person>.. Subject <lsdis:gives> <rdfs:range> Predicate <lsdis:politician> Object. Hillary Clinton Subject Predicate rdf:type Object rdfs:subclassof statement

Incorporation of Temporal Information Use Temporal RDF Graphs defined by Gutiérrez, et al 1 Considers time as a discrete, linearly-ordered domain Associate time intervals with statements which represent the valid-time of the statement Essentially a quad instead of a triple 1. Claudio Gutiérrez, Carlos A. Hurtado, Alejandro A. Vaisman. Temporal RDF. ESWC 2005: 93-107

Example Temporal Graph: Platoon Membership assigned_to [5, 15] E4:Soldier E1:Soldier assigned_to [1, 10] E2:Platoon assigned_to [11, 20] E3:Platoon assigned_to [5, 15] E5:Soldier

Temporal RDF Serialization C Statement CalendarClockInterval object begins Instant incalendarclockdatatype XMLSchema Datetime B A predicate subject temporal ends Instant incalendarclockdatatype XMLSchema Datetime RDF Reification OWLTime NEW

Example Domain Ontology 1. Matthew Perry, Farshad Hakimpour, Amit Sheth. "Analyzing Theme, Space and Time: An Ontology-based Approach", 14th Intl Symp. on Advances in Geographic Information Systems (ACM-GIS '06), Arlington, VA, November 10-11, 2006

Thematic Contexts Linking Non-Spatial Entities to Spatial Entities E2:Soldier E4:Address located_at located_at lives_at lives_at E6:Address E1:Soldier assigned_to Georeferenced Coordinate Space (Spatial Regions) occurred_at E7:Battle participates_in E8:Military_Unit E8:Military_Unit participates_in assigned_to E5:Battle occurred_at Residency Battle Participation E3:Soldier Named Places Spatial Occurrents Dynamic Entities

Querying Approach

Graph Patterns 20

Query Operators Two spatial operators spatial_extent() Find all soldiers participating in military events that take place within an input bounding box spatial_eval() Find all soldiers with symptoms indicative of exposure to chemical X that fought in battles within 2 miles of sightings of enemy group Y Two temporal operators temporal_extent() Return all soldiers exhibiting a given symptom during a specific time period temporal_eval() Find all soldiers who exhibited symptoms after participating in a given military event

Implementation SQL-based Approach Created user-defined functions in Oracle DBMS Create procedures for spatial and temporal indexing Existing Oracle Technologies Semantic Technology Component Storage Structures for RDF(S) Data RDFS Inference Procedures SQL-based Querying Spatial Component Spatial Types SDO_GEOMETRY Implementation of Spatial_Region Spatial Indexing (R-Tree) Spatial Operators

SQL-based Querying Approach SQL Table Functions SELECT X, Y FROM TABLE (Table_Func( )); X Y Z a b c d e f

Spatial Operators (spatial_extent) spatial_extent (graphpattern VARCHAR, spatialvar VARCHAR, ontology RDFModels, <geom SDO_GEOMETRY>, <spatialrelation VARCHAR>) returns AnyDataSet; Select all soldiers on the crew of a vehicle used in a military event that occurred within 45 miles of a given point SELECT x FROM TABLE (spatial_extent( '(?x <knoesis:on_crew_of>?y)(?y <knoesis:used_in>?z) (?z <knoesis:occurred_at>?l)', 'l', SDO_RDF_Models('military'), SDO_GEOMETRY(2001, 8265, SDO_POINT_TYPE( -71.796531, 44.304772, NULL), NULL, NULL), 'GEO_DISTANCE(distance=45 unit=mile)')));

Spatial Operators (spatial_eval) spatial_eval (graphpattern VARCHAR, spatialvar VARCHAR, graphpattern2 VARCHAR, spatialvar2 VARCHAR, spatialrelation VARCHAR, ontology RDFModels) return AnyDataSet; Select all platoons with a training area that overlaps the training area of Platoon_12996 SELECT b FROM TABLE (spatial_eval( '(<knoesis:platoon_12996> <knoesis:trains_at>?z) (?z <knoesis:located_at>?l)', 'l', '(?b <knoesis:trains_at>?c) (?c <knoesis:located_at>?d)', 'd', 'GEO_RELATE(mask=overlap)', SDO_RDF_Models('military')));

Temporal Properties of Context Instances Temporal Intersection [6, 12] Soldier#123 assigned_to:[3, 12] assigned_to:[6, 20] Platoon#456 Soldier#789 Temporal Range [3, 20]

Temporal Operators (temporal_extent) temporal_extent (graphpattern VARCHAR, intervaltype VARCHAR, ontology RDFModels, <start DATE>, <end DATE>, <temporalrel VARCHAR>) return AnyDataSet; Select all soldiers (and their platoons) on the crew of a vehicle actively used during the time period [10/04/1942, 9/21/1944] SELECT x, a FROM TABLE (temporal_extent( '(?x <knoesis:on_crew_of>?y) (?y <knoesis:used_in>?z) (?x <knoesis:assigned_to>?a)', 'INTERSECT' SDO_RDF_Models('military'), to_date('1942-10-04 00:26:01', 'yyyy-mm-dd hh24:mi:ss'), to_date('1944-09-21 08:22:00', 'yyyy-mm-dd hh24:mi:ss'), 'DURING'));

Temporal Operators (temporal_eval) temporal_eval (graphpattern VARCHAR, intervaltype VARCHAR, graphpattern2 VARCHAR, intervaltype2 VARCHAR, temporalrel VARCHAR, ontology RDFModels) return AnyDataSet; Find all pairs of soldiers such that one soldier was leader of a platoon in Division_2186 and the other soldier was leader of a platoon in Division_2191 during the same time period SELECT x, a FROM TABLE (temporal_eval( '(?x <knoesis:leader_of>?y) (?y <knoesis:platoon_of>?z) (?z <knoesis:battalion_of> <knoesis:division_2186>)', 'INTERSECT', '(?a <knoesis:leader_of>?b) (?b <knoesis:platoon_of>?c) (?c <knoesis:battalion_of> <knoesis:division_2191>)', 'INTERSECT', 'ANYINTERACT', SDO_RDF_Models('military')));

Storage Scheme Temporal Indexing Procedure Load RDF Data with Oracle Spatial Indexing Procedure

Temporal Inferencing RDFS Inferencing Rules 1. (x, rdf:type, y) AND (y, rdfs:subclassof, z) (x, rdf:type, z) 2. (x, p, y) AND (p, rdfs:domain, a), (p, rdfs:range, b) (x, rdf:type, a), (y, rdf:type, b) 3. (x, p, y) AND (p, rdfs:subpropertyof, z) (x, z, y) Example: (x, participates_in, e):[2, 5] (y, participates_in, e):[3, 7] (z, participates_in, e):[6, 9] By rule 2: (e, rdf:type, event):[2, 9]

Temporal Inferencing Algorithm 1. create table asserted_temporal_triples (subj, prop, obj, start_date, end_date) 2. perform schema-level inferencing 3. perform instance-level inferencing 4. sort redundant_triples by (subj_id, prop_id, obj_id, start) 5. make a single pass and merge overlapping intervals for same statement 6. insert updated triples and intervals into final temporal_triples table asserted_temporal_triples (x, participates_in, e):[2, 5] (y, participates_in, e):[3, 7] (z, participates_in, e):[6, 9] redundant_triples (e, rdf:type, event):[2, 5] (e, rdf:type, event):[3, 7] (e, rdf:type, event):[6, 9] temporal_triples (e, rdf:type, event):[2, 9]

Function Implementation Oracle Extensibility Framework ODCITable Interface Start() Fetch() Close()

Spatial Function Implementation 1. Start() 1. Translate graph pattern into multi-way join over temporal_triples table 2. Add sdo_relate or sdo_within_distance predicate to reflect spatial relation parameter 2. Fetch() 1. Retrieve rows until all are returned 3. Close() 1. Close cursor for query For spatial_eval: Nested Loop Join Strategy outer spatial_extent() inner filtered spatial_extent()

Temporal Function Implementation 1. Start() 1. Translate graph pattern into multi-way join over temporal_triples table 2. Push temporal filtering down as much as possible e.g., RANGE during [x, y] all statements start >= x AND end <= y 2. Fetch() 1. Retrieve an intermediate result row 2. Construct INT/RANGE interval for result row 3. Apply temporal filtering condition and return matching rows 3. Close() 1. Close cursor for query For temporal_eval: Nested Loop Join Strategy outer temporal_extent() inner filtered temporal_extent()

Evaluation

Evaluation Environment Oracle 10g R2 on RedHat EL Dual Xeon 3.0 GHz, 2GB RAM 512 MB buffer cache Objective Test scalability w.r.t (1) dataset size, (2) query complexity Dataset Thematic: Synthetic RDF Graph (Military Schema) Generated with TOntoGen 1 100k triples 1.6 M triples 15 M triples Spatial: US Census Block Group boundary polygons 873 9,352 83,236 Temporal: start and end times selected with uniform probability from 2 overlapping date ranges 1. Matthew Perry "TOntoGen: A Synthetic Data Set Generator for Semantic Web Applications", AIS SIGSEMIS Bulletin Volume 2 Issue 2 (April - June) 2005, pp. 46-48

Scalability w.r.t Dataset Size All times reported are average of 15 trials using a warm cache

Scalability w.r.t. Graph Pattern Size Times reported for 15 million triple dataset

Conclusions and Future Work Conclusions Approach for realizing spatial and temporal queries over SW data Motivated by lack of support in current semantic analytics tools Good scalability for large synthetic dataset Future Work SPARQL query language extensions