AllegroGraph for Flexibility in the Enterprise and on the Web. Jans Aasman Franz Inc

Similar documents
Title. Prolog, Rules, Reasoning and SPARQLing Magic in the real world. Franz Inc

GeoTemporal Reasoning in a Web 3.0 World

Flexible Tools for the Semantic Web

Jans Aasman, Ph.D. CEO Franz Inc GeoSpatial with AllegroGraph

Jans Aasman, Ph.D. CEO Franz Inc Optimizing Sparql and Prolog for reasoning on large scale diverse ontologies

Scaling the Semantic Wall with AllegroGraph and TopBraid Composer. A Joint Webinar by TopQuadrant and Franz

DB2 NoSQL Graph Store

How Insurers are Realising the Promise of Big Data

Introduction to NoSQL by William McKnight

Efficient 3D and 4D Geospatial Indexing in RDF Stores with a Focus on Moving Objects

GeoSPARQL Support and Other Cool Features in Oracle 12c Spatial and Graph Linked Data Seminar Culture, Base Registries & Visualisations

CIB Session 12th NoSQL Databases Structures

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015

Graph Databases. Graph Databases. May 2015 Alberto Abelló & Oscar Romero

A scalable AI Knowledge Graph Solution for Healthcare (and many other industries) Dr. Jans Aasman

Safe Harbor Statement

Property graphs vs Semantic Graph Databases. July 2014

Distributed Databases: SQL vs NoSQL

GeoTemporal Reasoning for the Social Semantic Web

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos

An overview of RDB2RDF techniques and tools

DATABASE SCALE WITHOUT LIMITS ON AWS

REGULATORY REPORTING FOR FINANCIAL SERVICES

MarkLogic 8 Overview of Key Features COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Efficient and Scalable Friend Recommendations

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL

Semantic Web Fundamentals

Hands-on immersion on Big Data tools

A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores

Conceptual Database Modeling

RethinkDB. Niharika Vithala, Deepan Sekar, Aidan Pace, and Chang Xu

Semantic Web Fundamentals

Introduction to NoSQL Databases

Comparing SQL and NOSQL databases

Microsoft Azure Storage

Stages of Data Processing

The Emerging Data Lake IT Strategy

Supporting Complex Thematic, Spatial and Temporal Queries over Semantic Web Data

Distributed Non-Relational Databases. Pelle Jakovits

Migrating Oracle Databases To Cassandra

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Kim Greene - Introduction

Intro To Big Data. John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center. Copyright 2017

L24: NoSQL (continued) CS3200 Database design (sp18 s2) 4/12/2018

20762B: DEVELOPING SQL DATABASES

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

Non-Relational Databases. Pelle Jakovits

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Building a Data Strategy for a Digital World

Event Stores (I) [Source: DB-Engines.com, accessed on August 28, 2016]

When, Where & Why to Use NoSQL?

8/24/2010. Internet. Advanced databases and data models: Theme1: Semi structured data. What is the problem? In this course:

Expressive Querying of Semantic Databases with Incremental Query Rewriting

CISC 7610 Lecture 2b The beginnings of NoSQL

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

CISC 7610 Lecture 4 Approaches to multimedia databases. Topics: Document databases Graph databases Metadata Column databases

SCALABLE DATABASES. Sergio Bossa. From Relational Databases To Polyglot Persistence.

CISC 7610 Lecture 4 Approaches to multimedia databases. Topics: Graph databases Neo4j syntax and examples Document databases

Avancier Methods (AM) From logical model to physical database

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Getting to know. by Michelle Darling August 2013

OLAP Introduction and Overview

August 23, 2017 Revision 0.3. Building IoT Applications with GridDB

COMP9321 Web Application Engineering

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics

Handout 12 Data Warehousing and Analytics.

Databases and Big Data Today. CS634 Class 22

Spatial Databases by Open Standards and Software 1.

Semantic Web Technologies. Topic: RDF Triple Stores

COMP9321 Web Application Engineering

Modern Database Concepts

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

PRESENTATION TITLE GOES HERE. Understanding Architectural Trade-offs in Object Storage Technologies

SQL in the Hybrid World

Copyright 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12

Ivan Herman. F2F Meeting of the W3C Business Group on Oil, Gas, and Chemicals Houston, February 13, 2012

Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development

Novel System Architectures for Semantic Based Sensor Networks Integraion

Using Linked Data Concepts to Blend and Analyze Geospatial and Statistical Data Creating a Semantic Data Platform

International Journal of Informative & Futuristic Research ISSN:

Electronic Health Records with Cleveland Clinic and Oracle Semantic Technologies

Land Administration and Management: Big Data, Fast Data, Semantics, Graph Databases, Security, Collaboration, Open Source, Shareable Information

Database infrastructure for electronic structure calculations

Microsoft. [MS20762]: Developing SQL Databases

Practical Semantic Applications Master Title for Oil and Gas Asset Reporting. Information Integration David Price, TopQuadrant

SmartData Fabric distributed virtual data, graph data and master data management, analytics and security. Solutions and Key Features Revision 2.

Oracle NoSQL Database Enterprise Edition, Version 18.1

Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course:

Developing SQL Databases

MongoDB An Overview. 21-Oct Socrates

Designing Database Solutions for Microsoft SQL Server 2012

Presented by Sunnie S Chung CIS 612

Relational databases

Introduction to Graph Databases

Conceptual Modeling on Tencent s Distributed Database Systems. Pan Anqun, Wang Xiaoyu, Li Haixiang Tencent Inc.

This presentation is for informational purposes only and may not be incorporated into a contract or agreement.

Using the MySQL Document Store

Transcription:

AllegroGraph for Flexibility in the Enterprise and on the Web Jans Aasman Franz Inc ja@franz.com

What is a triple store

(1 (2 3) (4 5) (6 7) (8 9) (10 11) (12 13) (14 15)(16 17) (18 19 20 21 22 23 24 27 28) (29 30))

No Schema. How is it different and why is it more flexible? Say whatever you want to say but ontologies may constrain what you put in triple store No Link Tables because you can do one-to-many relationships directly No Indexing Choices add new data attributes (predicates) on-the-fly that will be real-time available for querying Takes anythingyou give it: it is trivial to consume Rows and columns from RDB, XML, RDF(S), OWL, Text and Extracted Entities

And how is it the same AllegroGraph has all the enterprise features of a relational database 24/7, robust, real time processing ACID, transactional Replication, On-line Backup, Point in time recovery High Availability Many kinds of full text indexing REST Protocol

Why Triple Stores are here to stay. There is now a standard for meta data or graphs data representation: RDF/RDFS logic and inferencing: RDFS/OWL querying: SPARQL rules: RuleML, Swirl With vertical compression we can get this smaller than RDB Comparable to RDB and NoSQL wrt speed and scalability And an enormous impetus to share data via RDF

Oct 2007

latest LOD cloud LOD cloud Sept 22 2010

Just look at this

But what about NoSQL? And by that we mean Cassandra Hbase CoucheDB Etc..

Classes of applications and their natural database. Regular Enterprise Applications. Relational Databases Web Scale Shallow Objects Hadoop, Hbase, Cassandra, etc Complex Meta Data Applications. Triple Stores

Examples of things that work better in triple stores

latest LOD cloud LOD cloud Sept 22 2010

A Financial Application Find a cyclic path between two companies that are engaged in revenue boosting

Select?a?b?c?d?e where { Q1: A reasonable hard query for horizontally scaling stores and rdb, a straight forward query for vertical/parallel stores Franz send-money?a?a send-money?b?b send-money?c?c send-money Cray Cray send-money?d Not (?d =?c)?d send-money?e Not (?e?b)?e send-money Franz}

Q2: A very hard query for horizontally scaling stores and rdb, a straight forward query for vertical/parallel store Find a money trail from Franz to Cray that is two or more steps, find another money trail from Franz Cray that is also two or more steps where the two trails are completely different (Select (?path1?path2) (path Franz Cray <send-money> >= 2?path1) (path Cray Franz <send-money> >= 2?path2) (empty (intersection?path1?path2))

Why is this hard in SQL Relational databases very good at straight joins but less optimal for self-joins of unpredictable length Try writing this as a sql query

Why is this hard in distributed key/value stores. Databases like Cassandra are extremely good at retrieving nested objects in a see of billion of objects but are less optimal for joins. Relatively hard to write these as map reduce expressions. Every query has to be expressed as program, ad hoc is therefore discouraged

Why is this easier in vertical scaling triple stores These queries can be easily parallelized Graph databases are made to do graph search Ad hoc queries easy to write in Sparql or Prolog

Summarizing RDB Cassandra, Hbase Schemaless - +/- + Transactions + +/- + ACID + +/- + Concurrent & Dynamic + + + Random Access + + + High Availability + + + Complex graph search - - + Structured + Unstructured - +/- + Scalability + ++ + General Flexibility - +/- + AllegroGraph 4.0

Is it also used in the enterprise?

Flexibility Why do they say they use triple stores? In changing your data schemas on the fly and do real time processing with (meta) data Add new predicates and use them in queries the next microsecond Flexibility In writing ad-hoc queries In finding complex patterns and working with rules Flexibility Integrating various data sources without complicated masterdata management. and there is a standard

An event example

A Simple Event Ontology A type Meetings, communications event, financial transactions, visit, attack/truce, an insurance claim, a purchase order RDFS++ reasoning A list of actors Social Network Analysis A place GeoSpatial Reasoning A Start-time and possible an end-time Temporal Reasoning Anything else that describes the event Goods that changed hands

Social Network Analysis Answers 4 questions How far is P1 from P2 (and how strong is the relation?) To what groups does this person belong (ego groups, cliques?) How important is this person in the group? Does this group have a leader, how cohesive are they?

GeoSpatial Make the following super efficient Where did something happen? How far was event1 from event2? Find all the events that occurred in a bounding box or radius of M miles? Do these two shapes overlap? Find all the objects in the intersection of two shapes On a very large scale when things don t fit in memory millions of events and polygons

Temporal Reasoning Adhere to our convention to encode StartTimes and EndTimes and enjoy efficient temporal primitives Implementation of Allen s interval logic primitives

Activity Recognition Our customers use AllegroGraph as an event database with social network analysis and geospatial and temporal reasoning Find all meetings that happened in November within 5 miles of Berkeley that was attended by the most important person in Jans friends and friends of friends. (select (?x) (ego-group person:jans knows?group 2) SNA (actor-centrality-members?group knows?x?num) SNA (q?event fr:actor?x) DB Lookup (qs?event rdf:type fr:meeting) RDFS (interval-during?event 2008-11-01 2008-11-06 ) Temporal (geo-box-around geoname:berkeley?event 5 miles) Spatial!)

Thank you And questions?