E6885 Network Science Lecture 10: Graph Database (II)

Similar documents
E6895 Advanced Big Data Analytics Lecture 4:

Dynamic Graph Query Support for SDN Management. Ramya Raghavendra IBM TJ Watson Research Center

A Highly Efficient Runtime and Graph Library for Large Scale Graph Analytics

COMPUTER AND INFORMATION SCIENCE JENA DB. Group Abhishek Kumar Harshvardhan Singh Abhisek Mohanty Suhas Tumkur Chandrashekhara

GeoSPARQL Support and Other Cool Features in Oracle 12c Spatial and Graph Linked Data Seminar Culture, Base Registries & Visualisations

Managing and Mining Billion Node Graphs. Haixun Wang Microsoft Research Asia

Introduction to Graph Data Management

Oracle Spatial and Graph: Benchmarking a Trillion Edges RDF Graph ORACLE WHITE PAPER NOVEMBER 2016

Lecture 0: Course Intro

Graph Databases. Graph Databases. May 2015 Alberto Abelló & Oscar Romero

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

We focus on the backend semantic web database architecture and offer support and other services around that.

PGQL: a Property Graph Query Language

Apache Marmotta. Multimedia Management

An overview of RDB2RDF techniques and tools

Introduction to Graph Databases

Orchestrating Music Queries via the Semantic Web

When Graph Meets Big Data: Opportunities and Challenges

DB2 NoSQL Graph Store

CISC 7610 Lecture 4 Approaches to multimedia databases. Topics: Document databases Graph databases Metadata Column databases

This presentation is for informational purposes only and may not be incorporated into a contract or agreement.

Creating a Recommender System. An Elasticsearch & Apache Spark approach

Property graphs vs Semantic Graph Databases. July 2014

JENA: A Java API for Ontology Management

Benchmarking RDF Production Tools

New Approach to Graph Databases

Triple Stores in a Nutshell

Introduction to NoSQL by William McKnight

Readme file for Oracle Spatial and Graph and OBIEE Sample Application (V305) VirtualBox

Welcome to INFO216: Advanced Modelling

CISC 7610 Lecture 4 Approaches to multimedia databases. Topics: Graph databases Neo4j syntax and examples Document databases

Graph Analytics in the Big Data Era

Information Retrieval System Based on Context-aware in Internet of Things. Ma Junhong 1, a *

A Formal Definition of RESTful Semantic Web Services. Antonio Garrote Hernández María N. Moreno García

Oracle NoSQL Database Enterprise Edition, Version 18.1

Finding Topic-centric Identified Experts based on Full Text Analysis

CHAPTER 1 INTRODUCTION

G(B)enchmark GraphBench: Towards a Universal Graph Benchmark. Khaled Ammar M. Tamer Özsu

Analyzing a social network using Big Data Spatial and Graph Property Graph

Reducing Consumer Uncertainty

Open And Linked Data Oracle proposition Subtitle

a paradigm for the Introduction to Semantic Web Semantic Web Angelica Lo Duca IIT-CNR Linked Open Data:

Graph Databases. Guilherme Fetter Damasio. University of Ontario Institute of Technology and IBM Centre for Advanced Studies IBM Corporation

L24: NoSQL (continued) CS3200 Database design (sp18 s2) 4/12/2018

Linked Data: What Now? Maine Library Association 2017

geospatial querying ApacheCon Big Data Europe 2015 Budapest, 28/9/2015

Large Scale Graph Solutions: Use-cases And Lessons Learnt

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

Development of an Ontology-Based Portal for Digital Archive Services

Semantic Web Fundamentals

RDF Stores Performance Test on Servers with Average Specification

APPLYING KNOWLEDGE BASED AI TO MODERN DATA MANAGEMENT. Mani Keeran, CFA Gi Kim, CFA Preeti Sharma

Intro To Big Data. John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center. Copyright 2017

Multi-agent and Semantic Web Systems: Querying

Challenges for Data Driven Systems

Oracle NoSQL Database Enterprise Edition, Version 18.1

Enterprise Information Integration using Semantic Web Technologies:

Information Workbench

Real-time Fraud Detection with Innovative Big Graph Feature. Gaurav Deshpande, VP Marketing, TigerGraph; Mingxi Wu, VP Engineering, TigerGraph

Who we are: Database Research - Provenance, Integration, and more hot stuff. Boris Glavic. Department of Computer Science

STS Infrastructural considerations. Christian Chiarcos

RDF* and SPARQL* An Alternative Approach to Statement-Level Metadata in RDF

A General Approach to Query the Web of Data

Towards the Semantic Desktop. Dr. Øyvind Hanssen University Library of Tromsø

Non-Relational Databases. Pelle Jakovits

An overview of Graph Categories and Graph Primitives

Accelerator Design for Big Data Processing Frameworks

Novel System Architectures for Semantic Based Sensor Networks Integraion

Semantic Integration with Apache Jena and Apache Stanbol

Distributed Graph Storage. Veronika Molnár, UZH

Event Stores (I) [Source: DB-Engines.com, accessed on August 28, 2016]

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

The NoSQL Landscape. Frank Weigel VP, Field Technical Opera;ons

Representing and Querying Linked Geospatial Data

Title. Prolog, Rules, Reasoning and SPARQLing Magic in the real world. Franz Inc

DBpedia-An Advancement Towards Content Extraction From Wikipedia

Presented by Sunnie S Chung CIS 612

The DataBridge: A Social Network for Long Tail Science Data!

E6885 Network Science Lecture 11: Knowledge Graphs

Introduction to NoSQL Databases

745: Advanced Database Systems

SmallBlue: Unlock Collective Intelligence from Information Flows in Social Networks

Prof. Dr. Christian Bizer

A Brief History of Big Data

millions of SQL Server users worldwide, this feature broadens enormously concepts behind the model; how relationships are handled and what are the

The GQL Manifesto. 1. References [DM ]

Extending In-Memory Relational Database Engines with Native Graph Support

EYWA: a Distributed Graph Engine in the Huawei MIND Platform. Yinglong Xia Huawei Research America 02/09/2017

Publishing Statistical Data and Geospatial Data as Linked Data Creating a Semantic Data Platform

Finding Similarity and Comparability from Merged Hetero Data of the Semantic Web by Using Graph Pattern Matching

Grid Resources Search Engine based on Ontology

Using Linked Data Concepts to Blend and Analyze Geospatial and Statistical Data Creating a Semantic Data Platform

Development of guidelines for publishing statistical data as linked open data

Understanding NoSQL Database Implementations

RDFPath. Path Query Processing on Large RDF Graphs with MapReduce. 29 May 2011

Semantic Web Fundamentals

Design and Implementation of an RDF Triple Store

Flexible Tools for the Semantic Web

NoSQL Databases Analysis

Transcription:

E 6885 Topics in Signal Processing -- Network Science E6885 Network Science Lecture 10: Graph Database (II) Ching-Yung Lin, Dept. of Electrical Engineering, Columbia University November 18th, 2013

Course Structure 2 Class Date Lecture Topics Covered 09/09/13 1 Overview of Network Science 09/16/13 2 Network Representation and Feature Extraction 09/23/13 3 Network Paritioning, Clustering and Visualization 09/30/13 4 Network Analysis Use Case 10/07/13 5 Network Sampling, Estimation, and Modeling 10/14/13 6 Network Topology Inference 10/21/13 7 Network Information Flow 10/28/13 8 Dynamic & Probabilistic Networks and Graph Database 11/11/13 9 Final Project Proposal Presentation 11/18/13 10 Graph Databases II 11/25/13 11 Information Diffusion in Networks 12/02/13 12 Large-Scale Network Processing System 13 Final Project Presentation I 12/09/13

RDF and SPARQL 3

RDF and SPARQL 4

Resource Description Format (RDF) A W3C standard sicne 1999 Triples 5 Example: A company has nince of part p1234 in stock, then a simplified triple rpresenting this might be {p1234 instock 9}. Instance Identifier, Property Name, Property Value. In a proper RDF version of this triple, the representation will be more formal. They require uniform resource identifiers (URIs).

An example complete description 6

Advantages of RDF 7 Virtually any RDF software can parse the lines shown above as self-contained, working data file. You can declare properties if you want. The RDF Schema standard lets you declare classes and relationships between properties and classes. The flexibility that the lack of dependence on schemas is the first key to RDF's value. Split trips into several lines that won't affect their collective meaning, which makes sharding of data collections easy. Multiple datasets can be combined into a usable whole with simple concatenation. For the inventory dataset's property name URIs, sharing of vocabulary makes easy to aggregate.

SPARQL Query Langauge for RDF The following SPQRL query asks for all property names and values associated with the fbd:s9483 resource: 8

The SPAQRL Query Result from the previous example 9

Another SPARQL Example What is this query for? Data 10

Open Source Software Apache Jena 11

Property Graphs 12

Reference 13

A usual example 14

Query Example I 15

Query Examples II & III Computational intensive 16

Graph Database Example 17

Executation Time in the example of finding extended friends (by Neo4j) 18

Modeling Order History as a Graph 19

A query language on Property Graph Cypher 20

Cypher Example 21

Other Cypher Clauses 22

Property Graph Example Shakespheare 23

Creating the Shakespeare Graph 24

Query on the Shakespear Graph 25

Another Query on the Shakespear Graph 26

Chaining on the Query 27

Example Email Interaction Graph What's this query for? 28

Building Application Example Collaborative Filtering 29

How to make graph database fast? 30

Use Relationships, not indexes, for fast traversal 31

Storage Structure Example 32

Nodes and Relationships in the Object Cache 33

IBM System G 34

What is System G? A Complete Set of Visualizations, Analytical Algorithms, Middleware and Data Stores Designed to Support Graph Applications Rich Graph Algorithm/ Functions Primitives Centralities Communities Graph Sampling Network Info Flow Shortest Paths Ego Net Features Graph Matching Graph Query Graph Search Bayesian Networks Latent Net Inference Markov Networks Multi Graph Type Support Few, very large graphs (e.g. social, Internet of things) And More: Graph Visualizations Graph Databases Many, many small graphs (e.g. protein, healthcare) Large semantic graph (Semantic web, RDF, Graph search, Graph recommendation) Large Probabilistic graphical models: Bayesian networks, Markovian networks, HMMs, etc. Graph Middleware for Hardware Platform Optimization Graph Data Interface and Processing Interface Graph-Embedded Industry Solutions Based Basedon on~$21m ~$21Mresearch researchfunding funding==> ==> 65+ 65+research researchinnovations/papers innovations/papersincluding including77best bestpaper paperawards awards 35 New: BigData 2013 Best Paper Award (http://www.ieeebigdata.org)

Graphs Graph Database RDF / Property Graph Attributes Contextual Analysis 36 Topological Analytics Collective Graph Macro Collective Analysis Graphical Models Activity Graph Micro & Reasoning Cognitive Understanding

Preliminary comparison for Recommendation & Visualization IBM KnowledgeView 1-year Access Log: 72.3K users, 82.1K docs, and 1.74 million downloads Recommendation ==> 2-hop traversal & ranking Query Time (sec) / App. Type Collaborative Filtering for Recommenda tion* Centroid Graph Extraction for Visualization DB2 via SQL 0.24 52.0 (cold) 50.6 (cache) Oracle via SQL 0.35 201.0 (cold) 42.0 (cache) DB2RDF via SPARQL Neo4j Titan (Berk. DB) Titan (HBase) GBase (HBase) System G Native Store TBD 0.068 0.281 0.414 0.201 0.015 TBD 4.8 (cold) 1.2 (cache) 17.3 (cold) 6.8 (cache) 24.2 (cold) 5.7 (cache) 27.0 (cold) 2.4 (cache) 4.2 (cold) 0.07 (cache) Note: All numbers are preliminary. 37 For Visualization ==> 4-hop traversal & rankings

An Emerging Benchmark Test Set: data generator of full social media activity simulation of any number of users Next Bi-Annual Meeting: November 19 38

Questions? 39