Peter T. Wood. Third Alberto Mendelzon International Workshop on Foundations of Data Management

Similar documents
Querying from a Graph Database Perspective: the case of RDF

NAGA: Searching and Ranking Knowledge. Gjergji Kasneci, Fabian M. Suchanek, Georgiana Ifrim, Maya Ramanath, and Gerhard Weikum

A Relaxed Approach to RDF Querying

Graph Databases. Advanced Topics in Foundations of Databases, University of Edinburgh, 2017/18

Databases & Information Retrieval

fied by a regular expression [4,7,9,11,23,16]. However, this kind of navigational queries is not completely satisfactory since in many cases we would

The Property Graph Database Model

A Keyword-based Structured Query Language

An overview of Graph Categories and Graph Primitives

Regular Expressions for Data Words

COMP718: Ontologies and Knowledge Bases

Querying RDF Data from a Graph Database Perspective

Fixpoint Path Queries

ECS 120 Lesson 7 Regular Expressions, Pt. 1

Finding Top-k Approximate Answers to Path Queries

Preferentially Annotated Regular Path Queries

Graph Databases. Graph Databases. May 2015 Alberto Abelló & Oscar Romero

ANDREAS PIERIS JOURNAL PAPERS

Computing Full Disjunctions

Applications of Flexible Querying to Graph Data

Logical reconstruction of RDF and ontology languages

ρ-queries: Enabling Querying for Semantic Associations on the Semantic Web

Flexible Querying of Lifelong Learner Metadata

OWL 2 Profiles. An Introduction to Lightweight Ontology Languages. Markus Krötzsch University of Oxford. Reasoning Web 2012

A TriAL: A navigational algebra for RDF triplestores

arxiv: v3 [cs.db] 9 Nov 2018

On the Hardness of Counting the Solutions of SPARQL Queries

Querying Graph Databases 1/ 83

Indexing XML Data with ToXin

DATA MODELS FOR SEMISTRUCTURED DATA

Web Services Interfaces

Knowledge Representations. How else can we represent knowledge in addition to formal logic?

Some aspects of references behaviour when querying XML with XQuery

Taking the RDF Model Theory Out For a Spin

A RPL through RDF: Expressive Navigation in RDF Graphs

Ontological Modeling: Part 2

Ontologies for Agents

Hy + : A Hygraph-based Query and Visualization System

Shortest paths on large graphs: Systems, Algorithms, Applications

Flexible querying for SPARQL

(Refer Slide Time: 0:19)

Property Path Query in SPARQL 1.1

Survey of Graph Database Models

Languages and tools for building and using ontologies. Simon Jupp, James Malone

Introduction to Lexical Analysis

DB Project. Database Systems Spring 2013

KNOWLEDGE GRAPHS. Lecture 1: Introduction and Motivation. TU Dresden, 16th Oct Markus Krötzsch Knowledge-Based Systems

Encyclopedia of Database Systems, Editors-in-chief: Özsu, M. Tamer; Liu, Ling, Springer, MAINTENANCE OF RECURSIVE VIEWS. Suzanne W.

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Lecture 1: Introduction and Motivation Markus Kr otzsch Knowledge-Based Systems

MAT 7003 : Mathematical Foundations. (for Software Engineering) J Paul Gibson, A207.

CS6702 GRAPH THEORY AND APPLICATIONS 2 MARKS QUESTIONS AND ANSWERS

SWARMGUIDE: Towards Multiple-Query Optimization in Graph Databases

Semistructured Data Store Mapping with XML and Its Reconstruction

Implementing Flexible Operators for Regular Path Queries

YAGO - Yet Another Great Ontology

INFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM

Reasoning on Web Data Semantics

Structural characterizations of schema mapping languages

Semantic Association Identification and Knowledge Discovery for National Security Applications 1

Relational Database Features

Effective Searching of RDF Knowledge Bases

Various Graphs and Their Applications in Real World

About the Tutorial. Audience. Prerequisites. Disclaimer & Copyright. Graph Theory

Graphs. Tessema M. Mengistu Department of Computer Science Southern Illinois University Carbondale Room - Faner 3131

Intelligent flexible query answering Using Fuzzy Ontologies

Relational Databases

Foundations of SPARQL Query Optimization

FAdo: Interactive Tools for Learning Formal Computational Models

Syntax Agnostic Semantic Validation of XML Documents using Schematron

H1 Spring C. A service-oriented architecture is frequently deployed in practice without a service registry

}Optimization Formalisms for recursive queries. Module 11: Optimization of Recursive Queries. Module Outline Datalog

Graph Data Management & The Semantic Web

From Open Data to Data- Intensive Science through CERIF

}Optimization. Module 11: Optimization of Recursive Queries. Module Outline

On the construction of convergent transfer subgraphs in general labeled directed graphs

Keyword query interpretation over structured data

Strongly connected: A directed graph is strongly connected if every pair of vertices are reachable from each other.

Formal Languages and Automata

.. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..

Practical Aspects of Query Rewriting for OWL 2

Ontologies and The Earth System Grid

Path Query Reduction and Diffusion for Distributed Semi-structured Data Retrieval+

Annoucements. Where are we now? Today. A motivating example. Recursion! Lecture 21 Recursive Query Evaluation and Datalog Instructor: Sudeepa Roy

Lexical Analysis. Lecture 3-4

Reasoning on Business Processes and Ontologies in a Logic Programming Environment

Compiler Construction

Introduction to Mathematical Programming IE406. Lecture 16. Dr. Ted Ralphs

COMP9321 Web Application Engineering

New Rewritings and Optimizations for Regular Path Queries

a standard database system

A Conceptual Model and Predicate Language for Data Selection and Projection Based on Provenance Abstract 1. Introduction

A Web-Based OO Platform for the Development of Didactic Multimedia Collaborative Applications

Copyright 2000, Kevin Wayne 1

Introduction. October 5, Petr Křemen Introduction October 5, / 31

Query Rewriting Using Views in the Presence of Inclusion Dependencies

( A(x) B(x) C(x)) (A(x) A(y)) (C(x) C(y))

QUICK: Queries Using Inferred Concepts from Keywords

Enhanced Regular Path Queries on Semistructured Databases

Introduction to Lexical Analysis

Transcription:

Languages Languages query School of Computer Science and Information Systems Birkbeck, University of London ptw@dcs.bbk.ac.uk Third Alberto Mendelzon International Workshop on Foundations of Data Management

Outline query Languages query

Outline query Languages query

Motivation are widely used for representing data transportation and other networks geographical information semistructured data (hyper)document structure semantic associations in criminal investigations bibliographic citation analysis pathways in biological processes knowledge representation (e.g. semantic web) program analysis workflow systems Languages query data provenance...

Example Languages A graph of cities and flight durations: LHR query 7 2 1 JFK MAD CDG 8 12 14 14 LIM 4 SCL

Example Languages A graph of cities and flight durations: 7 LHR 2 1 query JFK MAD 12 14 CDG 8 14 LIM 4 SCL Nodes

Example Languages A graph of cities and flight durations: 7 LHR 2 1 query JFK 8 MAD 12 14 14 CDG LIM 4 SCL Edges

Types of Edges Languages Undirected query

Types of Edges Languages Undirected Directed query

Types of labels Languages Node labels A B query D C

Types of labels Languages Node labels A B Edge labels 3 query 2 1 5 D C 6

Cyclic graphs Languages Undirected A D B C query

Cyclic graphs Languages Undirected A D B C query

Cyclic graphs Languages Undirected A B Directed A B query D C D C

Cyclic graphs Languages Undirected A B Directed A B query D C D C

Acyclic graphs Languages Tree A D B C query

Acyclic graphs Languages Tree A B DAG A B query D C D C

Acyclic graphs Languages Tree A B DAG A B Tree A B query D C D C D C

Formal graph definition Languages For our purposes: database comprises a single labelled (multi-)graph G (finite) set of nodes N with identifiers drawn from an infinite vocabulary V (finite) set of (directed) edges E incidence function φ : E N N (allows multi-edges) edge labelling function λ : E Σ Σ is a finite alphabet So G = (N, E, V, Σ, φ, λ) query

Outline query Languages query

Many years ago... Languages PhD on Queries on (1988) supervised by Alberto Mendelzon query

Many years ago... Languages PhD on Queries on (1988) supervised by Alberto Mendelzon More recently querying RDF (allowing for query relaxation and ) approximate answers to semantic web queries investigating operators for finding/manipulating paths... with Pablo Barcelo and Carlos Hurtado (Chile) and Alex Poulovassilis (Birkbeck) query

Outline query Languages query

Transportation and other networks Languages airline, train, bus... networks communication networks planning networks single source and sink, acyclic query

Transportation and other networks Languages airline, train, bus... networks communication networks planning networks single source and sink, acyclic Typical queries: reachability: can I get from a to b? shortest path: find the quickest/shortest route from a to b reliability/capacity of paths critical path query

Knowledge representation Languages semantic networks conceptual graphs RDF/S, OWL ontologies taxonomies... query

Knowledge representation Languages semantic networks conceptual graphs RDF/S, OWL ontologies taxonomies... Typical queries: instance and subclass relationships finding connections between entities... query

Program/workflow analysis Languages nodes are program points or agents/products edges are program or workflow steps often single source and sink nodes also data provenance applications query

Program/workflow analysis Languages nodes are program points or agents/products edges are program or workflow steps often single source and sink nodes also data provenance applications Typical queries: reachability of code variables used before defined deadlock/livelock what agents/processes/products were involved in producing something query

Biological applications metabolic pathways gene regulatory networks protein interaction networks... Languages query

Biological applications metabolic pathways gene regulatory networks protein interaction networks... Typical queries include: path existence subgraph isomorphism k-shortest paths neighbourhood queries approximate matching... Languages query (see https://hpcrd.lbl.gov/staff/olken/ graphdm/graphdm.htm)

Outline query Languages query

Languages from the 1980s [Cruz, Mendelzon and Wood, 1987] [Cruz, Mendelzon and Wood, 1988] [Consens and Mendelzon, 1989] developed at University of Toronto data model is a labelled, directed graph in G and G +, query is a set of pairs of pattern graphs and summary graphs pattern graph nodes are labelled with variables or constants pattern graph edges are labelled with regular expressions over edge labels and variables Graphlog adds edge inversion, negation, distinguished edge and different semantics query

G, G + Example given a graph nodes representing people edges labelled with m (for motherof ) and f (for fatherof ) Languages query

G, G + Example given a graph nodes representing people edges labelled with m (for motherof ) and f (for fatherof ) following query finds parents followed by pairs of people who have a common ancestor m f p x y x y p p a x z y x y Languages query

Outline query Languages query

Languages from the 1990s [Abiteboul et al., 1997] developed at Stanford Lore: Lightweight Object Repository Lorel: Lore query language for semistructured data no predefined schema may be heterogeneous uses Object Exchange Model (OEM) can be viewed as extension of ODMG model/oql query

Lore model Languages data model is graph with two types of nodes complex objects atomic objects (values) with no outgoing edges each node has a unique oid each edge is labelled with a string graph has a number of named nodes (entry points) every node must be reachable from a named node query

Lorel example Languages graph representing a restaurant guide find addresses of restaurants with a given zipcode select Guide.restaurant.address where Guide.restaurant.address.zipcode = 92310 Guide is a named node restaurant, address and zipcode are edge labels query

Outline query Languages query

Languages from the 2000s [Weikum et al., 2009] developed at Max Planck Institute for Informatics YAGO: Yet Another Great Ontology NAGA: Not Another Google Answer semantic search engine for web derived knowledge combines DB and IR 26 relationships between entities derived using information extraction e.g., isa, borninyear, haswonprize, locatedin,... query

NAGA model Languages data model is directed, weighted multigraph nodes represent entities edges represent relationships weights represent confidence of extracted facts query is a connected, directed graph each edge labelled with a regular expression over edge labels or a variable or connect keyword answers are ranked by informativeness confidence compactness query

NAGA examples Languages graph representing information on people and films in which films did a governor act? X isa governor X actedin Y Y isa film X and Y are node variables isa and actedin are relationships (edge labels) what do Albert Einstein and Niels Bohr have in common? Albert_Einstein connect Niels_Bohr Albert_Einstein and Niels_Bohr are node labels query asks for paths connecting nodes ranked

Outline query Languages query

Other graph data models and query Languages Functional Data Model Logical Data Model O2 GOOD, GDM Strudel and StruQL G-BASE, Gram, GraphDB, GRAS hypergraphs, hypernode model, hygraphs RDF/S and SPARQL See Survey of Graph Database Models [Angles and Gutierrez, 2008] query

Outline query Languages query

functionality Languages graph pattern matching path finding edge label variables negation path variables aggregation approximate matching and (disjunction) query

Outline query Languages query

Example graph A graph of authors, prizes they have won, and countries where they were born: Languages query Nobel Booker prize haswon Neruda Gordimer Coetzee Carey author bornin Chile SouthAfrica Australia country

Example query Languages Which authors born in South Africa have won both the Nobel Prize in Literature and the Man Booker prize? Booker haswon X haswon Nobel query bornin SouthAfrica X is a variable

Matching subgraphs Two matching subgraphs Languages Nobel Booker prize query haswon Neruda Gordimer Coetzee Carey author bornin Chile SouthAfrica Australia country

Matching subgraphs Two matching subgraphs Languages Nobel Booker prize query haswon Neruda Gordimer Coetzee Carey author bornin Chile SouthAfrica Australia country

Matching subgraphs Two matching subgraphs Languages Nobel Booker prize query haswon Neruda Gordimer Coetzee Carey author bornin Chile SouthAfrica Australia country

answers Languages Depending on the query language and whether the database is a set of graphs or a single graph, answers mights be the set of graphs in which a match is found (e.g. biological applications) set of matching subgraphs (NAGA) set of variable bindings for each variable (most others) query

Forms of query expression similar to SQL/OQL (Lorel, RQL): select X from X.hasWon Y, X.hasWon Z, X.bornIn W where Y = Nobel and Z = Booker and W = SouthAfrica W, X, Y and Z are variables conjunctive query (similar to NAGA and others): (X) (X, haswon, Nobel), (X, haswon, Booker), (X, bornin, SouthAfrica) Languages query

evaluation problem Languages Given a query expression Q and a graph (database) G, is Q(G) non-empty? Combined complexity: both Q and G are part of the input complexity: input is Q while G is fixed Data complexity: input is G while Q is fixed Often consider data complexity since graphs are assumed to be large while query expressions are assumed to be short query

Complexity of query evaluation Languages For graph pattern matching, the complexity is the same as relational conjunctive queries subgraph isomorphism namely NP-complete in terms of query and combined complexity PTIME in terms of data complexity and combined complexity are in PTIME if the variables in the query satisfy an acyclicity condition [Yannakakis, 1981] But can still be exponential if output all variable bindings query

Outline query Languages query

More flexible matching Languages Booker haswon X haswon Nobel query citizenof ((bornin livesin) locatedin ) SouthAfrica South African if a citizen or born or lives in a place located there

Regular expressions Regular expression over alphabet Σ of edge labels: ɛ (empty string) is a regular expression any label in Σ is a regular expression if r 1 and r 2 are regular expressions, then so are (r1 r 2 ) (alternation) (r1 r 2 ) (concatenation) if r is regular expression, then so is r (closure) may also use a to mean traversal of edge labelled a in the reverse direction r + is shorthand for (r r ) r? is shorthand for (r ɛ) Σ is shorthand for (a 1 a n ) if Σ = {a 1,..., a n } Languages query

Regular Languages Language L(r) (set of sequences of labels) denoted by r is given by: ɛ denotes {ɛ} a Σ denotes {a} (r 1 r 2 ) denotes L(r 1 ) L(r 2 ) (r 1 r 2 ) denotes L(r 1 ) L(r 2 ) r denotes L(r) query

Paths satisfying regular expressions Languages Given a graph G = (N, E, V, Σ, φ, λ) a path p is a sequence of edges (e 1, e 2,..., e n ) such that, for each 1 i n, if φ(e i ) = (x, y), then φ(e i+1 ) = (y, z) for some x, y, z N the path label of p is given by λ(e 1 ) λ(e 2 ) λ(e n ) and is denoted λ(p) path p satisfies regular expression r if λ(p) L(r) Regular path query: given r and G, find all pairs of nodes (x, y) in G such there is a path from x to y which satisfies r query

Examples of regular path queries Languages X citizenof ((bornin livesin) locatedin ) Y query

Examples of regular path queries Languages X citizenof ((bornin livesin) locatedin ) Y query X restaurant (address)? zipcode Y

Examples of regular path queries Languages X citizenof ((bornin livesin) locatedin ) Y query X restaurant (address)? zipcode Y X instanceof (subclass) Y

Examples of regular path queries Languages X citizenof ((bornin livesin) locatedin ) Y query X restaurant (address)? zipcode Y X instanceof (subclass) Y X (selling bidder) +

Complexity of regular path query evaluation REGULAR PATH PROBLEM Given graph G, pair of nodes x and y and regular expression r, is there a path from x to y satisfying r? algorithm: construct a nondeterministic finite automaton (NFA) M accepting L(r) assume M has initial state s0 and final state s f consider G as an NFA with initial state x and final state y form the intersection I of M and G check if there is a path from (s 0, x) to (s f, y) Each step can be done in PTIME, so REGULAR PATH PROBLEM has PTIME combined complexity Languages query

Regular path query evaluation Languages NFA M for r = citizenof ((bornin livesin) locatedin ) bornin query start s 0 citizenof livesin ɛ s 1 locatedin s f

Regular path query evaluation Graph G: Languages a citizenof SA query locatedin b bornin CT livesin c bornin UK

Regular path query evaluation Intersection of G and M citizenof Languages a, s 0 SA, s 1 ɛ SA, s f query locatedin b, s 0 bornin CT, s 1 ɛ CT, s f livesin c, s 0 bornin UK, s 1 ɛ UK, s f

Regular path query evaluation Intersection of G and M citizenof a, s 0 SA, s 1 ɛ SA, s f locatedin b, s 0 bornin CT, s 1 ɛ CT, s f livesin c, s 0 bornin UK, s 1 ɛ UK, s f Languages query

Regular path query evaluation Intersection of G and M citizenof Languages a, s 0 SA, s 1 ɛ SA, s f query locatedin b, s 0 bornin CT, s 1 ɛ CT, s f livesin c, s 0 bornin UK, s 1 ɛ UK, s f

Regular path query evaluation Intersection of G and M citizenof a, s 0 SA, s 1 ɛ SA, s f locatedin b, s 0 bornin CT, s 1 ɛ CT, s f livesin c, s 0 bornin UK, s 1 ɛ UK, s f Languages query

Regular path query evaluation Intersection of G and M Languages citizenof a, s 0 SA, s 1 ɛ SA, s f query locatedin b, s 0 bornin CT, s 1 ɛ CT, s f livesin c, s 0 bornin UK, s 1 ɛ UK, s f

Regular path query evaluation Intersection of G and M citizenof a, s 0 SA, s 1 ɛ SA, s f locatedin b, s 0 bornin CT, s 1 ɛ CT, s f livesin c, s 0 bornin UK, s 1 ɛ UK, s f Languages query

Regular path query evaluation Alternatively can translate citizenof ((bornin livesin) locatedin ) to Datalog (as done by Graphlog, e.g.) assoc(x, Y ) bornin(x, Y ) assoc(x, Y ) livesin(x, Y ) partof (X, Y ) locatedin(x, Y ) partof (X, Y ) locatedin(x, Z ), partof (Z, Y ) answer(x, Y ) citizenof (X, Y ) answer(x, Y ) assoc(x, Y ) answer(x, Y ) assoc(x, Z ), partof (Z, Y ) Languages query

Regular simple path queries path p is simple if no node is repeated on p REGULAR SIMPLE PATH PROBLEM Given graph G, pair of nodes x and y and regular expression r, is there a simple path from x to y satisfying r? Languages query

Regular simple path queries path p is simple if no node is repeated on p REGULAR SIMPLE PATH PROBLEM Given graph G, pair of nodes x and y and regular expression r, is there a simple path from x to y satisfying r? REGULAR SIMPLE PATH PROBLEM is NP-complete, even for fixed expressions [Mendelzon and Wood, 1989] Languages query

Regular simple path queries path p is simple if no node is repeated on p REGULAR SIMPLE PATH PROBLEM Given graph G, pair of nodes x and y and regular expression r, is there a simple path from x to y satisfying r? REGULAR SIMPLE PATH PROBLEM is NP-complete, even for fixed expressions [Mendelzon and Wood, 1989] there can be a path from x to y satisfying r but no simple path satisfying r, e.g., r = (c d) d Languages query a c b

Outline query Languages query

Edge label variables Languages what relationship(s) exist between Coetzee and SouthAfrica? X (Coetzee, X, SouthAfrica) a schema-level query answers might be: {bornin, livesin, citizenof } find people X and things Y such that X is related Y in the same way as Coetzee is related to Y query (X, Y ) (Coetzee, Z, Y ), (Y, Z, X) superscript indicates traversal in reverse direction

Edge label variables program analysis example: database transaction graph: nodes represent points in a program special nodes start and end edges represent operations, e.g., lock(b) and unlock(b) of some data item b is it the case that a transaction tries to lock the same item more than once (not two-phase)? (start, (Σ lock(x) Σ lock(x) Σ ), end) Σ matches any sequence of edge labels sometimes called parameterised regular expressions Languages query

Negation Languages program analysis: def and use of program variables to find program points that immediately follow a use of an uninitialized variable Y (start, ( def (X)) use(x), Y ) query

Negation Languages program analysis: def and use of program variables to find program points that immediately follow a use of an uninitialized variable Y (start, ( def (X)) use(x), Y ) query to find only the first use of each uninitialized variable along each path Y, Z (start, (( (def (X) use(x))) ), Y ), (Y, use(x), Z )

Path variables may want to know path(s) connecting two nodes: linked data on the web (DBPedia, Freebase) link analysis in criminal networks data provenance given regular expression r and variable X, use (r)%x to bind path matching r to X edge label variable X is a special case where first occurrence is equivalent to (Σ)%X paths connecting Coetzee and Gordimer given by X (Coetzee, ((Σ Σ ) )%X, Gordimer) answers: bornin bornin and haswon haswon Lorel uses @, not %; NAGA uses connect keyword Languages query

Path variables Languages find entities X and Y such that Coetzee is connected to Y in the same way as X is connected to Y (X, Y ) (Coetzee, (Σ )%Z, Y ), (X, Z, Y ) query similar to regular expressions with backreferencing, e.g., in egrep (Unix) and in Perl membership problem is NP-complete [Aho, 1980]; data complexity is PTIME in general, can denote non-context-free, e.g., {ww w Σ } as above can also do local binding: ((Σ%X) X)

Outline query Languages query

Motivation To be able to answer traditional graph queries like degree of a node distance between pairs of nodes eccentricity of a node diameter, radius and centre of a graph Languages query

Motivation To be able to answer traditional graph queries like degree of a node distance between pairs of nodes eccentricity of a node diameter, radius and centre of a graph and applications like shortest path most reliable path critical path bill of materials... Languages query need operators such as count, min, max, sum

in Graphlog aggregate terms are allowed in label of distinguished edge or distinguished node following query computes, for each directory D, the total file space used by all contained files and sub-directories, other than those residing on disk 1 resideson disk 1 F S size D diskutil contains + sum(s) Languages query

From Graphlog to Datalog D diskutil sum(s) contains + disk 1 resideson F size S Languages query

From Graphlog to Datalog resideson disk 1 F S size D diskutil contains + containsplus(x, Y ) contains(x, Y ) sum(s) containsplus(x, Y ) contains(x, Z ), containsplus(z, Y ) diskutil(d, sum(s)) containsplus(d, F), size(f, S), resideson(f, disk 1) Languages query

Graphlog example we might also want to summarise values along a path and then aggregate following query computes the length of the shortest path between each pair of nodes shortestpath(min(sum(d)) Languages query X distance( ) + (D) Y D is called a collecting variable sum is used to summarise distances along a path min is used to aggregate the summarised distances query evaluation is in PTIME if summarisation and aggregation operators form a closed semiring [Consens and Mendelzon, 1990]

Outline query Languages query

Motivation Languages users may not be familiar with graph structure/constraints may formulate queries which return no answers or too few answers, e.g. expression course student when correct path is student course expression restaurant zipcode when address is required between them can perform approximate matching of paths rank results in terms of closeness to original query query

Approximate matching Languages can modify the user s original query (regular expression r) one way is to apply edit operations to L(r) insertions deletions substitutions transpositions invertions each operation may have a different cost somewhat related to user preferences prepared to substitute train by bus but at cost 2 query

Approximate matching algorithm Languages for conjunctive regular path queries can use algorithms from approximate string matching incrementally build an approximate NFA perform incremental joins for conjuncts PTIME combined complexity if conjuncts are acyclic and fixed number of head variables in general, can transform NFA using a regular transducer see [Hurtado, Poulovassilis and Wood, 2009] query

Languages motivated that graph-based data is widely used and available a brief high-level overview of some query for graph databases focussed on query language functionality some discussion of query evaluation algorithms some complexity results mentioned query

Issues not covered Languages Many issues not covered other more query evaluation strategies, e.g., using indexes graphs with schemas query optimisation, e.g., containment... query

References S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener. The LOREL query language for semistructured data. Int. J. on Digital Libraries, 1(1):68 88, April 1997. A. V. Aho. Pattern matching in strings. In R. V. Book, editor, Formal Language Theory: Perspectives and Open Problems, pages 325 347. Academic Press, 1980. R. Angles and C. Gutierrez. Survey of graph database models. ACM Comput. Surv., 40(1):1 39, 2008. M. P. Consens and A. O. Mendelzon. Expressing structural hypertext queries in GraphLog. In Proc. Second ACM Conf. on Hypertext, pages 269 292, 1989. Languages query

References Languages M. P. Consens and A. O. Mendelzon. Low complexity aggregation in GraphLog and Datalog. In Proc. 3rd Int. Conf. on Database Theory, pages 379 394, 1990. I. F. Cruz, A. O. Mendelzon, and P. T. Wood. A graphical query language supporting recursion. In ACM SIGMOD Int. Conf. on Management of Data, pages 323 330, 1987. I. F. Cruz, A. O. Mendelzon, and P. T. Wood. G + : Recursive queries without recursion. In Proc. 2nd Int. Conf. on Expert Database Systems, pages 355 368, 1988. query

References C. A. Hurtado, A. Poulovassilis, and P. T. Wood. Ranking Approximate Answers to Semantic Web Queries. In Proc. 6th European Semantic Web Conference, pages 263 277, 2009. A. O. Mendelzon and P. T. Wood. Finding regular simple paths in graph databases. In Proc. 15th Int. Conf. on Very Large Data Bases, pages 185 193, 1989. G. Weikum, G. Kasneci, M. Ramanath, and F. Suchanek. Database and information-retrieval methods for knowledge discovery. Commun. ACM, 52(4):56 64, 2009. Languages query

References Languages query M. Yannakakis. Algorithms for acyclic database schemes. In Proc. 7th Int. Conf. on Very Large Data Bases, pages 82 94, 1981.