Datalog Evaluation. Serge Abiteboul. 5 mai 2009 INRIA. Serge Abiteboul (INRIA) Datalog Evaluation 5 mai / 1

Similar documents
Foundations of Databases

Datalog Evaluation. Linh Anh Nguyen. Institute of Informatics University of Warsaw

DATABASE THEORY. Lecture 12: Evaluation of Datalog (2) TU Dresden, 30 June Markus Krötzsch

DATABASE THEORY. Lecture 15: Datalog Evaluation (2) TU Dresden, 26th June Markus Krötzsch Knowledge-Based Systems

FOUNDATIONS OF SEMANTIC WEB TECHNOLOGIES

}Optimization Formalisms for recursive queries. Module 11: Optimization of Recursive Queries. Module Outline Datalog

}Optimization. Module 11: Optimization of Recursive Queries. Module Outline

D2R2: Disk-oriented Deductive Reasoning in a RISC-style RDF Engine

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 14. Example. Datalog syntax: rules. Datalog query. Meaning of Datalog rules

Logic As a Query Language. Datalog. A Logical Rule. Anatomy of a Rule. sub-goals Are Atoms. Anatomy of a Rule

Relational Databases

Encyclopedia of Database Systems, Editors-in-chief: Özsu, M. Tamer; Liu, Ling, Springer, MAINTENANCE OF RECURSIVE VIEWS. Suzanne W.

Conjunctive queries. Many computational problems are much easier for conjunctive queries than for general first-order queries.

Deductive Databases. Motivation. Datalog. Chapter 25

Datalog Recursive SQL LogicBlox

Access Patterns (Extended Version) Chen Li. Department of Computer Science, Stanford University, CA Abstract

CSE 344 JANUARY 26 TH DATALOG

Datalog. Susan B. Davidson. CIS 700: Advanced Topics in Databases MW 1:30-3 Towne 309

Web Science & Technologies University of Koblenz Landau, Germany. Relational Data Model

Fix- point engine in Z3. Krystof Hoder Nikolaj Bjorner Leonardo de Moura

Magda Balazinska - CSE 444, Spring

Data Integration: Datalog

CSE 344 JANUARY 29 TH DATALOG

Answering Queries with Useful Bindings

Recap and Schedule. Till Now: Today: Ø Query Languages: Relational Algebra; SQL. Ø Datalog A logical query language. Ø SQL Recursion.

Introduction to Data Management CSE 344. Lecture 14: Datalog (guest lecturer Dan Suciu)

Lecture 1: Conjunctive Queries

Midterm. Database Systems CSE 414. What is Datalog? Datalog. Why Do We Learn Datalog? Why Do We Learn Datalog?

Midterm. Introduction to Data Management CSE 344. Datalog. What is Datalog? Why Do We Learn Datalog? Why Do We Learn Datalog? Lecture 13: Datalog

CSE 544: Principles of Database Systems

Principles of Program Analysis: Data Flow Analysis

Announcements. What is Datalog? Why Do We Learn Datalog? Database Systems CSE 414. Midterm. Datalog. Lecture 13: Datalog (Ch

Chapter 3: Relational Model

CMPS 277 Principles of Database Systems. Lecture #11

Chapter 6: Bottom-Up Evaluation

Range Restriction for General Formulas

CS34800 Information Systems. The Relational Model Prof. Walid Aref 29 August, 2016

Database System Concepts, 5 th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

An Extended Magic Sets Strategy for a Rule. Paulo J Azevedo. Departamento de Informatica Braga, Portugal.

Relational Model: History

Database Theory VU , SS Introduction to Datalog. Reinhard Pichler. Institute of Logic and Computation DBAI Group TU Wien

R10 SET a) Explain the Architecture of 8085 Microprocessor? b) Explain instruction set Architecture Design?

CSE414 Midterm Exam Spring 2018

Fundamentals of Prolog

Relational Algebra 1

Outline. q Database integration & querying. q Peer-to-Peer data management q Stream data management q MapReduce-based distributed data management

FOUNDATIONS OF DATABASES AND QUERY LANGUAGES

Implementation Techniques

Declarative Languages (For Building Systems)

Datalog. Rules Programs Negation

Chapter 2: Relational Model

Introduction to Database Systems CSE 444

Chapter 5: Other Relational Languages

Plan for today. Query Processing/Optimization. Parsing. A query s trip through the DBMS. Validation. Logical plan

Optimization of logical query plans Eliminating redundant joins

Database Theory: Beyond FO

COMP2411 Lecture 20: Logic Programming Examples. (This material not in the book)

Fix- point engine in Z3. Krystof Hoder Nikolaj Bjorner Leonardo de Moura

Element Algebra. 1 Introduction. M. G. Manukyan

Semantic Optimization of Preference Queries

Expressive capabilities description languages and query rewriting algorithms q

Program Analysis in Datalog

CMPS 277 Principles of Database Systems. Lecture #4

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 7 - Query optimization

µz An Efficient Engine for Fixed Points with Constraints

CPSC 304 Introduction to Database Systems

The CORAL Deductive System

Database Management System 11

Lecture 6: Chain rule, Mean Value Theorem, Tangent Plane

Chapter 5: Other Relational Languages.! Query-by-Example (QBE)! Datalog

Databases Relational algebra Lectures for mathematics students

Lecture 5: Declarative Programming. The Declarative Kernel Language Machine. September 12th, 2011

Relational Database: The Relational Data Model; Operations on Database Relations

D2R2: Disk-oriented Deductive Reasoning in a RISC-style RDF Engine

Can We Trust SQL as a Data Analytics Tool?

Chapter 2: Intro to Relational Model

I Relational Database Modeling how to define

Negation wrapped inside a recursion makes no. separated, there can be ambiguity about what. the rules mean, and some one meaning must

Introduction to Data Management CSE 344. Lecture 14: Relational Calculus

Lecture 9: Datalog with Negation

CSE344 Final Exam Winter 2017

Prolog-2 nd Lecture. Prolog Predicate - Box Model

CSE 344 Midterm. Wednesday, Nov. 1st, 2017, 1:30-2:20. Question Points Score Total: 100

CO444H. Ben Livshits. Datalog Pointer analysis

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 9 - Query optimization

Constraint Solving. Systems and Internet Infrastructure Security

Example problems. Combinatorial auction. Towers of Hanoi Rectangle packing Shortest route. 8 queens Soduko Maximizing (minimizing) costs

Logical Query Languages. Motivation: 1. Logical rules extend more naturally to. recursive queries than does relational algebra.

Relational Algebra. Procedural language Six basic operators

Advances in Data Management Beyond records and objects A.Poulovassilis

Name: 1. (a) SQL is an example of a non-procedural query language.

Reconcilable Differences

Introduction. 1. Introduction. Overview Query Processing Overview Query Optimization Overview Query Execution 3 / 591

1. Introduction; Selection, Projection. 2. Cartesian Product, Join. 5. Formal Definitions, A Bit of Theory

XML Storage and Indexing

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

Part V. Working with Information Systems. Marc H. Scholl (DBIS, Uni KN) Information Management Winter 2007/08 1

CSE344 Midterm Exam Fall 2016

Relational Calculus. Lecture 4B Kathleen Durant Northeastern University

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Transcription:

Datalog Evaluation Serge Abiteboul INRIA 5 mai 2009 Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 1 / 1

Datalog READ CHAPTER 13 lots of research in the late 80 th top-down or bottom-up evaluation direct evaluation vs. compilation into a more efficient program no product some influence on logic programming HERE 1 semi-naive bottom-up evaluation 2 top-down : QSQ 3 bottom-up : Magic Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 2 / 1

Reverse-Same-Generation rsg(x,y) rsg(x,y) flat(x,y) up(x,x1),rsg(y1,x1),down(y1,y) up flat down a a f g h i j e f m n n o o g f m n m o p m l f m f g b h c i d p k Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 3 / 1

l m f n o p d d u u u u u d f e f g h i j k u u d d d f f a b c d Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 4 / 1

Naive algorithm level 0 : level 1 : {(g,f ),(m,n),(m,o),(p,m)} level 2 : {level 1} {(a,b),(h,f ),(i,f ),(j,f ),(f,k)} level 3 : {level 2} {(a,c),(a,d)} level 4 : {level 3} rsg := repeat rsg := rsg flat π 16 (σ 2=4 (σ 3=5 (up rsg down))) until fixpoint rsg i+1 = rsg i flat π 16 (σ 2=4 (σ 3=5 (up rsg i down))) redundant computation each layer recomputes all elements of the previous layer monotonicity Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 5 / 1

Semi-naive Focus on the new facts generated at each level RSG 1 rsg(x,y) flat(x,y) i+1 rsg (x,y) up(x,x1), i rsg (y1,x1),down(y1,y) not recursive not a datalog program for each input I, RSG(I)(rsg) = 1 i (δ i rsg). less redundancy rsg i+1 rsg i δ i+1 rsg rsg i+1 Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 6 / 1

An improvement δrsg i+1 rsgi+1 rsg i e.g., (g,f ) δrsg, 2 not in rsg 2 rsg 1 use rsg i rsg i 1 instead of place of i rsg rule of RSG Non linear rules : e.g., ancestor anc(x,y) anc(x,y) semi-naive evaluation 1 anc(x,y) i+1 anc (x,y) i+1 anc (x,y) still some redundancy : use par(x,y) anc(x,z),anc(z,y) in the second par(x,y) i anc (x,z),anci (z,y) anci (x,z), i anc (z,y) temp i+1 (x,y) i anc (x,z),anci 1 (z,y) temp i+1 (x,y) anc i (x,z), i anc (z,y), Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 7 / 1

Top-down technique : Query-Sub-Query Program + query rsg(x,y) flat(x,y) rsg(x,y) up(x,x1),rsg(y1,x1),down(y1, y) query(y) rsg(a,y) Focus on relevant data avoid deriving unnecessary tuples A is relevant if there is a proof tree for query in which the fact occurs We will do that (not perfectly) using four ingredients (a) SLD-resolution but set-at-a-time using relational algebra (b) start with query constants and push constants from goals to subgoals (in the style of to pushing selections into joins) (c) Use sideways information passing to pass constant binding information from one atom to the next in subgoals (d) Use an efficient global flow-of-control strategy Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 8 / 1

Technique adornment and adorned rules supplementary relations and QSQ templates the kernel of the technique global control strategies Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 9 / 1

Adornment in rsg(a,y), we have a binding for the first argument of rsg we denote it rsg bf later on, we need rsg fb based on the evaluation of subqueries (rsg fb, { e, f }) in general (R γ,j) where 1 R is a predicate 2 γ an adornment 3 J provides bindings 4 completion for t in J 5 answer of the subquery : set of completions Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 10 / 1

Sideways information passing from relational optimization evaluation of joins P(a,v) join Q(b,w,x) join R(v,w,y) evaluate first P(a,v) obtains some v s evaluate then R(v,w,y) restricted by v s obtains some w s finally evaluate Q(b,w,x) restricted by w s Adorned rule R(x,y,z) R 1 (x,u,v),r 2 (u,w,w,z),r 3 (v,w,y,a) a subquery involving R bfb with left-to-right evaluation (reorder if necessary) R bfb (x,y,z) R1 bff (x,u,v),rbffb 2 (u,w,w,z),r3 bbfb (v,w,y,a) given head adornment and an ordering of body, algo for adorning Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 11 / 1

Supplementary relations and QSQ templates Data structure to store information during evaluation n + 1 supplementary relations for a rule with n atoms R bfb (x, y, z) R bff 1 (x, u, v), Rbffb 2 (u, w, w, z), R bbfb 3 (v, w, y, a) sup 0 [x, z] sup 1 [x, z, u, v] sup 2 [x, z, v, w] sup 3 [x, y, z] variables serve as attribute names in the supplementary relations QSQ template for the adorned rule : sup 0,sup 1,sup 2,sup 3 Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 12 / 1

Kernel of QSQ evaluation input : program + query construct adorned rules for the query : (1) rsg bf (x,y) flat(x,y) (2) rsg fb (x,y) flat(x,y) (3) rsg bf (x,y) up(x,x1),rsg fb (y1,x1),down(y1,y) (4) rsg fb (x,y) down(y1,y),rsg bf (y1,x1),up(x,x1) (Note the ordering to pass bindings in (4)) construct QSQ template for each adorned rule i and the supplementary relation variables sup i j construct also : for each idb R and adornment γ 1 the variable ans R γ of arity arity(r) 2 the variable input R γ of arity bound(r,γ) initialization : use the query Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 13 / 1

rsg bf (x, y) rsg bf (x, y) sup 1 0 [x] a flat(x, y) sup1 1 [x, y] sup 3 0 [x] a up(x, x 1 ), rsg fb (y 1, x 1 ), down(y 1, y) sup1 3 [x, x 1] a e a f sup2 3 [x, y 1] a g sup3 3 [x, y] a b rsg fb (x, y) rsg fb (x, y) sup 2 0 [y] e f flat(x, y) sup1 2 [x, y] g f sup 4 0 [y] e f down(y 1, y), rsg bf (y 1, x 1 ), up(x, x 1 ) sup1 4 [y, y 1] f l f m sup 4 2 [y, x 1] sup3 4 [x, y] input rsg bf input rsg fb ans rsg bf ans rsg fb a e f a b g f Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 14 / 1

Kinds of steps A : from input R γ to sup i 0 move new tuples in input R γ to 0 th supplementary relations B : from sup i j to sup i j+1 Pass new tuples from one supplementary relation to the next C : from ans γ R to supi j Use new idb tuples (from ans R γ ) to generate new supplementary relation tuple D : from sup i n to ans R γ Process tuples in the final supplementary relation of a rule Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 15 / 1

Example a into input rsg bf init a sup0 1,sup3 0 A a,e, a,f sup1 3 B e, f input rsg fb B g,f ans rsg fb B,D and 2nd rule a,b ans rsg bf C,B,D and 3rd rule Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 16 / 1

Global control strategies basic one : apply steps (A) through (D) until a fixpoint is reached QSQR (query-subquery-recursive) in step (B) : pass from supplementary relation sup i j 1 across an idb predicate R γ to supplementary relation sup i j introduce new tuples into sup i j and into input R γ perform a recursive call to determine the R γ -values corresponding to the new tuples added to input R γ, before considering the new tuples placed into sup i j see algorithm in the book Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 17 / 1

Merci Serge Abiteboul (INRIA) Datalog Evaluation 5 mai 2009 18 / 1