The Relational Model

Similar documents
THE RELATIONAL MODEL. University of Waterloo

The Relational Model

Multisets and Duplicates. SQL: Duplicate Semantics and NULL Values. How does this impact Queries?

Safe Stratified Datalog With Integer Order Does not Have Syntax

Lecture 1: Conjunctive Queries

Towards a Logical Reconstruction of Relational Database Theory

We ve studied the main models and concepts of the theory of computation:

Database Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler

SQL Part 3: Where Subqueries and other Syntactic Sugar Part 4: Unknown Values and NULLs

Schema Mappings and Data Exchange

Introduction to Finite Model Theory. Jan Van den Bussche Universiteit Hasselt

Logic and Databases. Phokion G. Kolaitis. UC Santa Cruz & IBM Research - Almaden


CSC Discrete Math I, Spring Sets

A Retrospective on Datalog 1.0

Database Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler

Summary of Course Coverage

Denotational Semantics. Domain Theory

LTCS Report. Concept Descriptions with Set Constraints and Cardinality Constraints. Franz Baader. LTCS-Report 17-02

Finite Model Generation for Isabelle/HOL Using a SAT Solver

CS558 Programming Languages

Programming Languages Third Edition

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

CSCI.6962/4962 Software Verification Fundamental Proof Methods in Computer Science (Arkoudas and Musser) Chapter p. 1/27

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

Discrete Mathematics Lecture 4. Harper Langston New York University

1.1 - Introduction to Sets

Lecture 5: Predicate Calculus. ffl Predicate Logic ffl The Language ffl Semantics: Structures

OWL 2 Profiles. An Introduction to Lightweight Ontology Languages. Markus Krötzsch University of Oxford. Reasoning Web 2012

9/19/12. Why Study Discrete Math? What is discrete? Sets (Rosen, Chapter 2) can be described by discrete math TOPICS

Exercises Computational Complexity

CSC 501 Semantics of Programming Languages

Conceptual modeling of entities and relationships using Alloy

The Big Picture. Chapter 3

COMP4418 Knowledge Representation and Reasoning

Unification in Maude. Steven Eker

Computation Club: Gödel s theorem

Uncertain Data Models

CPSC 121 Some Sample Questions for the Final Exam Tuesday, April 15, 2014, 8:30AM

Relational Databases

Database Theory VU , SS Codd s Theorem. Reinhard Pichler

Review of Sets. Review. Philippe B. Laval. Current Semester. Kennesaw State University. Philippe B. Laval (KSU) Sets Current Semester 1 / 16

NP-Completeness. Algorithms

On the Hardness of Counting the Solutions of SPARQL Queries

Note that in this definition, n + m denotes the syntactic expression with three symbols n, +, and m, not to the number that is the sum of n and m.

1. [5 points each] True or False. If the question is currently open, write O or Open.

Conjunctive queries. Many computational problems are much easier for conjunctive queries than for general first-order queries.

Formal Semantics of Programming Languages

University of Nevada, Las Vegas Computer Science 456/656 Fall 2016

Structural characterizations of schema mapping languages

Introductory logic and sets for Computer scientists

Data Integration: Logic Query Languages

Formal Semantics of Programming Languages

A set with only one member is called a SINGLETON. A set with no members is called the EMPTY SET or 2 N

Discrete Mathematics

NP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions.

8.1 Polynomial-Time Reductions

CS 151 Complexity Theory Spring Final Solutions. L i NL i NC 2i P.

Reductions and Satisfiability

A. Arratia Program Schemes 1. Program Schemes. Reunión MOISES, 3 Septiembre Argimiro Arratia. Universidad de Valladolid

Taibah University College of Computer Science & Engineering Course Title: Discrete Mathematics Code: CS 103. Chapter 2. Sets

NP and computational intractability. Kleinberg and Tardos, chapter 8

Overview. CS389L: Automated Logical Reasoning. Lecture 6: First Order Logic Syntax and Semantics. Constants in First-Order Logic.

Deductive Methods, Bounded Model Checking

4. Logical Values. Our Goal. Boolean Values in Mathematics. The Type bool in C++

Declarative programming. Logic programming is a declarative style of programming.

Posets, graphs and algebras: a case study for the fine-grained complexity of CSP s

arxiv: v1 [cs.db] 23 May 2016

! Greed. O(n log n) interval scheduling. ! Divide-and-conquer. O(n log n) FFT. ! Dynamic programming. O(n 2 ) edit distance.

ALGORITHMIC DECIDABILITY OF COMPUTER PROGRAM-FUNCTIONS LANGUAGE PROPERTIES. Nikolay Kosovskiy

Introduction II. Sets. Terminology III. Definition. Definition. Definition. Example

Constraint Satisfaction Problems

1 Elementary number theory

Automated Reasoning. Natural Deduction in First-Order Logic

Semantic Errors in Database Queries

Foundations of AI. 9. Predicate Logic. Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution

Foundations of Databases

CONVENTIONAL EXECUTABLE SEMANTICS. Grigore Rosu CS522 Programming Language Semantics

DATABASE THEORY. Lecture 11: Introduction to Datalog. TU Dresden, 12th June Markus Krötzsch Knowledge-Based Systems

Relational Algebra and Calculus

Simply-Typed Lambda Calculus

LOGIC AND DISCRETE MATHEMATICS

Automata Theory for Reasoning about Actions

Th(N, +) is decidable

CONVENTIONAL EXECUTABLE SEMANTICS. Grigore Rosu CS422 Programming Language Semantics

Chapter 8. NP and Computational Intractability. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Chapter 10 Part 1: Reduction

CS 580: Algorithm Design and Analysis

Foundations of Databases

COMPUTER SCIENCE TRIPOS

3/7/2018. CS 580: Algorithm Design and Analysis. 8.1 Polynomial-Time Reductions. Chapter 8. NP and Computational Intractability

NP versus PSPACE. Frank Vega. To cite this version: HAL Id: hal

Lectures 20, 21: Axiomatic Semantics

Lost in translation. Leonardo de Moura Microsoft Research. how easy problems become hard due to bad encodings. Vampire Workshop 2015

Ontology and Database Systems: Foundations of Database Systems

Decision Procedures in First Order Logic

On the Computational Complexity of Minimal-Change Integrity Maintenance in Relational Databases

The Logic of the Semantic Web. Enrico Franconi Free University of Bozen-Bolzano, Italy

λ-calculus Lecture 1 Venanzio Capretta MGS Nottingham

3. Logical Values. Our Goal. Boolean Values in Mathematics. The Type bool in C++

Transcription:

The Relational Model David Toman School of Computer Science University of Waterloo Introduction to Databases CS348 David Toman (University of Waterloo) The Relational Model 1 / 28

The Relational Model Idea All information is organized in (a finite number of) relations. Features: simple and clean data model powerful and declarative query/update languages semantic integrity constraints data independence David Toman (University of Waterloo) The Relational Model 2 / 28

The Relational Model Idea All information is organized in (a finite number of) relations. Features: simple and clean data model powerful and declarative query/update languages semantic integrity constraints data independence David Toman (University of Waterloo) The Relational Model 2 / 28

Relational Structures/Databases Components: Universe Relation Database a set of values D with equality (=) schema: name R, arity k (number of attributes) instance: a relation R D k. schema: finite set of relation schemes instance: a relation R i for each R i Notation Signature: = (R 1 ; : : : ; R k ) Instance: D = (D; =; R 1 ; : : : ; R k ) David Toman (University of Waterloo) The Relational Model 3 / 28

Examples of Relational Structures the integer numbers with addition and multiplication: (Z; =; plus; times) a Bibliography Database: (D; =; author; wrote; publication; : : :) for D the set of strings and ints an (abstraction of) OS Kernel: (D; =; process; file; waits-for; : : :) for D the set of strings, ints, and addresses... David Toman (University of Waterloo) The Relational Model 4 / 28

Examples of Relational Structures the integer numbers with addition and multiplication: (Z; =; plus; times) a Bibliography Database: (D; =; author; wrote; publication; : : :) for D the set of strings and ints an (abstraction of) OS Kernel: (D; =; process; file; waits-for; : : :) for D the set of strings, ints, and addresses... David Toman (University of Waterloo) The Relational Model 4 / 28

Examples of Relational Structures the integer numbers with addition and multiplication: (Z; =; plus; times) a Bibliography Database: (D; =; author; wrote; publication; : : :) for D the set of strings and ints an (abstraction of) OS Kernel: (D; =; process; file; waits-for; : : :) for D the set of strings, ints, and addresses... David Toman (University of Waterloo) The Relational Model 4 / 28

Example: Bibliography Relations (in the cs348 database) used in examples: author(aid, name) wrote(author, publication) publication(pubid, title) book(pubid, publisher, year) journal(pubid, volume, no, year) proceedings(pubid, year) article(pubid, crossref, startpage, endpage) ) names of attributes will be important later (for SQL) David Toman (University of Waterloo) The Relational Model 5 / 28

Example (sample instance) author = f (1; John); (2; Sue) g wrote = f (1; 1); (1; 4); (2; 3) g publication = f (1; Mathematical Logic); (3; Trans. Databases); (2; Principles of DB Syst.); (4; Query Languages) g book = f (1; AMS; 1990) g journal = f (3; 35; 1; 1990) g proceedings = f (2; 1995) g article = f (4; 2; 30; 41) g David Toman (University of Waterloo) The Relational Model 6 / 28

Example (tabular form) author aid name 1 John 2 Sue wrote author publication 1 1 1 4 2 3 publication pubid title 1 Mathematical Logic 3 Trans. Databases 2 Principles of DB Syst. 4 Query Languages ) that s why relations are often called tables. David Toman (University of Waterloo) The Relational Model 7 / 28

Example: OS Kernel schema: process(pid, ppid, uid, stime, time, cmd) file(fd, pid, name, dentry) waits-for(pid, wpid) instance depends on what s going on in the kernel: UID PID PPID STIME TIME CMD root 1 0 Sep16 00:00:00 init [2] root 2 1 Sep16 00:00:00 [ksoftirqd/0] root 3 1 Sep16 00:00:00 [watchdog/0] root 4 1 Sep16 00:00:00 [events/0] root 5 1 Sep16 00:00:00 [khelper] root 6 1 Sep16 00:00:00 [kthread] root 9 6 Sep16 00:00:00 [kblockd/0]... david 28382 5790 10:10 00:00:00 bash david 29903 28382 10:10 00:00:00 ps -ef David Toman (University of Waterloo) The Relational Model 8 / 28

Example: OS Kernel schema: process(pid, ppid, uid, stime, time, cmd) file(fd, pid, name, dentry) waits-for(pid, wpid) instance depends on what s going on in the kernel: UID PID PPID STIME TIME CMD root 1 0 Sep16 00:00:00 init [2] root 2 1 Sep16 00:00:00 [ksoftirqd/0] root 3 1 Sep16 00:00:00 [watchdog/0] root 4 1 Sep16 00:00:00 [events/0] root 5 1 Sep16 00:00:00 [khelper] root 6 1 Sep16 00:00:00 [kthread] root 9 6 Sep16 00:00:00 [kblockd/0]... david 28382 5790 10:10 00:00:00 bash david 29903 28382 10:10 00:00:00 ps -ef David Toman (University of Waterloo) The Relational Model 8 / 28

Atomic Queries Idea Relationships between objects (tuples) that are present in an instance are true, relationships absent are false. In the sample Bibliography database instance John is an author with id 1 : (1; John) 2 author; Mathematical Logic is a publication: (1; Mathematical Logic) 2 publication; Moreover it is a book published by AMS in 1990 : (1; AMS; 1990) 2 book; John wrote Mathematical Logic : John has NOT written Trans. Databases : etc. (1; 1) 2 wrote; (1; 3) 62 wrote; David Toman (University of Waterloo) The Relational Model 9 / 28

Atomic Queries Idea Relationships between objects (tuples) that are present in an instance are true, relationships absent are false. In the sample Bibliography database instance John is an author with id 1 : (1; John) 2 author; Mathematical Logic is a publication: (1; Mathematical Logic) 2 publication; Moreover it is a book published by AMS in 1990 : (1; AMS; 1990) 2 book; John wrote Mathematical Logic : John has NOT written Trans. Databases : etc. (1; 1) 2 wrote; (1; 3) 62 wrote; David Toman (University of Waterloo) The Relational Model 9 / 28

Relational Calculus: Syntax Idea Complex statements about truth can be formulated using the language of first-order logic. Definition (Syntax) Given a database schema = (R 1 ; : : : ; R k ) and a set of variable names fx 1 ; x 2 ; : : :g, formulas are defined by ' ::= R i (x i 1 ; : : : ; x i k ) j x i = x j j ' ^ ' j 9x i :' {z } conjunctive formulas j ' _ ' {z } positive formulas j :' {z } rst-order formulas David Toman (University of Waterloo) The Relational Model 10 / 28

First-order Variables and Valuations How do we interpret variables? Definition (Valuation) A valuation is a function : fx 1 ; x 2 ; : : :g! D that maps variable names to values in the universe. Idea Answers to queries, valuations to free variables that make the formula true with respect to a database. David Toman (University of Waterloo) The Relational Model 11 / 28

First-order Variables and Valuations How do we interpret variables? Definition (Valuation) A valuation is a function : fx 1 ; x 2 ; : : :g! D that maps variable names to values in the universe. Idea Answers to queries, valuations to free variables that make the formula true with respect to a database. David Toman (University of Waterloo) The Relational Model 11 / 28

Complete Semantics Definition The truth of formulas is defined with respect to 1 a database instance D = (D; =; R; S; : : :), and 2 a valuation : fx 1 ; x 2 ; : : :g! D as follows: D ; j= R(x i 1 ; : : : ; x i k ) if R 2 ; ((x i 1 ); : : : ; (x i k )) 2 R D ; j= x i = x j if (x i ) = (x j ) D ; j= ' ^ if D ; j= ' and D ; j= D ; j= :' if not D ; j= ' D ; j= 9x i :' if D ; [x i 7! v ] j= ' for some v 2 D Definition An answer to a query ' over a database D is a relation: '(D) = f jfv(') : D ; j= 'g David Toman (University of Waterloo) The Relational Model 12 / 28

Complete Semantics Definition The truth of formulas is defined with respect to 1 a database instance D = (D; =; R; S; : : :), and 2 a valuation : fx 1 ; x 2 ; : : :g! D as follows: D ; j= R(x i 1 ; : : : ; x i k ) if R 2 ; ((x i 1 ); : : : ; (x i k )) 2 R D ; j= x i = x j if (x i ) = (x j ) D ; j= ' ^ if D ; j= ' and D ; j= D ; j= :' if not D ; j= ' D ; j= 9x i :' if D ; [x i 7! v ] j= ' for some v 2 D Definition An answer to a query ' over a database D is a relation: '(D) = f jfv(') : D ; j= 'g David Toman (University of Waterloo) The Relational Model 12 / 28

Examples of Queries over integers: list all pairs of numbers that add to 10 list all composite numbers list all prime numbers over the bibliography database: list all publications list titles of all publications list titles of all books list all publications without authors list (pairs of) coauthor names list titles of publications written by a single author w.r.t. the OS kernel: list pairs of PIDs of processes that share an open file list PIDs of processes do not have any open file list pairs of PIDs of processes that wait for each other David Toman (University of Waterloo) The Relational Model 13 / 28

Examples of Queries over integers: list all pairs of numbers that add to 10 list all composite numbers list all prime numbers over the bibliography database: list all publications list titles of all publications list titles of all books list all publications without authors list (pairs of) coauthor names list titles of publications written by a single author w.r.t. the OS kernel: list pairs of PIDs of processes that share an open file list PIDs of processes do not have any open file list pairs of PIDs of processes that wait for each other David Toman (University of Waterloo) The Relational Model 13 / 28

Examples of Queries over integers: list all pairs of numbers that add to 10 list all composite numbers list all prime numbers over the bibliography database: list all publications list titles of all publications list titles of all books list all publications without authors list (pairs of) coauthor names list titles of publications written by a single author w.r.t. the OS kernel: list pairs of PIDs of processes that share an open file list PIDs of processes do not have any open file list pairs of PIDs of processes that wait for each other David Toman (University of Waterloo) The Relational Model 13 / 28

Equivalences and Syntactic Sugar Boolean Equivalences First-order Equivalences :(:') ' ' _ :(:' ^ : ) '! :' _ ' $ ('! ) ^ (! ')... 8x :' :9x ::' David Toman (University of Waterloo) The Relational Model 14 / 28

Integrity Constraints Relational signature captures only the structure of relations. Idea Valid database instances satisfy additional integrity constraints. values of a particular attribute belong to a prescribed data type. values of attributes are unique among tuples in a relation (keys). values appearing in one relation must also appear in another relation (referential integrity). values cannot appear simultaneously in certain relations (disjointness). values in certain relation must appear in at least one of another set of relations (coverage).... David Toman (University of Waterloo) The Relational Model 15 / 28

Example Revisited (Integers) What must be always true for the integers? plus is commutative plus is a (relation representing a) binary function plus and times obey the distributive law Idea Integrity constraints ) closed queries true for every valid database instance. David Toman (University of Waterloo) The Relational Model 16 / 28

Example Revisited (Integers) What must be always true for the integers? plus is commutative plus is a (relation representing a) binary function plus and times obey the distributive law Idea Integrity constraints ) closed queries true for every valid database instance. David Toman (University of Waterloo) The Relational Model 16 / 28

Example Revisited (Bibliography) Typing constraints Uniqueness of values/keys Author id s are integers. Author names are strings. Author id s are unique and determine author names. Publication id s are unique as well. Articles are identified by their id and the id of a collection they have appeared in. Referential Integrity/Foreign Keys books, journals, proceedings, and articles are publications. The components of a wrote tuple must be an author and a publication. David Toman (University of Waterloo) The Relational Model 17 / 28

Example Revisited (cont.) Disjointness Coverage books are different from journals. books are different from proceedings. Every publication is a book or a journal or a proceedings or an article. Every article appears in a book or in a journal or in proceedings. David Toman (University of Waterloo) The Relational Model 18 / 28

Views and Integrity Constraints Idea Answers to queries can be used to define derived relations (views) ) extension of a DB schema subtraction, complement,... collection-style publication, editor,... In general, a view is an integrity constraint of the form 8x 1 ; : : : ; x k :R(x 1 ; : : : ; x k ) () ' where R is a fresh relation name and x 1 ; : : : ; x k are free variables of '. ) why not simply an arbitrary formula containing R? David Toman (University of Waterloo) The Relational Model 19 / 28

Views and Integrity Constraints Idea Answers to queries can be used to define derived relations (views) ) extension of a DB schema subtraction, complement,... collection-style publication, editor,... In general, a view is an integrity constraint of the form 8x 1 ; : : : ; x k :R(x 1 ; : : : ; x k ) () ' where R is a fresh relation name and x 1 ; : : : ; x k are free variables of '. ) why not simply an arbitrary formula containing R? David Toman (University of Waterloo) The Relational Model 19 / 28

Views and Integrity Constraints Idea Answers to queries can be used to define derived relations (views) ) extension of a DB schema subtraction, complement,... collection-style publication, editor,... In general, a view is an integrity constraint of the form 8x 1 ; : : : ; x k :R(x 1 ; : : : ; x k ) () ' where R is a fresh relation name and x 1 ; : : : ; x k are free variables of '. ) why not simply an arbitrary formula containing R? David Toman (University of Waterloo) The Relational Model 19 / 28

Views and Integrity Constraints Idea Answers to queries can be used to define derived relations (views) ) extension of a DB schema subtraction, complement,... collection-style publication, editor,... In general, a view is an integrity constraint of the form 8x 1 ; : : : ; x k :R(x 1 ; : : : ; x k ) () ' where R is a fresh relation name and x 1 ; : : : ; x k are free variables of '. ) why not simply an arbitrary formula containing R? David Toman (University of Waterloo) The Relational Model 19 / 28

Story so far... 1 databases, relational structures 2 queries, formulas in First-Order logic 3 integrity constraints, closed formulas in FO logic... so is there anything new here? )YES: database instances must be finite David Toman (University of Waterloo) The Relational Model 20 / 28

Story so far... 1 databases, relational structures 2 queries, formulas in First-Order logic 3 integrity constraints, closed formulas in FO logic... so is there anything new here? )YES: database instances must be finite David Toman (University of Waterloo) The Relational Model 20 / 28

Story so far... 1 databases, relational structures 2 queries, formulas in First-Order logic 3 integrity constraints, closed formulas in FO logic... so is there anything new here? )YES: database instances must be finite David Toman (University of Waterloo) The Relational Model 20 / 28

Unsafe Queries :9x :author(x ; y) book(x ; y ; z ) _ proceedings(x ; y) x = y ) we want only queries with finite answers (over finite databases). Definition (Domain-independent Query) A query is domain-independent if its answer depends only in the database instance (but not on the domain D). Theorem Answers to domain-independent queries contain only values that exist in the database instance (or as a constant in the query). Domain-independent + finite database ) safe David Toman (University of Waterloo) The Relational Model 21 / 28

Unsafe Queries :9x :author(x ; y) book(x ; y ; z ) _ proceedings(x ; y) x = y ) we want only queries with finite answers (over finite databases). Definition (Domain-independent Query) A query is domain-independent if its answer depends only in the database instance (but not on the domain D). Theorem Answers to domain-independent queries contain only values that exist in the database instance (or as a constant in the query). Domain-independent + finite database ) safe David Toman (University of Waterloo) The Relational Model 21 / 28

Unsafe Queries :9x :author(x ; y) book(x ; y ; z ) _ proceedings(x ; y) x = y ) we want only queries with finite answers (over finite databases). Definition (Domain-independent Query) A query is domain-independent if its answer depends only in the database instance (but not on the domain D). Theorem Answers to domain-independent queries contain only values that exist in the database instance (or as a constant in the query). Domain-independent + finite database ) safe David Toman (University of Waterloo) The Relational Model 21 / 28

Unsafe Queries :9x :author(x ; y) book(x ; y ; z ) _ proceedings(x ; y) x = y ) we want only queries with finite answers (over finite databases). Definition (Domain-independent Query) A query is domain-independent if its answer depends only in the database instance (but not on the domain D). Theorem Answers to domain-independent queries contain only values that exist in the database instance (or as a constant in the query). Domain-independent + finite database ) safe David Toman (University of Waterloo) The Relational Model 21 / 28

Unsafe Queries :9x :author(x ; y) book(x ; y ; z ) _ proceedings(x ; y) x = y ) we want only queries with finite answers (over finite databases). Definition (Domain-independent Query) A query is domain-independent if its answer depends only in the database instance (but not on the domain D). Theorem Answers to domain-independent queries contain only values that exist in the database instance (or as a constant in the query). Domain-independent + finite database ) safe David Toman (University of Waterloo) The Relational Model 21 / 28

Safety and Query Satisfiability Theorem Satisfiability a of first-order formulas is undecidable; co-r.e. in general r.e for finite databases a Is there a database for which the answer is non-empty? Proof. Reduction from PCP (see Abiteboul et. al. book, p.122-126). Theorem Domain-independence of first-order queries is undecidable. Proof. ' is satisfiable iff x = y ^ ' is not domain-independent. David Toman (University of Waterloo) The Relational Model 22 / 28

Safety and Query Satisfiability Theorem Satisfiability a of first-order formulas is undecidable; co-r.e. in general r.e for finite databases a Is there a database for which the answer is non-empty? Proof. Reduction from PCP (see Abiteboul et. al. book, p.122-126). Theorem Domain-independence of first-order queries is undecidable. Proof. ' is satisfiable iff x = y ^ ' is not domain-independent. David Toman (University of Waterloo) The Relational Model 22 / 28

Range-restricted Queries Definition (Range restricted queries) Q ::= R(x i 1 ; : : : ; x i k ) Q ^ Q Q ^ x i = x j 9x i :Q Q 1 _ Q 2 Q 1 ^ :Q 2 ) ) x i ; x j 2 FV (Q) FV (Q 1 ) = FV (Q 2 ) Theorem Range-restricted ) Domain-independent. David Toman (University of Waterloo) The Relational Model 23 / 28

Range-restricted Queries Definition (Range restricted queries) Q ::= R(x i 1 ; : : : ; x i k ) Q ^ Q Q ^ x i = x j 9x i :Q Q 1 _ Q 2 Q 1 ^ :Q 2 ) ) x i ; x j 2 FV (Q) FV (Q 1 ) = FV (Q 2 ) Theorem Range-restricted ) Domain-independent. David Toman (University of Waterloo) The Relational Model 23 / 28

Safe v.s. Range-restricted Do we lose expressiveness by using to Range-restricted queries? Theorem Every domain-independent query can be written equivalently as a Range restricted query. Proof. 1 restrict every variable in ' to active domain (constants present in the database instance), 2 express the active domain using a unary query over the database instance. David Toman (University of Waterloo) The Relational Model 24 / 28

Computational Properties Evaluation of every query terminates ) relational calculus is not Turing complete Data Complexity in the size of the database, for a fixed query. ) in PTIME ) in LOGSPACE ) AC 0 (constant time on polynomially many CPUs in parallel) Combined complexity ) in PSPACE ) can express NP-hard problems (encode SAT) David Toman (University of Waterloo) The Relational Model 25 / 28

Query Evaluation vs. Theorem Proving Query Evaluation Given a query ' and a finite database instance D find substitutions such that D ; j= '. Query Satisfiability Given a query ' find whether there is a (finite) database instance D and a substitution such that D ; j= '. much harder (undecidable) problem can be solved for fragments (conjunctive/positive) queries David Toman (University of Waterloo) The Relational Model 26 / 28

Query Equivalence and Schema Do we ever need the power of theorem proving? Definition (Query Subsumption) ' subsumes (with respect to a schema ) if '(D) (D) for every database D such that D j=. necessary for query simplification equivalent to proving ^ i 2 i! (8x 1 ; : : : x k :'! ) undecidable in general; decidable for fragments of RC David Toman (University of Waterloo) The Relational Model 27 / 28

Query Equivalence and Schema Do we ever need the power of theorem proving? Definition (Query Subsumption) ' subsumes (with respect to a schema ) if '(D) (D) for every database D such that D j=. necessary for query simplification equivalent to proving ^ i 2 i! (8x 1 ; : : : x k :'! ) undecidable in general; decidable for fragments of RC David Toman (University of Waterloo) The Relational Model 27 / 28

What queries cannot be expressed in RC? Note RC is not Turing-complete ) there must be computable queries that cannot be written in RC. Built-in Operations Counting/Aggregation ordering, arithmetic, string operations, etc. cardinality of sets (parity) Reachability/Connectivity/Transitive closure paths in a graph (binary relation)? Model extensions: Incompleteness/Inconsistency tuples with unknown (but existing) values incomplete relations and open world assumption conflicting information (e.g., from different sources) David Toman (University of Waterloo) The Relational Model 28 / 28

What queries cannot be expressed in RC? Note RC is not Turing-complete ) there must be computable queries that cannot be written in RC. Built-in Operations Counting/Aggregation ordering, arithmetic, string operations, etc. cardinality of sets (parity) Reachability/Connectivity/Transitive closure paths in a graph (binary relation)? Model extensions: Incompleteness/Inconsistency tuples with unknown (but existing) values incomplete relations and open world assumption conflicting information (e.g., from different sources) David Toman (University of Waterloo) The Relational Model 28 / 28

What queries cannot be expressed in RC? Note RC is not Turing-complete ) there must be computable queries that cannot be written in RC. Built-in Operations Counting/Aggregation ordering, arithmetic, string operations, etc. cardinality of sets (parity) Reachability/Connectivity/Transitive closure paths in a graph (binary relation)? Model extensions: Incompleteness/Inconsistency tuples with unknown (but existing) values incomplete relations and open world assumption conflicting information (e.g., from different sources) David Toman (University of Waterloo) The Relational Model 28 / 28

What queries cannot be expressed in RC? Note RC is not Turing-complete ) there must be computable queries that cannot be written in RC. Built-in Operations Counting/Aggregation ordering, arithmetic, string operations, etc. cardinality of sets (parity) Reachability/Connectivity/Transitive closure paths in a graph (binary relation)? Model extensions: Incompleteness/Inconsistency tuples with unknown (but existing) values incomplete relations and open world assumption conflicting information (e.g., from different sources) David Toman (University of Waterloo) The Relational Model 28 / 28