Towards a Logical Reconstruction of Relational Database Theory

Similar documents
Foundations of AI. 9. Predicate Logic. Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution

Lecture 1: Conjunctive Queries

Propositional Logic. Part I

CSC Discrete Math I, Spring Sets

Overview. CS389L: Automated Logical Reasoning. Lecture 6: First Order Logic Syntax and Semantics. Constants in First-Order Logic.

Safe Stratified Datalog With Integer Order Does not Have Syntax

Chapter 2 & 3: Representations & Reasoning Systems (2.2)

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Module 6. Knowledge Representation and Logic (First Order Logic) Version 2 CSE IIT, Kharagpur

A Retrospective on Datalog 1.0

3.4 Deduction and Evaluation: Tools Conditional-Equational Logic

Module 6. Knowledge Representation and Logic (First Order Logic) Version 2 CSE IIT, Kharagpur

STABILITY AND PARADOX IN ALGORITHMIC LOGIC

This is already grossly inconvenient in present formalisms. Why do we want to make this convenient? GENERAL GOALS

DATABASE THEORY. Lecture 11: Introduction to Datalog. TU Dresden, 12th June Markus Krötzsch Knowledge-Based Systems

Proseminar on Semantic Theory Fall 2013 Ling 720 An Algebraic Perspective on the Syntax of First Order Logic (Without Quantification) 1

Constraint Solving. Systems and Internet Infrastructure Security

CSC 501 Semantics of Programming Languages

15-819M: Data, Code, Decisions

Foundations of Schema Mapping Management

INCONSISTENT DATABASES

The Relational Model

Review Material: First Order Logic (FOL)

DATABASE THEORY. Lecture 18: Dependencies. TU Dresden, 3rd July Markus Krötzsch Knowledge-Based Systems

Situation Calculus and YAGI

Propositional Calculus: Boolean Algebra and Simplification. CS 270: Mathematical Foundations of Computer Science Jeremy Johnson

Lecture 4: January 12, 2015

CS 512, Spring 2017: Take-Home End-of-Term Examination

Logical reconstruction of RDF and ontology languages

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

Ontology and Database Systems: Foundations of Database Systems

Automata Theory for Reasoning about Actions

EXTENSIONS OF FIRST ORDER LOGIC

CSE 20 DISCRETE MATH. Fall

Uncertain Data Models

The Inverse of a Schema Mapping

CSCI.6962/4962 Software Verification Fundamental Proof Methods in Computer Science (Arkoudas and Musser) Chapter p. 1/27

"Relations for Relationships"

Typed First-order Logic

Resolution (14A) Young W. Lim 6/14/14

Chapter 3. Describing Syntax and Semantics

Detecting Logical Errors in SQL Queries

Chapter 3: Propositional Languages

CMPS 277 Principles of Database Systems. Lecture #4

Discrete Mathematics Lecture 4. Harper Langston New York University

Knowledge Representation and Reasoning Logics for Artificial Intelligence

Semantics and Pragmatics of NLP Propositional Logic, Predicates and Functions

Knowledge Representation and Reasoning Logics for Artificial Intelligence

2.1 Sets 2.2 Set Operations

Propositional Logic Formal Syntax and Semantics. Computability and Logic

Declarative Programming. 2: theoretical backgrounds

Introductory logic and sets for Computer scientists

From Types to Sets in Isabelle/HOL

Software Engineering Lecture Notes

Knowledge Representation

Schema Mappings and Data Exchange

First-Order Logic PREDICATE LOGIC. Syntax. Terms

RESULTS ON TRANSLATING DEFAULTS TO CIRCUMSCRIPTION. Tomasz Imielinski. Computer Science Department Rutgers University New Brunswick, N.

Typed Lambda Calculus

CMPS 277 Principles of Database Systems. Lecture #3

Programming Languages Third Edition

Computer Science Technical Report

Inconsistency-tolerant logics

A fuzzy subset of a set A is any mapping f : A [0, 1], where [0, 1] is the real unit closed interval. the degree of membership of x to f

Relative Information Completeness

Declarative programming. Logic programming is a declarative style of programming.

Recap Datalog Datalog Syntax Datalog Semantics. Logic: Datalog. CPSC 322 Logic 6. Textbook Logic: Datalog CPSC 322 Logic 6, Slide 1

CSCI.6962/4962 Software Verification Fundamental Proof Methods in Computer Science (Arkoudas and Musser) Sections p.

An Annotated Language

Binary Decision Diagrams

Axiomatic Specification. Al-Said, Apcar, Jerejian

SOFTWARE ENGINEERING DESIGN I

Notes on Default Reasoning

T h e incomplete database

Principles of Knowledge Representation and Reasoning

9/19/12. Why Study Discrete Math? What is discrete? Sets (Rosen, Chapter 2) can be described by discrete math TOPICS

Th(N, +) is decidable

Topic Maps Reference Model, version 6.0

CSE 20 DISCRETE MATH. Winter

Semantic Errors in Database Queries

Logic Programming and Reasoning about Actions

Alphabets, strings and formal. An introduction to information representation

A set with only one member is called a SINGLETON. A set with no members is called the EMPTY SET or 2 N

COMP718: Ontologies and Knowledge Bases

Negations in Refinement Type Systems

Lecture 5. Logic I. Statement Logic

Semantic Forcing in Disjunctive Logic Programs

Typed Lambda Calculus for Syntacticians

Logic and its Applications

Data Integration: Logic Query Languages

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

Foundations of Databases

So many Logics, so little time

The three faces of homotopy type theory. Type theory and category theory. Minicourse plan. Typing judgments. Michael Shulman.

Integrity Constraints For Access Control Models

Database Theory VU , SS Codd s Theorem. Reinhard Pichler

Hoare Logic. COMP2600 Formal Methods for Software Engineering. Rajeev Goré

Automated Reasoning. Natural Deduction in First-Order Logic

Semantic data integration in P2P systems

Transcription:

Towards a Logical Reconstruction of Relational Database Theory On Conceptual Modelling, Lecture Notes in Computer Science. 1984 Raymond Reiter Summary by C. Rey November 27, 2008-1 / 63

Foreword DB: 2 points of view Contributions Contributions 2 / 63

Why reading this paper? Foreword DB: 2 points of view Contributions Contributions Consolidate my database bakground Better understand the link between logic and databases Precise some basic databases notions: the closed world assumption the domain closure assumption the unique name assumption the notion of safe query the notion of incomplete information (disjunctive information and null values) Begin some theoretical database course Focus on some points that are not necessarily the main contributions of the paper formalization informal explanations 3 / 63

Databases: 2 points of view Foreword DB: 2 points of view Contributions Contributions DB theoretician pov = logical pov a DB is a model (now we d say "finite" model) a model of some integrity constraint a query = a formula to be evaluated wrt this model DB logicians = proof theoretic pov a DB = a sef of FO formulae a query = a formula to prove given a DB as premises integrity constraint satisfaction = consistency checking efforts on proof finding algorithms 4 / 63

Paper contributions Foreword DB: 2 points of view Contributions Contributions Best paradigm? model theory? proof theory? Aim of the paper: give an answer s reconciliable paradigm proof theory: a little bit more fruitful 5 / 63

Main contribution of the paper Foreword DB: 2 points of view Contributions Contributions Let a without null value It can be transformed into a set of FO axioms st this theory is a characterization in proof theory of query evaluation integrity constraint These theories can be generalized to incorporate more real world knowledge to characterize incomplete information to characterize different semantics for null values to define precisely the notion of an answer to a query in the context of previous points define precisely the notion of integrity constraint and the constraint satisfaction problem 6 / 63

FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Databases and Logic The Model Theoretic Perspective 7 / 63

First order languages FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems A first order language F = (A, W) A is an infinite alphabet of symbols containing : an infinite set of constants (may be empty): a, b, P aris, 34, P art 10,... an infinite set of variables: x, y, z, z 1, x 5,... an infinite set of predicates ( 1): P, Q, SUP P LIES,... Each having an arity n. no functor (no function name) ponctuation signs: (,),. logical connectors:,,,, A term is a variable or a constant of A 8 / 63

First order languages FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems A first order language F = (A, W) W is the set of well formed formulae, i.e. the smallest set containing: all atomic formulae, that are expressions of the form P (t 1,..., t n ) where P is a predicate name and each t i is a term 1 all expressions using one or two well formed formulae φ and ψ : φ ψ φ ψ φ ψ ψ φ ψ all expressions built using a logical quantifier among {, } and using one variable x and one well formed formulae ψ: x ψ x ψ 1 If all t i are constants, then it is called a ground atomic formula. 9 / 63

Relational languages FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems A FO language F = (A, W) is a relational language iff A is st: the number of constants is finite, and at least 1 the number of predicates is finite among predicates, = is the equality one among predicates, some unary ones are "simple types" which boolean combinations model the notion of the domain of a relation 10 / 63

Types of a relational language FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Let R = (A, W) be a relational language. The set of types of R is the smallest set containing: all simple types of A all boolean combinations of types τ 1 and τ 2 : τ 1 τ 2 τ 1 τ 2 τ 1 Type-restricted quantifiers, with τ a type: x τ, φ means x, τ(x) φ x τ, φ means x, τ(x) φ 11 / 63

Semantics of FO languages FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Defining the semantics is classically done as follows: Defining the notion of interpretation (constant and predicate semantics) Defining the notion of assignement (variable semantics) Defining the truth value of formulae (formulae semantics) 12 / 63

Interpretation FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems An interpretation I of the relational language F = (A, W) is a couple ( I, I ) where : I is a non empty set called the domain (or the universe) I is the interpretation function defined as follows: for each constant c: c I I for each predicate name P of arity n: P I I n. P I is called the extension of P in I. 13 / 63

Assignment FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems An assignment is a mapping v from variables of A to I. x variable f A, v(x) I. It is sometimes noted {x 1 /a 1,..., x n /a n }. If x is a new variable, then v {x/c} is identical to v and in addition maps x to c. All assignments that we will use will be implicitely extended to constants as the identity function : for all constant c from A, v(c) := c. 14 / 63

Truth value of formula FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Let R = (A, W) be a relational language, I an interpretation of R, and v an assignment. The = relation is defined as follows: 1. I = P (t 1,..., t n )[v] iff (v(t 1 ),..., v(t n )) P I, for each atomic formula P (t 1,..., t n ) in W. 2. I = (t 1 = t 2 )[v] iff v(t 1 ) = v(t 2 ) 3. I = (φ ψ)[v] iff I = φ[v] and I = ψ[v] 4. I = (φ ψ)[v] iff I = φ[v] or I = ψ[v] 5. I = ( φ)[v] iff not I = φ[v] 6. I = (φ ψ)[v] iff I = ( φ ψ)[v] 7. I = (φ ψ)[v] iff I = ((φ ψ) (ψ φ))[v] 8. I = ( x φ)[v] iff for all c I, I = φ[v {x/c}] 9. I = ( x φ)[v] iff there exists c I, I = φ[v {x/c}] 15 / 63

Model of one (or many) formula(e) FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Let R = (A, W) be a relational language, I an interpretation of F, and v an assignment. We say: I satisfies φ knowing v iff I = φ[v] φ is false in I iff there is no assignment v for which there is I = φ[v] φ is true in I iff I = φ[v], for all assignments v. In this case, I is said to be a model of φ.we note it I = φ. I is a model of a set S of well formed formula φ iff I = φ for each φ S. 16 / 63

Relational interpretation FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Let R = (A, W) be a relational language (recall: finite number of constants and predicates). An interpretation I is a relational interpretation for R iff: the restriction of. I to constants is bijective Equality is defined by extension: = I = {(d, d) d I } Important consequence: I is finite 17 / 63

Relational database FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems A relational database is a triple (R, I, IC): R is a relational language I is a relational interpretation of R IC is a set of wffs of R called integrity constraints, such that: predicate P (n) (except = and simple types), IC must contain: x 1... x n (P (x 1,..., x n ) τ 1 (x 1 )... τ n (x n )) where the τ i are types 2. 2 τ 1,..., τ n are called the domains of P 18 / 63

Vocabulary FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems predicate P (except types), P I is called the relation (of) P The integrity constraints IC of (R, I, IC) are satisfied iff I is a model of IC. 19 / 63

Example FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Let R = (A, W) be a relational language. Predicates: teacher (1), course (1), student (1), teach (2), enrolled (2), = (2) Simple types: teacher (1), course (1), student (1) Constants: A, B, C, a, b, c, d, CS100, CS200, P 100, P 200 Interpretation domain I : { A, B, C, a, b, c, d, CS100, CS200, P 100, P 200 } Remark: looks like a Herbrand interpretation not presented like that: only a way to give a name to elements of the domain 20 / 63

Example (cont.) FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Predicates interpretation (as tables) teacher student A B C a b c d course enrolled CS100 CS200 P 100 P 200 a CS100 a P 100 b CS100 c P 100 d CS200 d P 200 teach = A CS100 A CS200 B P 100 C P 200 A A B B C C a a b b c c d d CS100 CS100 CS200 CS200 P 100 P 100 P 200 P 200 21 / 63

Example (cont.) FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems The previous interpretation is a model of: (2.1) x y(teach(x, y) teacher(x) course(y)) (2.2) x y(enrolled(x, y) student(x) course(y)) (2.3) ( x course)( y teacher)teach(y, x) (2.4) ( x teacher)( y course)teach(x, y) s: (R, I, IC = {(2.1), (2.2), (2.3), (2.4)}) is a relational database (2.1) and (2.2) defines the domains of teach and enrolled the relational database satisfies its integrity constraints 22 / 63

Remarks FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Fundamental assumptions here Unique name assumption: distinctly named elements of the domain are in fact distinct. A finite domain Closed world assumption (completion axioms) These allows: extensional definition of = = (d, d ) is false for all distinct elements d and d of the domain closed world assumption for = extensional definition of other predicate (<, >,...) is also possible determining truth of a sentence reduces to purely propositional truth table evaluations. Integrity constraints studied here correspond to static integrity constraints or state laws: to be satisfied by any state of the database do not correspond to dynamic integrity constraints or transition laws 23 / 63

A first order query language FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Let R = (A, W) be a relational language. A query for R is any expression of the form: ( n ) {x 1,..., x n τ i (x i ) W (x 1,..., x n )} where: i=1 x i s are variables of A τ i s are types composed of simple types of A W (x 1,..., x n ) W the only free variables of W (x 1,..., x n ) are in {x 1,..., x n } all quantifiers are type restricted 24 / 63

Answer to a query FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Let DB = (R, I, IC) a relational database. A query for R is applicable to DB. Let DB = (R = (A, W), I, IC) a relational database. Let Q be the following query {x 1,..., x n ( n i=1 τ i(x i )) W (x 1,..., x n )} applicable to DB. Then: A tuple (c 1,..., c n ) A n is an answer to Q wrt DB def inition I is a model of ( n i=1 τ i(x i )) W (x 1,..., x n ) 25 / 63

Example FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Queries applicable to previous database: Who teaches P100? {x teacher(x) teach(x, P 100)} Who are all of A s students? {x student(x) (( y course)teach(a, y) enrolled(x, y))} What courses does a take and who teaches them? {(x, y) course(x) teacher(y) (enrolled(a, x) teach(y, x))} Who teaches all of the students? {x teacher(x) (( y student)( z course) teach(x, z) enrolled(y, z))} Queries not applicable (meaningless) {x teacher(x) teach(x, M AT H100)} {x supplier(x) (( y part)supplies(x, y))} 26 / 63

No need for safety FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems In (domain) relational calculus, the domain of a relation: is the totality of all individuals of a certain kind may be finite or infinite (thus the whole domain may be infinite as well) Thus unrestricted complement of a (finite) relation may be infinite need for the notion of safe queries But, with a finite domain: this (infinite) totality is not clear: never explicitly represented in databases it may be more natural that queries are about things the DB knows about no need for safety Comments (C. Rey): same consideration: active domain semantics domain independent queries (and many syntactical conditions on queries) constraint databases to handle in a finite way infinite relations 27 / 63

Problems with model theoretic approach FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Databases with incomplete information Disjunctive information manage many interpretations Represent a fact such as: a is enrolled in CS100 or in P 100 Split interpretation into 3: one in which it is CS100, one for P 100 and one for both Constants are answers if answers in all 3 interpretations Null values (as values at present not known) values not known but to be taken in a finite set of known values amounts to previous case of disjunctive information values not known but to be taken in a infinite set of not always known values inserting "null" in relations defining a third truth value "unknown" not clear how to extend the relational model... other kinds of null values... 28 / 63

Problems (cont.) FO languages Relational languages Types Semantics Interpretation Assignment Truth value Model Relational interpret Vocabulary Remarks FO query language No safety Problems Extending the relational model to incorporate more real world knowledge To express general facts about the world: "All men are mortal". To manage events (sequencing, times ofoccurences) Generalization hierarchies with property inheritance Some knowledge can be represented within integrity constraints (such as transitivity of a relation), but many other can t. What about in the context of proof theory? 29 / 63

Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Databases and Logic The Proof Theoretic Perspective 30 / 63

Purpose of this section Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction model theory proof theory define relational theories (class of FO theories) equivalence between relational theories and interpretations define a relational database as (R, T, IC) where T is a relational theory truth in relational interpretations provability in relational theories Comment (C. Rey): proof theory all assumptions are made explicit in axioms 31 / 63

From interpretation to theory Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Let DB = (R = (A, W), I, IC) a relational database: all domain elements are named using constants predicates interpretation as ground atomic formulae {teacher(a), teacher(b), teacher(c),..., enrolled(d, P 100),..., = (P 200, P 200)} Main idea : consider the previous set of ground atomic formulae as a FO theory T add to this theory other sentences to express all aspects of a relational database 32 / 63

From interpretation to theory Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Various formulae true in I can be proven with T as premises: T enrolled(c, P 100) T enrolled(a, P 100) teach(b, P 100) T ( y course)teach(a, y) enrolled(a, y) Formulae that are not provable: x(teacher(x) course(x) student(x)) because T does not say that A, B,..., P 100, P 200 are the only domain elements domain closure axiom = (A, B), = (A, C),... unique name axioms teacher(a) completion axioms More over ground formulae for equality can be simplify equality axioms: reflexivity, commutativity, transitivity, substitution axioms 33 / 63

Domain closure axiom Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Let I be a relational interpretation with a finite domain I = {c 1,..., c n }. The domain closure axiom for I is: x(= (x, c 1 )... = (x, c n )) 34 / 63

Unique name axioms Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Let I be a relational interpretation with a finite domain I = {c 1,..., c n }. The unique name axioms for I are: pair of constants c i and c j such that c i and c j are syntactically different, there is: = (c i, c j ) 35 / 63

Completion axioms Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Let I be a relational interpretation with a finite domain I = {c 1,..., c n }. Let Δ W be the set of ground atomic formulae (not containing equality atoms) that are true in I. For each predicate P (m) (different from =), we define C P = { c P ( c) Δ} Either C P = {(c 1 1,..., c 1 m),..., (c r 1,..., c r m)} Then the completion axiom for P is: x 1,..., x m (P (x 1,..., x m )) r j=1 ( m i=1 Either C P = Then the completion axiom for P is: x 1,..., x m ( P (x 1,..., x m )) = (x i, c j i ) ) 36 / 63

Equality axioms Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Let I be a relational interpretation with a finite domain I = {c 1,..., c n }. Reflexivity: x = (x, x) Commutativity: x y(= (x, y) = (y, x)) Transitivity: x y z((= (x, y) = (y, z)) = (x, z)) Substitution: for each predicate P (m) x 1,..., x m, y 1,..., y m (P (x 1,..., x m ) ( m i=1 = (x i, y i ))) P (y 1,..., y m ) 37 / 63

Example Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction The previous example implies the theory T : Ground atomic formulae (except equality atoms): teacher(a), teacher(b), teacher(c),..., enrolled(d, P 100) Domain closure axiom: x(= (x, A) = (x, B)... = (x, P 200)) Unique name axioms: = (A, B), = (A, C), = (A, a),... Equality axioms Completion axiom for each predicate: x(teacher(x) (= (x, A) = (x, B) = (x, C))) x y(enrolled(x, y) ((= (x, a) = (y, CS100))... (= (x, d) = (y, P 200)))... 38 / 63

Relational theory Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Let R = (A, W) be a relational language. A first order theory T W is a relational theory of R iff it satisfies the following properties: 1. If c 1,..., c n are all the constants of A then T contains: the domain closure axiom the unique name axioms 2. T contains all equality axioms 3. There is a set Δ W of ground atomic formulae (not containing equality atoms) such that Δ T. Moreover T contains the completion axioms built from Δ. 4. The only wffs of T are those that follow previous conditions 1., 2. and 3. 39 / 63

Equivalence theories interpretations Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Theorem Suppose R = (A, W) is a relational language. Then: 1. If T is a relational theory of R, then T has a unique model I which is a relational interpretation for R. 2. If I is a relational interpretation for R, then there is a relational theory T of R such that I is the only model of T. 40 / 63

Equivalence truth provability Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Corollary Suppose R = (A, W) is a relational language, T is a relational theory of R and I is a model of T. Then, for any wff ϕ of R: ϕ is true in I T ϕ Proof Since I must be a unique model of T, and the completeness theorem fo FOL. 41 / 63

Relational database Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction A relational database is a triple (R, T, IC): R is a relational language T is a relational theory of R IC is a set of wffs of R called integrity constraints such that: predicate P (n) (except = and simple types), IC must contain: x 1... x n (P (x 1,..., x n ) τ 1 (x 1 )... τ n (x n )) where the τ i are types 3. 3 τ 1,..., τ n are called the domains of P 42 / 63

Satisfaction of integrity constraints Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Let (R, T, IC) be a relational database. The integrity constraint IC are said to be satisfied iff: ϕ IC, T ϕ 43 / 63

Answer to a query Purpose Intuition Intuition Domain closure axiom Unique name axioms Completion axioms Equality axioms Relational theory Equivalence IC satisfaction Let DB = (R = (A, W), T, IC) a relational database. Let Q be the following query {x 1,..., x n ( n i=1 τ i(x i )) W (x 1,..., x n )} applicable to DB. Then: A tuple (c 1,..., c n ) A n is an answer to Q wrt DB def inition 1. T τ i (c i ), i {1,..., n} and 2. T W (c 1,..., c n ) 44 / 63

Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Generalizing the proof theoretic perspective DB that contain incomplete information 45 / 63

Adding disjunctive information Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Problem: represent disjunctive facts of the form "P is the case or Q is, or... but I don t know which" Example: Ground atomic formulae as tables: part p 1 p 2 p 3 supplies supplier Acme F oo p 3 Acme p 1 F oo p 2 subpart p1 p 2 46 / 63

Adding disjunctive information: problem Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Other axioms: Domain closure axiom: x(= (x, p 1 ) = (x, p 2 ) = (x, p 3 ) = (x, Acme) = (x, F oo)) Unique name axioms: = (p 1, p 2 ), = (p 1, p 3 ),... Equality axioms: as usual Completion axioms: 1. x(part(x) (= (x, p 1 ) = (x, p 2 ) = (x, p 3 ))) 2. x(supplier(x) (= (x, Acme) = (x, F oo)) 3. x y(supplies(x, y) ((= (x, Acme) = (y, p 1 )) (= (x, F oo) = (y, p 2 ))) 4. x y(subpart(x, y) (= (x, p 1 ) = (y, p 2 ))) Problem: cannot add supplies(f oo, p 1 ) supplies(f oo, p 3 ) since the (completion) axioms prove supplies(f oo, p 1 ) supplies(f oo, p 3 ) 47 / 63

Adding disjunctive information: solution Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Replace completion axioms by: 3. x y(supplies(x, y) ((= (x, Acme) = (y, p 1 )) (= (x, F oo) = (y, p 2 )) (= (x, F oo) = (y, p 1 )) (= (x, F oo) = (y, p 3 ))) 48 / 63

Generalized completion axioms Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Let R = (A, W) be a relational language. A wff of W is a positive ground clause iff it has the form A 1... A m where each A i a ground nonequality atomic formula. Let Δ W be a set of clauses of R. For each predicate P (m) (different from =), we define C P = { c A 1... A m Δ, i {1,..., m} A i is P ( c)} Either C P = {(c 1 1,..., c 1 m),..., (c r 1,..., c r m)} Then the completion axiom for P is: x 1,..., x m (P (x 1,..., x m )) r j=1 ( m i=1 Either C P = Then the completion axiom for P is: x 1,..., x m ( P (x 1,..., x m )) = (x i, c j i ) ) 49 / 63

Generalized relational theory Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Let R = (A, W) be a relational language. A first order theory T W is a generalized relational theory of R iff it satisfies the following properties: 1. as a relational theory 2. as a relational theory 3. There is a set Δ W of clauses such that Δ T. Moreover T contains the generalized completion axioms built from Δ. 4. as a relational theory 50 / 63

Generalized relational database Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability A generalized relational database is a triple (R, T, IC): R is a relational language T is a generalized relational theory of R IC is a set of wffs of R called integrity constraints Satisfaction of integrity constraint and answer to a query are defined as before. Theorem Every generalized relational theory is consistent. 51 / 63

Adding null values Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Problem: represent following null values "value at present unknown, and not necessary one of some finite set of known possible values" : represent "Some supplier supplies part p 3, but I don t know who it is. Moreover this supplier may or may not be one the known supplier Acme or F oo." Solution: x(supplier(x) supplies(x, p 3 )) or supplier(ω) supplies(ω, p 3 ) with a skolem constant ω How to integrate such new constants? 52 / 63

Adding null values Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Unique name axioms: ω may denote the same individual as another constant. So no need for unique name axioms for ω. Domain closure axiom: x(= (x, p 1 ) = (x, p 2 ) = (x, p 3 ) = (x, Acme) = (x, F oo) = (x, ω)) Completion axioms: 1. x(supplier(x) (= (x, Acme) = (x, F oo) = (x, ω)) 2. x y(supplies(x, y) ((= (x, Acme) = (y, p 1 )) (= (x, F oo) = (y, p 2 )) (= (x, ω) = (y, p 3 ))) 53 / 63

Adding null values Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Modify the theory: add skolem constants modify the domain closure axiom modify the completion axioms maybe add some inequalities axioms (between skolem constants and constants or other skolem constants) DO NOT change unique name axioms 54 / 63

Generalized rel theory with null values Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability As before: precise definition of all axioms definition of a generalized relational database with null values definition of satisfaction of integrity constraint and of an answer to a query remains the same consistency of every generalized relational database with null values Any generalized relational theory with null values is decidable (because of the finite domain). 55 / 63

Decidability Disjunctive information Disj info problem Disj info solution Galz completion axioms Gald relational theory Gald relational database Adding null values Gd rel th with nv Decidability Any generalized relational theory with null values is decidable (because of the finite domain). Testing the truth: in all possible model very expensive by a theorem proving approach better by a generalization of relational algebra still better but possible? 56 / 63

Purpose Discussion CWA Generalizing the proof theoretic perspective Conceptual modelling 57 / 63

Purpose of incorporating more knowledge Purpose Discussion CWA the relational model: enable to express many real world knowledge events hierarchies and properties inheritance aggregations many existing extensions but: not clear semantics difficult to compare logic can be a solution (propositions are made to adapt relational theories) 58 / 63

Discussion Purpose Discussion CWA General principle to add more knowledge: new FO formulae are added as integrity constraints other existing axioms must stay in the theory (to ensure inference) General properties there must be: a domain closure axiom unique name axioms (unnecessary for null values) represent the closed world assumption 59 / 63

Closed world assumption Purpose Discussion CWA This is a characterization of negation in database. This is an inference rule: R( c) T R( c) Problems: it treats null values incorrectly with disjunctive information, it leads to inconsistency is it not FO definable (it is a meta notion) In relational theories: completions axioms express the CWA. What about more general setting? 60 / 63

Benefits 61 / 63

(Model and) proof theoretic approache(s) Benefits based on a semantics: precise concepts have precise definition unique name assumption domain closure assumption closed world assumption null values disjunctive information answer of a query integrity constraints allow conceptual uniformity representational (all in the same FO language) queries integrity constraint facts operational (all solved by proof theory) query evaluation satisfaction of constraints 62 / 63

(Model and) proof theoretic approache(s) Benefits have practical advantages (for conceptual modelling) non logical models are given aa precise semantics different data models can be compared nonproof theoretic query evaluation algorithms can be proved correct wrt logical semantics of queries integrity constraints maintenance algorithms can be proved correct wrt the proof theoretic definition of constraint satisfaction 63 / 63