Ontology and Database Systems: Foundations of Database Systems

Similar documents
Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity

Lecture 1: Conjunctive Queries

Completeness of Queries over SQL Databases

Foundations of Databases

Uncertain Data Models

Incomplete Information: Null Values

Incomplete Databases: Missing Records and Missing Values

Towards a Logical Reconstruction of Relational Database Theory

DATABASE TECHNOLOGY - 1MB025 (also 1DL029, 1DL300+1DL400)

Chapter 3: Relational Model

CS34800 Information Systems. The Relational Model Prof. Walid Aref 29 August, 2016

COSC344 Database Theory and Applications. σ a= c (P) Lecture 3 The Relational Data. Model. π A, COSC344 Lecture 3 1

Relational Metadata Integration. Cathy Wyss

On the Hardness of Counting the Solutions of SPARQL Queries

Uncertainty in Databases. Lecture 2: Essential Database Foundations

CMPS 277 Principles of Database Systems. Lecture #3

2.2 Relational Model (RM)

Declarative programming. Logic programming is a declarative style of programming.

Database Theory VU , SS Codd s Theorem. Reinhard Pichler

Note that in this definition, n + m denotes the syntactic expression with three symbols n, +, and m, not to the number that is the sum of n and m.

Textbook: Chapter 6! CS425 Fall 2013 Boris Glavic! Chapter 3: Formal Relational Query. Relational Algebra! Select Operation Example! Select Operation!

Incomplete Data: What Went Wrong, and How to Fix It. PODS 2014 Incomplete Information 1/1

Chapter 2 & 3: Representations & Reasoning Systems (2.2)

CMP-3440 Database Systems

Relational Algebra for sets Introduction to relational algebra for bags

DATABASE DESIGN I - 1DL300

Database Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

CS 377 Database Systems

From Relational Algebra to the Structured Query Language. Rose-Hulman Institute of Technology Curt Clifton

CS2300: File Structures and Introduction to Database Systems

Foundations of Databases

Schema Mappings and Data Exchange

CS 377 Database Systems

Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views Modification of the Database Data Definition

Multisets and Duplicates. SQL: Duplicate Semantics and NULL Values. How does this impact Queries?

Ontology and Database Systems: Foundations of Database Systems

Database System Concepts

Information Integration

2.2.2.Relational Database concept

v Conceptual Design: ER model v Logical Design: ER to relational model v Querying and manipulating data

ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

Relational db: the origins. Relational Calculus (aka FO)

Modeling access. Relational query languages. Queries. Selection σ P (R) Relational model: Relational algebra

The Relational Model

This is already grossly inconvenient in present formalisms. Why do we want to make this convenient? GENERAL GOALS

Relational Databases

Chapter 2: Intro to Relational Model

Do we really understand SQL?

Approximation Algorithms for Computing Certain Answers over Incomplete Databases

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

3. Relational Data Model 3.5 The Tuple Relational Calculus

Relational Algebra. Procedural language Six basic operators

From ER Diagrams to the Relational Model. Rose-Hulman Institute of Technology Curt Clifton

Relational Algebra. Study Chapter Comp 521 Files and Databases Fall

Structural characterizations of schema mapping languages

Foundations of Schema Mapping Management

Chapter 6: Formal Relational Query Languages

elements of a relation are called tuples. Every tuple represents an entity or a relationship. Country name code population area...

6. Relational Algebra (Part II)

Chapter 1: Introduction

DATABASE DESIGN I - 1DL300

LECTURE 8: SETS. Software Engineering Mike Wooldridge

Relational Database Languages

THE RELATIONAL MODEL. University of Waterloo

History of SQL. Relational Database Languages. Tuple relational calculus ALPHA (Codd, 1970s) QUEL (based on ALPHA) Datalog (rule-based, like PROLOG)

Relational Algebra 1. Week 4

Relational Model: History

THE RELATIONAL DATA MODEL CHAPTER 3 (6/E) CHAPTER 5 (5/E)

CS317 File and Database Systems

SQL. Lecture 4 SQL. Basic Structure. The select Clause. The select Clause (Cont.) The select Clause (Cont.) Basic Structure.

Relational Query Languages. Relational Algebra. Preliminaries. Formal Relational Query Languages. Relational Algebra: 5 Basic Operations

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 4 MODULE, SPRING SEMESTER MATHEMATICAL FOUNDATIONS OF PROGRAMMING ANSWERS

Relational Algebra 1

Relational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions.

CSCC43H: Introduction to Databases. Lecture 3

CSE 344 JANUARY 26 TH DATALOG

Chapter 6 The database Language SQL as a tutorial

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

CSC Discrete Math I, Spring Sets

Relational Model, Relational Algebra, and SQL

Chapter 2: Relational Model

Introduction to SQL. Select-From-Where Statements Multirelation Queries Subqueries. Slides are reused by the approval of Jeffrey Ullman s

CMPS 277 Principles of Database Systems. Lecture #4

Relational Query Languages: Relational Algebra. Juliana Freire

Lyublena Antova, Christoph Koch, and Dan Olteanu Saarland University Database Group Saarbr ucken, Germany Presented By: Rana Daud

DATABASE THEORY. Lecture 18: Dependencies. TU Dresden, 3rd July Markus Krötzsch Knowledge-Based Systems

Typed Lambda Calculus

Denotational Semantics. Domain Theory

Database Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler

Biomedicine and bioinformatics databases (module Fundamentals of database systems)

Introduction to SQL SELECT-FROM-WHERE STATEMENTS SUBQUERIES DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA

Relational Model and Relational Algebra

Relational Design: Characteristics of Well-designed DB

Conjunctive queries. Many computational problems are much easier for conjunctive queries than for general first-order queries.

On the Codd Semantics of SQL Nulls

Domain Constraints Referential Integrity Assertions Triggers. Authorization Authorization in SQL

Informationslogistik Unit 4: The Relational Algebra

Introduction to SQL. Select-From-Where Statements Multirelation Queries Subqueries

Transcription:

Ontology and Database Systems: Foundations of Database Systems Werner Nutt Faculty of Computer Science Master of Science in Computer Science A.Y. 2014/2015

Incomplete Information Schema Person(fname, surname, city, street) City(cname, population) We know Mair lives in Bozen (but we don t know first name and street) Carlo Rossi lives in Bozen (but we don t know the street) Mair and Carlo Rossi live in the same street (but we don t know which) Maria Pichler lives in Brixen Bozen has a population of 100,500 Brixen has a population < 100,000 (but we don t known the number) Queries 1 Return first name and surname of people living in Bozen! 2 Return the surnames of people living in Bozen! 3 Who (surname) is living in the same street as Mair? 4 Which people are living in a city with less than 100,000 inhabitants? W. Nutt ODBS-FDBs 2014/2015 (1/73)

Incomplete Information: Questions How can we represent this info? What does the representation look like? What is its meaning? What answers should the queries return? How can we define the semantics of queries? W. Nutt ODBS-FDBs 2014/2015 (2/73)

Modeling Incomplete Information: SQL Nulls Person fname surname city street NULL Mair Bozen NULL Carlo Rossi Bozen NULL Maria Pichler Brixen NULL City cname population Bozen 100,500 Brixen NULL Intuitive meaning of NULL: one of (i) does not exist (ii) exists, but is unknown (iii) unknown whether (i) or (ii) W. Nutt ODBS-FDBs 2014/2015 (3/73)

SQL Nulls: Formal Semantics dom (or, equivalenty, every type) is extended by a new value: NULL built-in predicates are evaluated according to a 3-valued logic with truth values false < unknown < true atoms with NULL evaluate to unknown Boolean operations: AND/OR correspond to min/max on truth values NOT extends the classical definition by NOT(unknown) = unknown additional operation ISNULL( ) with ISNULL(v) = true iff v is NULL a query returns those tuples for which query conditions evaluate to true W. Nutt ODBS-FDBs 2014/2015 (4/73)

SQL Nulls: Example Queries Query 1 returns (NULL, Mair) and (Carlo, Rossi) Query 2 returns Mair and Rossi Queries 3 and 4 return nothing W. Nutt ODBS-FDBs 2014/2015 (5/73)

Representation Systems [Imieliński/Lipski, 1984] Distinguish between semantic instances, which are the ones we know syntactic instances, which contain tuples with variables (written 1, 2,...) A syntactic instance represents many semantic instances Syntactic instances are called multi-tables (i.e., several tables). There are three kinds of (multi-)tables: Codd Tables: a variable occurs no more than once Naive or Variable Tables: a variable can occur several times Conditional Tables: variable table where each tuple t is tagged with a boolean combination cond( t) of built-in atoms Short names: table, v-table, c-table W. Nutt ODBS-FDBs 2014/2015 (6/73)

Semantics of Tables Let T be a multi-table with variables var(t). For an assignement α: var(t) dom we define αt = {α t t T, α = cond( t )} Then T represents the infinite sets of instances rep(t) = {αt α: var(t) dom} Rep(T) = {J I J for some I rep(t)} where rep(t) is the closed-world interpretation of T Rep(T) is the open-world interpretation of T (Many results hold for both, the closed-world and the open-world interpretation. We assume open-world interpretation if not said otherwise.) W. Nutt ODBS-FDBs 2014/2015 (7/73)

Certain and Possible Answers Given T and a query Q, the tuple c is a certain answer (for Q over T) if c is returned by Q over all instances represented by T a possible answer if c is returned by Q over some instance represented by T We denote the set of all certain answers as cert T (Q). We have cert T (Q) = J Rep(T) Q(J) W. Nutt ODBS-FDBs 2014/2015 (8/73)

Modeling Incomplete Information: Codd-Tables Person fname surname city street 1 Mair Bozen 2 Carlo Rossi Bozen 3 City cname population Bozen 100,500 Brixen 5 Maria Pichler Brixen 4 Certain answers for our example queries: Query 1 returns (Carlo, Rossi) Query 2 returns Mair and Rossi Query 3 returns Mair Query 4 returns nothing W. Nutt ODBS-FDBs 2014/2015 (9/73)

Modeling Incomplete Information: v-tables Person fname surname city street 1 Mair Bozen 2 Carlo Rossi Bozen 2 City cname population Bozen 100,500 Brixen 5 Maria Pichler Brixen 4 Certain answers for our example queries: Query 1 returns (Carlo, Rossi) Query 2 returns Mair and Rossi Query 3 returns Mair and Rossi Query 4 returns nothing W. Nutt ODBS-FDBs 2014/2015 (10/73)

Modeling Incomplete Information: c-tables Person fname surname city street 1 Mair Bozen 2 Carlo Rossi Bozen 2 City cname population cond Bozen 100,500 true Brixen 5 5 < 100, 000 Maria Pichler Brixen 4 Certain answers for our example queries: Query 1 returns (Carlo, Rossi) Query 2 returns Mair and Rossi Query 3 returns Mair and Rossi Query 4 returns Pichler W. Nutt ODBS-FDBs 2014/2015 (11/73)

Strong Representation Systems Definition Let Q be a query and T be a table. Then Q(T) := {Q(I) I rep(t)} That is, Q(T) contains the relation instances obtained by applying Q individually to each instance represented by T. Note: Q(T) is a set of sets of tuples, not a set of tuples! W. Nutt ODBS-FDBs 2014/2015 (12/73)

Strong Representation Systems (cont) Theorem (Imieliński/Lipski) For every relational algebra query Q and every c-table T one can compute a c-table T such that rep( T) = Q(T) That is, T can be considered the answer of Q over T the result of querying a c-table can be represented by a c-table c-tables are a strong representation system The downside: handling of c-tables is intractable: the membership problem I rep(t)? is NP-hard the c-tables T may be very large W. Nutt ODBS-FDBs 2014/2015 (13/73)

Weak Representation Systems: Motivation Let T v be our example v-table and consider Q 0 = π fname,sname (σ city= Bozen (Person)), Q 1 = π sname (σ city= Bozen (Person)) Then: cert Tv (Q 0 ) = {(Carlo, Rossi)} and cert Tv (Q 1 ) = {(Mair), (Rossi)} Observation: Q 0 = π sname (Q 1 ), but cert Tv (Q 0 ) cannot be computed from cert Tv (Q 1 ) Compositionality is violated! Better keep the nulls! Idea: Try to keep enough information for representing certain answers! W. Nutt ODBS-FDBs 2014/2015 (14/73)

Incomplete Databases: Definition Definition (Incomplete Database) An incomplete database is a set of instances (I, J ). For a query Q and an incomplete db I, the set of certain answers for Q over I is cert I (Q) := Q(I). I I W. Nutt ODBS-FDBs 2014/2015 (15/73)

Weak Representation Systems Let L be a query language (e.g., conjunctive queries, positive queries, positive relational algebra) Definition (L-Equivalence) Two incomplete databases I, J are L-equivalent, denoted I L J, if for each Q L we have cert I (Q) = cert J (Q) That is, L-equivalent incomplete dbs give rise to the same certain answers for all queries in L. Goal: For Q and T, find a T such that T is L-equivalent to Q(Rep(T)), for a suitable L W. Nutt ODBS-FDBs 2014/2015 (16/73)

Weak Representation Systems (cntd) L + calc language of positive relational calculus queries Theorem (Imielinski/Lipski) For every positive query Q and v-table T, one can compute a v-table T such that Rep(T ) L + Q(Rep(T)) calc Proof. Apply Q to T, treating variables like constants. That is, T contains enough information to compute certain answers to positive queries on Q(Rep(T)) can be considered the answer of Q over T, in the context of positive queries W. Nutt ODBS-FDBs 2014/2015 (17/73)