C. Batini & M. Scannapieco Data and Information Quality Book Figures. Chapter 10: Data Quality Issues in Data Integration Systems

Size: px
Start display at page:

Download "C. Batini & M. Scannapieco Data and Information Quality Book Figures. Chapter 10: Data Quality Issues in Data Integration Systems"

Transcription

1 C. Batini & M. Scannapieco Data and Information Quality Book Figures Chapter 10: Data Quality Issues in Data Integration Systems 1

2 Data Fusion Strategies conflict handling strategies conflict ignorance conflict avoidance conflict resolution instance based metadata based instance based metadata based deciding mediating deciding mediating 2

3 Mediator-wrapper architecture USER MEDIATOR WRAPPER WRAPPER WRAPPER DB1 DB2 DB3 3

4 Phases of the QP-alg approach Input User query Sources with QCAs IQ Scores Source Selection Best Sources Plan Creation All Correct Plans Output Best Plans Plan Selection 1. QCA Quality Evaluation 2. Plan Quality Evaluation 3. Plan Ranking 4

5 Example of Price computation for the Plan P i Price QS1+QS2+QS3 Price QS1+QS2 Price QS3 Price QS1 Price QS2 Plan P i 5

6 R 1 t1,1 t1,2 t1,3 t1,n1 Query result construction In DaQuinCIS t 1 1=t1,1={ <f1,1, q1,1> <f1,w, q1,w>} t 1 2=t2,n2 ={<f2,1, q2,1> <f2,w, q2,w>} t 1 3= tk,2 = {<f3,1, q3,1> <f3,w, q3,w>}... C 1 R t 1 R 2 t2,1 t2,2 t2,3 t2,n2 C 2 t 2 1=t1,2= {<f1,1, q1,1> <f1,w, q1,w>} t 2 2= t2,2= {<f2,1, q2,1> <f2,w, q2,w>} t 2 3=tk,nk = {<f3,1, q3,1> <f3,w, q3,w>}... t 2 R k tk,1 tk,2 tz 1=t1,n 1 ={<f1,1, q1,1> <f1,w, q1,w>} t C z z 2=t2,3 ={<f2,1, q2,1> <f2,w, q2,w>} t z 3= tk,1= {<f3,1, q3,1> <f3,w, q3,w>}... t z tk,3 tk,nk 6

7 Example of query fragment construction from contributing views Q Q C 1 C 2 QF 1 QF 2 null 7

8 Comparison of quality-driven processing techniques Techniques Quality Metadata Granularity of Quality Characterization QP-alg YES Source, Query Correspondences Assertions, User Queries DaQuinCIS Query Processing FusionPlex Query Processing YES Each data element of a semistructured data model Type of Mapping GLAV GAV Quality Algebra Support Preliminary YES Source GLAV Preliminary No 8

9 An example of key- and attribute-level conflicts EmployeeID Name Surname Salary arpa78 John Smith 2000 eugi98 Edward Monroe 1500 ghjk09 Anthony Wite 1250 treg23 Marianne Collins 1150 Attribute Conflicts EmployeeS1 Key Conflicts EmployeeID Name Surname Salary arpa78 John Smith 2600 eugi98 Edward Monroe 1500 ghjk09 Anthony White 1250 dref43 Mariann e Collins 1150 collins@abc.it EmployeeS2 9

10 Resolution functions as proposed in [462] Function Attribute Type Description COUNT any Counts number of conflicting values MIN any Minimum value MAX any Maximum value RANDOM any Random non null value CHOOSE(Source) any Chooses most reliable source for the particular attribute MAXIQ any Value of highest information quality GROUP any Groups all conflicting values SUM numerical Sums all values MEDIAN numerical Median value, namely having the same number of higher and lower values AVG numerical Arithmetic mean of all values VAR numerical Variance of values STDDEV numerical Standard Deviation of values SHORTEST non-numerical Minimum length value, ignoring spaces LONGEST non-numerical Maximum length value, ignoring spaces CONCAT non-numerical Concatenation of values ANNCONCAT non-numerical Annotated concatenation of values, whose purpose is to specify the source, before the actual returned value 10

11 Instance of the global relation Employee TupleID EmployeeID Name Surname Salary t 1 arpa78 John Smith 2000 smith@abc.it t 2 eugi98 Edward Monroe 1500 monroe@abc.it t 3 ghjk09 Anthony Wite 1250 white@abc.it t 4 treg23 Marianne Collins 1150 collins@abc.it t 5 arpa78 John Smith 2600 smith@abc.it t 6 eugi98 Edward Monroe 1500 monroe@abc.it t 7 ghjk09 Anthony White 1250 white@abc.it t 8 dref43 Marianne Collins 1150 collins@abc.it 11

12 Resolution of attribute conflicts TupleID EmployeeID Name Surname Salary t 1 arpa78 John Smith 2000 smith@abc.it t 2 eugi98 Edward Monroe 1500 monroe@abc.it t 3 ghjk09 Anthony White 1250 white@abc.it t 4 treg23 Marianne Collins 1150 collins@abc.it RAC(Employee,Salary(MIN), Surname(Longest), EmployeeID(Any)) 12

13 Resolution of tuple conflicts TupleID EmployeeID Name Surname Salary t 1 arpa78 John Smith 2600 smith@abc.it t 2 eugi98 Edward Monroe 1500 monroe@abc.it t 3 ghjk09 Anthony Wite 1250 white@abc.it t 4 dref43 Marianne Collins 1150 collins@abc.it RTC(Employee,ANY) 13

14 Result of the context-aware query as applied to the table EmployeeS1 EmployeeID Salary arpa EmployeeID Salary eugi EmployeeID Salary ghjk treg

15 Conflict resolution techniques Techniques Tolerance Strategies Query Model SQL-Based Conflict Resolution Aurora FusionPlex NO High Confidence, RandomEvidence, PossibleAtAll No resolution strategy, selective attribute resolution SQL Ad-hoc Conflict Tolerant Query Model Extended SQL DaQuinCIS NO Extended XML FraQL-based Conflict Resolution OO RA NO Thresholds for tolerable and intolerable conflicts Ad-hoc FraQL Ad hoc Object Oriented Extension (OO RA ) 15

Data Fusion. References

Data Fusion. References Data Fusion Helena Galhardas DEI/IST References Data Fusion, Jens Bleiholder and Felix Naumann, ACM Computing Surveys, Vol. 41, N.1, 2008. Slides VLDB 2009 tutorial on Data Fusion, Luna Dong and Felix

More information

DIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY

DIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY DIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY Reham I. Abdel Monem 1, Ali H. El-Bastawissy 2 and Mohamed M. Elwakil 3 1 Information Systems Department, Faculty of computers and information,

More information

CMP-3440 Database Systems

CMP-3440 Database Systems CMP-3440 Database Systems Relational DB Languages Relational Algebra, Calculus, SQL Lecture 05 zain 1 Introduction Relational algebra & relational calculus are formal languages associated with the relational

More information

Databases - 5. Problems with the relational model Functions and sub-queries

Databases - 5. Problems with the relational model Functions and sub-queries Databases - 5 Problems with the relational model Functions and sub-queries Problems (1) To store information about real life entities, we often have to cut them up into separate tables Problems (1) To

More information

Database Management Systems

Database Management Systems Sample Questions 1 Write SQL query to create a table for describing a book. The table should have Book Title, Author, Publisher, Year published, and ISBN fields. Your table should have a primary key. For

More information

Chapter 6 - Part II The Relational Algebra and Calculus

Chapter 6 - Part II The Relational Algebra and Calculus Chapter 6 - Part II The Relational Algebra and Calculus Copyright 2004 Ramez Elmasri and Shamkant Navathe Division operation DIVISION Operation The division operation is applied to two relations R(Z) S(X),

More information

SQL Data Query Language

SQL Data Query Language SQL Data Query Language André Restivo 1 / 68 Index Introduction Selecting Data Choosing Columns Filtering Rows Set Operators Joining Tables Aggregating Data Sorting Rows Limiting Data Text Operators Nested

More information

What Are Group Functions? Reporting Aggregated Data Using the Group Functions. Objectives. Types of Group Functions

What Are Group Functions? Reporting Aggregated Data Using the Group Functions. Objectives. Types of Group Functions What Are Group Functions? Group functions operate on sets of rows to give one result per group. Reporting Aggregated Data Using the Group Functions Maximum salary in table Copyright 2004, Oracle. All rights

More information

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL In This Lecture Yet More SQL Database Systems Lecture 9 Natasha Alechina Yet more SQL ORDER BY Aggregate functions and HAVING etc. For more information Connoly and Begg Chapter 5 Ullman and Widom Chapter

More information

Outline A Survey of Approaches to Automatic Schema Matching. Outline. What is Schema Matching? An Example. Another Example

Outline A Survey of Approaches to Automatic Schema Matching. Outline. What is Schema Matching? An Example. Another Example A Survey of Approaches to Automatic Schema Matching Mihai Virtosu CS7965 Advanced Database Systems Spring 2006 April 10th, 2006 2 What is Schema Matching? A basic problem found in many database application

More information

The Relational Model Constraints and SQL DDL

The Relational Model Constraints and SQL DDL The Relational Model Constraints and SQL DDL Week 2-3 Weeks 2-3 MIE253-Consens 1 Schedule Week Date Lecture Topic 1 Jan 9 Introduction to Data Management 2 Jan 16 The Relational Model 3 Jan. 23 Constraints

More information

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 3 Descriptive Measures Slide 3-2 Section 3.1 Measures of Center Slide 3-3 Definition 3.1 Mean of a Data Set The mean of a data set is the sum of the observations divided by the number of observations.

More information

Advanced SQL GROUP BY Clause and Aggregate Functions Pg 1

Advanced SQL GROUP BY Clause and Aggregate Functions Pg 1 Advanced SQL Clause and Functions Pg 1 Clause and Functions Ray Lockwood Points: s (such as COUNT( ) work on groups of Instead of returning every row read from a table, we can aggregate rows together using

More information

Database Management Systems,

Database Management Systems, Database Management Systems SQL Query Language (3) 1 Topics Aggregate Functions in Queries count sum max min avg Group by queries Set Operations in SQL Queries Views 2 Aggregate Functions Tables are collections

More information

Data Manipulation Language (DML)

Data Manipulation Language (DML) In the name of Allah Islamic University of Gaza Faculty of Engineering Computer Engineering Department ECOM 4113 DataBase Lab Lab # 3 Data Manipulation Language (DML) El-masry 2013 Objective To be familiar

More information

EECS 647: Introduction to Database Systems

EECS 647: Introduction to Database Systems EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009 Stating Points A database A database management system A miniworld A data model Conceptual model Relational model 2/24/2009

More information

Semistructured Data and Mediation

Semistructured Data and Mediation Semistructured Data and Mediation Prof. Letizia Tanca Dipartimento di Elettronica e Informazione Politecnico di Milano SEMISTRUCTURED DATA FOR THESE DATA THERE IS SOME FORM OF STRUCTURE, BUT IT IS NOT

More information

Information Quality in Integrated Information Systems

Information Quality in Integrated Information Systems Information Quality in Integrated Information Systems Institute for Infocomm Research, Singapore 15.3.2005 Felix Naumann Humboldt-Universität zu Berlin 15.3.2005 Felix Naumann - Humboldt-Universität zu

More information

Data Integration and Data Warehousing Database Integration Overview

Data Integration and Data Warehousing Database Integration Overview Data Integration and Data Warehousing Database Integration Overview Sergey Stupnikov Institute of Informatics Problems, RAS ssa@ipi.ac.ru Outline Information Integration Problem Heterogeneous Information

More information

Slicing and Dicing Data in CF and SQL: Part 1

Slicing and Dicing Data in CF and SQL: Part 1 Slicing and Dicing Data in CF and SQL: Part 1 Charlie Arehart Founder/CTO Systemanage carehart@systemanage.com SysteManage: Agenda Slicing and Dicing Data in Many Ways Handling Distinct Column Values Manipulating

More information

Data Structure: Relational Model

Data Structure: Relational Model Applications View of a Relational Database Management (RDBMS) System Why applications use DBs? Persistent data structure Independent from processes using the data Also, large volume of data High-level

More information

GIFT Department of Computing Science. CS-217: Database Systems. Lab-4 Manual. Reporting Aggregated Data using Group Functions

GIFT Department of Computing Science. CS-217: Database Systems. Lab-4 Manual. Reporting Aggregated Data using Group Functions GIFT Department of Computing Science CS-217: Database Systems Lab-4 Manual Reporting Aggregated Data using Group Functions V3.0 4/28/2016 Introduction to Lab-4 This lab further addresses functions. It

More information

SQL Functions (Single-Row, Aggregate)

SQL Functions (Single-Row, Aggregate) Islamic University Of Gaza Faculty of Engineering Computer Engineering Department Database Lab (ECOM 4113) Lab 4 SQL Functions (Single-Row, Aggregate) Eng. Ibraheem Lubbad Part one: Single-Row Functions:

More information

SQL and Incomp?ete Data

SQL and Incomp?ete Data SQL and Incomp?ete Data A not so happy marriage Dr Paolo Guagliardo Applied Databases, Guest Lecture 31 March 2016 SQL is efficient, correct and reliable 1 / 25 SQL is efficient, correct and reliable...

More information

KORA. RDBMS Concepts II

KORA. RDBMS Concepts II RDBMS Concepts II Outline Querying Data Source With SQL Star & Snowflake Schemas Reporting Aggregated Data Using the Group Functions What Are Group Functions? Group functions operate on sets of rows to

More information

Test: Mid Term Exam Semester 2 Part 1 Review your answers, feedback, and question scores below. An asterisk (*) indica tes a correct answer.

Test: Mid Term Exam Semester 2 Part 1 Review your answers, feedback, and question scores below. An asterisk (*) indica tes a correct answer. Test: Mid Term Exam Semester 2 Part 1 Review your answers, feedback, and question scores below. An asterisk (*) indica tes a correct answer. Section 1 (Answer all questions in this section) 1. Which comparison

More information

Relational Model, Relational Algebra, and SQL

Relational Model, Relational Algebra, and SQL Relational Model, Relational Algebra, and SQL August 29, 2007 1 Relational Model Data model. constraints. Set of conceptual tools for describing of data, data semantics, data relationships, and data integrity

More information

Monitoring and Resolving Lock Conflicts. Copyright 2004, Oracle. All rights reserved.

Monitoring and Resolving Lock Conflicts. Copyright 2004, Oracle. All rights reserved. Monitoring and Resolving Lock Conflicts Objectives After completing this lesson you should be able to do the following: Detect and resolve lock conflicts Manage deadlocks Locks Prevent multiple sessions

More information

Part V. Relational XQuery-Processing. Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2007/08 297

Part V. Relational XQuery-Processing. Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2007/08 297 Part V Relational XQuery-Processing Marc H Scholl (DBIS, Uni KN) XML and Databases Winter 2007/08 297 Outline of this part (I) 12 Mapping Relational Databases to XML Introduction Wrapping Tables into XML

More information

Schedule. Today: Feb. 21 (TH) Feb. 28 (TH) Feb. 26 (T) Mar. 5 (T) Read Sections , Project Part 6 due.

Schedule. Today: Feb. 21 (TH) Feb. 28 (TH) Feb. 26 (T) Mar. 5 (T) Read Sections , Project Part 6 due. Schedule Today: Feb. 21 (TH) Transactions, Authorization. Read Sections 8.6-8.7. Project Part 5 due. Feb. 26 (T) Datalog. Read Sections 10.1-10.2. Assignment 6 due. Feb. 28 (TH) Datalog and SQL Recursion,

More information

SECOND SEMESTER BCA : Syllabus Copy

SECOND SEMESTER BCA : Syllabus Copy BCA203T: DATA STRUCTURES SECOND SEMESTER BCA : Syllabus Copy Unit-I Introduction and Overview: Definition, Elementary data organization, Data Structures, data structures operations, Abstract data types,

More information

Chapter 2 Introduction to Relational Models

Chapter 2 Introduction to Relational Models CMSC 461, Database Management Systems Spring 2018 Chapter 2 Introduction to Relational Models These slides are based on Database System Concepts book and slides, 6th edition, and the 2009 CMSC 461 slides

More information

Some Basic Aggregate Functions FUNCTION OUTPUT The number of rows containing non-null values The maximum attribute value encountered in a given column

Some Basic Aggregate Functions FUNCTION OUTPUT The number of rows containing non-null values The maximum attribute value encountered in a given column SQL Functions Aggregate Functions Some Basic Aggregate Functions OUTPUT COUNT() The number of rows containing non-null values MIN() The minimum attribute value encountered in a given column MAX() The maximum

More information

INDEX. 1 Basic SQL Statements. 2 Restricting and Sorting Data. 3 Single Row Functions. 4 Displaying data from multiple tables

INDEX. 1 Basic SQL Statements. 2 Restricting and Sorting Data. 3 Single Row Functions. 4 Displaying data from multiple tables INDEX Exercise No Title 1 Basic SQL Statements 2 Restricting and Sorting Data 3 Single Row Functions 4 Displaying data from multiple tables 5 Creating and Managing Tables 6 Including Constraints 7 Manipulating

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Relational Algebra. Relational Algebra. 7/4/2017 Md. Golam Moazzam, Dept. of CSE, JU

Relational Algebra. Relational Algebra. 7/4/2017 Md. Golam Moazzam, Dept. of CSE, JU Relational Algebra 1 Structure of Relational Databases A relational database consists of a collection of tables, each of which is assigned a unique name. A row in a table represents a relationship among

More information

Introduction to SQL Server 2005/2008 and Transact SQL

Introduction to SQL Server 2005/2008 and Transact SQL Introduction to SQL Server 2005/2008 and Transact SQL Week 2 TRANSACT SQL CRUD Create, Read, Update, and Delete Steve Stedman - Instructor Steve@SteveStedman.com Homework Review Review of homework from

More information

Microsoft Access - Using Relational Database Data Queries (Stored Procedures) Paul A. Harris, Ph.D. Director, GCRC Informatics.

Microsoft Access - Using Relational Database Data Queries (Stored Procedures) Paul A. Harris, Ph.D. Director, GCRC Informatics. Microsoft Access - Using Relational Database Data Queries (Stored Procedures) Paul A. Harris, Ph.D. Director, GCRC Informatics October 01, 2004 What is Microsoft Access? Microsoft Access is a relational

More information

Relational Algebra and SQL

Relational Algebra and SQL Relational Algebra and SQL Relational Algebra. This algebra is an important form of query language for the relational model. The operators of the relational algebra: divided into the following classes:

More information

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking

More information

CSC 261/461 Database Systems Lecture 19

CSC 261/461 Database Systems Lecture 19 CSC 261/461 Database Systems Lecture 19 Fall 2017 Announcements CIRC: CIRC is down!!! MongoDB and Spark (mini) projects are at stake. L Project 1 Milestone 4 is out Due date: Last date of class We will

More information

The DaQuinCIS Broker: Querying Data and Their Quality in Cooperative Information Systems

The DaQuinCIS Broker: Querying Data and Their Quality in Cooperative Information Systems The DaQuinCIS Broker: Querying Data and Their Quality in Cooperative Information Systems Massimo Mecella 1, Monica Scannapieco 1,2, Antonino Virgillito 1, Roberto Baldoni 1, Tiziana Catarci 1, and Carlo

More information

Integrity Constraints (Reminder)

Integrity Constraints (Reminder) Constraints, Triggers and Active Databases Chapter 9 1 Integrity Constraints (Reminder) Key (relation): a list of attributes Primary key (Sql: PRIMARY KEY) Secondary key (UNIQUE) Foreign key (inter-relation):

More information

Why Relational Databases? Relational databases allow for the storage and analysis of large amounts of data.

Why Relational Databases? Relational databases allow for the storage and analysis of large amounts of data. DATA 301 Introduction to Data Analytics Relational Databases Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca DATA 301: Data Analytics (2) Why Relational Databases? Relational

More information

OPEN SOURCE DB SYSTEMS TYPES OF DBMS

OPEN SOURCE DB SYSTEMS TYPES OF DBMS OPEN SOURCE DB SYSTEMS Anna Topol 1 TYPES OF DBMS Relational Key-Value Document-oriented Graph 2 DBMS SELECTION Multi-platform or platform-agnostic Offers persistent storage Fairly well known Actively

More information

SIT772 Database and Information Retrieval WEEK 6. RELATIONAL ALGEBRAS. The foundation of good database design

SIT772 Database and Information Retrieval WEEK 6. RELATIONAL ALGEBRAS. The foundation of good database design SIT772 Database and Information Retrieval WEEK 6. RELATIONAL ALGEBRAS The foundation of good database design Outline 1. Relational Algebra 2. Join 3. Updating/ Copy Table or Parts of Rows 4. Views (Virtual

More information

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen LECTURE 10: INTRODUCTION TO SQL FULL RELATIONAL OPERATIONS MODIFICATION LANGUAGE Union, Intersection, Differences (select

More information

DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843

DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843 DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843 WHAT IS A DATA CUBE? The Data Cube or Cube operator produces N-dimensional answers

More information

1) Introduction to SQL

1) Introduction to SQL 1) Introduction to SQL a) Database language enables users to: i) Create the database and relation structure; ii) Perform insertion, modification and deletion of data from the relationship; and iii) Perform

More information

XPath Lecture 34. Robb T. Koether. Hampden-Sydney College. Wed, Apr 11, 2012

XPath Lecture 34. Robb T. Koether. Hampden-Sydney College. Wed, Apr 11, 2012 XPath Lecture 34 Robb T. Koether Hampden-Sydney College Wed, Apr 11, 2012 Robb T. Koether (Hampden-Sydney College) XPathLecture 34 Wed, Apr 11, 2012 1 / 20 1 XPath Functions 2 Predicates 3 Axes Robb T.

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Exam I Computer Science 420 Dr. St. John Lehman College City University of New York 12 March 2002

Exam I Computer Science 420 Dr. St. John Lehman College City University of New York 12 March 2002 Exam I Computer Science 420 Dr. St. John Lehman College City University of New York 12 March 2002 NAME (Printed) NAME (Signed) E-mail Exam Rules Show all your work. Your grade will be based on the work

More information

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke Parallel DBMS Prof. Yanlei Diao University of Massachusetts Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke I. Parallel Databases 101 Rise of parallel databases: late 80 s Architecture: shared-nothing

More information

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016 Applied Databases Lecture 5 ER Model, normal forms Sebastian Maneth University of Edinburgh - January 25 th, 2016 Outline 2 1. Entity Relationship Model 2. Normal Forms Keys and Superkeys 3 Superkey =

More information

Introduction to SQL. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

Introduction to SQL. ECE 650 Systems Programming & Engineering Duke University, Spring 2018 Introduction to SQL ECE 650 Systems Programming & Engineering Duke University, Spring 2018 SQL Structured Query Language Major reason for commercial success of relational DBs Became a standard for relational

More information

3. Relational Data Model 3.5 The Tuple Relational Calculus

3. Relational Data Model 3.5 The Tuple Relational Calculus 3. Relational Data Model 3.5 The Tuple Relational Calculus forall quantification Syntax: t R(P(t)) semantics: for all tuples t in relation R, P(t) has to be fulfilled example query: Determine all students

More information

Exam code: Exam name: Database Fundamentals. Version 16.0

Exam code: Exam name: Database Fundamentals. Version 16.0 98-364 Number: 98-364 Passing Score: 800 Time Limit: 120 min File Version: 16.0 Exam code: 98-364 Exam name: Database Fundamentals Version 16.0 98-364 QUESTION 1 You have a table that contains the following

More information

Reporting Aggregated Data Using the Group Functions. Copyright 2004, Oracle. All rights reserved.

Reporting Aggregated Data Using the Group Functions. Copyright 2004, Oracle. All rights reserved. Reporting Aggregated Data Using the Group Functions Copyright 2004, Oracle. All rights reserved. Objectives After completing this lesson, you should be able to do the following: Identify the available

More information

CSE 344 Midterm. Wednesday, February 19, 2014, 14:30-15:20. Question Points Score Total: 100

CSE 344 Midterm. Wednesday, February 19, 2014, 14:30-15:20. Question Points Score Total: 100 CSE 344 Midterm Wednesday, February 19, 2014, 14:30-15:20 Name: Question Points Score 1 30 2 50 3 12 4 8 Total: 100 This exam is open book and open notes but NO laptops or other portable devices. You have

More information

CS

CS CS 1555 www.cs.pitt.edu/~nlf4/cs1555/ Relational Algebra Relational algebra Relational algebra is a formal system for manipulating relations Operands are relations Operations are: Set operations Union,

More information

2) SQL includes a data definition language, a data manipulation language, and SQL/Persistent stored modules. Answer: TRUE Diff: 2 Page Ref: 36

2) SQL includes a data definition language, a data manipulation language, and SQL/Persistent stored modules. Answer: TRUE Diff: 2 Page Ref: 36 Database Processing, 12e (Kroenke/Auer) Chapter 2: Introduction to Structured Query Language (SQL) 1) SQL stands for Standard Query Language. Diff: 1 Page Ref: 32 2) SQL includes a data definition language,

More information

Relational Model History. COSC 416 NoSQL Databases. Relational Model (Review) Relation Example. Relational Model Definitions. Relational Integrity

Relational Model History. COSC 416 NoSQL Databases. Relational Model (Review) Relation Example. Relational Model Definitions. Relational Integrity COSC 416 NoSQL Databases Relational Model (Review) Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Relational Model History The relational model was proposed by E. F. Codd

More information

Chapter 3: Introduction to SQL. Chapter 3: Introduction to SQL

Chapter 3: Introduction to SQL. Chapter 3: Introduction to SQL Chapter 3: Introduction to SQL Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 3: Introduction to SQL Overview of The SQL Query Language Data Definition Basic Query

More information

Chapter 5 Relational Algebra. Nguyen Thi Ai Thao

Chapter 5 Relational Algebra. Nguyen Thi Ai Thao Chapter 5 Nguyen Thi Ai Thao thaonguyen@cse.hcmut.edu.vn Spring- 2016 Contents 1 Unary Relational Operations 2 Operations from Set Theory 3 Binary Relational Operations 4 Additional Relational Operations

More information

Preferentially Annotated Regular Path Queries

Preferentially Annotated Regular Path Queries Preferentially Annotated Regular Path Queries Gosta Grahne Concordia U. Canada Alex Thomo U. of Victoria Canada Bill Wadge U. of Victoria Canada Regular Path Queries (RPQ s) Essentially regular expressions

More information

UNIT 2. DATA PREPROCESSING AND ASSOCIATION RULES

UNIT 2. DATA PREPROCESSING AND ASSOCIATION RULES UNIT 2. DATA PREPROCESSING AND ASSOCIATION RULES Data Pre-processing-Data Cleaning, Integration, Transformation, Reduction, Discretization Concept Hierarchies-Concept Description: Data Generalization And

More information

Intermediate SQL: Aggregated Data, Joins and Set Operators

Intermediate SQL: Aggregated Data, Joins and Set Operators Intermediate SQL: Aggregated Data, Joins and Set Operators Aggregated Data and Sorting Objectives After completing this lesson, you should be able to do the following: Identify the available group functions

More information

Chapter 2: Intro to Relational Model

Chapter 2: Intro to Relational Model Non è possibile visualizzare l'immagine. Chapter 2: Intro to Relational Model Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Example of a Relation attributes (or columns)

More information

CS145 Final Examination

CS145 Final Examination CS145 Final Examination Spring 2003, Prof. Widom ffl Please read all instructions (including these) carefully. ffl There are 11 problems on the exam, with a varying number of points for each problem and

More information

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE COMP 62421 Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE Querying Data on the Web Date: Wednesday 24th January 2018 Time: 14:00-16:00 Please answer all FIVE Questions provided. They amount

More information

Fundamentals of Database Systems

Fundamentals of Database Systems Fundamentals of Database Systems Assignment: 2 Due Date: 18th August, 2017 Instructions This question paper contains 10 questions in 6 pages. Q1: Consider the following schema for an office payroll system,

More information

Keys are fields in a table which participate in below activities in RDBMS systems:

Keys are fields in a table which participate in below activities in RDBMS systems: Keys are fields in a table which participate in below activities in RDBMS systems: 1. To create relationships between two tables. 2. To maintain uniqueness in a table. 3. To keep consistent and valid data

More information

An introduction to Big Data Integration. Letizia Tanca Politecnico di Milano

An introduction to Big Data Integration. Letizia Tanca Politecnico di Milano An introduction to Big Data Integration Letizia Tanca Politecnico di Milano Trends in Data Integration ü Pre-history: focused on challenges that occur within enterprises ü The Web era: scaling to a much

More information

PRODUCT DATA. PULSE Data Manager Type Uses and Features

PRODUCT DATA. PULSE Data Manager Type Uses and Features PRODUCT DATA PULSE Data Manager Type 7767 PULSE Data Manager (PDM) is a family of data management solutions that enables measurements from PULSE LabShop or any of its applications to be labelled with metadata

More information

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS INTRODUCTION TO RELATIONAL DATABASE SYSTEMS DATENBANKSYSTEME 1 (INF 3131) Torsten Grust Universität Tübingen Winter 2015/16 1 SQL: GROUPING AND AGGREGATION In relational database design, a single table

More information

SQL: A COMMERCIAL DATABASE LANGUAGE. Complex Constraints

SQL: A COMMERCIAL DATABASE LANGUAGE. Complex Constraints SQL: A COMMERCIAL DATABASE LANGUAGE Complex Constraints Outline 1. Introduction 2. Data Definition, Basic Constraints, and Schema Changes 3. Basic Queries 4. More complex Queries 5. Aggregate Functions

More information

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11}

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11} Entering Data: Basic Commands Consider the data set: {15, 22, 32, 31, 52, 41, 11} Data is stored in Lists on the calculator. Locate and press the STAT button on the calculator. Choose EDIT. The calculator

More information

CS W Introduction to Databases Spring Computer Science Department Columbia University

CS W Introduction to Databases Spring Computer Science Department Columbia University CS W4111.001 Introduction to Databases Spring 2018 Computer Science Department Columbia University 1 in SQL 1. Key constraints (PRIMARY KEY and UNIQUE) 2. Referential integrity constraints (FOREIGN KEY

More information

WHAT IS SQL. Database query language, which can also: Define structure of data Modify data Specify security constraints

WHAT IS SQL. Database query language, which can also: Define structure of data Modify data Specify security constraints SQL KEREM GURBEY WHAT IS SQL Database query language, which can also: Define structure of data Modify data Specify security constraints DATA DEFINITION Data-definition language (DDL) provides commands

More information

CS317 File and Database Systems

CS317 File and Database Systems CS317 File and Database Systems Lecture 3 Relational Calculus and Algebra Part-2 September 10, 2017 Sam Siewert RDBMS Fundamental Theory http://dilbert.com/strips/comic/2008-05-07/ Relational Algebra and

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

Database 2: Slicing and Dicing Data in CF and SQL

Database 2: Slicing and Dicing Data in CF and SQL Database 2: Slicing and Dicing Data in CF and SQL Charlie Arehart Founder/CTO Systemanage carehart@systemanage.com SysteManage: Agenda Slicing and Dicing Data in Many Ways Handling Distinct Column Values

More information

Provenance Management in Databases under Schema Evolution

Provenance Management in Databases under Schema Evolution Provenance Management in Databases under Schema Evolution Shi Gao, Carlo Zaniolo Department of Computer Science University of California, Los Angeles 1 Provenance under Schema Evolution Modern information

More information

Outline. Review questions. Carnegie Mellon Univ. Dept. of Computer Science Database Applications

Outline. Review questions. Carnegie Mellon Univ. Dept. of Computer Science Database Applications arnegie Mellon Univ. Dept. of omputer Science 15-415 - Database pplications Lecture #22: oncurrency ontrol Part 2 (R&G ch. 17) aloutsos SS 15-415 #1 Outline conflict/view serializability Two-phase locking

More information

Relational Databases

Relational Databases Relational Databases Lecture 2 Chapter 3 Robb T. Koether Hampden-Sydney College Fri, Jan 18, 2013 Robb T. Koether (Hampden-Sydney College) Relational Databases Fri, Jan 18, 2013 1 / 26 1 Types of Databases

More information

Relational Algebra. Procedural language Six basic operators

Relational Algebra. Procedural language Six basic operators Relational algebra Relational Algebra Procedural language Six basic operators select: σ project: union: set difference: Cartesian product: x rename: ρ The operators take one or two relations as inputs

More information

Data Integration Systems

Data Integration Systems Data Integration Systems Haas et al. 98 Garcia-Molina et al. 97 Levy et al. 96 Chandrasekaran et al. 2003 Zachary G. Ives University of Pennsylvania January 13, 2003 CIS 650 Data Sharing and the Web Administrivia

More information

ECE 650 Systems Programming & Engineering. Spring 2018

ECE 650 Systems Programming & Engineering. Spring 2018 ECE 650 Systems Programming & Engineering Spring 2018 Introduction to SQL Tyler Bletsch Duke University Slides are adapted from Brian Rogers (Duke) Structured Query Language SQL Major reason for commercial

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Restricting and Sorting Data. Copyright 2004, Oracle. All rights reserved.

Restricting and Sorting Data. Copyright 2004, Oracle. All rights reserved. Restricting and Sorting Data Objectives After completing this lesson, you should be able to do the following: Limit the rows that are retrieved by a query Sort the rows that are retrieved by a query Use

More information

4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML)

4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML) 4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML) example: Which lectures are required for the immediate predecessors? select predecessor from is_precondition_of where

More information

CS2 Current Technologies Lecture 2: SQL Programming Basics

CS2 Current Technologies Lecture 2: SQL Programming Basics T E H U N I V E R S I T Y O H F R G E D I N B U CS2 Current Technologies Lecture 2: SQL Programming Basics Dr Chris Walton (cdw@dcs.ed.ac.uk) 4 February 2002 The SQL Language 1 Structured Query Language

More information

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects.

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects. Set Theory Set theory is a branch of mathematics that studies sets. Sets are a collection of objects. Often, all members of a set have similar properties, such as odd numbers less than 10 or students in

More information

Full file at

Full file at David Kroenke's Database Processing: Fundamentals, Design and Implementation (10 th Edition) CHAPTER TWO INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) True-False Questions 1. SQL stands for Standard

More information

Practical Performance Tuning using Digested SQL Logs. Bob Burgess Salesforce Marketing Cloud

Practical Performance Tuning using Digested SQL Logs. Bob Burgess Salesforce Marketing Cloud Practical Performance Tuning using Digested SQL Logs Bob Burgess Salesforce Marketing Cloud Who?! Database Architect! Salesforce Marketing Cloud (Radian6 & Buddy Media stack) Why?! I can t be the only

More information

Retrieving Data Using the SQL SELECT Statement. Copyright 2004, Oracle. All rights reserved.

Retrieving Data Using the SQL SELECT Statement. Copyright 2004, Oracle. All rights reserved. Retrieving Data Using the SQL SELECT Statement Copyright 2004, Oracle. All rights reserved. Objectives After completing this lesson, you should be able to do the following: List the capabilities of SQL

More information

Relational Query Languages

Relational Query Languages Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model supports simple, powerful, declarative QLs with precise semantics: Strong formal foundation

More information

Spatiotemporal Aggregate Computation: A Survey

Spatiotemporal Aggregate Computation: A Survey Spatiotemporal Aggregate Computation: A Survey Inés Fernando Vega López Richard T. Snodgrass Bongki Moon Department of Computer Science Department of Computer Science Autonomous University of Sinaloa,

More information

Workbooks (File) and Worksheet Handling

Workbooks (File) and Worksheet Handling Workbooks (File) and Worksheet Handling Excel Limitation Excel shortcut use and benefits Excel setting and custom list creation Excel Template and File location system Advanced Paste Special Calculation

More information

Relational Algebra BASIC OPERATIONS DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA

Relational Algebra BASIC OPERATIONS DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA Relational Algebra BASIC OPERATIONS 1 What is an Algebra Mathematical system consisting of: Operands -- values from which new values can be constructed. Operators -- symbols denoting procedures that construct

More information