CPS 216 Spring 2003 Homework #1 Assigned: Wednesday, January 22 Due: Monday, February 10

Similar documents
2.2.2.Relational Database concept

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

EECS 647: Introduction to Database Systems

Fundamentals of Database Systems

Relational Model: History

Lecture 16. The Relational Model

Static type safety guarantees for the operators of a relational database querying system. Cédric Lavanchy

4/10/2018. Relational Algebra (RA) 1. Selection (σ) 2. Projection (Π) Note that RA Operators are Compositional! 3.

Relational Model & Algebra. Announcements (Thu. Aug. 27) Relational data model. CPS 116 Introduction to Database Systems

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

WEEK 3. EE562 Slides and Modified Slides from Database Management Systems, R.Ramakrishnan 1

Relational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions.

CSE 544 Principles of Database Management Systems

Announcements. Agenda. Database/Relation/Tuple. Discussion. Schema. CSE 444: Database Internals. Room change: Lab 1 part 1 is due on Monday

Example Examination. Allocated Time: 100 minutes Maximum Points: 250

Announcements (September 14) SQL: Part I SQL. Creating and dropping tables. Basic queries: SFW statement. Example: reading a table

CSC 261/461 Database Systems Lecture 19

CSE 544 Principles of Database Management Systems

CSC 261/461 Database Systems Lecture 13. Fall 2017

Relational Model, Relational Algebra, and SQL

Lecture Query evaluation. Combining operators. Logical query optimization. By Marina Barsky Winter 2016, University of Toronto

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

Relational Algebra 1

Database Management Systems Paper Solution

Relational Databases

SQL: Part II. Announcements (September 18) Incomplete information. CPS 116 Introduction to Database Systems. Homework #1 due today (11:59pm)

Relational Algebra for sets Introduction to relational algebra for bags

Homework 1: Relational Algebra and SQL (due February 10 th, 2016, 4:00pm, in class hard-copy please)

Sankalchand Patel College of Engineering, Visnagar B.E. Semester III (CE/IT) Database Management System Question Bank / Assignment

Multiple-Choice. 3. When you want to see all of the awards, even those not yet granted to a student, replace JOIN in the following

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

Relational Model History. COSC 416 NoSQL Databases. Relational Model (Review) Relation Example. Relational Model Definitions. Relational Integrity

Ian Kenny. November 28, 2017

Informatics 1: Data & Analysis

CS 377 Database Systems

MANAGING DATA(BASES) USING SQL (NON-PROCEDURAL SQL, X401.9)

L22: The Relational Model (continued) CS3200 Database design (sp18 s2) 4/5/2018

Today s topics. Null Values. Nulls and Views in SQL. Standard Boolean 2-valued logic 9/5/17. 2-valued logic does not work for nulls

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Relational Model and Relational Algebra

Final Review. Zaki Malik November 20, 2008

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. Date: Thursday 16th January 2014 Time: 09:45-11:45. Please answer BOTH Questions

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Plan for today. Query Processing/Optimization. Parsing. A query s trip through the DBMS. Validation. Logical plan

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

QUIZ 1 REVIEW SESSION DATABASE MANAGEMENT SYSTEMS

Announcements (September 18) SQL: Part II. Solution 1. Incomplete information. Solution 3? Solution 2. Homework #1 due today (11:59pm)

CS2300: File Structures and Introduction to Database Systems

Birkbeck. (University of London) BSc/FD EXAMINATION. Department of Computer Science and Information Systems. Database Management (COIY028H6)

Relational Algebra. Mr. Prasad Sawant. MACS College. Mr.Prasad Sawant MACS College Pune

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

6 February 2014 CSE-3421M Test #1 w/ answers p. 1 of 14. CSE-3421M Test #1. Design

Agenda. Discussion. Database/Relation/Tuple. Schema. Instance. CSE 444: Database Internals. Review Relational Model

CS 461: Database Systems. Final Review. Julia Stoyanovich

Databases - 4. Other relational operations and DDL. How to write RA expressions for dummies

Unit- III (Functional dependencies and Normalization, Relational Data Model and Relational Algebra)

Assignment 3: SQL SELECT Queries

EECS 647: Introduction to Database Systems

Lecture #8 (Still More Relational Theory...!)

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

DATABASE DESIGN I - 1DL300

CIS 330: Applied Database Systems

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

Chapter 6 The database Language SQL as a tutorial

Query Processing and Query Optimization. Prof Monika Shah

Relational Model & Algebra. Announcements (Tue. Sep. 3) Relational data model. CompSci 316 Introduction to Database Systems

D B M G. SQL language: basics. Managing tables. Creating a table Modifying table structure Deleting a table The data dictionary Data integrity

Data about data is database Select correct option: True False Partially True None of the Above

The Extended Algebra. Duplicate Elimination. Sorting. Example: Duplicate Elimination

Relational Algebra. Relational Algebra. 7/4/2017 Md. Golam Moazzam, Dept. of CSE, JU

What s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence

CS411 Database Systems. 04: Relational Algebra Ch 2.4, 5.1

Announcements. Agenda. Database/Relation/Tuple. Schema. Discussion. CSE 444: Database Internals

Relational Algebra and SQL

Relational Model and Relational Algebra. Rose-Hulman Institute of Technology Curt Clifton

Relational Query Languages: Relational Algebra. Juliana Freire

Exam code: Exam name: Database Fundamentals. Version 16.0

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Parser: SQL parse tree

Informationslogistik Unit 4: The Relational Algebra

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2015 Quiz I

CSE 344 JANUARY 26 TH DATALOG

CSE 344 Final Examination

Midterm Review. Winter Lecture 13

CSE 344 JANUARY 5 TH INTRO TO THE RELATIONAL DATABASE

Chapter 19 Query Optimization

Relational Algebra. Procedural language Six basic operators

Database Design and Tuning

Birkbeck. (University of London) BSc/FD EXAMINATION. Department of Computer Science and Information Systems. Database Management (COIY028H6)

Friday Nights with Databases!

COMP9311 Week 10 Lecture. DBMS Architecture. DBMS Architecture and Implementation. Database Application Performance

Relational Algebra and SQL. Basic Operations Algebra of Bags

CS 338 Functional Dependencies

Introduction to Data Management. Lecture #11 (Relational Languages I)

Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept]

CSE 444, Winter 2011, Midterm Examination 9 February 2011

The Relational Model and Relational Algebra

Bachelor in Information Technology (BIT) O Term-End Examination

Chapter 3. Algorithms for Query Processing and Optimization

Transcription:

CPS 216 Spring 2003 Homework #1 Assigned: Wednesday, January 22 Due: Monday, February 10 Note: This is a long homework. Start early! If you have already taken CPS 196.3 or an equivalent undergraduate course in databases, you may skip Problems 1-5 and complete Problems 6-8. Otherwise, you may complete Problems 1-5 and skip Problems 6-8. Problem 1. Consider a database containing information about military aircrafts used in the World War II. This database consists of the following four relations: Model (model, type, country, payload, range, speed) Plane (id, model, year) Battle (name, year) Outcome (plane, battle, result) All aircrafts are categorized into models. The relation Model records the name of the model (e.g., B-52, F4F ), type (e.g., fighter, bomber, reconnaissance plane ), the country that made the model, the payload of the model, its maximum range (in miles) and speed (in MPH). Relation Plane records the unique identification string of each plane, its model, and the year in which it entered service. Relation Battle gives the name and the year of each battle involving planes. Relation Outcome records the result ( downed, damaged, ok ) for each plane in each battle. The key attributes are underlined. Please write relational algebra expressions to answer following queries. (a) List model name and country for models with range over 1,500 miles. (b) Find ids of planes with speed no less than 300 MPH. (c) List ids and models of planes that were downed in the battle of Pearl Harbor. (d) For each plane engaged in the battle of Midway, list its id, payload, speed, and the year in which it entered service. (e) Find those countries that have both bombers and reconnaissance planes. (f) Find those planes that lived to fight another day, i.e., they were damaged in one battle but later fought in another. (g) Find the model with the highest payload. (h) Find ids of those planes that fought only in the battles that plane 007 fought in. (i) Find ids of those planes that fought in every battle that plane 007 fought in. Problem 2. As discussed in class, the core operators in relational algebra are selection (σ p ), projection (π L ), cross product ( ), union ( ), and difference ( ). (a) Show that the projection operator is necessary; that is, some queries that use the projection operator cannot be expressed using any combination of the other operators.

(b) Show that the union operator is necessary; that is, some queries that use the union operator cannot be expressed using any combination of the other operators. (c) Show that the selection operator is necessary; that is, some queries that use the selection operator cannot be expressed using any combination of the other operators. Problem 3. Consider a relation R (A, B, C, D) with FD s AB C, C D, and D B. (a) Show that {A, B} is a key of R (remember a key has to be minimal). (b) What are the other keys of R? (Hint: A must be in every key of R; why?) (c) D B is a BNCF violation. Using this violation, we decompose R into R 1 (B, D) and R 2 (A, C, D). What are the keys of R 1? (d) What are the FD s that hold in R 1? Do not list them all; instead, give a set of FD s from which all other FD s in R 1 follow. This set of FD s is called a basis. When checking for BCNF violations, it suffices to check just the basis. (e) Is R 1 in BCNF? Briefly explain why. (f) What are the keys of R 2? (Hint: There is more than one.) (g) What are the FD s that hold in R 2? Again, do not list them all; instead, give a basis. (h) Is R 2 in BCNF? If yes, briefly explain why. Otherwise, decompose further until all decomposed relations are in BCNF, and then show your final results. Problem 4. Consider the following schema. The catalog lists prices charged for parts by suppliers. Supplier (sid, sname, address) Part (pid, pname, color) Catalog (sid, pid, price) Log into rack40.cs.duke.edu and run ~cps216/examples/db-supply/setup.sh to setup a database with some sample data. For the SQL database schema, please refer to the file create.sql in the same directory. Write SQL statements to answer the following queries. Make sure that the result contains no duplicates. Use DISTINCT only when necessary. Write all your queries in file named hw1-4.sql. When you are done, run db2 tvf hw1-4.sql > hw1-4.out (you may need to run db2 connect to cps216 before that and db2 disconnect all afterwards). Then, print out the file hw1-4.out and turn it in together with the rest of the homework. You may find it useful to consult the online document titled DB2 SQL Programming Notes, found under the Programming Notes section of the course Web page (http://www.cs.duke.edu/courses/spring03/cps216/). (a) Find the names of parts for which there is some supplier. (b) Find the names of suppliers who supply only red parts. (c) Find the ids of suppliers who supply at least one red part and at least one green part.

(d) Find the names of parts supplied by Lockheed Martin and by no one else. (e) Find the average price of red parts. (f) Find the ids of suppliers who charge more for some part than the average price of that part. (g) For each part, find the names of suppliers who charge most for that part. Problem 5. You are asked to design a relational database for a store. After brainstorming with the store managers, you come up with the following specification: The store has multiple departments, identified by their names. Each department may have many employees but only one of them is the department manager. Employees are identified by their names. We also need to record their salaries. Each employee may work in only one department. Managers are employees as well, and they manage the departments they work in. The store sells various items identified by item ids. Each item is carried by exactly one department, while each department may carry many items. For each item, we also need to keep a short description and its quantity in stock. The store deals with a number of suppliers identified by their names. We need to record their addresses. Each supplier supplies an item at a particular price. A supplier may supply any number of items, and the same items could be supplied by different suppliers (probably at different prices). The store receives orders identified by order ids. Each order has a date, a shipping address, and may include different quantities of multiple items. You are encouraged (but not required) to use the E/R model to help you design the schema. Make sure that you apply the relational database design theory to check for redundancies in your design, and remove them if necessary. Keep all SQL statements you write for this problem in a file named hw1-5.sql. When you are done, run db2 tvf hw1-5.sql > hw1-5.out. Then, print out the file hw1-5.out and turn it in together with the rest of this homework. (a) Create the final relational schema using SQL CREATE TABLE statements on the DB2 server. Choose appropriate data types for your columns, and remember to declare any keys, foreign keys, NOT NULL, and CHECK constraints when appropriate. (b) Make up some sample data and write SQL INSERT statements to populate the database. Each table should contain at least five rows. (c) For each table you have created and populated, run the following query: SELECT * FROM table_name FETCH FIRST 10 ROWS ONLY;. (d) Write a query to find, for each department, the name of its manager, the number of employees in this department, the number of orders for items carried by this department, and the number of suppliers that supply items carried by this department. (Each order should be counted at most once even if it contains multiple

Problem 6. items carried by this department. Similarly, each supplier should be counted at most once even if it supplies multiple items carried by this department.) The following questions are based on the paper A Relational Model of Data for Large Shared Data Banks, by Codd. (a) Suppose that an instance of R (A, B) is joinable with an instance of S (B, C). What functional dependencies must hold in order for the join of R with S to be unique (as discussed by Codd in Section 2.1.3)? (b) Why didn t Codd include the selection operator (σ p ) in the list of relational operators he discussed in Section 2.1? Problem 7. The following questions are based on the paper A History and Evaluation of System R, by Chamberlin et al. (a) In Section 3 under the subsection titled The Optimizer, authors described two join methods that were believed to be optimal or nearly optimal under most circumstances. One notably efficient join method is missing, however. What is this missing join method? (b) In Section 4 under the subsection titled The SQL Language, authors introduced the notion of outer-joins. Subsequently, syntax for outer-join was added to the SQL standard. Strictly speaking, however, the new syntax is not necessary (except perhaps from a user-friendly point of view). Show that you can write the outer-join between tables R (A, B) and S (B, C) in SQL without using the outer-join syntax. Problem 8. For this problem, you may work either independently or in groups of two. Your task is to develop a relational algebra query interface for DB2. Your interface should accept relational algebra query strings as input, translate them into SQL, send the SQL queries to DB2, and print out the results received from DB2. In developing this interface, you may need to consider the following issues: Relational algebra syntax and parsing. First, you will need to define the syntax of relational algebra queries and write a parser to convert arbitrarily complex query strings into an internal representation (perhaps in the form of an expression tree). Since parsing is not our focus, you may choose a syntax that is easy to parse. For example, you can use Reverse Polish Notation to avoid the use of parentheses to define the nesting of operators. A query π R.A, S.D ( σ B = Bart C = 5 R C D S ) might be encoded as R \select{b = Bart or C = 5} S \join{c <= D} \project{r.a, S.D}, with the understanding that join is a binary operator while select and project are unary operators.

Using schema information. For some relational algebra queries, knowledge about the schema is required for correct translation. For example, to translate the natural join query, R S, we need to know the names of columns in R and S in order to generate the correct SELECT and WHERE clauses. (SQL actually supports a NATURAL INNER JOIN construct officially, but unfortunately it is not implemented in most commercial systems.) For another example, to translate the full table/column rename operator, e.g., ρ S(A, B, C) R, we need to know the names of columns in R in order to rename them in the SELECT clause. Thus, to do a complete job in translation, your interface needs to query DB2 for schema information. Another use of schema information is to detect semantic errors, e.g., references to nonexistent columns, type mismatches, incompatible schemas, etc. To make your life easier for this problem, you may choose not to support natural join and column rename, but you must support regular join and table rename (e.g., ρ S R). You may also delegate detection of semantic errors to DB2, but you must pass on any error messages returned by DB2 to the users of your interface. JDBC/ODBC application or db2 wrapper? One approach is to implement the interface as a JDBC/ODBC database application. This approach allows the interface to query DB2 for schema information and use it in translation and error checking. The downside of this approach is that the interface must also handle the presentation of query results and error messages. An alternative approach is to implement the interface on top of the db2 command-line processor. The interface simply needs to translate the query into SQL, execute SQL using db2, and pass on whatever output back to the user. This approach is simpler to implement and easier to debug, but accessing schema information is more difficult. Difference between set and bag semantics. Remember that relational algebra is duplicatefree, while SQL permits duplicates. You may need to use DISTINCT when necessary. Additional bells and whistles. This problem is open-ended. If you have time and enjoy programming databases, you might consider adding the following features: o A graphical user interface to help users explore the database schema and contents and construct relational algebra queries (perhaps as expression trees). o More informative error messages. o Support for natural join, column rename, outerjoin, semijoin, etc. o The ability to assign names to relational algebra expressions and use them in subsequent queries (just like SQL views). For this problem, turn in a pointer to the directory containing your documentation, source code, and executable. Please document the relational algebra syntax you support, known bugs/limitations, nifty features you have implemented, and instructions to compile and run your program. The best interface will be used in teaching future database courses at Duke.