Homework 6: FDs, NFs and XML (due April 13 th, 2016, 4:00pm, hard-copy in-class please)

Similar documents
Homework 6: FDs, NFs and XML (due April 15 th, 2015, 4:00pm, hard-copy in-class please)

Project Assignment 2 (due April 6 th, 2015, 4:00pm, in class hard-copy please)

Project Assignment 2 (due April 6 th, 2016, 4:00pm, in class hard-copy please)

BCNF. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong BCNF

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Homework 5: Miscellanea (due April 26 th, 2013, 9:05am, in class hard-copy please)

Homework 6. Question Points Score Query Optimization 20 Functional Dependencies 20 Decompositions 30 Normal Forms 30 Total: 100

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Handout 3: Functional Dependencies and Normalization

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

SCHEMA REFINEMENT AND NORMAL FORMS

Functional Dependencies CS 1270

Combining schemas. Problems: redundancy, hard to update, possible NULLs

Database Systems. Basics of the Relational Data Model

Homework 1: Relational Algebra and SQL (due February 10 th, 2016, 4:00pm, in class hard-copy please)

Review: Attribute closure

Fundamentals of Database Systems

Homework 2: E/R Models and More SQL (due February 17 th, 2016, 4:00pm, in class hard-copy please)

CSE 562 Database Systems

Exercise 9: Normal Forms

Design Theory for Relational Databases

Homework 4: Query Processing, Query Optimization (due March 21 st, 2016, 4:00pm, in class hard-copy please)

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Homework 3: Relational Database Design Theory (100 points)

Babu Banarasi Das National Institute of Technology and Management

Functional Dependencies and Finding a Minimal Cover

COMP7640 Assignment 2

Lecture 6a Design Theory and Normalization 2/2

CS411 Database Systems. 05: Relational Schema Design Ch , except and

UNIT 3 DATABASE DESIGN

Homework 1: RA, SQL and B+-Trees (due September 24 th, 2014, 2:30pm, in class hard-copy please)

Example Examination. Allocated Time: 100 minutes Maximum Points: 250

FUNCTIONAL DEPENDENCIES

COSC Dr. Ramon Lawrence. Emp Relation

Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design

Relational Design Theory. Relational Design Theory. Example. Example. A badly designed schema can result in several anomalies.

Homework 7: Transactions, Logging and Recovery (due April 22nd, 2015, 4:00pm, in class hard-copy please)

Schema Refinement: Dependencies and Normal Forms

Database Design Principles

Schema Refinement and Normal Forms

Unit 3 : Relational Database Design

The Relational Data Model

Chapter 8: Relational Database Design

Part II: Using FD Theory to do Database Design

Schema Refinement: Dependencies and Normal Forms

DATABASE MANAGEMENT SYSTEMS

customer = (customer_id, _ customer_name, customer_street,

Schema Refinement & Normalization Theory 2. Week 15

Desired properties of decompositions

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Presentation on Functional Dependencies CS x265

V. Database Design CS448/ How to obtain a good relational database schema

CSIT5300: Advanced Database Systems

Schema And Draw The Dependency Diagram

Chapter 7: Relational Database Design

Lecture 5 Design Theory and Normalization

Functional dependency theory

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Relational Design: Characteristics of Well-designed DB

Solutions to Final Examination

Midterm Exam (Version B) CS 122A Spring 2017

Normalization 03. CSE3421 notes

Case Study: Lufthansa Cargo Database

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

Database Management Systems Paper Solution

Schema Refinement: Dependencies and Normal Forms

Informal Design Guidelines for Relational Databases

CS352 Lecture - Conceptual Relational Database Design

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

CSCI 5333 DBMS Spring 2018 Final Examination. Last Name: First Name: Student Id:

CS145 Midterm Examination

Relational Model and Relational Algebra A Short Review Class Notes - CS582-01

5 Normalization:Quality of relational designs

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Database Management System

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

Ryan Marcotte CS 475 (Advanced Topics in Databases) March 14, 2011

Draw A Relational Schema And Diagram The Functional Dependencies In The Relation >>>CLICK HERE<<<

Homework 1: RA, SQL and B+-Trees (due Feb 7, 2017, 9:30am, in class hard-copy please)

CSCI DBMS Spring 2017 Final Examination. Last Name: First Name: Student Id:

In This Lecture. Normalisation to BCNF. Lossless decomposition. Normalisation so Far. Relational algebra reminder: product

Redundancy:Dependencies between attributes within a relation cause redundancy.

Relational Database Design. Announcements. Database (schema) design. CPS 216 Advanced Database Systems. DB2 accounts have been set up

CS 461: Database Systems. Final Review. Julia Stoyanovich

Relational Database Design (II)

CS 338 Functional Dependencies

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

Database Constraints and Design

MIDTERM EXAMINATION Spring 2010 CS403- Database Management Systems (Session - 4) Ref No: Time: 60 min Marks: 38

CSE 444 Midterm Test

Database Design Theory and Normalization. CS 377: Database Systems

Theory of Normal Forms Decomposition of Relations. Overview

Examination examples

Databases Lecture 7. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

Transcription:

Virginia Tech. Computer Science CS 4604 Introduction to DBMS Spring 2016, Prakash Homework 6: FDs, NFs and XML (due April 13 th, 2016, 4:00pm, hard-copy in-class please) Reminders: a. Out of 100 points. Contains 5 pages. b. Rough time-estimates: ~5-7 hours. c. Please type your answers. Illegible handwriting may get no points, at the discretion of the grader. Only drawings may be hand-drawn, as long as they are neat and legible. d. There could be more than one correct answer. We shall accept them all. e. Whenever you are making an assumption, please state is clearly. f. Important: Unless otherwise mentioned, assume that you need to show your work e.g. if the question asks what are R s keys? we do not just want a list; we want a step-by-step explanation as well. g. Lead TA for this homework: Sorour Amiri. Q1. XML [7 points] (Courtesy Widom and Ullman) This question is based on the XML "Vehicles" document below. <Vehicles> <Car manf="hyundai"> <Model>Azera</Model> <Car manf="toyota"> <Model>Camry</Model> <Truck manf="toyota"> <Model>Tundra</Model> </Truck> <Car manf="hyundai"> <Model>Elantra</Model> <HorsePower>120</HorsePower> <Car manf="toyota"> 1

<Model>Prius</Model> <HorsePower>120</HorsePower> </Vehicles> Q1.1 (3 points) Which of the following XPath expressions, when applied to the "Vehicles" document, does NOT produce a sequence of exactly three items? a) /Vehicles/Car[@manf="Toyota"]/HorsePower b) /Vehicles/*[HorsePower>200]/Model c) //*[@manf="toyota"]/@manf Justify your answer briefly (i.e. give the output of the query (queries) which do not produce a sequence of three items). Q1.2 (4 points) In a DTD that the "Vehicles" document satisfies, which of the following element declarations would you definitely NOT find? a) <!ELEMENT Vehicles (Car*, Truck+, Car*)> b) <!ELEMENT Vehicles (Car+, Truck*, Car)> c) <!ELEMENT Vehicles ((Car Truck)*)> d) <!ELEMENT Vehicles (Car*, Truck*, Car*, Truck*, Car*)> Again, just justify your choice(s) briefly (i.e. explain why you would not find your selected element declaration(s) in a DTD for Vehicles). Q2. FDs Definition [8 points] Suppose that we have the following four tuples in an instance of relation schema R with five attributes ABCDE: A B C D E 1 2 4 3 8 1 2 5 7 9 3 6 4 7 0 1 6 5 7 9 Q2.1 (4 points) Which of the following dependencies can you infer does not hold over R? Just list the ones that do not hold, giving the reason in 1 line for each. (a) E D, (b) D E, (c) DE B, (d) E AB, (e) BC ED 2

Q2.2 (4 points) If ABE à ACE and ABE à ADE hold in R, which of the following dependencies can be deduced? (No explanation is needed just mark the correct FDs) (a) ABE à ACDE (b) B à C (c) ABCE à ADE (d) ABE à AE Q3. Inferring FDs [27 points] Consider the relations below and sets of functional dependencies that hold in those relations and answer the following questions. For each sub-part, it is enough for you to list only completely non-trivial FDs with a single attribute on the right hand side. Note that candidate key means just key (i.e. both words are interchangeable). A candidate key (or simply key) should imply the entire relation and should be minimal. On the other hand, a superkey is any super-set of a candidate key. Q3.1 R1(A, B, C, D) with FDs AB C, C D, and D A. Q3.1.1 (3 points) What are all the non-trivial FDs that follow from the given FDs? Q3.1.2 (3 points) What are all the keys (i.e. candidate keys) of R1? Q3.1.3 (3 points) What are all the superkeys for R1 that are not keys? Q3.2 R2(A, B, C, D) with FDs A B, B C, and B D. Q3.2.1 (3 points) What are all the non-trivial FDs that follow from the given FDs? Q3.2.2 (3 points) What are all the keys (i.e candidate keys) of R2? Q3.2.3 (3 points) What are all the superkeys for R2 that are not keys? Q3.3 R3(A, B, C, D) with FDs ABD C, BC D, CD A, and AD B. Q3.3.1 (3 points) What are all the non-trivial FDs that follow from the given FDs? Q3.3.2 (3 points) What are all the keys (i.e. candidate keys) of R3? Q3.3.3 (3 points) What are all the superkeys for R3 that are not keys? Q4. Projection of FDs [26 points] Consider the relation R = {A, B, C, D, E, F} and the following set of FDs on R: F = {AB C, AC B, AD E, B D, BC A, E F} Consider also the relations: R1 (A, B, C), R2 (A, B, C, E, F), and R3 (A, B, C, D). Q4.1 (2x3=6 points) Compute the projection of the FD-set F on each of the three relations R1, R2 and R3. Only write down a minimal cover. Q4.2 (2x3=6 points) What are all the candidate keys (i.e. the keys) for each of the three relations R1, R2 and R3? 3

Q4.3 (2x2=4 points) For each of the three relations, indicate if it is (A) BCNF and (B) 3NF. Explain your answer. Q4.4 (6 points) Is it possible to decompose each R2 and R3 into new schemas such that they are in BCNF and the decomposition is lossless and dependency preserving? Show your work. Q4.5 (4 points) Decompose R into multiple relations so that they are in 3NF and the decomposition is lossless and dependency preserving. Use the standard 3NF synthesis algorithm. Compare your result with Q4.4. Q5. Lossy decompositions [8 points] Consider a relation R(A, B, C, D, E). Assume R satisfies the following functional dependencies: C E and A C. Suppose we decompose R into these two relations: R1(A, B, C) and R2(C, D, E). So you can think of R1 = Π {!,!,!} (r) and R2 = Π!,!,! r. Q5.1 (5 points) Show that this decomposition is not a lossless-join by giving an example. That is, give an instance r of R (i.e., give a set of tuples for R) such that both dependencies hold on r but r R1 R2. Give the simplest instance you can find. Also show your work (i.e. compute R1 and R2 and the join). Q5.2 (3 points) Can you show that the decomposition is lossy by using the theorem discussed in class? Again, show your work. Q6. FDs in English language [24 points] Consider the following relation, which stores information of Facebook dataset. FacebookDB (u_id, u_name, u_location, u_friends_count, u_posts_count, u_post_importance, u_age, u_influence, p_id, p_text, p_location, p_like_count, p_reshare_count, p_parent_id, c_id, c_text, c_reply_count, c_like_count) Now consider the following constraints in English: Statement 1 Every user has a unique user id (u_id). Other information for users such as name (u_name), location (u_location), number of friends (u_friends_count), and posts (u_post_count), importance of posts (u_post_importance), age, and the influence value are also kept. 4

Statement 2 Statement 3 Statement 4 Statement 5 Statement 6 Each user can have multiple friends, and posts. For every post we have p_id (which is unique), the text, the location, the number of likes, and re-shares. If the post itself is a re-share of another post, we have the parent post id. There might be several comments for each post. The comments have a unique id (c_id), text, number of replies to it (c_reply_count), and number of likes (c_like_count). The importance of a user s posts (u_post_importance) is computed based on number of friends and posts (u_friends_count, u_posts_count) of the user. The user s influence (u_influence) is defined using the age and the importance of the user s posts (u_age, u_post_importance). Clearly this is not a good relational design, as it contains redundancies and update, insertion and deletion anomalies. Hence we want to decompose it into a good schema. Q6.1 (6 points) List the functional dependencies for this relation that you can construct using the English description and also specifically mention the corresponding statement number(s) you used to formulate each FD. Note: some statement(s) might not result in a FD. Q6.2 (3 points) Rigorously prove (using Armstrong s axioms) that you can derive the FD: u_friends_count, u_posts_count, u_age à u_influence from the set of the FDs you found in Q5.1. Show your steps. Q6.3 (10 points) Is the above relation in BCNF using the FDs you found in Q5.1? If not, decompose it into a set of BCNF relations (using our standard algorithm). Is your resulting decomposition dependency-preserving too? Q6.4 (5 points) Decompose the original relation into a set of 3NF relations instead (using the standard synthesis algorithm). Show your steps. Is your resulting decomposition in BCNF as well? Explain your answer. 5