CIS 550 Fall Final Examination. December 13, Name: Penn ID:

Similar documents
Database Management Systems Written Exam

University of California, Berkeley. (2 points for each row; 1 point given if part of the change in the row was correct)

Database Management Systems Written Examination

CSE344 Midterm Exam Fall 2016

CSE344 Midterm Exam Winter 2017

CPSC 310: Database Systems / CSPC 603: Database Systems and Applications Exam 2 November 16, 2005

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23

IMPORTANT: Circle the last two letters of your class account:

Query Processing and Advanced Queries. Query Optimization (4)

CSE 344 Final Review. August 16 th

CMSC 461 Final Exam Study Guide

Spring 2013 CS 122C & CS 222 Midterm Exam (and Comprehensive Exam, Part I) (Max. Points: 100)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2017 Quiz I

Relational Database Features

Database Management

CS 564 Final Exam Fall 2015 Answers

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

University of California, Berkeley. CS 186 Introduction to Databases, Spring 2014, Prof. Dan Olteanu MIDTERM

CSE344 Final Exam Winter 2017

CIS 110 Introduction to Computer Programming 8 October 2013 Midterm

CS2 Databases TEST 1 25 August 2003 Student Number: MARK: /35

CSCI-6421 Final Exam York University Fall Term 2004

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2015 Quiz I

Computer Science 597A Fall 2008 First Take-home Exam Out: 4:20PM Monday November 10, 2008 Due: 3:00PM SHARP Wednesday, November 12, 2008

Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10

CSE 344 MAY 7 TH EXAM REVIEW

Chapter 12: Query Processing

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

; Spring 2008 Prof. Sang-goo Lee (14:30pm: Mon & Wed: Room ) ADVANCED DATABASES

CPSC 310: Database Systems / CSPC 603: Database Systems and Applications Final Exam Fall 2005

Midterm 1: CS186, Spring I. Storage: Disk, Files, Buffers [11 points] cs186-

Hash table example. B+ Tree Index by Example Recall binary trees from CSE 143! Clustered vs Unclustered. Example

Query Processing & Optimization

Course No: 4411 Database Management Systems Fall 2008 Midterm exam

Midterm 1: CS186, Spring I. Storage: Disk, Files, Buffers [11 points] SOLUTION. cs186-

CSE 444 Final Exam. August 21, Question 1 / 15. Question 2 / 25. Question 3 / 25. Question 4 / 15. Question 5 / 20.

D.K.M COLLEGE FOR WOMEN(AUTONOMOUS),VELLORE DATABASE MANAGEMENT SYSTEM QUESTION BANK

Introduction to Database Systems CSE 414. Lecture 26: More Indexes and Operator Costs

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15

Query Processing Strategies and Optimization

Principles of Data Management. Lecture #9 (Query Processing Overview)

Chapter 1: Introduction

CSE 344 FEBRUARY 14 TH INDEXING

CSE 344 Midterm. Friday, February 8, 2013, 9:30-10:20. Question Points Score Total: 100

CSE 344 Final Examination

Outline. Database Management Systems (DBMS) Database Management and Organization. IT420: Database Management and Organization

Migrating Oracle Databases To Cassandra

CS317 File and Database Systems

Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems

CS 2316 Exam 4 Fall 2011

CS 222/122C Fall 2016, Midterm Exam

CS 186 Midterm, Spring 2003 Page 1

1 (10) 2 (8) 3 (12) 4 (14) 5 (6) Total (50)

CSE 190D Spring 2017 Final Exam Answers

Data Analysis. CPS352: Database Systems. Simon Miner Gordon College Last Revised: 12/13/12

Midterm 1: CS186, Spring 2015

University of Waterloo Midterm Examination Sample Solution

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz I

CS222P Fall 2017, Final Exam

CSE 190D Spring 2017 Final Exam

CSE 544 Principles of Database Management Systems

Chapter 13: Query Processing

Topics. History. Architecture. MongoDB, Mongoose - RDBMS - SQL. - NoSQL

CSE 344 Midterm. Wednesday, February 19, 2014, 14:30-15:20. Question Points Score Total: 100

Chapter 12: Query Processing. Chapter 12: Query Processing

Final Examination CSE 100 UCSD (Practice)

relational Key-value Graph Object Document

Chapter 12: Query Processing

(a) Explain how physical data dependencies can increase the cost of maintaining an information

CS6302- DATABASE MANAGEMENT SYSTEMS- QUESTION BANK- II YEAR CSE- III SEM UNIT I

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

Chapter 13: Query Processing Basic Steps in Query Processing

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

Introduction to Database Systems CSE 344

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

Name Class Account UNIVERISTY OF CALIFORNIA, BERKELEY College of Engineering Department of EECS, Computer Science Division J.

Database Systems CSE 414

Entity Relationship Graphs (14 pts)

7. Query Processing and Optimization

DSE 203 DAY 1: REVIEW OF DBMS CONCEPTS

Advanced Database Systems

Fall, 2004 CIS 550. Database and Information Systems Midterm Solutions

MongoDB Schema Design

CISC 3140 (CIS 20.2) Design & Implementation of Software Application II

CSE 344 Midterm. November 9, 2011, 9:30am - 10:20am. Question Points Score Total: 100

Examination paper for TDT4145 Data Modelling and Database Systems

Query Processing and Query Optimization. Prof Monika Shah

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

CMPS 181, Database Systems II, Final Exam, Spring 2016 Instructor: Shel Finkelstein. Student ID: UCSC

Department of Information Technology B.E/B.Tech : CSE/IT Regulation: 2013 Sub. Code / Sub. Name : CS6302 Database Management Systems

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

IMPORTANT: Circle the last two letters of your class account:

Evaluation of Relational Operations

CS157a Fall 2018 Sec3 Home Page/Syllabus

File Processing Approaches

Transcription:

CIS 550 Fall 2013 Final Examination December 13, 2013 Name: Penn ID: Email: My signature below certifies that I have complied with the University of Pennsylvania's Code of Academic Integrity in completing this examination. (Exams without signatures will not be graded.) Signature Date Instructions: This is an open-book, open-notes no-device exam: you may not make use of any electronic devices, not even calculators. Your mobile phones and MP3 players must be turned off and stored away. You have 110 minutes to answer all of the questions. The entire exam is worth 120 points, giving you a guideline of spending approximately 1 minute per point per problem. Partial credit will be given. Do not spend disproportionate time on any one question. All correct answers are short. Meandering answers or brain dumps will not be given full points. Write your answers in the spaces provided: you must turn in this printed exam. The back side of each page may be used as a scratch pad. Good luck! Score 1-15: 30pts 19: 15pts 16: 20pts 20: 15pts 17: 15pts 21: 10pts 18: 15pts Total Score

[2 pts each] Please answer Questions 1-15 on the ScanTron sheet, NOT this sheet of paper. Use a Number 2 Pencil, and be careful to fully erase if you change your answer. 1. XML solves the most difficult issues in data interchange: 2. XML Schema enables key and foreign key constraints to be specified: 3. JSON can be parsed without knowing anything about the tags: 4. XML is in 3NF: 5. SQL is converted to relational algebra, which is then converted into relational calculus to be executed in the database query execution engine: 6. Local-as-view refers to schema mappings that are defined as queries over the mediated schema: 7. What is true about the results of evaluating an XPath? a. XPath returns an unordered set of nodes b. XPath returns an ordered multiset of nodes c. XPath returns an ordered set of nodes d. XPath returns an ordered multiset of nodes 8. Which of the following can reduce the possibility of SQL injection attacks: a. Prepared statements b. Dynamic SQL c. Views

d. None of the above 9. NoSQL databases: a. Never use SQL, hence the name b. Always perform better than SQL databases c. Are especially appropriate for transactions d. None of the above 10. Virtual data integration or enterprise information integration means: a. The cloud is used to integrate data b. A central database is used to integrate data c. Extract-transform-load (ETL) scripts are used to integrate the data d. A virtual mediated schema is used e. None of the above 11. The hash join algorithm needs: a. A join condition (theta) that is a range condition b. Input relations that are sorted on their primary keys c. Input relations that are sorted on the join key d. Input relations that are unsorted 12. Sorting and hashing can be used to: a. Reduce the amount of data that needs to be considered in a multiple-pass algorithm b. Reduce the size of query results c. Project data early d. Convert data using MapReduce 13. An SQL query optimizer, such as that in Oracle or DB2, optimizes: a. The number of tuples according to cardinality estimates b. The number of requests according to workload c. The number of users according to predictions d. The estimated cost of the query according to a cost model 14. Which of the following is an ACID property: a. Concurrency b. Atomicity c. Idempotence d. Delivery

15. Cloud storage systems typically do not provide full ACID semantics because of: a. The number of clients being handled by the cloud b. The latency of communications across multiple servers c. The sizes of the databases d. The number of CEOs and CIOs who associate ACID with drugs and get a negative impression of the capability

16. [20pts] Given the document: <items> <item> <type>book</book> <isbn>978-0385349949</author> <author key= 121 >Sheryl Sandberg</author> <title>lean In</title> </item> <item> <type>book</book> <isbn>978-0385537858</author> <author key= 123 >Dan Brown</author> <title>inferno</title> </item> <item> <type>movie</book> <star key= 149 >Steve Carell</star> <star key= 300 >Kristin Wiig</star> <director key= 3 >Chris Renaud</director> <director key= 99 >Pierre Coffin</director> <title>despicable Me 2</title> </item> </items> a. Write an XPath to return all book titles by Dan Brown. /items/item[type= book ][author= Dan Brown ]/title or /items/item[type= book ] [author= Dan Brown ]/title/text()

b. Write an XQuery to convert the books and authors into a relation-like form with foreign keys. (The author key contains the value, and recall that the @ prefix in XPath can be used to query for an attribute). Your code should be generic, but its output, over the sample data above, should look like: <books> <author key= 121 > <name>sheryl Sandberg</name> </author> <author key= 123 > <name>dan Brown</name> </author> <book> <isbn>978-0385349949</isbn> <title>lean In</title> <author-id>121</author-id> </book> <book> <isbn>978-0385537858</author> <title>inferno</title> <author>123</author> </book> </books> <books> { for $a in distinctvalues(doc( input.xml )/items/item/author) return <author key=$a/@key>{ $a/text() }</author>, for $b in doc( input.xml )/items/item[type= book ] return <book>{ $i/isbn, $i/title}, {for $a in $b/author return <author-id>{ $a/text() }</author-id>} </book> } </books>

17. [15pts] Given the schema: Users(login, first, last, address, email) Friends(login1, login2) where login1, login2 reference Users Convert the following query into a relational algebra expression: SELECT user, F2.login2 AS recommendation FROM Users U, Friends F1, Friends F2 WHERE U.email = me@me.com AND U.login = F1.login1 AND F1.login2 = F2.login1 18. [15pts] Explain briefly what a clustered index means with respect to what data is stored in intermediate nodes, leaf nodes, and so on. In a clustered index, data records are stored in the same order as the key of the index. Typically this means intermediate nodes in the index are pivot values on the key, and the leaf nodes contain records. In unusual cases we can have a secondary index that s clustered, in that the leaf nodes contain pointers to data records that show up in the same exact order. (Consider a clustered primary index in lastname, firstname and a secondary clustered index on lastname.)

19. [15pts] Given the B+ Tree: R 10 20 30 81 A B C D 36 42 51 E G 30* 31* 42* 43* F H 36* 38* 51* 52* 56* 60* I 94 98 J K L 81* 82* 94* 95* 96* 97* 98* 99* 100* 105* Show the effects of inserting 71* and 93*. You may draw over the figure, or redraw the labeled nodes ( R, D L ) below. D: 36, 42, 51, 56 H: 51*, 52* H : 56*, 60*, 71* J: 81*, 82*, 93*

20. [15pts] Given the following costs: Page random access time = 5msec Page sequential read time = 0.05msec R tuples/page = 20 Cardinality of R = 2000 tuples S tuples/page = 10 Cardinality of S = 1000 tuples R and S are sorted on the join key Which join algorithm (nested loops or merge) should we choose, and why? Assume every page is filled to capacity and that the buffer pool (cache) is 2 pages. Merge join (one pass through each of the tables) 21. [10pts] Explain briefly where key-value stores offer advantages over relational database systems. Key-value stores (KVSs) offer benefits when concurrent updates are unlikely to touch the same key and transactions don t touch multiple keys; when the data is not naturally tabular; when queries are not content-related.