Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE

Similar documents
Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. Date: Thursday 16th January 2014 Time: 09:45-11:45. Please answer BOTH Questions

Relational Model: History

CS 564 Final Exam Fall 2015 Answers

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Query Processing & Optimization

Relational Algebra. Procedural language Six basic operators

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

McGill April 2009 Final Examination Database Systems COMP 421

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

Today s topics. Null Values. Nulls and Views in SQL. Standard Boolean 2-valued logic 9/5/17. 2-valued logic does not work for nulls

Database System Concepts

Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems

Advanced Database Systems

Hash-Based Indexing 165

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Chapter 13: Query Processing

QUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION

CSE Midterm - Spring 2017 Solutions

Chapter 12: Query Processing

2.2.2.Relational Database concept

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

Chapter 13: Query Processing Basic Steps in Query Processing

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Principles of Data Management. Lecture #13 (Query Optimization II)

Fundamentals of Database Systems

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.

Mahathma Gandhi University

CSE 344 APRIL 20 TH RDBMS INTERNALS

Basant Group of Institution

CSE 344 FEBRUARY 14 TH INDEXING

7. Query Processing and Optimization

Homework 1: RA, SQL and B+-Trees (due September 24 th, 2014, 2:30pm, in class hard-copy please)

Chapter 12: Query Processing

CSIT5300: Advanced Database Systems

Relational Query Optimization. Highlights of System R Optimizer

Query Languages for XML

Lecture 20: Query Optimization (2)

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. M.Sc. in Computational Science & Engineering

2.3 Algorithms Using Map-Reduce

Chapter 3. Algorithms for Query Processing and Optimization

FDB: A Query Engine for Factorised Relational Databases

Parser: SQL parse tree

Relational Algebra and SQL. Basic Operations Algebra of Bags

Chapter 9. Cardinality Estimation. How Many Rows Does a Query Yield? Architecture and Implementation of Database Systems Winter 2010/11

Chapter 12: Query Processing. Chapter 12: Query Processing

Reminders. Query Optimizer Overview. The Three Parts of an Optimizer. Dynamic Programming. Search Algorithm. CSE 444: Database Internals

CMP-3440 Database Systems

COMP9311 Week 10 Lecture. DBMS Architecture. DBMS Architecture and Implementation. Database Application Performance

Semantic Optimization of Preference Queries

EXTENDED RELATIONAL ALGEBRA OUTERJOINS, GROUPING/AGGREGATION INSERT/DELETE/UPDATE

Chapter 6: Formal Relational Query Languages

Homework 1: RA, SQL and B+-Trees (due Feb 7, 2017, 9:30am, in class hard-copy please)

Chapter 2: Intro to Relational Model

Database Tuning and Physical Design: Basics of Query Execution

Two hours. Please note that an OMR Sheet is attached for use with Section A: full instructions for their use are given in Section A

Web Science & Technologies University of Koblenz Landau, Germany. Relational Data Model

CMSC 424 Database design Lecture 18 Query optimization. Mihai Pop

RELATIONAL DATA MODEL

CS 245 Midterm Exam Winter 2014

CS145 Introduction. About CS145 Relational Model, Schemas, SQL Semistructured Model, XML

CS 582 Database Management Systems II

Learn Relational Database from Scratch. Dan Li, Ph.D. Associate Professor Computer Science Eastern Washington University

DBMS Query evaluation

Relational Query Optimization

3. Relational Data Model 3.5 The Tuple Relational Calculus

Database Systems SQL SL03

Ch 5 : Query Processing & Optimization

More SQL. Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update

From SQL-query to result Have a look under the hood

CSE 544 Principles of Database Management Systems

Data about data is database Select correct option: True False Partially True None of the Above

Principles of Database Management Systems

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 7 - Query execution

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Ian Kenny. November 28, 2017

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007

Question Score Points Out Of 25

Database Management Systems Written Examination

QUIZ 1 REVIEW SESSION DATABASE MANAGEMENT SYSTEMS

Ian Kenny. December 1, 2017

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst March 8 and 13, 2007

Database Systems SQL SL03

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

Introduction to Database Systems CSE 444

CompSci 516 Data Intensive Computing Systems

CS 245 Midterm Exam Solution Winter 2015

Grouping Operator. Applying γ L (R) Recall: Outerjoin. Example: Grouping/Aggregation. γ A,B,AVG(C)->X (R) =??

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15

Part XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321

CS122 Lecture 4 Winter Term,

Chapter 19 Query Optimization

Relational Algebra BASIC OPERATIONS DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA

SQL Functionality SQL. Creating Relation Schemas. Creating Relation Schemas

University of Waterloo Midterm Examination Sample Solution

Transcription:

COMP 62421 Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE Querying Data on the Web Date: Wednesday 24th January 2018 Time: 14:00-16:00 Please answer all FIVE Questions provided. They amount to a total of 50 marks. This is a CLOSED book examination The use of electronic calculators is permitted provided they are not programmable and do not store text [PTO]

1. a) Given two relations R(a : int, b : str) and S(c : int, d : str), let R > S. For each of the algebraic expressions below, write down an expression that characterizes its minimum cardinality and another that characterizes its maximum cardinality, stating any assumptions you have made in your answer. i) R S ii) π a R b) Let BBD be a database schema on beers (made by some brewer) that are served in bars (at some address and for a certain price), and on drinkers (at some address) that frequent bars a certain number of times per week, where the beers that drinkers like are also recorded, as follows: Beer(name,brewer) Serves(bar[fk Bar.name],beer[fk Beer.name],price) Bar(name,address) Drinker(name,address) Frequents(drinker[fk Drinker.name],bar[fk Bar.name],time_a_week) Likes(drinker[fk Drinker.name],beer[fk Beer.name]) i) Code a solution to the following problem as a TRC expression over the BBD database schema: Return a unary relation with column regulars where regulars contains the name of a drinker that frequents at least one bar more than once a week. ii) Code a solution to the following problem as a SQL query over the BBD database schema: Return a ternary relation with columns barname, baraddress and bestvalue, where barname contains the name of the bar, baraddress contains the address of the bar, and bestvalue contains the price of the cheapest beer that the bar serves, provided that that price is less than 3. Note that the result will have one row per bar (assuming each bar only exists in one address). Page 2 of 6

2. a) Assume that a query execution plan contains the subtree R S over intermediate results R and S. Further assume that the DBMS has been configured to use B bytes of buffer space. Briefly explain how large B must be in terms of the size in bytes of R and S for it to make sense for the optimizer to select a hash join as the physical algorithm with which to evaluate R S. b) Assume that a query execution plan contains the subtree σ a>3 (R c π b (S)) over intermediate results R and S, where a schema(r) and b schema(s) and c schema(r) schema(s). Further assume that execution is pipelined and that the optimizer selects a hash join to evaluate the join in this query plan fragment. Briefly explain how this choice impacts on operator-level multithreading of the right child of the join (i.e., the projection) and, likewise, of the parent of the join (i.e., the selection). c) Consider the following query plan fragment: R S T U. Assume that the average width of a tuple is the same for all the leaf nodes (i.e., R,S,T,U) and that that width is 10. Further assume that the cardinalities of the leaf nodes are as follows: R = 20, S = 10, T = 12, U = 15. Finally, assume the worst-case cardinality for the result of all the joins in the plan fragment except when they involve S or T, in which case empirical evidence shows that the resulting size in bytes is, respectively, 10% and 20% smaller than the worst case would have been. State which join order would be selected by the greedy algorithm taught in this course unit indicating the outcome of each pass in the algorithm s execution on the given inputs and stating any assumptions you have made in your answer. (6 marks) Page 3 of 6 [ Next page ]

3. a) In the context of XQuery Core, briefly describe what is meant by a static semantics. (1 mark) b) Consider the following FLWOR expression F: for $i in (<item>c</item>, <item>a</item>, <item>b</item>) return <creator>{data($i)}</creator> Now, consider the following trace of the evaluation of F using the sequence, right unit, let and data equivalence laws, where E1, E2, E3 and E4 act as place-holders. Using your knowledge of those equivalence laws, write down the XQuery expressions that instantiate E1, E2, E3 and E4. for $i in (<item>c</item>, <item>a</item>, <item>b</item>) return <creator>{data($i)}</creator> = (sequence) E1 = (right unit) E2 = (let) E3 = (data) E4 c) Write out the expression that results from mapping the following XPath expression into XQuery Core: $root/x/b/y[@z > 0] (5 marks) Page 4 of 6

4. a) State any two of the various serialization syntaxes for storing and exchanging RDF. b) Use the triple binary tables strategy to map the following RDF triples into relational tables. You only need to show the s and p tables. (8 marks) (uk, capital, london) (uk, area, 242495) (usa, capital, washington) (usa, area, n) Page 5 of 6 [ Next page ]

5. a) Briefly explain why map-reduce computations so directly express queries of the form SELECT λ,γ(α) FROM Φ WHERE θ GROUP BY λ where Φ is a relation with schema λ {α}, Γ is an element of {COUNT, SUM, MAX, MIN, AVG}, and θ is a predicate over the schema of Φ. b) Consider, in scientific computing, the volume of data generated by the Large Hadron Collider (LHC) at CERN, i.e., some 30 petabytes or so, annually. Given the four characteristics of so-called big data known as the four Vs, briefly comment on how each one applies or not in the case of the LHC at CERN. c) Consider the following RA algebraic expression over the database in Q1b above: Bar Frequents Sketch the pseudocode of the mapper and reducer functions that would compute the correct value for the given algebraic expression. Page 6 of 6 END OF EXAMINATION