Question 1. SQL and Relational Algebra [25 marks] Question 2. Enhanced Entity Relationship Data Model [25 marks]

Similar documents
Question 1. Relational Data Model and Relational Algebra [26 marks] Question 2. SQL and [26 marks]

COMP302. Database Systems

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

McGill April 2009 Final Examination Database Systems COMP 421

DATABASE MANAGEMENT SYSTEMS

Schema And Draw The Dependency Diagram

Database Management

Examination paper for TDT4145 Data Modelling and Database Systems

CS348: INTRODUCTION TO DATABASE MANAGEMENT (Winter, 2011) FINAL EXAMINATION

Sankalchand Patel College of Engineering, Visnagar B.E. Semester III (CE/IT) Database Management System Question Bank / Assignment

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

Data about data is database Select correct option: True False Partially True None of the Above

CMSC 461 Final Exam Study Guide

GUJARAT TECHNOLOGICAL UNIVERSITY

The appendix contains information about the Classic Models database. Place your answers on the examination paper and any additional paper used.

Babu Banarasi Das National Institute of Technology and Management

THE UNIVERSITY OF BRITISH COLUMBIA CPSC 304: MIDTERM EXAMINATION SET A OCTOBER 2016

Course Outline Faculty of Computing and Information Technology

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10

Database Management Systems Paper Solution

CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam

CSE344 Final Exam Winter 2017

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

VIEW OTHER QUESTION PAPERS

MIDTERM EXAMINATION Spring 2010 CS403- Database Management Systems (Session - 4) Ref No: Time: 60 min Marks: 38

D.K.M COLLEGE FOR WOMEN(AUTONOMOUS),VELLORE DATABASE MANAGEMENT SYSTEM QUESTION BANK

CMPS 181, Database Systems II, Final Exam, Spring 2016 Instructor: Shel Finkelstein. Student ID: UCSC

Bachelor in Information Technology (BIT) O Term-End Examination

CSE 444, Winter 2011, Midterm Examination 9 February 2011

UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division

BIRKBECK (University of London)

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

CS 564 Final Exam Fall 2015 Answers

Essay Question: Explain 4 different means by which constrains are represented in the Conceptual Data Model (CDM).

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

IMPORTANT: Circle the last two letters of your class account:

Assignment Session : July-March

BCS THE CHARTERED INSTITUTE FOR IT BCS HIGHER EDUCATION QUALIFICATIONS BCS Level 5 Diploma in IT

Database Management Systems (CS 601) Assignments

Lecturer 2: Spatial Concepts and Data Models

THE COPPERBELT UNIVERSITY

Example Examination. Allocated Time: 100 minutes Maximum Points: 250

Name :. Roll No. :... Invigilator s Signature : DATABASE MANAGEMENT SYSTEM

Tutorial 2: Relational Modelling

Name Class Account UNIVERISTY OF CALIFORNIA, BERKELEY College of Engineering Department of EECS, Computer Science Division J.

Mahathma Gandhi University

Birkbeck. (University of London) BSc/FD EXAMINATION. Department of Computer Science and Information Systems. Database Management (COIY028H6)

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

CS/B.Tech/CSE/New/SEM-6/CS-601/2013 DATABASE MANAGEMENENT SYSTEM. Time Allotted : 3 Hours Full Marks : 70

Database Systems Management

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

1 (10) 2 (8) 3 (12) 4 (14) 5 (6) Total (50)

EXAMINATIONS 2013 MID-YEAR SWEN 432 ADVANCED DATABASE DESIGN AND IMPLEMENTATION

CS 461: Database Systems. Final Review. Julia Stoyanovich

Computer Science 597A Fall 2008 First Take-home Exam Out: 4:20PM Monday November 10, 2008 Due: 3:00PM SHARP Wednesday, November 12, 2008

3 February 2011 CSE-3421M Test #1 p. 1 of 14. CSE-3421M Test #1. Design

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz I

Exam. Question: Total Points: Score:

Database Management Systems

Midterm 2: CS186, Spring 2015

(DMTCS 01) Answer Question No.1 is compulsory (15) Answer One question from each unit (4 15=60) 1) a) State whether the following is True/False:

University of Waterloo Midterm Examination Sample Solution

CSCE 4523 Introduction to Database Management Systems Final Exam Fall I have neither given, nor received,unauthorized assistance on this exam.

CSCE 4523 Introduction to Database Management Systems Final Exam Spring I have neither given, nor received,unauthorized assistance on this exam.

CS425 Midterm Exam Summer C 2012

Relational Model History. COSC 416 NoSQL Databases. Relational Model (Review) Relation Example. Relational Model Definitions. Relational Integrity

6 February 2014 CSE-3421M Test #1 w/ answers p. 1 of 14. CSE-3421M Test #1. Design

Basant Group of Institution

Introduction to Database Systems. Announcements CSE 444. Review: Closure, Key, Superkey. Decomposition: Schema Design using FD

Birkbeck. (University of London) BSc/FD EXAMINATION. Department of Computer Science and Information Systems. Database Management (COIY028H6)

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. Date: Thursday 16th January 2014 Time: 09:45-11:45. Please answer BOTH Questions

Database Management System 15

SAMPLE FINAL EXAM SPRING/2H SESSION 2017

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

NAPIER UNIVERSITY SCHOOL OF COMPUTING FIRST DIET (SEMESTER TWO) EXAMINATION SESSION CS 71012: DATABASE SYSTEMS TIME ALLOWED: 2 HOURS

Homework 6: FDs, NFs and XML (due April 13 th, 2016, 4:00pm, hard-copy in-class please)

SYED AMMAL ENGINEERING COLLEGE

ADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.

Review The Big Picture

IMPORTANT: Circle the last two letters of your class account:

CS 474, Spring 2016 Midterm Exam #2

Introduction to Databases

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2015 Quiz I

You write standard JDBC API application and plug in the appropriate JDBC driver for the database the you want to use. Java applet, app or servlets

II B.Sc(IT) [ BATCH] IV SEMESTER CORE: RELATIONAL DATABASE MANAGEMENT SYSTEM - 412A Multiple Choice Questions.

IMPORTANT: Circle the last two letters of your class account:

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

1. Considering functional dependency, one in which removal from some attributes must affect dependency is called

CS211 Lecture: Database Design

8) A top-to-bottom relationship among the items in a database is established by a

Draw A Relational Schema And Diagram The Functional Dependencies In The Relation >>>CLICK HERE<<<

Chapter 12: Query Processing

normalization are being violated o Apply the rule of Third Normal Form to resolve a violation in the model

Logical Database Design. ICT285 Databases: Topic 06

6.830 Problem Set 3 Assigned: 10/28 Due: 11/30

IMPORTANT: Circle the last two letters of your class account:

Northern India Engineering College, New Delhi Question Bank Database Management System. B. Tech. Mechanical & Automation Engineering V Semester

Delhi Noida Bhopal Hyderabad Jaipur Lucknow Indore Pune Bhubaneswar Kolkata Patna Web: Ph:

Unit Assessment Guide

Transcription:

EXAMINATIONS 2003 MID-YEAR COMP 302 Database Systems Time allowed: Instructions: 3 Hours Answer all questions. Make sure that your answers are clear and to the point. Calculators and foreign language dictionaries are allowed. No reference material is allowed. There are 180 marks on the exam. CONTENTS: Question 1. SQL and Relational Algebra [25 marks] Question 2. Enhanced Entity Relationship Data Model [25 marks] Question 3. Mapping ER to Relational Data Model [30 marks] Question 4. Functional Dependencies and Normalization [40 marks] Question 5. Query Optimisation [40 marks] Question 6. Concurrency Control [20 marks] Last Page: Formulae for Computing Query Cost Estimate 1

Question 1. SQL and Relational Algebra [25 marks] Consider the following part of a relational database schema for recording data from two linguistics experiments on some subjects using a collection of paragraphs and a collection of sentences. One experiment measures the reading time of the subjects on each paragraph, and the other measures the number of words that the subjects can recall from each sentence. Subject: {{SubjId, Gender, Age},{SubjId}} ParaExp: {{SubjId, ParagraphNum, Time}, {SubjId+ParagraphNum}} SentExp: {{SubjId, SentenceNum, WordCount}, {SubjId+SentenceNum}} Although there are some other relation schemes in the database schema, only these three are needed for the intended queries. Details of the ParaExp relation are given in the table. Attribute Data Type Max. Length Null SubjId Int 4 No ParagraphNum Int 2 No Time Bigint 8 No (a) Write an SQL statement that would define the ParaExp table. Include any referential integrity constraints. Deleting a subject from the Subject table should delete all their experimental results, and changing the SubjId of a subject should not be allowed if there are any experimental results for that subject. [3 Marks] (b) Write SQL queries for each of the following: (i) Which subjects took more than 150 seconds to read any paragraph? [3 Marks] 2

(ii) Which female subjects read any paragraph in less than 100 seconds and remembered at least 15 words from any sentence? [3 Marks] (iii) List, in increasing order, the average reading time of the paragraphs. [5 Marks] (iv) List, in increasing order, the reading times on the paragraph experiment of the subject(s) who recalled the most words from sentence 8? [8 Marks] (c) Write a Relational Algebra expression for a query that retrieves the paragraph numbers of the paragraphs read by female subjects in less than 100 seconds [3 Marks] 3

Question 2. Enhanced Entity Relationship Data Model [25 marks] A large legal company wants to construct a database to record all the information about the activities of the company. You are to construct an EER diagram that models the part of the company s activities described below. Show all the participation and cardinality constraints (using one of the notations presented in the course). If there are any constraints that cannot be expressed in the EER diagram, write them in English below the diagram. If the description does not specify information you need to know, make a reasonable assumption and state the assumption you have made. The staff of the company consists of a number of lawyers and a number of legal assistants. The number of staff is small enough that they can identify the staff by name. Each staff member has a role (lawyer or assistant) and a rate ($ per hour). The lawyers each have a specialty, and some of the lawyers are partners, while others are not. The primary activity of the company is legal cases for clients. Each active case is identified by a CaseID number, and involves a particular client. At least one lawyer, and possibly more, will work on each case, and each lawyer may work on several cases (or on none). Every day, they record how much time they have worked on each case. Exactly one lawyer is assigned to be in charge of each case. Not all lawyers are in charge of a case. The legal assistants are assigned to cases, but, unlike the lawyers, each assistant is assigned to at most one case. A case may have more than one legal assistant assigned to it, and some cases have no legal assistants at all. There are several different courts in the country. They are identified by their level, (magistrate, superior, appeal), and their district. The court rooms where trials take place are identified by which court they belong to and a number, for example, Wellington magistrates court, #3. Each court room also has an address. Judges are identified by their names. Each judge has a reputation for how they conduct trials. Each case will be tried by one court, though each court may be trying many cases, and the company may or may not have cases with all the different courts. The trial of a case must involve at least one, and usually many appearances. Each appearance involves a judge at a particular courtroom at a particular time (specified by the day and the hour). The different appearances of a case may have different judges and different courtrooms, as long as they are different courtrooms of the same court. There can only be one appearance in any courtroom at any given day and time. Courtrooms and judges are not engaged in an appearance at every time, and some judges and courtrooms are not involved in any of the cases of this company. 4

5

Question 3. Mapping EER to relational data model [30 marks] The figure below shows an EER model for an database for an earthquake monitoring service. The database records the various base-stations owned by the service. Each base-station is located in a town that is threatened by some known fault lines, and each town has at most one base-station. A base-station has a collection of seismometers (always identified by A, B, C, ) located near the base-station that report their measurements every 10 seconds to the base-station by radio. The base station forwards the measurements by phone line to the central database. There are two kinds of seismometers (transverse and reflective) that make different measurements. Reflective seismometers are placed to monitor a particular fault at a certain depth. (a) Map the EER diagram below into a relational database schema: [18 Marks] For each relation schema, list the attributes of the relation and a primary key. You do not need to specify the domains of the attributes. List a set of non-redundant referential integrity constraints. State the highest normal form that your relation database schema is in, and explain why. Note that the partial key of the BaseStation type is empty. PhoneNo SeisNam Latitude BaseStation 1 1 control N Seismometer Type Location Longitude in d Type 1 Town N threat Level TName Region N ReflectvSeism TransvsSeism M FaultLine 1 monitor Measures Sensitvty Direction Measures Length FaultID Depth Time Xdist Ydist Time Data 6

7

(b) The monitor and control relationships in the diagram require total participation of one or both of the entity types involved. Explain how your relational database schema enforces these constraints, if it does. Otherwise, suggest how the constraints could be enforced. [6 Marks] (c) If your relational database schema enforces the constraint that the specialization of Seismometer is total, explain how it enforces the constraint. If not, give two different examples of how adding or deleting tuples from the database could leave the database in an inconsistent state. [6 Marks] 8

SPARE PAGE FOR EXTRA S Cross out rough working that you do not want marked. Specify the question number for work you do want marked. 9

Question 4. Functional Dependencies and Normalization (a) Consider the following set of functional dependencies (fd s): F 1 = {A BC, CDEG DEG, G G }. [40 marks] Apply the decomposition inference rule on the set F 1 to produce a new set of fd s F 1, where each fd has only one attribute on its right hand side. [3 marks] (b) Consider the following set of functional dependencies: F 2 = {AB C, DEG H, A C, DG J, J H }. Transform the set F 2 into a new set of fd s F 2, in which each fd is left reduced (ie has no redundant attributes on its left hand side). [6 marks] (c) Consider the following set of left reduced functional dependencies: F 3 = {AB C, C D, C B, D B, AB D, AD C }. Transform the set F3 into a new, non redundant set of fd s F 3. [8 marks] 10

(d) Consider the following set of left reduced and non redundant functional dependencies: F 4 = {AB C, AB D, AB E, B L, B G, B H, A J }. Partition the set F 4 into subsets G i, i = 1, 2,, where each subset contains only those functional dependencies from F 4 that contain the same left hand side. [2 marks] Build a set S of relation schemas by transforming each subset G i into a relation schema of the form N i (R, K), where N i is the relation schema name, R is the set of attributes, and K is the set of keys. [2 marks] (e) What is the highest normal form that the set of relation schema S of question 4(d) is in? [2 marks] 11

(f) Suppose (U, F ) is a universal relation schema, where U = {A, B, C, D, E, G } and F = {AB C, AC E, AC G, C D, C B, D B }. Suppose that starting from (U, F ) the following set S = {N 1 ({A, B, C }, {AB}), N 2 ({C, D }, {C}), N 3 ({D, B }, {D }), N 4 ({A, C, E, G }, {AC})} of 3NF relation schemas has been produced. If you consider that S is a lossless join decomposition of (U, F ) explain why it is. If you consider that S is not a lossless join decomposition of (U, F ), explain why it is not, and transform it into a new set of relation schemas S that will be at least a 3NF and lossless join decomposition of (U, F ). [6 marks] (g) Questions (a) to (f) represent the steps of a normalization algorithm. What is the name of that algorithm? [1 mark] 12

(h) Consider the following set of relation schemas (the same as in question (f)) S = {N 1 ({A, B, C }, {AB}), N 2 ({C, D }, {C}), N 3 ({D, B }, {D }), N 4 ({A, C, E, G }, {AC})} Suppose all functional dependencies in S are a consequence of relation scheme keys, transform S into a set of relation schemes S by applying the following algorithm: [5 marks] Input: S, F = {X A ( N i (R i, K i ) S)(X K i AX R i } Output: S S := S //Initialization while ( N i (R i, K i ), N j (R j, K j ) S )(i j (X i ) + F = (X j ) + F ) { // Merge N i (R i, K i ) and N j (R j, K j ) into N i (R i, K i ) R i := R i R j K i := K i K j // Replace N i (R i, K i ) and N j (R j, K j ) with N i (R i, K i ) in S S := ((S \ { N i (R i, K i )}) \ { N j (R j, K j )}) { N i (R i, K i )} } (i) What is the highest normal form that the set of relation schema S of question (h) is in? Justify your answer. [5 marks] 13

Question 5. Query Optimisation [40 marks] Consider the following description of a part of the LINGUISTIC_EXPERIMENT database schema, given in Table 1, and the corresponding parameters of the physical database structure, given in Table 2. In Table 2, L is the tuple length (size), f is the blocking factor, B is the block size, r is the number of tuples, and x is either the primary, or secondary key index height. The values of all the Table 2 parameters pertain to the base relations. Note that the same part of the database is also considered in Question 1. Attribute Data Type Max. Length Null Default Relation name: Subject, Primary Key: SubjId SubjId Int 4 N Gender Char 1 N Age Int 2 N Relation name: ParaExp, Primary Key: SubjIdId + ParagraphNum SubjId Int 4 N ParagraphNum Int 2 N Time Bigint 8 N Relation name: SentExp, Primary Key: SubjId+ SentenceNum SubjId Int 4 N SentenceNum Int 2 N WordCount Int 2 N Table 1 Relation schema L f B r x (primary) x (secondary) Subject 7 71 497 13000 2 1 ParaExp 14 35 490 65000 3 - SentExp 8 62 496 130000 3 - For parts (b) to (e), suppose: Table 2 Each subject may have an age between 10 and 69 years (60 different values), and there are an approximately equal number of subjects of every age, Each subject was asked to read 10 sentences, The intermediate results of the query evaluation are materialized, The final result of the query is materialized, The size of each intermediate result block should not exceed 500 bytes, There is a buffer pool of 6000 bytes provided for query processing in the main memory, There are primary indexes on Subject.SubjId, ParaExp.(SubjId + ParagraphNum), and SentExp.(SubjId + SentenceNum) available, and There is a secondary index on Subject.Age available NOTE: Some of the formulae you may need when computing the estimated query costs are given at the end of this exam paper. 14

(a) Draw the heuristic optimization tree that corresponds to the SQL query: SELECT s.subjid, Age, ParagraphNum, Time FROM (Subject s NATURAL JOIN ParaExp p) WHERE Age >= 55; [8 marks] (b) Calculate the lowest execution cost of the query SELECT * FROM Subject WHERE Age >= 55; [5 marks] 15

SPARE PAGE FOR EXTRA S Cross out rough working that you do not want marked. Specify the question number for work you do want marked. 16

(c) Calculate the lowest execution cost of the query SELECT * FROM Subject ORDER BY SubjId; [5 marks] (d) Calculate the lowest execution cost of the query SELECT * FROM Subject s, ParaExp p WHERE s.subjid = p.subjid; [10 marks] 17

SPARE PAGE FOR EXTRA S Cross out rough working that you do not want marked. Specify the question number for work you do want marked. 18

(e) Draw a heuristic optimization tree and calculate the lowest execution cost of the query SELECT SubjId, SUM(WordNum) FROM SentExp GROUP BY SubjId; [12 marks] 19

Question 6. Concurrency Control [20 marks] Suppose a DBMS supports an explicit lock table command with the syntax LOCK [TABLE] <table_name> IN <lock_mode> MODE where lock_mode is one of the following: SHARE ROW EXCLUSIVE, EXCLUSIVE, or SHARE INDEX EXLUSIVE. SHARE ROW EXCLUSIVE mode locks exclusively those rows that are selected by a subsequent SELECT statement. It conflicts with EXCLUSIVE, SHARE INDEX EXCLUSIVE, and other SHARE ROW EXCLUSIVE locks on the rows selected, but does not prevent reading. EXCLUSIVE mode locks exclusively the whole table. It conflicts with SHARE ROW EXCLUSIVE, SHARE INDEX EXCLUSIVE, and other EXCLUSIVE locks, and prevents any reading of the locked table. SHARE INDEX EXCLUSIVE mode locks exclusively those entries in the index that are selected by a subsequent SELECT statement. It conflicts with SHARE ROW EXCLUSIVE, EXCLUSIVE, and other SHARE INDEX EXCLUSIVE locks, but does not prevent reading. (a) Consider the code fragments of the Java JDBC program on the facing page. Note that after a comment is in place of the code that implements the action of the comment. i) Which commands determine transaction boundaries? [3 marks] ii) Which command performs actual locking in the book table and what is locked? [3 marks] iii) What will be the contents of the res object if the command res.next(); returns true? [3 marks] iv) Why is a try catch pair of blocks needed inside the catch block at the end of the code?[3 marks] 20

... Connection con; int j_isbn; try { // Establish a connection with the database... con.setautocommit(false); Statement lockbook=con.createstatement(); lockbook.executeupdate("lock book IN SHARE ROW EXCLUSIVE MODE"); PreparedStatement retbook = con.preparestatement("select * FROM book WHERE isbn=?"); // Obtain data for the variable j_isbn... retbook.setint(1, j_isbn); ResultSet res = retbook.executequery(); if (res.next()) { // Do some processing... // Do some database updating... } con.commit(); con.setautocommit(true); } catch (SQLException ex) { try { con.rollback(); con.setautocommit(true); } catch (SQLException e) { } return "\nsqlexception: " + ex.getmessage(); } // Close connection and exit... 21

(b) Consider the transaction programs T 1 and T 2 on the facing page. The BEGIN COMMIT pair of statements defines the transaction boundaries. Suppose the isolation level is READ COMMITED. Transaction T 2 is normally used to enter assignment marks for all students and the transaction T 1 is run afterwards only if scaling up is needed. However, this time the transactions are run concurrently. i) Concurrent execution of these two transactions leads to a well-known concurrency anomaly. What is the name of this anomaly? [1 mark] ii) The outcome of running T 2 and T 1 as on the facing page will disadvantage student 9090, since his/her marks will appear as if they had been scaled up although they were not. On the other hand, if the Assignment 4 marks for student 9090 had not been published on the web page, the student would have come to ask about his/her marks, which would result in his/her marks being scaled up. Make adjustment(s) to the program T 1 that will result in the most efficient behaviour in a mutiprogramming environment. Make your correction(s) to the program clearly visible. Briefly explain what you have done and why. [7 marks] 22

T 1 T 2 t i m e // Assignment({StudId, AssigNo, Marks}, {StudId, AssigNo}) BEGIN; LOCK Assignment IN SHARE ROW EXCLUSIVE MODE; SELECT * FROM Assignment WHERE AssigNo = 4; StudId AssigNo Marks 3030 4 25.0 1010 4 28.0 (2 rows) // Assignment 4 was too hard; scale up all marks by 20% UPDATE Assignment SET Marks = 1.2*Marks WHERE AssigNo = 4; // Read new marks and publish them on the course web page SELECT * FROM Assignment WHERE AssigNo = 4; StudId AssigNo Marks 3030 4 30.0 1010 4 33.6 9090 4 15.0 (3 rows) COMMIT; // Insert marks of an approved late submission into the database BEGIN; INSERT INTO Assignments VALUES (9090, 4, 15); COMMIT; 23

* * * * * * * * * * * * * * * * * * * SPARE PAGE FOR EXTRA S Cross out rough working that you do not want marked. Specify the question number for work you do want marked. 24 END

Attachment to COMP 302 Degree Examination 2003 FORMULAE FOR COMPUTING QUERY COST ESTIMATE Blocking factor: f = B / L Number of blocks: b = r / f Selection cardinality of the attribute A : s (A) = r / d (A), where d (A) is the number of different A values Number of buffers n = K / B, where K is the size of the buffer pool C (project) = b 1 + b 2 C(project_distinct) = b 1 + b 2 + 2b 2 (1 + log m b 2 ) + b 2 + b 3, m = n - 1 C (select_linear) = b 1 + s / f C (select_sec_key) = x + s + s / f Costs of join algorithms (o stands for the outer loop relation, and i stands for the inner loop relation) C (nested_join) = b o + b i b o / (n - 2) + js * r o * r i / f C (single_join) = b o + r o * f (index i ) + js * r o * r i / f (minimum of 4 buffers required) C (sort_join) = b 1 (3 + 2 log m b 1 ) + b 2 (3 + 2 log m b 2 ) + js * r 1 * r 2 / f, m = n - 1 C(partition_join) = 3 (b 1 + b 2 ) + js * r 1 * r 2 / f, n > (3 + (1 + 4b 1 ) 1/2 )/2, b 1 < b 2 C (sort) = 2b(1 + log m b ) f (index ): primary index: f (index ) = x + 1 secondary index: f (index ): x + s Approximate formulae for choosing the most efficient join algorithm Mandatory condition: b o << b i (o stands for the outer loop relation, and i stands for the inner loop relation) C(single) < C(nested) r o * f (index i) < b i b o /(n 2) // Providing r o * f (index i ) >> b o C(single) < C(sort) r o * f (index i) < b i (3 + 2 log m b i ) // Providing r o * f (index i ) >> b o C(single) < C(partition-hash) r o * f (index i) < 3b i // Providing r o * f (index i ) >> b o C(nested) < C(sort-merge) b o < (n - 2)(3 + 2 log m b i ) C(nested) < C(partition-hash) b o / (n 2 ) < 3 C(sort-merge) < C(partition-hash) 25 END