Type of course: Elective SUBJECT NAME: Distributed DBMS SUBJECT CODE: 21714 B.E. 7 th SEMESTER Prerequisite: Database Management Systems & Networking Rationale: Students are familiar with Centralized DBMS. This subject will give introduction to Distributed DBMS and associated problems. Students will be able to understand various algorithms and techniques for managing distributed database. Teaching and Examination Scheme: Teaching Scheme Credits Examination Marks Total L T P C Theory Marks Practical Marks Marks ESE PA (M) ESE (V) PA (E) PA ALA ESE OEP (I) 3 0 2 5 70 20 10 20 10 20 150 Content: Sr. No. Content Total Hrs % Weightage 1 Introduction: Distributed Data Processing, Distributed Database Systems, Promises of DDBSs, Complicating factors, Problem areas 03 7 2 Overview of RDBMS: Concepts, Integrity, Normalization 02 5 3 Distributed DBMS Architecture : Models- Autonomy, Distribution, Heterogeneity DDBMS Architecture Client/Server, Peer to peer, MDBS 4 Data Distribution Alternatives: Design Alternatives localized data, distributed data Fragmentation Vertical, Horizontal (primary & derived), hybrid, general guidelines, correctness rules Distribution transparency location, fragmentation, replication Impact of distribution on user queries No Global Data Dictionary(GDD), GDD containing location information, Example on fragmentation 5 Semantic Data Control : View Management, Authentication database authentication, OS authentication, Access Rights, Semantic Integrity Control Centralized & Distributed, Cost of enforcing semantic integrity 03 7 05 15 03 10
6 Query Processing : Query Processing Problem, Layers of Query Processing Query Processing in Centralized Systems Parsing & Translation, Optimization, Code generation, Example Query Processing in Distributed Systems Mapping global query to local, Optimization, 7 Optimization of Distributed Queries: Query Optimization, Centralized Query Optimization, Join Ordering Distributed Query Optimization Algorithms 8 Distributed Transaction Management & Concurrency Control: Transaction concept, ACID property, Objectives of transaction management, Types of transactions, Objectives of Distributed Concurrency Control, Concurrency Control anomalies, Methods of concurrency control, Serializability and recoverability, Distributed Serializability, Enhanced lock based and timestamp based protocols, Multiple granularity, Multi version schemes, Optimistic Concurrency Control techniques 9 Distributed Deadlock & Recovery Deadlock concept, Deadlock in Centralized systems, Deadlock in Distributed Systems Detection, Prevention, Avoidance, Wait-Die Algorithm, Wound-Wait algorithm Recovery in DBMS - Types of Failure, Methods to control failure, Different techniques of recoverability, Write- Ahead logging Protocol, Advanced recovery techniques- Shadow Paging, Fuzzy checkpoint, ARIES, RAID levels, Two Phase and Three Phase commit protocols 04 10 06 10 08 18 08 18 Suggested Specification table with Marks (Theory): Distribution of Theory Marks R Level U Level A Level N Level E Level C Level 10 15 15 15 10 5 Legends: R: Remembrance; U: Understanding; A: Application, N: Analyze and E: Evaluate C: Create and above Levels (Revised Bloom s Taxonomy) Note: This specification table shall be treated as a general guideline for students and teachers. The actual distribution of marks in the question paper may vary slightly from above table. Reference Books: 1. Principles of Distributed Database Systems, Ozsu, Pearson Publication 2. Distributed Database Mangement Systems, Rahimi & Haug, Wiley 3. Distributed Database Systems, Chanda Ray, Pearson Publication 4. Distributed Databases, Sachin Deshpande, Dreamtech
Course Outcome: After learning the course the students should be able to: Understand what is Distributed DBMS Understand various architectures of DDBMS Apply various fragmentation techniques given a problem Understand and calculate the cost of enforcing semantic integrity control Understand the steps of query processing How optimization techniques are applies to Distributed Database Learn and understand various Query Optimization Algorithms Understand Transaction Management & Compare various approaches to concurrency control in Distributed database Understand various algorithms and techniques for deadlock and recovery in Distributed database List of Experiments: [1] Create two databases either on single DBMS and Design Database to fragment and share the fragments from both database and write single query for creating view. [2] Create two databases on two different computer systems and create database view to generate single DDB. [3] Create various views using any one of examples of database and Design various constraints. [4] Write and Implement algorithm for query processing using any of Example in either C /C++ /Java /.NET [5] Using any of example, write various Transaction statement and show the information about concurrency control [i.e. various lock s from dictionary] by executing multiple update and queries. [6] Using Transaction /commit rollback, Show the transaction ACID properties. [7] Write java JDBC program and use JTA to show various isolation level s in transaction. [8] Implement Two Phase Commit Protocol [9] Case study on nosql [10] Case study on Hadoop Design based Problems (DP)/Open Ended Problem: 1. Countrywide drug supplier chain operates from five different cities in the country and it maintains following database. Shop(ds-id, ds-city, ds-contactno) Medicine(med-id, med-name,manuf-id) Manufacturer(manuf-id, manuf-name, manuf-city) Order(med-id, ds-id,qty) Suggest fragmenatation and allocation schema considering following frequent queries (a) List manufacturer names who belong to the same city in which the drug shop that has placed an order resides. (b) How many orders are generated from a city say Ahmedabad? Justify your design and mention assumptions if any clearly.
2. Consider relations EMP( eno,ename,title) and ASG(eno,pno, resp,dur). Write down suitable queries in SQL-like syntax and in relational algebra for finding the names of employees who are managers of any project. Is the query optimized? If not, optimize it. Major Equipment: Networking of computers, RDBMS List of Open Source Software/learning website: 1. https://docs.oracle.com/cd/b10501_01/server.920/a96521/ds_concepts.htm 2. https://cs.uwaterloo.ca/~tozsu/ddbook/presentation-slides.php 3. http://www.tutorialspoint.com/distributed_dbms/distributed_dbms_databases.htm ACTIVE LEARNING ASSIGNMENTS: Preparation of power-point slides, which include videos, animations, pictures, graphics for better understanding theory and practical work The faculty will allocate chapters/ parts of chapters to groups of students so that the entire syllabus to be covered. The power-point slides should be put up on the web-site of the College/ Institute, along with the names of the students of the group, the name of the faculty, Department and College on the first slide. The best three works should submit to GTU.
BE - SEMESTER VII (NEW) - EXAMINATION SUMMER 2017 Subject Code: 21714 Date: 11/05/2017 Subject Name: Distributed DBMS(Departmental Elective - II) Time: 02.30 PM to 05.00 PM Total Marks: 70 Q.1 (a) Define Distributed Database System. What is transparency in DDBMS? Explain Layers of transparency in DDBMS. (b) Describe functional layers of Relational DBMS. Q.2 (a) Elaborate Peer to Peer Distributed architecture. (b) Describe Bond Energy Algorithm (BEA) algorithm in vertical fragmentation. (b) Explain different types of failures in DDBMS. Q.3 (a) What do you mean by distributed semantic integrity control? Explain with example. (b) Draw and explain query processing in centralized system. Q.3 (a) What is authorization control? How to imply authorization control in centralized and distributed environment. (b) Write a short note on: Phases of Distributed Query Processing Q.4 (a) Discuss query optimization in brief. (b) Define a transaction. Discuss the ACID properties of a transaction. Q.4 (a) Discuss join ordering in query optimization. (b) Compare locking-based and timestamp-based concurrency control protocols with the help of examples. Q.5 (a) Explain serializable schedules with the help of example. (b) Discuss wait - die and wound wait deadlock avoidance algorithm for distributed system. Q.5 (a) Explain Two Phase Commit Protocol (2PC). (b) What is allocation? List and explain the information requirements during allocation. 1
BE - SEMESTER VII EXAMINATION WINTER 2014 Subject Code: 171602 Date: 27-11-2014 Subject Name: Distributed Database Application and System Time: 10:30 am - 01:00 pm Total Marks: 70 Q.1 (a) Explain Mixed fragmentation with the help of an example (b) What are distributed systems? List two advantages and disadvantages of distributed system over centralized one? Also explain parallelism and transparency. Q.2 (a) Explain Top Down Design Process for Distributed Database Design (b) Explain following in context of Relational algebra : 1. Selection 2. Natural Join 3. Intersection (b) Explain the terms transparency and concurrency with respect to DDBMS. Also Explain layers of transparency Q.3 (a) Write a note on components of DDBMS (b) Explain loosely coupled system and tightly coupled system Q.3 (a) What are various types of networks? Explain in brief (b) Explain Transaction Management in DDBMS. Q.4 (a) What do you mean by two phase locking? How it is different from strict two phase locking? Explain in brief. (b) What is replication? Explain different types of replication techniques? Q.4 (a) Discuss the problems of query optimization (b) Write a note on concurrency control. Q.5 (a) Given a global relation EMP(EMPNUM,NAME,SAL,TAX,MGRNUM,DEPTNUM) Write the mixed fragmentation de_nition and fragmentation tree of relation EMP. (b) Write short note on: Reliability in Distributed DBMS Q.5 (a) List the steps of Query Decomposition. Explain any one of them. (b) Explain the Client server architecture for Distributed DBMS with figure 1
BE - SEMESTER VII (OLD) - EXAMINATION SUMMER 2017 Subject Code: 171602 Date: 06/05/2017 Subject Name: Distributed Database Application & System Time: 02:30 PM to 05:00 PM Total Marks: 70 Q.1 (a) What is distributed database system? Explain shared memory, shared disk and shared nothing multiprocessor system with neat sketches. (b) Explain different relational algebra operations with proper example(s). Q.2 (a) What do you mean by normalization? Enlist and explain various normal forms with appropriate example(s). (b) Discuss in detail the problem areas in DDBS environment. (b) What are the various types of computer networks? Explain in detail. Q.3 (a) Explain layers of transparency. (b) Describe architectural models for distributed DBMSs. Q.3 (a) Elaborate MDBS architecture without GCS. (b) Explain distribution design issues in detail. Q.4 (a) Discuss top-down design process in detail. (b) Explain views in centralized DBMSs with example(s). Q.4 (a) Explain client/server reference architecture. (b) Write a note on : allocation model Q.5 (a) Describe horizontal fragmentation with its various types. (b) Elaborate layers of query processing. Q.5 (a) Discuss query processing system with example(s). (b) Explain centralized semantic integrity control with examples. 1
BE - SEMESTER VII EXAMINATION SUMMER 2014 Subject Code: 171602 Date: 29-05-2014 Subject Name: Distributed Database Application and System Time: 02:30 pm - 05:00 pm Total Marks: 70 Q.1 (a) Explain the potential problems with DDBMS. (b) Explain Top Down Design Process for Distributed Database Design. Q.2 (a) Explain following in context of Relational algebra : 1. Selection 2. Natural Join 3. Intersection (b) What are the objectives of Query processing? (b) Explain layers of Query processing. Q.3 (a) Consider following relations employee (person-name, street, city) works(person-name. company-name, salary) company(company-name, city) manages(person-name,, manager-name) Write following query in relational algebra form. (1) Find names of all employees who work for HDFC. (2) Find the names of all employees who live in the same city as the company for which they work. (3) Find names and cities of residence of all employees who do not work for HDFC and earn more than Rs 10 lac per year. (b) What is fragmentation? Why it is needed? What are the correctness rules for fragmentation? Q.3 (a) What do you mean by distributed semantic integrity control? Explain with example. (b) Describe BEA algorithm used in vertical fragmentation. Q.4 (a) What is Query Optimization? List the components of Query Optimizer software and explain any one. (b) Explain the first phase of Query processing that transforms a relational calculus query into a relational algebra query. Q.4 (a) What is allocation? List and explain the information requirements during allocation. (b) Explain Distributed cost model with suitable example and determine total time as well as response time. Q.5 (a) List various Transaction Models. Explain any two in detail. (b) Discuss fundamental issues in distributed database design. Give at least 3 differences between replicated and Partitioned database. Q.5 (a) Write short note on: MDBS architecture. (b) Write short note on: Two Phase Commit Protocol for Distributed Transactions 1
BE - SEMESTER VII EXAMINATION SUMMER 2015 Subject Code: 171602 Date:04/05/2015 Subject Name: Distributed Database Application & System Time: 02:30 to 05:00 Total Marks: 70
BE - SEMESTER VII EXAMINATION WINTER 2013 Subject Code: 171602 Date: 28-11-2013 Subject Name: Distributed Database Application and System Time: 10:30 TO 01:00 Total Marks: 70 Q.1 (a) Explain various levels of data and process distribution in distributed environment. (b) Explain Top Down Design Process for Distributed Database Design. Q.2 (a) Explain following in context of Relational algebra : 1. Selection 2. Θ Join 3. Intersection (b) Explain database interoperability. (b) Explain layers of Query processing. Q.3 (a) Consider following relations employee (person-name, street, city) works(person-name. company-name, salary) company(company-name, city) manages(person-name,, manager-name) Write following query in relational algebra form. (1) Find names of all employees who work for HDFC. (2) Find the names of all employees who do not live in the same city as the company for which they work. (3) Find names and cities of residence of all employees who work for HDFC and earn more than Rs 10 lac per year. (b) Draw and explain the process of DDBS design to be used when we wanted to design DDBS from scratch? Q.3 (a) Describe BEA algorithm used in vertical fragmentation. (b) Explain View management. How it is carried out in centralized and distributed DBMS? Q.4 (a) List the steps of Query Decomposition. Explain any one of them. (b) What is Query Optimization? List the components of Query Optimizer software and explain any one. Q.4 (a) What is allocation? List and explain the information requirements during allocation. (b) Explain with necessary diagram the architecture of Distributed DBMS. Q.5 (a) Explain Transaction Management in DDBMS. (b) Discuss fundamental issues in distributed database design. Give at least 3 differences between replicated and Partitioned database. Q.5 (a) Write short note on: Reliability in Distributed DBMS (b) Write short note on: MDBS architecture 1
BE - SEMESTER VII(OLD) EXAMINATION WINTER 2016 Subject Code: 171602 Date: 25/11/2016 Subject Name: Distributed Database Application & System Time: 10:30 AM to 01:00 PM Total Marks: 70 Q.1 (a) Consider a stock market having many broker and customer for trading, design distributed database with fragmentation and allocation strategy. (b) List and briefly explain transparency for DDBS environment. Q.2 (a) Explain MDBMS architecture for distributed DBMS. (b) Explain top-down design process in detail. (b) Explain Peer to Peer Distributed System. Q.3 (a) Explain Vertical fragmentation with example. (b) Explain views in centralized DBMSs with examples. Q.3 (a) Explain functional schematics of an integrated distributed DBMS. (b) Explain centralized semantic integrity control with example. Q.4 (a) Explain layers of query processing. (b) Describe the relative advantage and disadvantage of distributed DBMs. Q.4 (a) Explain Centralized query optimization. (b) What do you mean by normalization? Explain any two with example. Q.5 (a) Explain distributed cost model with example. (b) Briefly describe the characterization of query processors. Q.5 (a) Discuss how you perform deadlock management in a distributed DBMS. (b) Define a transaction. Discuss the AICD properties of transaction in the context of DDBs. 1/1
BE - SEMESTER VII EXAMINATION SUMMER 2016 Subject Code:171602 Date:18/05/2016 Subject Name:Distributed Database Application & System Time:02:30 PM to 05:00 PM Total Marks: 70 Q.1 (a) Discuss DDBS environment. How it differs from centralized environment? (b) State the need of normalization in database management system. Write and explain any two normal forms with appropriate example. Assume suitable data. Q.2 (a) Discuss various components of a distributed DBMS. (b) (1) Explain shared disk and shared memory multiprocessor system with neat 04 sketches. (2) Let R be an original relation with appropriate primary key constraint(s). R 03 is decomposed into R1 and R2 using horizontal fragmentation. R1 is further divided into R11 & R12 and R2 into R21, R22 & R23 with the help of vertical fragmentation. Which operation should be performed, join or union, to get an original R2 from R21, R22 and R23 without any loss of data? Justify your answer. (b) Write a detailed note on horizontal fragmentation. Q.3 (a) Explain MDBS architecture without a GCS. (b) (1) Discuss selection operation and join operation in context of relational 04 algebra. (2) Give brief discussion of network transparency. 03 Q.3 (a) Describe allocation problem in detail. (b) (1) Briefly explain (A0, D0, H0) and (A0, D0, H1) with respect to architectural 04 alternatives of distributed DBMSs. (2) Briefly explain network topology. 03 Q.4 (a) Explain views in centralized DBMSs with example(s). (b) Elaborate Client/Server reference architecture. Q.4 (a) Discuss the following terms in detail with example: - Attribute Usage Matrix - Attribute Affinity Matrix (b) Elaborate centralized and distributed authorization control. Q.5 (a) Explain layers of query processing. (b) Write a note on parallel database system. Q.5 (a) List steps of query decomposition and discuss the same in brief. (b) Write a note on distributed concurrency control. 1
BE - SEMESTER VII EXAMINATION WINTER 2015 Subject Code: 171602 Date:16/12/2015 Subject Name: Distributed Database Application and System Time: 10:30am to 1:00pm Total Marks: 70 Q.1 (a) Consider an International multiplex movie house having branches in all over world, discuss its design of distributed database with type of fragmentation and allocation strategy. (b) List and briefly explain advantages of Distributed database with centralized approach. Q.2 (a) Give comparison of all architecture for distributed DBMS with its application. (b) Explain top-down design process in detail. (b) Explain Client Server architecture for Distributed DBMS. Q.3 (a) Explain, Min term predict, COM_MIN and PHIZONTAL algorithm in context of horizontal fragmentation. (b) Explain views in centralized DBMSs with examples. Q.3 (a) Explain AA matrix, CA matrix and BEA algorithm in context of vertical fragmentation. (b) What is distributed integrity Assertion? Explain any one assertion with example. Q.4 (a) Explain layers of query processing. (b) Explain Elimination of redundancy and rewriting in the context of query decomposition. Q.4 (a) Explain Centralized query optimization. (b) What are the various types of networks? Explain in detail. Q.5 (a) Explain distributed cost model with example. (b) Briefly describe the characterization of query processors. Q.5 (a) Differentiate 1) Data warehousing Vs Distributed Database 2) Distributed Processing Vs Cooperative Processing (b) Explain reduction in primary horizontal fragmentation for localization of distributed data. 1/1