Database Management Systems

Similar documents
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25-1

Distributed KIDS Labs 1

Distributed DBMS. Concepts. Concepts. Distributed DBMS. Concepts. Concepts 9/8/2014

Chapter 19: Distributed Databases

Chapter 19: Distributed Databases

It also performs many parallelization operations like, data loading and query processing.

DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

Chapter 18: Parallel Databases Chapter 19: Distributed Databases ETC.

Chapter 18: Parallel Databases

Distributed Databases

Multi-CPU and distributed systems. Database system architectures. Lecture topics: monolithic systems client server systems parallel database servers

Database Architectures

Distributed Databases

Distributed Databases Systems

Distributed Databases

Database Architectures

management systems Elena Baralis, Silvia Chiusano Politecnico di Torino Pag. 1 Distributed architectures Distributed Database Management Systems

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology

Database Management Systems

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

Course Content. Parallel & Distributed Databases. Objectives of Lecture 12 Parallel and Distributed Databases

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

Distributed Database Management Systems. Data and computation are distributed over different machines Different levels of complexity

Fundamental Research of Distributed Database

The Replication Technology in E-learning Systems

Database Administration. Database Administration CSCU9Q5. The Data Dictionary. 31Q5/IT31 Database P&A November 7, Overview:

CMSC 461 Final Exam Study Guide

Topics in Reliable Distributed Systems

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

CMP-3440 Database Systems

Security Mechanisms I. Key Slide. Key Slide. Security Mechanisms III. Security Mechanisms II

Distributed Databases. Distributed Database Systems. CISC437/637, Lecture #21 Ben Cartere?e

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions

Specific Objectives Contents Teaching Hours 4 the basic concepts 1.1 Concepts of Relational Databases

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

UNIVERSITY OF BOLTON CREATIVE TECHNOLOGIES COMPUTING PATHWAY SEMESTER TWO EXAMINATION 2014/2015 ADVANCED DATABASE SYSTEMS MODULE NO: CPU6007

CMPS 181, Database Systems II, Final Exam, Spring 2016 Instructor: Shel Finkelstein. Student ID: UCSC

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Babu Banarasi Das National Institute of Technology and Management

Scaling Database Systems. COSC 404 Database System Implementation Scaling Databases Distribution, Parallelism, Virtualization

Help student appreciate the DBMS scope of function

Distributed Databases

Introduction to the Course

GUJARAT TECHNOLOGICAL UNIVERSITY

Questions about the contents of the final section of the course of Advanced Databases. Version 0.3 of 28/05/2018.

Chapter 9: Concurrency Control

DISTRIBUTED DATABASES

UNIVERSITY OF BOLTON WESTERN INTERNATIONAL COLLEGE FZE BSC (HONS) COMPUTING SEMESTER 1 EXAMINATIONS 2016/2017 ADVANCED DATABASE SYSTEMS

Distributed Transaction Management

DATABASE MANAGEMENT SYSTEMS

Goals for Today. CS 133: Databases. Final Exam: Logistics. Why Use a DBMS? Brief overview of course. Course evaluations

Chapter 11 - Data Replication Middleware

CSE 444: Database Internals. Section 9: 2-Phase Commit and Replication

Advances in Data Management Distributed and Heterogeneous Databases - 2

Queen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

CS/B.Tech/CSE/New/SEM-6/CS-601/2013 DATABASE MANAGEMENENT SYSTEM. Time Allotted : 3 Hours Full Marks : 70

Distributed System. Gang Wu. Spring,2018

Consistency in Distributed Systems

Dr. Awad Khalil. Computer Science & Engineering department

Concurrency Control - Two-Phase Locking

Consistency and Scalability

DATABASE DEVELOPMENT (H4)

Lecture 21. Lecture 21: Concurrency & Locking

CSE 544, Winter 2009, Final Examination 11 March 2009

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

Distributed Transaction Management. Distributed Database System

Course Book Academic Year

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

CSE 444: Database Internals. Lecture 25 Replication

Chapter 3. Database Architecture and the Web

VII. Database System Architecture

Lecture 12. Lecture 12: The IO Model & External Sorting

Overview of Transaction Management

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

XI. Transactions CS Computer App in Business: Databases. Lecture Topics

Outline. Non-Standard Database Systems. Distributed Database System. Outline. Distributed Databases

Distributed Databases Systems

Distributed Databases

Chapter 20: Database System Architectures

CS 347 Parallel and Distributed Data Processing

Problems Caused by Failures

Advanced Databases: Parallel Databases A.Poulovassilis

Mobile and Heterogeneous databases

Distributed Systems 24. Fault Tolerance

Monitoring and Resolving Lock Conflicts. Copyright 2004, Oracle. All rights reserved.

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

Bsc(Hons) Web Technologies. Examinations for / Semester 2

MIS Database Systems.

Department of Computer Science Final Exam, CS 4411a Databases II

BIS Database Management Systems.

CPS352 Lecture: Distributed Database Systems last revised 4/20/2017. Materials: Projectable showing horizontal and vertical partitioning

Distributed Database Systems

CS 347: Distributed Databases and Transaction Processing Notes07: Reliable Distributed Database Management

Distributed Database Systems

Module 15: Managing Transactions and Locks

Transcription:

Database Management Systems Distributed Databases Doug Shook

What does it mean to be distributed? Multiple nodes connected by a network Data on the nodes is logically related The nodes do not need to be homogeneous The network matters a lot: Topology Proximity 2

Transparency Do our users need to be aware of the fact that a database is distributed? Data organization transparency Location transparency Naming transparency Replication transparency Fragmentation transparency Horizontal Vertical 3

Autonomy Can nodes operate independently? Design autonomy Communication autonomy Execution autonomy 4

Reliability and Availability Reliability: probability that a system is running Availability: probability that the system is continuously available How do we construct reliable systems? Fault tolerance 5

Advantages Flexibility for large operations Increased reliability and availability Improved performance Data localization Easier Expansion 6

Additional Functions Manage data distribution Distributed query processing Replication management Distributed transaction management Recovery Security 7

Distributed Database Properties Degree of homogeneity Degree of local autonomy 8

Architecture 9

Schema Architecture 10

Component Architecture 11

Data Fragmentation Where should the data go? Break the data down into logical units What pieces of our database make sense as logical units? 12

Types of Fragmentation Horizontal Fragmentation Break a relation down into subsets of tuples Specify a set of conditions for this purpose What about relationships? Vertical Fragmentation Break a relation down into subsets of columns How do we reconstruct the original tuples? We can use relational operations to specify fragmentation Which ones? Which operations will reconstruct fragmented tuples? 13

Replication and Allocation Replication improves availability How? Should we replicate everything? Pros? Cons? 14

Example Three sites Site 1 is HQ, accesses all data Site 2 is the home of department 5 Needs access to employee and project info Site 3 is the home of department 4 Needs access to employee and project info Which part of the DB should be at each site? What kind of fragmentation is required? What should we do if an employee from department 5 works on a project owned by department 4? 15

Query Processing Processed in stages Mapping Localization Global Optimization What are we optimizing here? Local Optimization 16

Data Transfer Costs 17

Data Costs We submit our query at a third site Assume each record in the result is 40 bytes long Three options Transfer both relations to third site, perform the join Transfer EMPLOYEE to site 2, perform the join there, send the results Transfer department to site 1, perform the join there, send the results Which option results in the least amount of data being sent? 18

Practice Problem Redo the following example using the following query: In the previous examples we assumed that the queries were being submitted from a third site. What if the query is being submitted from site 2? Compute the data costs. 19

Decomposition If multiple copies exist, which one should we read from? What if we need to update that information? Where do we go to find out how many copies of a piece of information there are and where they are stored? 20

Transaction Management Since data will be accessed from many sites, we need a way to coordinate Global transaction manager is in charge of each transaction Unfortunately, two phase commits have some problems in this setup... Like what? 21

Three Phase Commit Break the commit into two phases: Prepare to commit Commit Prepare-to-commit All participants vote on whether the commit should happen If yes, then proceed to commit as normal What do we do on a crash? 22

Concurrency Control Problems: Multiple copies of the same data Failure and recovery of individual sites Failure of communication Distributed Commits Distributed Deadlock 23

Distinguished Copy How do we track locks for distributed items? Idea: make a distinguished copy All locking requests are sent to this copy Where does this distinguished copy live? Called the coordinator site 24

Coordinator Sites Primary Site One site has all the distinguished copies Disadvantages? Primary with backup Primary Copy What happens when a site goes down? 25

Voting Method No distinguished copy Lock requests are sent to all sites with the item Must receive a majority of locks Advantages? Disadvantages? 26

Distributed Catalogs How do we track catalog information with multiple sites? One copy? Many copies? Centralized vs. Replicated 27

Practice Problem Which of the following schedules is conflict serializable? Determine an equivalent serial schedule 28

Practice Problem Sketch the conflict graph for each schedule below Come up with an equivalent serial schedule, if possible. 29

Practice Problem For the following transactions, determine what locks would need to be acquired, then come up with a schedule that would satisfy two phase locking. 30

Practice Problem Could the following schedule suffer from deadlock? Explain how you know. If deadlock is a problem, explain how it could be resolved. 31

Practice Problem Could the following schedule suffer from deadlock? Explain how you know. If deadlock is a problem, explain how it could be resolved. 32

Practice Problem We wish to distribute our course tracking database so that each department is considered a separate site. Explain what kinds of partitioning would need to take place to accomplish this. Consider a query that reports all of the courses that a student has taken across all departments What should the data transmission pattern be? Will distribution improve or degrade the performance of this query? 33