COURSE 12. Parallel DBMS

Similar documents
Parallel DBMS. Chapter 22, Part A

CompSci 516: Database Systems. Lecture 20. Parallel DBMS. Instructor: Sudeepa Roy

Parallel DBMS. Lecture 20. Reading Material. Instructor: Sudeepa Roy. Reading Material. Parallel vs. Distributed DBMS. Parallel DBMS 11/15/18

Parallel DBMS. Lecture 20. Reading Material. Instructor: Sudeepa Roy. Reading Material. Parallel vs. Distributed DBMS. Parallel DBMS 11/7/17

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Outline. Parallel Database Systems. Information explosion. Parallelism in DBMSs. Relational DBMS parallelism. Relational DBMSs.

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism

PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

Course Content. Parallel & Distributed Databases. Objectives of Lecture 12 Parallel and Distributed Databases

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Chapter 20: Database System Architectures

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction

Announcements. Database Systems CSE 414. Why compute in parallel? Big Data 10/11/2017. Two Kinds of Parallel Data Processing

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large

Chapter 20: Parallel Databases

Chapter 20: Parallel Databases. Introduction

B.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2

Advanced Databases: Parallel Databases A.Poulovassilis

Huge market -- essentially all high performance databases work this way

Introduction to Data Management CSE 344

Lecture 23 Database System Architectures

Architecture and Implementation of Database Systems (Winter 2014/15)

Introduction to Database Systems CSE 414

Chapter 17: Parallel Databases

CS-460 Database Management Systems

Introduction to Data Management CSE 344

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Parallel DBs. April 23, 2018

CSE 544: Principles of Database Systems

It also performs many parallelization operations like, data loading and query processing.

Evaluation of Relational Operations

Parallel DBs. April 25, 2017

What We Have Already Learned. DBMS Deployment: Local. Where We Are Headed Next. DBMS Deployment: 3 Tiers. DBMS Deployment: Client/Server

Advanced Database Systems

Database Architectures

Database Technology Database Architectures. Heiko Paulheim

Architecture-Conscious Database Systems

Lecture 9: MIMD Architectures

OS and Hardware Tuning

Storage Systems. Storage Systems

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

data parallelism Chris Olston Yahoo! Research

OS and HW Tuning Considerations!

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Sandor Heman, Niels Nes, Peter Boncz. Dynamic Bandwidth Sharing. Cooperative Scans: Marcin Zukowski. CWI, Amsterdam VLDB 2007.

Moore s Law. Computer architect goal Software developer assumption

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

Column Stores vs. Row Stores How Different Are They Really?

Department of Computer Engineering University of California at Santa Cruz. File Systems. Hai Tao

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

Storing Data: Disks and Files

Lecture 12. Lecture 12: The IO Model & External Sorting

OPS-23: OpenEdge Performance Basics

Advanced Databases. Lecture 15- Parallel Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch

I/O CANNOT BE IGNORED

Computer-System Organization (cont.)

CSE 344 MAY 2 ND MAP/REDUCE

Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448. The Greed for Speed

Kathleen Durant PhD Northeastern University CS Indexes

Lecture 7: Parallel Processing

Capturing Your Changed Data

Chapter 8 Virtual Memory

Crescando: Predictable Performance for Unpredictable Workloads

Optimizing Testing Performance With Data Validation Option

Segregating Data Within Databases for Performance Prepared by Bill Hulsizer

Performance Issues and Query Optimization in Monet

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

High Performance Computing

GFS: The Google File System. Dr. Yingwu Zhu

Database Applications (15-415)

Shared Memory and Distributed Multiprocessing. Bhanu Kapoor, Ph.D. The Saylor Foundation

4.1 Introduction 4.3 Datapath 4.4 Control 4.5 Pipeline overview 4.6 Pipeline control * 4.7 Data hazard & forwarding * 4.

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Chapter 12: Query Processing. Chapter 12: Query Processing

L9: Storage Manager Physical Data Organization

Computer Architecture

External Sorting Implementing Relational Operators

Evaluation of relational operations

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Exadata X3 in action: Measuring Smart Scan efficiency with AWR. Franck Pachot Senior Consultant

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Evaluation of Relational Operations: Other Techniques

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

UNIT I (Two Marks Questions & Answers)

CS 405G: Introduction to Database Systems. Storage

CS352 Lecture: Database System Architectures last revised 11/22/06

FAWN as a Service. 1 Introduction. Jintian Liang CS244B December 13, 2017

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 18: Parallel Databases

Database Architectures

Database Systems II. Secondary Storage

Bridging the Processor/Memory Performance Gap in Database Applications

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Chapter 12: Indexing and Hashing

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced?

Lecture 9: MIMD Architectures

Transcription:

COURSE 12 Parallel DBMS 1

Parallel DBMS Most DB research focused on specialized hardware CCD Memory: Non-volatile memory like, but slower than flash memory Bubble Memory: Non-volatile memory like, but slower than flash memory Head-per-track Disks: Hard disks with one head per data track Optical Disks: Similar to, but higher capacity than, CD-RW s 2

Parallel DBMS (cont) In past, database machines were custom built to be highly parallel: Processor-per-track: There is one processor associated with each disk track. Searching and scanning can be done with one disk rotation Processor-per-head: There is one processor associated with each disk read/write head. Searching and scanning can be done as fast as the disk can be read Off-the-disk: There are multiple processors attached to a large cache where records are read into and processed 3

Parallel DBMS (cont) Hardware did not fulfill promises, so supporting these three parallel machines didn t seem feasible in the future Also, disk I/O bandwidth was not increasing fast enough with processor speed to keep them working at these higher speeds Prediction was that off-the shelf components would soon be used to implement database servers Parallelism was expected to be a thing of the past as higher processing speeds meant that parallelism wasn t necessary 4

Parallel DBMS (cont) Prediction was wrong! Focus is now on using off-the-shelf components CPU s double in speed every 18 months Disks double in speed much more slowly I/O is still a barrier but is now much better But, parallel processing is still a very useful way to speed up operations We can implement parallelism on mass-produced, modular database machines instead of custom designed ones 5

Parallel DBMS (cont) Why parallel database systems? They are ideally suited to relational data which consists of uniform operations on data streams Output from one operation can be streamed as input next operation Data can be worked on by multiple processors / operations at once Both allow higher throughput. This is good! Both require high-speed interconnection buses between processors/operations. This could be bad! 6

Definitions Pipeline parallelism: Many machines each doing one step in a multi-step process. T o operators orki g i series a data set: output from o e operator is piped i to a other operator to speed up processi g Any Sequential Program Any Sequential Program Examp e: Seque tia Sca fed i to a Projectio fed i to a Se ect etc. 7

Definitions Partition parallelism: Many machines doing the same thing to different pieces of data. Ma y operators orki g together o o e operatio ; each operator does some of the ork o the i put data a d produces output Any Any Any Any Sequential Program Any Any Any Any Sequential Program Examp e: A tab e is partitio ed over severa machi es. O e sca operator ru s o each machi e to feed i put to a Sort 8

DBMS: The Parallel Success Story DBMSs are the most (only?) successful application of parallelism. Teradata, Tandem vs. Thinking Machines, KSR.. Every major DBMS vendor has some parallel server Workstation manufacturers now depend on parallel DB server sales. Reasons for success: Bulk-processing (= partition parallelism). Natural pipelining. Inexpensive hardware can do the trick! Users/app-programmers don t need to think in parallel 9

Mainframes vs. Multiprocessor Machines Mainframes are unable to provide the high speed processing and interconnect systems Mainframes are unable to scale once built. Thus, their total power is defined when they are first constructed Multiprocessor (Parallel) systems based on inexpensive off-the-shelf components Multiprocessor machines can grow incrementally through the addition of more memory, disks or processors when scale needs to be increased Mainframes are shared-disk or shared memory systems Multiprocessor systems are shared-nothing systems 10

Architecture Issue: Shared What? Shared Memory (SMP) All processors have direct access to a common global memory and all hard disks Example: A single workstation Characteristics: Easy to program Expensive to build Difficult to scale-up DBMSs: Sequent, SGI, Sun CLIENTS Processors Memory 11

Architecture Issue: Shared What? (cont) Shared Disk All processors have access to a private memory and all hard disks Example: A network of servers accessing a SAN (Storage Area Network) for storage CLIENTS DBMSs: VMScluster, Sysplex 12

Architecture Issue: Shared What? (cont) Shared Nothing (network) All processors have access to a private memory and disk, and act as a server for the data stored on that disk Example: A network of FTP mirrors Characteristics: Hard to program Cheap to build Easy to scaleup CLIENTS DBMSs: Tandem, Teradata, SP2 13

What Systems Work This Way Shared Nothing Teradata: 400 nodes Tandem: 110 nodes IBM / SP2 / DB2: 128 nodes Informix/SP2 48 nodes ATT & Sybase? nodes Shared Disk Oracle DEC Rdb 170 nodes 24 nodes CLIENTS CLIENTS Shared Memory Informix RedBrick 9 nodes? nodes CLIENTS Processors Memory 14

Architecture Issue Trend in parallel database systems is towards shared nothing designs: Minimal interference from minimal resource sharing Don t require an overly powerful interconnect Don t move large amounts of data through an interconnect - only questions and answers Traffic is minimized by reducing data at one processor before sending over interconnect Can scale to hundreds and thousands of processors without increasing interference (may slow network performance) 15

transaction/sec. (throughput) sec./transaction (response time) Some Terminology Speed-Up More resources means proportionally less time for given amount of data. Scale-Up If resources increased in proportion to increase in data size, time is constant. Ideal Ideal degree of parallelism degree of parallelism Ideally, a parallel system would experience Linear Speed-up and Linear Scale-up 16

Speed-Up Speed-Up = small system processing time large system processing time Speed-up is linear if an N-times larger system executes a task N-times faster than a smaller system running the same task (i.e. above equation evaluates to N) Hold the problem size constant and grow the system Example Speed-up = 100/50 System size increased by factor of 2 linear speed-up 17

Scale-Up Scale-Up = small system processing time on small problem large system processing time on big problem Scale-up is linear if an N-times larger system executes a task N-times larger in the same time as a smaller system running a smaller task (i.e. above equation evaluates to 1) Scale-up grows the problem and system size Example Scale-up = 100/100 System and problem size increased by factor of 2 and scale-up is 1 we have linear scale-up 18

Barriers to Linear Speed-up/Scale-up Startup: Time needed to start-up the operation (i.e. launch the processes) Interference: Slowdowns imposed on processes through the access of shared resources Skew: Number of parallel operations increase but the variance in operation size increases too (job is only as fast as the slowest operation!) 19

Architectures and Speed-up/Scale-up Shared nothing architectures experience near-linear speedups and scaleups Some time must be spent coordinating the processes and sending requests and replies across a network Shared memory systems do not scale well Operations interfere with each other when accessing memory or disks Shared disk systems do not scale well Need to deal with the overhead of locking, concurrent updates, and cache consistency 20

Architectures and Speed-up/Scale-up The shortcomings of shared memory and shared disk systems makes them unattractive next to shared nothing systems Why are shared nothing systems only now becoming popular? Modular components have only recently become available Most database software is written for uni-processor systems and must be converted to receive any speed boost 21

Remember Para e ism? Factors limiting pipelined parallelism Relational pipelines are rarely very long (ie: chains past length 10 are rare) Some operations do not provide output until they have consumed all input (i.e. sorts and aggregation) Execution cost of some operations is greater than others (i.e. skew) Thus, pipelined parallelism often will not provide a large speed-up! 22

Remember Para e ism? Partitioned execution offers better speed-up and scale-up Data can be partitioned so inputs and outputs can be used and produced in a divide-and-conquer manner Data partitioning is done by placing data fragments on several different disks/sites Disks are read and written in parallel Provides high I/O bandwidth without any specialized hardware 23

Automatic Data Partitioning Round-robin Partitioning Data is fragmented in a roundrobin fashion (i.e. The first tuple goes to site A, then site B, then site C, then site A, etc.) Ideal if a table is to be sequentially scanned (spread load) Poor performance if a query wants to associatively access data (i.e. find all students of age 23) A...E F...J K...N O...S T...Z 24

Automatic Data Partitioning (cont) Hash Partitioning Data is fragmented by applying a hash function to an attribute of each row The function specifies placement of the row on a specific disk Ideal if a table is to be sequentially or associatively accessed (i.e. all students of age 23 are on the same disk - saves the overhead of starting a scan at each disk) hash function A...E F...J K...N O...S T...Z 25

Automatic Data Partitioning (cont) Range Partitioning Tuples are clustered together on disk Ideal for sequential and associative access. Also ideal for clustered access (again, saves the overhead of beginning scans on other disks; also does not require a sort!) Risk of data skew (i.e. all data ends up in one partition) and execution skew (i.e. all the work is done by one disk) Need a good partitioning scheme to avoid this A...E F...J K...N O...S T...Z 26

Partitioning Conclusions Partitioning raises several new design issues Databases must now have a fragmentation strategy Databases must now have a partitioning strategy Partitioning helps us in several ways Increased partitioning decreases response time (usually, and to a certain point) Increased partitioning increases throughput Sequential scans are faster as more processors and disks are working on the problem in parallel Associative scans are faster as fewer tuples are stored at each node meaning a smaller index is searched 27

Partitioning Conclusions (cont) Partitioning can actually increase response time Occurs when starting a scan at a site becomes a major part of the actual execution time. This is bad! Data partitioning is only half the story We ve seen how partitioning data can speed up processing How about partitioning relational operators? 28

Different Types of DBMS parallelism Intra-operator parallelism get all machines working to compute a given operation (scan, sort, join) Inter-operator parallelism each operator may run concurrently on a different site (exploits pipelining) Inter-query parallelism different queries run on different sites We ll focus on intra-operator parallelism 29

Parallelizing Relational Operators Objective Use existing operator implementations without modifications Only 3 mechanisms are needed Operator replication Merge operator Split operator Result is a parallel DBMS capable of providing linear speed-up and scale-up! 30

Intra-operator parallelism If we want to parallelize operators, there are two options: 1) Rewrite the operator so that it can make use of multiple processors and disks: Could take a long time and a large amount of work Could be system specific programming 2) Use multiple copies of unmodified operators with partitioned data Very little code needs to be modified Should be moveable to any new hardware setup 31

Parallel Scans Scan in parallel, and merge. Selection may not require all sites for range or hash partitioning. Indexes can be built at each partition. Question: How do indexes differ in the different schemes? Think about both lookups and inserts! What about unique indexes? 32

Parallel Sorting Current records: 8.5 Gb/minute, shared-nothing; Datamation benchmark in 2.41 secs (http://now.cs.berkeley.edu/nowsort/) Idea: A...E F...J K...N O...S T...Z Scan in parallel, and range-partition as you go. As tuples come in, begin local sorting on each Resulting data is sorted, and range-partitioned. Problem: skew! Solution: sample the data at start to determine partition points. 33

Parallel Aggregates For each aggregate function, need a decomposition: count(s) = S count(s(i)), ditto for sum() avg(s) = (S sum(s(i))) / S count(s(i)) and so on... For groups: Sub-aggregate groups close to the source. Pass each sub-aggregate to its group s site. Chosen via a hash fn. count count count count count count A...E F...J K...N O...S34T...Z

Intra-operator parallelism (cont) Suppose we have two range-partitioned data sets, A and B, partitioned as follows A1, B1 = A-H tuples A2, B2 = I-Q tuples A3, B3 = R-Z tuples Suppose we want to join relation A and B. How can we do this? 35

Intra-operator parallelism (cont) Option 1: Start 6 scan operations and feed them, in partitioned order, into one central join operation 36

Intra-operator parallelism (cont) Option 2: Start 6 scan operations and feed them into three join operations 37

Intra-operator parallelism (cont) Option 1: Exploits partitioned data but only partial parallel execution I.e.: Scan operations are done in parallel, but one central join is a bottleneck Option 2: Exploits partitioned data and full parallel execution I.e.: Scan and join operations are done in parallel What if we were joining this data and replicating the result table across several sites? Can we further leverage parallel processing? 38

Intra-operator parallelism (cont) Option 1: After the join operation, merge the data into one table at one site and transmit the data to the other replication sites 39

Intra-operator parallelism (cont) Option 2: After the join operation, split he output and send each tuple to each replication site 40

Intra-operator parallelism (cont) Option 1: Exploits parallel processing but we have a bottleneck at the merge site Could also have high network overhead to transfer all the tuples to one site, then transfer them again to the replication sites Option 2: Exploits parallel processing and splits output Each site is given each tuple as it is produced 41

Intra-operator parallelism (cont) If each process runs on an independent processor with an independent disk, we will see very little interference What happens if a producing operator gets too far ahead of consuming operator? (i.e.: the scan works faster than the join) Operators make use of independent buffers and flow control mechanisms to make sure that data sites aren t overwhelmed by each other Obviously, some operations will be better suited to parallel operation than others 42

Grosch s Law Grosch s law (1960) stipulates that more expensive computers are better than less expensive computers Mainframes cost: $25 000/MIPS (Million Instructions Per Second) $1 000/MB RAM Now, commodity parts cost: $250/MIPS $100/MB RAM (remember those days?!) Grosch s law has been broken: Combining several hundred or thousand of these shared nothing systems can create a much more powerful machine than even the most expensive mainframe! 43

Future Research Parallel Query Optimization: take network traffic and processor load balancing into account. Application Program Parallelism: programs need to be written to take advantage of parallel operators Physical Database Design: design tools must be created to help database administrators decide on table indexing and data partitioning schemes to be used Data Reorganization Utilities: must be written to take into account parallelism and the factors that affect parallel performance 44

Conclusions Database systems want cheap, fast hardware systems This means commodity parts, not custom built systems A shared nothing architecture has been shown to be ideal at performing parallel operations cheaply and quickly They experience good speedup and scaleup 45