Module 10: Parallel Query Processing

Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Buffer Disk Space Module 10: Parallel Query Processing Module Outline We are here... 10.1 Objectives in Parallelizing s 10.2 Speed-Up and Scale-Up 10.3 Opportunities for Parallelization in Rs 10.4 Examples for parallel query execution plans 325

10.1 Objectives in Parallelizing s Thus far, we have (implicitly or explicitly) been considering a in a client-server architecture: One server operates on the data stored on its local disks, on behalf of any of the numerous clients issuing requests to the server over a (local or global) network. All the data-intensive work, as well as, e.g., transaction management, is done on the (single) server. The following considerations may lead us to parallel or distributed architectures: High-performance. A single server, implemented on a sequential single-processor machine may not be able to provide the necessary performance (repsonse-time, throughput). High-availability. A single server represents a single point-of-failure. Hardware or software problems as well as network disconnection turn all database operations down. Extensibility. Accomodating increasing demands in terms of database size and/or performance will definitely hit hard limits in a single-server architecture. 326

Architectures for Parallel s Typical parallel architectures include: Shared Memory: Shared Disk: Shared Nothing: Local Memory Local Memory Local Memory Local Memory Local Memory Local Memory Global Shared Memory Each of these has its own advantages and potential problems. For example, shared memory is easiest to program, while shared nothing scales best. 327

10.2 Speed-Up and Scale-Up Speed-Up. Given a constant problem size (e.g., database size and transaction load), how does the performance (e.g., response time) increase with increased hardware ressources (e.g., number of processors and/or disks)? Scale-Up. When the problem size increases, can we achieve the same performance with hardware ressources increased correspondingly? Overview of metrics for parallel processing Problem/ Size constant variable Ressource constant Size-Up Utilization variable Speed-Up Scale-Up 328

10.2.1 Problems with speed-up Considering the response-time performance indicator, speed-up (w.r.t. increased number of processors used) is defined as rt-speed-up(n) = reponse-time with 1 processor response-time with n processors Similarly, we can use the throughput (number of transactions per second) indicator and define throughput with n processors tput-speed-up(n) = throughput with 1 processor Problem: we can not achieve linear speed-up, rather... ideal optimal real Amdahl s Law: 1 Speed-Up seq. part + par.part /n e.g., start-up and synchr. overhead, sub-optimal load-balancing,... 329

10.3 Opportunities for Parallelization in Rs Relational s offer a large potential to exploit parallelism: Data parallelism. Queries operate on large data sets. Data sets can be partitioned and each partition can be handled by a separate, parallel thread. Challenge: avoid skew in partitioning the data. Pipelined parallelism. Queries consist of (pipelined) sequences of operators. Each operator can be executed by a separate, pipelined thread. See our earlier discussion on pipelining. Operator parallelism. For many operators, their internal execution algorithms can be parallelized into several threads. For example, parallel join-algorithms. 330

Different kinds of parallelism Depending on what is performed in parallel, the following systematics have been developed: Inter-transaction parallelism: several transaction are run in parallel. (This is the standard in all s.) Intra-transaction p.: Inter-query p.: several queries within a transaction are run in parallel (needs asynchronous SQL-I/F). 20.4.1 Intra-query Formen p.: within der Parallelität one SQL-call, multiple tasks are run in parallel Inter-operator p.: operators constituting a query are run in parallel... was Intra-operator wird parallel ausgeführt? p.: a single operator is implemented via a parallel algorithm BOT Select... Select... Insert... EOT BOT Select... Select... Insert... EOT BOT EOT BOT EOT Interquery-Parallelität Intraquery-Parallelität/ Intraoperator-Parallelität 331

10.4 Examples for parallel query execution plans 10.4.1 Parallel join algorithms There are a number of parallel join algorithms. nested loops join (aka. broadcast join): The most simple one is parallel 1. Partitioning phase: broadcast records of outer relation to nodes holding inner. 2. Join phase: locally compute (partial) joins on nodes holding inner. S R R S S S R R S S This algorithm can be used for non-equi joins, too. 332

Parallel associative join If the inner relation (S) is stored in partitions (according to the join attributes) and the join is an equi-join, then we can 1. distribute puter tuples to the matching partition of the inner, 2. compute the (partial) joins locally on the nodes storing the inner partitions. S R R S S S R R S S 333

Parallel (simple) hash join 1. Partition outer (R) using some hash function h, send records to join node indicated by hash value. 2. Partition inner (S) using same hash function h, send records to join node indicated by hash value. 3. Locally compute (partial) joins on all join nodes. S R R S S S R R S S N.B.: a node can be scan and join node at the same time. 334

Parallel asymmetric hash join (see earlier discussion of hash joins.) 1. Building phase: scan and distribute outer according to some hash function h. 2. Probing phase: combine scan/distribute of inner with locally computing the join. S R R S S S R R S S N.B.: again, a node can be scan and join node at the same time. 335

Parallel hybrid hash join (see earlier discussion of hybrid hash joins.) 1. Building phase: scan and distribute outer according to some hash function h, keep first bucket in memory. 2. Probing phase: combine scan/distribute of inner with locally computing the join for the first bucket. S R R S S S R R S S 336

10.4.2 Parallelizing join trees Left-deep join trees can be fully pipelined. As such, they offer good potential for (pipelining) inter-operator parallelism. When we consider parallel hash joins, though, we observe that each building phase falls within its own, sequential execution phase: Example: consider the join of four relations R 1, R 2, R 3, R 4 and the left-deep join tree shown below; each join is implemented as an asymmetric hash join. J3 B3 P3 J2 S4 implem ented by B2 P2 S4 J1 S3 B1 P1 S3 S1 S2 S = Scan J= Join B = Build P= Probe The execution of the query proceeds in 4 sequential steps (the tasks within each step are executed in parallel): 1. {S 1, B 1 } 2. {S 2, P 1, B 2 } 3. {S 3, P 2, B 3 } 4. {S 4, P 3 } S1 S2 337

Analysis of parallelizing left-deep join trees PROs: no more than 2 hash tables have to kept in memory at the same time probing relation is always a base table CONs: rather limited degree of parallelism size of hash tables (build phase) depends on join selectivity (difficult to estimate accurately) 338

Parallelizing right-deep join trees Example: consider the same join of four relations R 1, R 2, R 3, R 4 as before, but now look at the right-deep join tree shown below; each join is again implemented as an asymmetric hash join. J3 B3 P3 S4 J2 implem ented by S4 B2 P2 S3 J1 S3 B1 P1 S2 S1 S2 S1 Now, the execution of the query can be split into only 2 sequential steps (parallelizing the tasks within each step): 1. {S 2, B 1, S 3, B 2, S 4, B 3 } 2. {S 1, P 1, P 2, P 3 } more parallelism (parallel scans, all probing phases in a single pipeline) all build-relations are base tables, hence better size-estimates much higher memory requirements (all build-tables) 339

Bibliography Özsu, M. and Valduriez, P. (1991). Principles of Distributed Systems. Prentice Hall. Rahm, E. (1994). Mehrrechner-Datenbanksysteme Grundlagen der verteilten und parallelen Datenbankverarbeitung. Addison-Wesley, Bonn. Ramakrishnan, R. and Gehrke, J. (2003). Management Systems. McGraw-Hill, New York, 3 edition. 340