Heap-Filter Merge Join: A new algorithm for joining medium-size relations

Similar documents
Relational division : four algorithms and their performance

The one-to-one match operator of the Volcano query Processing System

Chapter 12: Query Processing. Chapter 12: Query Processing

A Case for Merge Joins in Mediator Systems

Advanced Database Systems

Chapter 13: Query Processing

Chapter 12: Query Processing

Database System Concepts

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

Chapter 13: Query Processing Basic Steps in Query Processing

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Parallel external sorting in volcano

6.830 Lecture 8 10/2/2017. Lab 2 -- Due Weds. Project meeting sign ups. Recap column stores & paper. Join Processing:

Query optimization in object-oriented database systems : the REVELATION project

Query Processing & Optimization

Chapter 12: Query Processing

Advanced Databases: Parallel Databases A.Poulovassilis

University of Waterloo Midterm Examination Sample Solution

QUERY PROCESSING IN A RELATIONAL DATABASE MANAGEMENT SYSTEM

Tradeoffs in Processing Multi-Way Join Queries via Hashing in Multiprocessor Database Machines

New Bucket Join Algorithm for Faster Join Query Results

University of Waterloo Midterm Examination Solution

Join algorithm costs revisited

Optimization of Queries in Distributed Database Management System

Join Processing for Flash SSDs: Remembering Past Lessons

Evaluation of Relational Operations

A Multi Join Algorithm Utilizing Double Indices

Hash-Based Indexing 165

Implementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

CSE 530A. B+ Trees. Washington University Fall 2013

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER

Encapsulation of parallelism in the volcano query processing system

Query Processing. Solutions to Practice Exercises Query:

Architecture and Implementation of Database Systems Query Processing

Lecture 15: The Details of Joins

Chapter 8. Implementing the Relational Algebra. Architecture and Implementation of Database Systems Winter 2008/09

A NEW SORTING ALGORITHM FOR ACCELERATING JOIN-BASED QUERIES

Query Optimization in Distributed Databases. Dilşat ABDULLAH

Query Processing. Introduction to Databases CompSci 316 Fall 2017

Chapter 8. Implementing the Relational Algebra. Architecture and Implementation of Database Systems Winter 2010/11

Chapter 8. Implementing the Relational Algebra. Architecture and Implementation of Database Systems Summer 2014

Evaluation of Relational Operations: Other Techniques

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Database Applications (15-415)

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Question 1. Part (a) [2 marks] what are different types of data independence? explain each in one line. Part (b) CSC 443 Term Test Solutions Fall 2017

Implementation of Relational Operations

TID Hash Joins. Robert Marek University of Kaiserslautern, GERMANY

Implementation of Relational Operations: Other Operations

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Evaluation of Relational Operations. Relational Operations

Evaluation of relational operations

Overview of Implementing Relational Operators and Query Evaluation

Query Processing Strategies and Optimization

Evaluation of Relational Operations: Other Techniques

ATYPICAL RELATIONAL QUERY OPTIMIZER

Introduction to Database Systems CSE 444, Winter 2011

Early Grouping Gets the Skew

Lecture 14. Lecture 14: Joins!

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

CMSC424: Database Design. Instructor: Amol Deshpande

CS122 Lecture 15 Winter Term,

CSIT5300: Advanced Database Systems

15-415/615 Faloutsos 1

Review. Support for data retrieval at the physical level:

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

CSE 544 Principles of Database Management Systems

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Roadmap DB Sys. Design & Impl. Overview. Reference. What is Join Processing? Example #1. Join Processing

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2009 Quiz I Solutions

Query Evaluation Overview, cont.

TotalCost = 3 (1, , 000) = 6, 000

Architecture and Implementation of Database Systems (Summer 2018)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Database Systems CSE 414

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large

Chapter 20: Parallel Databases

Chapter 20: Parallel Databases. Introduction

Term Project Symmetric Hash Join 1

External Sorting Implementing Relational Operators

Database Applications (15-415)

Query processing and optimization

EXTERNAL SORTING. Sorting

Implementation of Relational Operations in Omega Parallel Database System *

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

CS122 Lecture 8 Winter Term,

Query Evaluation Overview, cont.

Problem Set 2 Solutions

SUMMARY OF DATABASE STORAGE AND QUERYING

Indexing and Hashing

Module 10: Parallel Query Processing

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection

Parallel Query Optimisation

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction

Transcription:

Oregon Health & Science University OHSU Digital Commons CSETech January 1989 Heap-Filter Merge Join: A new algorithm for joining medium-size relations Goetz Graefe Follow this and additional works at: http://digitalcommons.ohsu.edu/csetech Recommended Citation Graefe, Goetz, "Heap-Filter Merge Join: A new algorithm for joining medium-size relations" (1989). CSETech. 197. http://digitalcommons.ohsu.edu/csetech/197 This Article is brought to you for free and open access by OHSU Digital Commons. It has been accepted for inclusion in CSETech by an authorized administrator of OHSU Digital Commons. For more information, please contact champieu@ohsu.edu.

Heap-Filter Merge Join: A New Algorithm for Joining Medium-Size Relations Goetz Graeje Oregon Graduate Center Department of Computer Science and Engineering 19600 N.W. von Neumann Drive Beaverton, OR 97006-1999 USA Technical Report No. CS/E 89-012

Heap-Filter Merge Join: A New Algorithm for Joining Medium-Size Relations Goetz Graefe Oregon Graduate Center Beaverton, Oregon 910061999 graefe@cse.ogc.edu Abstract We present a new algorithm for relational equi-join. The algorithm is a modification of merge join but promises superior performance for medium-size relations. In many cases, it even compares favorably with hybrid hash join. 1. Intrbduction While there seems to be an abundance of relational join algorithms, some cases still allow improvements. In this technical note, we present a new algorithm which is a variant of the well-known merge join algorithm. In essence, it avoids merging sorted runs of the outer relation and merges them directly with the inner relation. This method avoids an substantial amount of I/O for sorting, at the expense of increased 1/0 for the inner relation. In the following section, we discuss known algorithms, in particular merge join and hybrid hash join, and derive their cost formulas. The new algorithm, called heap-filter merge join, is described in Section 3. In Section 4, we provide an analytical performance comparison of merge join, two variants of heapfilter merge join, and hybrid hash join. Section 5 contains a summary and our conclusions from this effort. 2. Competitive Join Algorithms In this section, we discuss two known join algorithms and their cost formulas. We have chosen merge join because heapfilter merge join is a derivative of it, and hybrid hash join because it was shown to be a very efficient join algorithm [I].

Please note that in all our cost formulas, we omit the cost of creating unsorted streams of input tuples and any cost associated with storing the output tuples, since these costs are common to and equal for all the algorithms. 2.1. Merge Join Merge join has been one of the first join algorithms to be published and analyzed, e.g. in [2], and is the algorithm of choice for large relations in almost all commercial database systems. The idea is quite simple and well-known: Sort both relations on the join attribute', and merge them advancing a scan pointer in each relation. If both relations contain duplicate join attribute values, the scan pointer in the inner relation sometimes has to be backed up. We leave the exact scan logic to the reader as exercise or to look it up in a database text. The major cost is for the sort step. For simplicity, we only concern ourselves with 1/0 costs, even though we realize that the CPU cost is non-trivial. The 1/0 during sorting consists of sequential write operations while writing initial runs, and random read operations while reading and merging these runs. We assume that the memory size is a reasonable fraction of the input relation sizes; therefore we calculate the cost for only one merge level. For sorting the inner relation, using the cost parameters in Table 1 the cost is aorti := (scg + rnd) inner. If we assume that the output of both sort operations is immediately passed to the merge join operator, i.e., without additional I/O, the total 1/0 cost for the merge join is inner size of inner relation in pages outer size of outer relation in pages memory memory size in pages Seq rnd sequential I/O, loms random I/O, 30ms Table 1. Cost components. We assume without loss of generality that there is only one join attribute Our discussion is equally valid for multi-attribute equi-joins.

We will come back to this formula in Section 4. 2.2. Hybrid Hash Join (seq + m d) outer + sorti Hashing is a very fast method to find equality matches and a number of hash-based join algorithms have been proposed, e.g., 131. A memory-resident hash table is built with the first input, called the build input, and then probed with the other input, called the probe input. This algorithm is simple and fast if the build input fits into main memory. A number of strategies have been proposed to deal with the case when the build input is larger than memory [4, 5, 61. In a recent comparison of several algorithms, hybrid hash join was found to provide superior performance over a wide range of parameters [I]. Hybrid hash join is an optimistic hash join algorithm; we call it optimistic since it starts out with the assumption that the hash table overflow will fit into memory. When hash table overflow will occurs, a portion of the hash buckets are dumped from main memory to a build ouerjlow file on disk. Further tuples from the build input are first checked whether they belong to a hash bucket still in memory or to one on disk; in the latter case, they are not kept in memory but immediately added to the overflow file. If the remaining hash buckets overflow again, another portion is dumped, etc. Thus multiple overflow files can be created and added to while building the hash table. Notice that if multiple overflows are dealt with more efficiently if multiple overflow files are used. After the build input is exhausted, the probe input is consumed. If a probe input tuple matches with a hash bucket in memory, the join is performed immediately. Otherwise, it is added to a probe overflow file. It makes good sense to build multiple probe overflow files using the same partitioning rule used for the build overflow files. After both inputs are consumed, the overflow files are joined using the same algorithm. For our analysis, we assume that a sufficient number of overflow files has been built such that no further overflow occurs, i.e., both inputs have been partitioned into small enough disjoint subsets.

The 1/0 cost for the overflow files depends on what fraction of these files has to be written to disk; we use the formula F := (inner - memory) / inner Writing to overflow files is sequential if there is only one such file, otherwise it uses random writes. Reading overflow files always uses sequential I/O. Thus, if inner > 2 memory, the 1/0 cost is otherwise, it is (rnd + seq) F (inner + outer), (2 seq) F (inner + outer) 3. Heap-Filter Merge Join Merge join uses two sorted inputs and computes their (equi-) join by maintaining scan pointers in each, advancing them based on comparisons of join attributes and resetting them sometimes in the presence of duplicate join attribute values. The cost of the actual merge join algorithm is relatively small compared to the cost of sorting the input relations. Our effort was inspired by the desire to reduce the sorting costs. We assume in the sequel that both relations are originally unsorted. We assume that the outer relation is the larger of the two relations. Almost all alp rithms perform very well if the inner relation fits in main memory; therefore we will not concern ourselves with this case. Let us assume that the inner relation's size is a moderate multiple of the memory size, say twice to ten times the size of memory, and that the outer relation is quite large. The new algorithm, which we call heap-filter merge join, avoids sorting the outer relation completely (which is different than completely avoiding the sort!) and instead joins the initial sorted runs immediately with the inner relation. Thus, such runs do not need to be written to disk or read for merging. The 1/0 savings compared to merge join are substantial - the outer relation is never written to temporary files and therefore does not incur any 1/0 costs.

These saving, however, do not come without a cost, namely scanning the sorted inner relation repeatedly. If a heap is used for run generation, i.e., if all tuples from the outer relations have to travel through a sorting heap which gives this join algorithm its name, the number of runs can be expected to be the size of the outer relation divided by twice the memory size 171. The inner relation must be joined with each of these runs. Using the assumption that the inner relation is larger than memory, the inner relation must be retrieved repeatedly from disk, once for each run of the outer relation. While this algorithm seems reminiscent of nested loops join and therefore very expensive, it does warrant a closer examination. Let us develop this algorithm's cost formula. First, we need to sort the inner relation, at cost sorti developed above. Second, we need to scan the sorted inner relation once for each run from the outer relation. The number of these runs is equal to R := outer / (2 memory) The cost for each scan over the inner relation is eeq inner Thus, the total 1/0 cost for heapfilter merge join is R inner seq + sorti 3.1. Alternating HeapFilter Merge Join It would be desirable to leverage at least some of the 1/0 performed during a scan over the inner relation for the next scan. Fortunately, this can easily be done by creating alternating runs from the outer relation. This means that the first run is ascending, the next one descending, the third one ascending again, etc. For these runs, the sorted inner relation can be scanned forward, backward, forward, backward, etc., using the last page of one scan as the first of the next, without I/O. Careful analysis will show that only one page should be used; using more memory pages will decrease the size of runs from the outer relation but increase the number of scans over the inner. For this alternating heap-filter merge join, the 1/0 cost is

3.2. Complex Queries (R (inner - 1) + 1) seq + sorti We would like to point out that the heapfilter merge join algorithms not only have good performance (as we will see in the next section), they also allow dataflow between operators in a complex query. For example, as soon as the first tuple from the outer input has traveled through the heap, it can be joined with the inner relation and an output tuple produced. Note that all algorithms discussed here consume one relation completely before starting to consume the other, and therefore before producing results. 4. Analytical Performance Comparison In this section, we will compare merge join, heapfilter merge join, alternating heap-filter I merge join, and hybrid hash join. As mentioned before, we assume that both input relations ori- ginally are not sorted. We omit the cost of reading the unsorted relations and writing the out- put to disk since these costs are equal for all algorithms. We only show the cost for the alter- nating variant of heapfilter merge join. The difference between heapfilter merge join and alternating heap-filter merge join is minimal since the crucial factor in the formula is 249 vs. 250, respectively. Figure 1 shows the join cost for the four algorithms discussed here for a memory size of 100 pages and inner relation size of 250 pages. The outer relation sizes varies from 0 to 10,000 pages. All four algorithms hawe linear cost functions because we assumed a single level merge for all sort operations. It is obvious that merge join is inferior to hybrid hash join. However, the difference between heapfilter merge join and hybrid hash is probably surprising for most readers. The reason, as pointed out above, is that no part of the larger, outer relation is ever written to temporary disk files. We have to admit that we selected the inner-tememory ratio carefully for this graph. If the ratio is below two, the cost of hybrid hash join is much less because only sequential 1/0 is necessary to write the overflow file. If the ratio is too high, the cost of repetitive scans becomes

Memory Size 100, Inner Size 250, Outer Size &10,000 Time in Seconds Hybrid A 200 Merge Alt. Heap-F x I 5000 Outer Size Figure 1. Join Costs depending on Outer Relation Size. dominating. In fact, if this ratio is above four, the number of 110s for merge join is less than for heapfilter merge join. Considering that we estimate triple the cost of sequential 1/0 for random I/O, the break-even point between these two algorithms is characterized by an inner relations eight times the size of memory. To get a more realistic view of heapfilter merge join, we fixed the outer relation and memory sizes and varied the inner relation size. Figure 2 shows the join costs for the four algo- rithms depending on the inner relation size. Merge join is most expensive over the entire range shown, dominated by the sort cost for the large outer relation. The break-even point between merge join and heapfilter merge join is around 800 pages for the inner relation, eight times the memory size. The cost curve for hybrid hash join is most interesting: This algorithm is superior if there is little overflow, but the cost is substantial if the large outer relation must be partitioned into multiple overflow files using random I/O. Only when the inner relation size is a large multiple of the memory size, hybrid hash join becomes superior to heapfilter merge join. Clearly, the

Memory Size 100, Inner Size 0-800, Outer Size 10,000 Hybrid A Merge 0 Alt. ~ e i p - ~ x I 04 I I 1 I I 0 200 400 600 800 Inner Size Figure 2. Join Cost depending on Inner Relation Size. asymptotic cost of hybrid hash join is superior to any of the other algorithms, but there seems to be a substantial window in which heapfilter merge join outperforms hybrid hash. To illustrate this window, we fixed the memory size and varied both inner and outer rela- tion sizes. In Figure 3, we shaded the area in which alternating heapfilter merge join outper- forms hybrid hash join. As can be seen from the figure, this area is substantial for medium-size relations, i.e., where the inner relation size is a small multiple of the outer relation size. 5. Summary and Conclusions We have described a new equi-join algorithm based on the well-known merge join alge rithm, which we call heap-filter merge-join. For moderately large relations, it outperforms merge join by a significant margin. When compared to hybrid hash join, commonly regarded as a very efficient algorithm, heapfilter merge join is superior in some parameter ranges, namely if the inner relation size is a small multiple of the memory size. The results of this study have surprised us; we expected to heapfilter merge join to be inferior for all relation sizes, to be marginally superior is a very narrow range. While we believe

Memory Size 100, Inner Size CL800, Outer Size (F10,000 800 600 Inner Size 400 200 0 5000 Outer Size Figure 3. Region where Alternating Heap-Filter dominates Hybrid we used reasonably realistic cost functions, we intend to verify this study by implementing and comparing these algorithms in the framework of the Volcano query processing system [8]. References 1 D. DeWitt and D. Schneider, "A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment," Proceedings of the ACM SIGMOD Confercncc, p. 110 (May-June 1989). 2. M. Blasgen and K. Eswaran, "Storage and Access in Relational Databases," IBM Syatema Journal 16(4)(1977). 3. K. Bratbergsengen, "Hashing Methods and Relational Algebra Operations," Proeccdings of the Conference on Very Larpe Data Bases, pp. 323-333 (August 1984). 4. D.J. DeWitt, R. Katz, F. Olken, L. Shapiro, M. Stonebraker, and D. Wood, "Implementation Techniques for Main Memory Database Systems," Proceeding8 of the ACM SIGMOD Conference, pp. 1-8 (June 1984). 5. LD. Shapiro, "Join Processing in Database Systems with Large Main Memories," ACM Transactions on Database Systems ll(3) pp. 239264 (September 1986).

6. S. Fushimi. M. Kitsuregawa. and H. Tanaka. "An Overview of The System Software of A Parallel Relational Database Machine GRACE. " Proceeding of the Conference on Very Large Data Bases. pp. 209-219 (August 1986). 7. D. Knuth. The Art of Computer Programming. Addison-Wesley. Reading. MA. (1973). 8. G. Graefe. "Volcano: An Extensible and Parallel Dataflow Query Processing System. " Oregon Graduate Center. Computer Science Technical Report. (89-006)(June 1989). Abstract... 1. Introduction... 2. Competitive Join Algorithms... 2.1. Merge Join... 2.2. Hybrid Hash Join... 3. HeapFilter Merge Join... 3.1. Alternating HeapFilter Merge Join... 3.2. Complex Queries... 4. Analytical Performance Comparison... 5. Summary and Conclusions...