Lecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure

Size: px
Start display at page:

Download "Lecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure"

Transcription

1 Lecture 6: External Interval Tree (Part II) Yufei Tao Division of Web Science and Technology Korea Advanced Institute of Science and Technology 3 Making the external interval tree dynamic Remember that it is not our purpose to design a static structure to solve the stabbing problem in fact, a persistent B-tree already fulfills that purpose. Our goal is to have a fully dynamic structure. In the sequel, we will discuss how the external interval tree can be updated efficiently. For this purpose, we will make the tall cache assumption that M B 2, which allows us to avoid some complicated details, so that we can focus on learning several major techniques to dynamize an external memory structure. 3.1 Dynamizing an underflow structure Recall that each node of the external interval tree has an underflow structure which is a persistent B-tree. As mentioned in the previous lecture, in general, we cannot efficiently insert or delete an arbitrary interval in a persistent B-tree. However, an underflow structure has a special property: it indexes at most B 2 intervals. Next, we will utilize this property to make an underflow structure dynamic. Consider, in general, a data structure T that manages at most N elements. Assume that (i) T occupies space(n) blocks, (ii) can be constructed in build(n) I/Os, and (iii) supports a query in query(n)+o(k/b) I/Os, where K is the number of elements reported. Then: Lemma 1. T can be converted into a fully dynamic structure that has size space(n), answers a query in O(query(N)+K/B) I/Os, and supports an insertion or a deletion in 1 B build(n) I/Os amortized. To achieve the above purpose, it suffices to associate T with an additional (disk) block to buffer all the incoming updates (insertions and deletions). In other words, each update is simply placed in the buffer block without actually modifying T. Obviously, the space complexity of T remains space(n). To answer a query, we first retrieve from T the set S of qualifying elements in query(n)+o( S /B) I/Os. Remember, however, some elements in S may no longer belong to the dataset due to the deletions in the buffer block. Conversely, some new elements to be added to the dataset by the insertions in the buffer may also need to reported. To account for these changes, it suffices to spend an extra I/O to inspect the buffer block. In any case, S cannot differ from K by more than B. Hence, the total query cost is query(n)+1+o((k+b)/b) = O(query(N)+K/B). How do we incorporate the buffered updates into T? Nothing needs to be done until the buffer gets full, i.e., B updates have been accumulated. At this time, we simply rebuild the entire T in build(n) I/Os, and then, clear the buffer. Since this is done only once every B updates, on average, each update bears only build(n)/b I/Os. Lemma 1 is particularly useful when N = B 2 and build(b 2 ) = O(B), i.e., T is a B 2 -structure that can be built in linear time. In this case, the each update can then be performed in O(1) I/Os amortized. This is true for an underflow structure in the external interval tree. Notice that the tall 1

2 cache assumption permits us to simply read all the at most B 2 elements into memory and construct the persistent B-tree there. The only cost incurred is that of reading the elements and writing the structure back to the disk 1, i.e., O(B). We therefore have: Corollary 1. Each underflow structure consumes linear space, answers a stabbing query in O(1 + K/B) I/Os, and supports an insertion or a deletion in constant I/Os amortized. 3.2 Modifying the external interval tree We need to slightly modify the static external interval tree as described in the previous lecture to make it dynamic. First, the base tree T is a weight-balanced B-tree (instead of a normal B-tree), whereeach leaf node has a capacity B and each internal node has at most B child nodes (implying that the branching parameter is B/4). Now consider an internal node u in T with child nodes u 1,...,u f. Before, each L u (i) (R u (i)) was implemented as a linked list, but now we implement it as a B-tree indexing the left (right) endpoints. The purpose is to insert/remove an interval in L u (i) (R u (i)) using I/Os logarithmic to the size of the B-tree. Similarly, we implement each M u [i,j] as a B-tree (indexing, e.g., its intervals left endpoints). The last change concerns what it means by a multi-slab σ u [i,j] is underflowing. Before, this was defined as σ u [i,j] having less than B (middle) intervals. Now, we extend the definition: If M u [i,j] is non-empty, σ u [i,j] underflows if it has less than B/2 intervals. Otherwise (i.e., all the intervals belonging to σ u [i,j] are indexed by the underflow structure U u ), σ u [i,j] underflows as long as it has less than B intervals. We stick to the invariant that if σ u [i,j] underflows, its intervals are managed by U u ; otherwise, they are indexed by M u [i,j]. Notice that the modified underflow definition creates a leeway of B/2 before the intervals of σ u [i,j] are moved between M u [i,j] and U u. In any case, U u still manages at most B 2 intervals. The above changes do not affect the space consumption of the overall structure, and nor do they affect the query algorithm and its cost. We are now ready to clarify the update algorithms. 3.3 Performing an insertion Let s be the interval being inserted. We first insert the left and right endpoints of s in T (without handling overflows yet even if they occur) by traversing at most two root-to-leaf paths. In doing so, we have also identified the node u whose stabbing set S u we should add s to. Assume that u has f child nodes, and that the left (right) endpoint of s falls in σ(u i ) (σ(u j )) for some i,j. Cut s into a left interval s l, a middle interval s m, and a right interval s r. Insert s l (s r ) into the left (right) structure of u, more specifically, L u (i) (R u (j)). If s m, we check whether the intervals of σ u [i+1,j 1] are being indexed by M u [i+1,j 1]. If yes, s m is inserted there. Otherwise, we add s m to U u. Now σ u [i+1,j 1] may have B intervals so that it no longer underflows. In this case, we find them in O(1) I/Os (by performing a stabbing query on U u ), delete all of them from U u in O(B) amortized I/Os (see Corollary 1), initialize an 1 Strictly speaking, the situation is a bit more complex because besides the elements, the construction algorithm of the persistent B-tree also needs to store additional data, which would make the total amount of memory consumption over B 2 words. However, one can show that the algorithm requires O(B 2 ) words at any moment. As a result, we can eliminate this issue by constraining an underflow structure to contain at most B 2 /c elements, for some proper c. 2

3 empty B-tree M u [i+1,j 1], and insert those B intervals into M u [i +1,j 1] using O(B) I/Os. We can charge this cost over the at least B/2 elements added to U u since the previous underflow of σ u [i +1,j 1]. Therefore, on average, each insertion bears only O(B)/ B 2 = O(1) I/Os for the movement of intervals from U u to M u [i+1,j 1]. The cost so far is O(log B N) amortized. Now it remains to handle overflows, which may have happened to the nodes on the at most two root-to-leaf paths we followed at the beginning. We treat the overflows in a bottom-up manner, namely, first handling the at most two leaf nodes, then their parents, and so on. In general, let v be a node that overflows, and ˆv be its parent. Split the elements of v into v 1 and v 2 following the standard algorithm in the weight-balanced B-tree. Let l be the splitting value, i.e., all the elements in v 1 (v 2 ) are smaller (at least) l. Note that l becomes a new slab boundary at ˆv. We proceed to fix the secondary structures of v 1,v 2 and ˆv. Note that the intervals in S v (stabbing set of v) can now be divided into three groups: (i) those completely to the left of l, (ii) those completely to the right of l, and (iii) those crossing l. The first group becomes S v1, the second becomes S v2, while the intervals of the third group, denoted as S up, should be inserted into Sˆv. Clearly, S v1,s v2,s up can be obtained in O( S v /B) I/Os by scanning S v once. In fact, with this cost, we can obtain two sorted lists for S v1, one sorted by the left endpoints of its intervals and the other by their right intervals (this detail is left to you). Refer to the first (second) copy as the left (right) copy of S v1. The same is true for S v2 and S up. Before proceeding we prove: Lemma 2. Consider a node u and its stabbing set S u. Given the left and right copies of S u, all the secondary structures of u can be built in O( B + S u /B) I/Os. Proof. Assume that u has f B child nodes. By scanning the left copy of S u once, we can generate the intervals indexed by L u (i) for each i [1,f]. After which, L u (i) can be built in O(1+ L u (i) /B) I/Os. Hence, the left structure of u can be constructed in O( B + S u /B) I/Os in total. Similarly, its right structure can also be constructed in the same cost. As M B 2, and there are less than f 2 = B multi-slabs, by scanning the left copy of S u once, we can obtain the intervals belonging to each multi-slab in O( S u /B) I/Os, such that if a multislab has at least B intervals, all those intervals are stored in a file, sorted by their left endpoints; otherwise, the intervals of the (underflowing) multi-slab remain in memory. Build the underflow structure using the intervals in memory, and write the structure to the disk in cost linear to the number of indexed intervals. Finally, for each non-underflowing multi-slab σ u [i,j], build M u [i,j] on those intervals in cost linear to the number of them. Therefore, given S v1 and S v2, the secondary structures of v 1 and v 2 can be constructed in O( B+ S v1 /B+ S v2 /B) = O(B+ S v /B) I/Os. Now let us focus on ˆv. The new Sˆv is the union of the original Sˆv and S up. From now on, we use Sˆv to refer to the new Sˆv. Given the left and right copies of S up, it is easy to generate the corresponding copies of Sˆv in O( Sˆv /B) I/Os, after which the secondary structures of ˆv can be rebuilt in O( B + Sˆv /B) I/Os. Now it is time to use the fact that T is a weight-balanced B-tree with leaf capacity b = B and branching parameter p = B/4. Let w(v) and w(ˆv) be the weights of v and ˆv, respectively. It thus follows that w(ˆv) 4p w(v). Observe that S v w(v) (as each interval in S v has both endpoints in the subtree of v), and Sˆv w(ˆv). In other words, the total cost of re-constructing the secondary 3

4 structures of v 1,v 2 and ˆv is O( B + S / B + Sˆv /B) = O( B +w(v)/ B +w(ˆv)/b) = O( B +w(ˆv)/b) = O( B +4p w(v)/b) = O(w(v)) where the last inequality used the fact that p < B and that w(v) B. Recall that, by the property of the weight-balanced B-tree, when v overflows, Ω(v) updates have been performed under its subtree. Hence, we can amortize the O(w(v)) cost of handing the overflow over those updates so that each one of them accounts for only constant I/Os. As each update may need to bear such an amortized cost O(log B N) times, it follows that each insertion can be performed in O(log B N) I/Os amortized. Remarks. There are two key ingredients in the above insertion algorithm that lead to the nice amortized insertion time of O(log B N). The first one is all the B 2 -structures, each of which is space efficient, and can be updated and supports a query with constant overhead (see Corollary 1). The second ingredient is the usage of the weight-balanced B-tree, which allows us to pay a huge amount of cost to handle the overflow of a node as much as the number of data elements in the subtree of the node. This technique is known as partial rebuilding. It is the first time in this course we see the necessity of the weight-balanced B-tree. 3.4 Performing a deletion As expected, the major difficulty of a deletion is the handling of underflows. Interestingly, next we will see how to circumvent the difficulty altogether by using a technique called global rebuilding. Recall that a query algorithm reports intervals only from stabbing sets. Hence, as long as we can (i) keep the stabbing sets updated, and (ii) make sure that the weight-balanced B-tree T still allows us to guide the query to the relevant stabbing sets, we can seek ways to save us some trouble when it comes to removing elements from T itself. With the above in mind, the deletion algorithm can be made surprisingly simple. To delete a segment s, we remove it from the secondary structures of the node whose stabbing set contains s. This can be done easily in O(log B N) I/Os by reversing the corresponding steps in an insertion. We are done right here, without even removing the left or right endpoint of s from T. It is easy to see that the correctness of the query algorithm can still be guaranteed. As no element is ever deleted from T, underflows can never happen. There is, however, a minor drawback. Since we permit redundant endpoints to remain in T, over time the number of endpoints in T can become so much larger than the current N, so that the height of T may eventually become ω(log B N). To avoid this, after N/2 updates since the initial construction of T (where N is the size of the dataset I at the time of that construction), we simply rebuild the entire T from scratch by incrementally inserting each interval currently in I (of course, we need to keep track of I exactly but this can be easily done with another B-tree). Notice that now I can have at most 3N/2 elements, so T can be re-constructed in O(N log B N) I/Os, or merely O(log B N) amortized I/Os per update. It is easy to verify that with this approach, the height of T is always O(log B I ) at all times. Summarizing all the above discussion, we have: 4

5 Theorem 1. Under the tall-cache assumption M B 2, there exists a structure on a set of N intervals that consumes O(N/B) space, supports a stabbing query in O(log B N +K/B) I/Os, and can be updated in O(log B N) amortized I/Os per insertion and deletion. Bibliography The interval tree in internal memory is due to Edelsbrunner [3]. Its external version was developed by Arge and Vitter [1]. They showed that Theorem 1 still holds even without the the tall cache assumption, by giving a clever algorithm to construct an underflow structure in O(B) I/Os using only 2 memory blocks (i.e., M = 2B). They also explained in [2] how to remove the amortization so that each insertion/deletion can be handled in O(log B N) I/Os in the worst case. Finally, the partial and global rebuilding techniques we discussed were invented by Overmars [4]. References [1] L. Arge and J. S. Vitter. Optimal dynamic interval management in external memory. In FOCS, pages , [2] L. Arge and J. S. Vitter. Optimal external memory interval management. SIAM J. of Comp., 32(6): , [3] H. Edelsbrunner. A new approach to rectangle intersections, part I. International Journal of Computer Mathematics, 13: , [4] M. H. Overmars. The Design of Dynamic Data Structures. Springer-Verlag,

Lecture Notes: External Interval Tree. 1 External Interval Tree The Static Version

Lecture Notes: External Interval Tree. 1 External Interval Tree The Static Version Lecture Notes: External Interval Tree Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk This lecture discusses the stabbing problem. Let I be

More information

Lecture Notes: Range Searching with Linear Space

Lecture Notes: Range Searching with Linear Space Lecture Notes: Range Searching with Linear Space Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk In this lecture, we will continue our discussion

More information

Optimal External Memory Interval Management

Optimal External Memory Interval Management KU ScholarWorks http://kuscholarworks.ku.edu Please share your stories about how Open Access to this article benefits you. Optimal External Memory Interval Management by Lars Arge and Jeffrey Scott Vitter

More information

I/O-Algorithms Lars Arge

I/O-Algorithms Lars Arge I/O-Algorithms Fall 203 September 9, 203 I/O-Model lock I/O D Parameters = # elements in problem instance = # elements that fits in disk block M = # elements that fits in main memory M T = # output size

More information

An Optimal Dynamic Interval Stabbing-Max Data Structure?

An Optimal Dynamic Interval Stabbing-Max Data Structure? An Optimal Dynamic Interval Stabbing-Max Data Structure? Pankaj K. Agarwal Lars Arge Ke Yi Abstract In this paper we consider the dynamic stabbing-max problem, that is, the problem of dynamically maintaining

More information

Massive Data Algorithmics

Massive Data Algorithmics Three-Sided Range Queries Interval management: 1.5 dimensional search More general 2d problem: Dynamic 3-sidede range searching - Maintain set of points in plane such that given query (q 1,q 2,q 3 ), all

More information

Binary Heaps in Dynamic Arrays

Binary Heaps in Dynamic Arrays Yufei Tao ITEE University of Queensland We have already learned that the binary heap serves as an efficient implementation of a priority queue. Our previous discussion was based on pointers (for getting

More information

The B-Tree. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland

The B-Tree. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland Yufei Tao ITEE University of Queensland Before ascending into d-dimensional space R d with d > 1, this lecture will focus on one-dimensional space, i.e., d = 1. We will review the B-tree, which is a fundamental

More information

I/O-Algorithms Lars Arge Aarhus University

I/O-Algorithms Lars Arge Aarhus University I/O-Algorithms Aarhus University April 10, 2008 I/O-Model Block I/O D Parameters N = # elements in problem instance B = # elements that fits in disk block M = # elements that fits in main memory M T =

More information

Computational Geometry

Computational Geometry Windowing queries Windowing Windowing queries Zoom in; re-center and zoom in; select by outlining Windowing Windowing queries Windowing Windowing queries Given a set of n axis-parallel line segments, preprocess

More information

Massive Data Algorithmics

Massive Data Algorithmics Database queries Range queries 1D range queries 2D range queries salary G. Ometer born: Aug 16, 1954 salary: $3,500 A database query may ask for all employees with age between a 1 and a 2, and salary between

More information

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree. The Lecture Contains: Index structure Binary search tree (BST) B-tree B+-tree Order file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture13/13_1.htm[6/14/2012

More information

6 Distributed data management I Hashing

6 Distributed data management I Hashing 6 Distributed data management I Hashing There are two major approaches for the management of data in distributed systems: hashing and caching. The hashing approach tries to minimize the use of communication

More information

Computing intersections in a set of line segments: the Bentley-Ottmann algorithm

Computing intersections in a set of line segments: the Bentley-Ottmann algorithm Computing intersections in a set of line segments: the Bentley-Ottmann algorithm Michiel Smid October 14, 2003 1 Introduction In these notes, we introduce a powerful technique for solving geometric problems.

More information

1 Static-to-Dynamic Transformations

1 Static-to-Dynamic Transformations You re older than you ve ever been and now you re even older And now you re even older And now you re even older You re older than you ve ever been and now you re even older And now you re older still

More information

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25 Multi-way Search Trees (Multi-way Search Trees) Data Structures and Programming Spring 2017 1 / 25 Multi-way Search Trees Each internal node of a multi-way search tree T: has at least two children contains

More information

Report on Cache-Oblivious Priority Queue and Graph Algorithm Applications[1]

Report on Cache-Oblivious Priority Queue and Graph Algorithm Applications[1] Report on Cache-Oblivious Priority Queue and Graph Algorithm Applications[1] Marc André Tanner May 30, 2014 Abstract This report contains two main sections: In section 1 the cache-oblivious computational

More information

Lecture 8 13 March, 2012

Lecture 8 13 March, 2012 6.851: Advanced Data Structures Spring 2012 Prof. Erik Demaine Lecture 8 13 March, 2012 1 From Last Lectures... In the previous lecture, we discussed the External Memory and Cache Oblivious memory models.

More information

ICS 691: Advanced Data Structures Spring Lecture 8

ICS 691: Advanced Data Structures Spring Lecture 8 ICS 691: Advanced Data Structures Spring 2016 Prof. odari Sitchinava Lecture 8 Scribe: Ben Karsin 1 Overview In the last lecture we continued looking at arborally satisfied sets and their equivalence to

More information

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof.

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof. Priority Queues 1 Introduction Many applications require a special type of queuing in which items are pushed onto the queue by order of arrival, but removed from the queue based on some other priority

More information

Lecture 3: Art Gallery Problems and Polygon Triangulation

Lecture 3: Art Gallery Problems and Polygon Triangulation EECS 396/496: Computational Geometry Fall 2017 Lecture 3: Art Gallery Problems and Polygon Triangulation Lecturer: Huck Bennett In this lecture, we study the problem of guarding an art gallery (specified

More information

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler Lecture 12: BT Trees Courtesy to Goodrich, Tamassia and Olga Veksler Instructor: Yuzhen Xie Outline B-tree Special case of multiway search trees used when data must be stored on the disk, i.e. too large

More information

CMSC 754 Computational Geometry 1

CMSC 754 Computational Geometry 1 CMSC 754 Computational Geometry 1 David M. Mount Department of Computer Science University of Maryland Fall 2005 1 Copyright, David M. Mount, 2005, Dept. of Computer Science, University of Maryland, College

More information

Lecture 22 November 19, 2015

Lecture 22 November 19, 2015 CS 229r: Algorithms for ig Data Fall 2015 Prof. Jelani Nelson Lecture 22 November 19, 2015 Scribe: Johnny Ho 1 Overview Today we re starting a completely new topic, which is the external memory model,

More information

(2,4) Trees. 2/22/2006 (2,4) Trees 1

(2,4) Trees. 2/22/2006 (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 2/22/2006 (2,4) Trees 1 Outline and Reading Multi-way search tree ( 10.4.1) Definition Search (2,4) tree ( 10.4.2) Definition Search Insertion Deletion Comparison of dictionary

More information

External Memory Algorithms for Geometric Problems. Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)

External Memory Algorithms for Geometric Problems. Piotr Indyk (slides partially by Lars Arge and Jeff Vitter) External Memory for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter) Compared to Previous Lectures Another way to tackle large data sets Exact solutions (no more embeddings)

More information

Suffix Trees and Arrays

Suffix Trees and Arrays Suffix Trees and Arrays Yufei Tao KAIST May 1, 2013 We will discuss the following substring matching problem: Problem (Substring Matching) Let σ be a single string of n characters. Given a query string

More information

Lecture 3 February 9, 2010

Lecture 3 February 9, 2010 6.851: Advanced Data Structures Spring 2010 Dr. André Schulz Lecture 3 February 9, 2010 Scribe: Jacob Steinhardt and Greg Brockman 1 Overview In the last lecture we continued to study binary search trees

More information

Dynamic Arrays and Amortized Analysis

Dynamic Arrays and Amortized Analysis Yufei Tao ITEE University of Queensland As mentioned earlier, one drawback of arrays is that their lengths are fixed. This makes it difficult when you want to use an array to store a set that may continuously

More information

Multi-Way Search Trees

Multi-Way Search Trees Multi-Way Search Trees Manolis Koubarakis 1 Multi-Way Search Trees Multi-way trees are trees such that each internal node can have many children. Let us assume that the entries we store in a search tree

More information

Dynamic Arrays and Amortized Analysis

Dynamic Arrays and Amortized Analysis Department of Computer Science and Engineering Chinese University of Hong Kong As mentioned earlier, one drawback of arrays is that their lengths are fixed. This makes it difficult when you want to use

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

Balanced search trees. DS 2017/2018

Balanced search trees. DS 2017/2018 Balanced search trees. DS 2017/2018 Red-black trees Symmetric binary B-tree, Rudolf Bayer, 1972. The balancing is maintained by using a coloring of the nodes. The red-black trees are binary search trees

More information

Threshold Interval Indexing for Complicated Uncertain Data

Threshold Interval Indexing for Complicated Uncertain Data Threshold Interval Indexing for Complicated Uncertain Data Andrew Knight Department of Computer Science Rochester Institute of Technology Rochester, New York, USA Email: alk1234@rit.edu Qi Yu Department

More information

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19 CSE34T/CSE549T /05/04 Lecture 9 Treaps Binary Search Trees (BSTs) Search trees are tree-based data structures that can be used to store and search for items that satisfy a total order. There are many types

More information

An I/O-Efficient Algorithm for Computing Vertex Separators on Multi-Dimensional Grid Graphs and Its Applications

An I/O-Efficient Algorithm for Computing Vertex Separators on Multi-Dimensional Grid Graphs and Its Applications Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 22, no. 2, pp. 297 327 (2018) DOI: 10.7155/jgaa.00471 An I/O-Efficient Algorithm for Computing Vertex Separators on Multi-Dimensional

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Planar Point Location

Planar Point Location C.S. 252 Prof. Roberto Tamassia Computational Geometry Sem. II, 1992 1993 Lecture 04 Date: February 15, 1993 Scribe: John Bazik Planar Point Location 1 Introduction In range searching, a set of values,

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Priority Queues and Binary Heaps

Priority Queues and Binary Heaps Yufei Tao ITEE University of Queensland In this lecture, we will learn our first tree data structure called the binary heap which serves as an implementation of the priority queue. Priority Queue A priority

More information

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree B-Trees Disk Storage What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree Deletion in a B-tree Disk Storage Data is stored on disk (i.e., secondary memory) in blocks. A block is

More information

On k-dimensional Balanced Binary Trees*

On k-dimensional Balanced Binary Trees* journal of computer and system sciences 52, 328348 (1996) article no. 0025 On k-dimensional Balanced Binary Trees* Vijay K. Vaishnavi Department of Computer Information Systems, Georgia State University,

More information

arxiv: v2 [cs.ds] 9 Apr 2009

arxiv: v2 [cs.ds] 9 Apr 2009 Pairing Heaps with Costless Meld arxiv:09034130v2 [csds] 9 Apr 2009 Amr Elmasry Max-Planck Institut für Informatik Saarbrücken, Germany elmasry@mpi-infmpgde Abstract Improving the structure and analysis

More information

Computational Geometry

Computational Geometry Windowing queries Windowing Windowing queries Zoom in; re-center and zoom in; select by outlining Windowing Windowing queries Windowing Windowing queries Given a set of n axis-parallel line segments, preprocess

More information

Lecture 3: B-Trees. October Lecture 3: B-Trees

Lecture 3: B-Trees. October Lecture 3: B-Trees October 2017 Remarks Search trees The dynamic set operations search, minimum, maximum, successor, predecessor, insert and del can be performed efficiently (in O(log n) time) if the search tree is balanced.

More information

Lecture 5. Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs

Lecture 5. Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs Lecture 5 Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs Reading: Randomized Search Trees by Aragon & Seidel, Algorithmica 1996, http://sims.berkeley.edu/~aragon/pubs/rst96.pdf;

More information

Multi-Way Search Trees

Multi-Way Search Trees Multi-Way Search Trees Manolis Koubarakis 1 Multi-Way Search Trees Multi-way trees are trees such that each internal node can have many children. Let us assume that the entries we store in a search tree

More information

A Discrete and Dynamic Version of Klee s Measure Problem

A Discrete and Dynamic Version of Klee s Measure Problem CCCG 2011, Toronto ON, August 10 12, 2011 A Discrete and Dynamic Version of Klee s Measure Problem Hakan Yıldız John Hershberger Subhash Suri Abstract Given a set of axis-aligned boxes B = {B 1, B 2,...,

More information

Lecture Notes on Binary Search Trees

Lecture Notes on Binary Search Trees Lecture Notes on Binary Search Trees 15-122: Principles of Imperative Computation Frank Pfenning André Platzer Lecture 15 March 6, 2014 1 Introduction In this lecture, we will continue considering associative

More information

Trapezoid and Chain Methods

Trapezoid and Chain Methods C.S. 252 Prof. Roberto Tamassia Computational Geometry Sem. II, 1992 1993 Lecture 05 Date: Febuary 17, 1993 Scribe: Peter C. McCluskey Trapezoid and Chain Methods 1 Trapezoid Method (continued) Continuing

More information

An Algorithm for Enumerating all Directed Spanning Trees in a Directed Graph

An Algorithm for Enumerating all Directed Spanning Trees in a Directed Graph (C) Springer Verlag, Lecture Notes on Computer Sience An Algorithm for Enumerating all Directed Spanning Trees in a Directed Graph Takeaki UNO Department of Systems Science, Tokyo Institute of Technology,

More information

Balanced Binary Search Trees. Victor Gao

Balanced Binary Search Trees. Victor Gao Balanced Binary Search Trees Victor Gao OUTLINE Binary Heap Revisited BST Revisited Balanced Binary Search Trees Rotation Treap Splay Tree BINARY HEAP: REVIEW A binary heap is a complete binary tree such

More information

Point Enclosure and the Interval Tree

Point Enclosure and the Interval Tree C.S. 252 Prof. Roberto Tamassia Computational Geometry Sem. II, 1992 1993 Lecture 8 Date: March 3, 1993 Scribe: Dzung T. Hoang Point Enclosure and the Interval Tree Point Enclosure We consider the 1-D

More information

Efficient Range Query Processing on Uncertain Data

Efficient Range Query Processing on Uncertain Data Efficient Range Query Processing on Uncertain Data Andrew Knight Rochester Institute of Technology Department of Computer Science Rochester, New York, USA andyknig@gmail.com Manjeet Rege Rochester Institute

More information

where is a constant, 0 < <. In other words, the ratio between the shortest and longest paths from a node to a leaf is at least. An BB-tree allows ecie

where is a constant, 0 < <. In other words, the ratio between the shortest and longest paths from a node to a leaf is at least. An BB-tree allows ecie Maintaining -balanced Trees by Partial Rebuilding Arne Andersson Department of Computer Science Lund University Box 8 S-22 00 Lund Sweden Abstract The balance criterion dening the class of -balanced trees

More information

1 The range query problem

1 The range query problem CS268: Geometric Algorithms Handout #12 Design and Analysis Original Handout #12 Stanford University Thursday, 19 May 1994 Original Lecture #12: Thursday, May 19, 1994 Topics: Range Searching with Partition

More information

What is a Multi-way tree?

What is a Multi-way tree? B-Tree Motivation for studying Multi-way and B-trees A disk access is very expensive compared to a typical computer instruction (mechanical limitations) -One disk access is worth about 200,000 instructions.

More information

lecture notes September 2, How to sort?

lecture notes September 2, How to sort? .30 lecture notes September 2, 203 How to sort? Lecturer: Michel Goemans The task of sorting. Setup Suppose we have n objects that we need to sort according to some ordering. These could be integers or

More information

Algorithms and Data Structures: Lower Bounds for Sorting. ADS: lect 7 slide 1

Algorithms and Data Structures: Lower Bounds for Sorting. ADS: lect 7 slide 1 Algorithms and Data Structures: Lower Bounds for Sorting ADS: lect 7 slide 1 ADS: lect 7 slide 2 Comparison Based Sorting Algorithms Definition 1 A sorting algorithm is comparison based if comparisons

More information

Rank-Pairing Heaps. Bernard Haeupler Siddhartha Sen Robert E. Tarjan. SIAM JOURNAL ON COMPUTING Vol. 40, No. 6 (2011), pp.

Rank-Pairing Heaps. Bernard Haeupler Siddhartha Sen Robert E. Tarjan. SIAM JOURNAL ON COMPUTING Vol. 40, No. 6 (2011), pp. Rank-Pairing Heaps Bernard Haeupler Siddhartha Sen Robert E. Tarjan Presentation by Alexander Pokluda Cheriton School of Computer Science, University of Waterloo, Canada SIAM JOURNAL ON COMPUTING Vol.

More information

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 2004 Goodrich, Tamassia (2,4) Trees 1 Multi-Way Search Tree A multi-way search tree is an ordered tree such that Each internal node has at least two children and stores d -1 key-element

More information

7 Distributed Data Management II Caching

7 Distributed Data Management II Caching 7 Distributed Data Management II Caching In this section we will study the approach of using caching for the management of data in distributed systems. Caching always tries to keep data at the place where

More information

Lecture 15 Binary Search Trees

Lecture 15 Binary Search Trees Lecture 15 Binary Search Trees 15-122: Principles of Imperative Computation (Fall 2017) Frank Pfenning, André Platzer, Rob Simmons, Iliano Cervesato In this lecture, we will continue considering ways to

More information

BST Deletion. First, we need to find the value which is easy because we can just use the method we developed for BST_Search.

BST Deletion. First, we need to find the value which is easy because we can just use the method we developed for BST_Search. BST Deletion Deleting a value from a Binary Search Tree is a bit more complicated than inserting a value, but we will deal with the steps one at a time. First, we need to find the value which is easy because

More information

Lecture 3 February 20, 2007

Lecture 3 February 20, 2007 6.897: Advanced Data Structures Spring 2007 Prof. Erik Demaine Lecture 3 February 20, 2007 Scribe: Hui Tang 1 Overview In the last lecture we discussed Binary Search Trees and the many bounds which achieve

More information

ICS 691: Advanced Data Structures Spring Lecture 3

ICS 691: Advanced Data Structures Spring Lecture 3 ICS 691: Advanced Data Structures Spring 2016 Prof. Nodari Sitchinava Lecture 3 Scribe: Ben Karsin 1 Overview In the last lecture we started looking at self-adjusting data structures, specifically, move-to-front

More information

A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES)

A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES) Chapter 1 A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES) Piotr Berman Department of Computer Science & Engineering Pennsylvania

More information

Level-Balanced B-Trees

Level-Balanced B-Trees Gerth Stølting rodal RICS University of Aarhus Pankaj K. Agarwal Lars Arge Jeffrey S. Vitter Center for Geometric Computing Duke University January 1999 1 -Trees ayer, McCreight 1972 Level 2 Level 1 Leaves

More information

Balanced Binary Search Trees

Balanced Binary Search Trees Balanced Binary Search Trees Pedro Ribeiro DCC/FCUP 2017/2018 Pedro Ribeiro (DCC/FCUP) Balanced Binary Search Trees 2017/2018 1 / 48 Motivation Let S be a set of comparable objects/items: Let a and b be

More information

CS350: Data Structures B-Trees

CS350: Data Structures B-Trees B-Trees James Moscola Department of Engineering & Computer Science York College of Pennsylvania James Moscola Introduction All of the data structures that we ve looked at thus far have been memory-based

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Problem Set 5 Solutions

Problem Set 5 Solutions Introduction to Algorithms November 4, 2005 Massachusetts Institute of Technology 6.046J/18.410J Professors Erik D. Demaine and Charles E. Leiserson Handout 21 Problem Set 5 Solutions Problem 5-1. Skip

More information

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree Chapter 4 Trees 2 Introduction for large input, even access time may be prohibitive we need data structures that exhibit running times closer to O(log N) binary search tree 3 Terminology recursive definition

More information

Cuckoo Hashing for Undergraduates

Cuckoo Hashing for Undergraduates Cuckoo Hashing for Undergraduates Rasmus Pagh IT University of Copenhagen March 27, 2006 Abstract This lecture note presents and analyses two simple hashing algorithms: Hashing with Chaining, and Cuckoo

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Balanced Trees Part One

Balanced Trees Part One Balanced Trees Part One Balanced Trees Balanced search trees are among the most useful and versatile data structures. Many programming languages ship with a balanced tree library. C++: std::map / std::set

More information

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 530A. B+ Trees. Washington University Fall 2013 CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key

More information

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology Introduction Chapter 4 Trees for large input, even linear access time may be prohibitive we need data structures that exhibit average running times closer to O(log N) binary search tree 2 Terminology recursive

More information

Lecture 15 Notes Binary Search Trees

Lecture 15 Notes Binary Search Trees Lecture 15 Notes Binary Search Trees 15-122: Principles of Imperative Computation (Spring 2016) Frank Pfenning, André Platzer, Rob Simmons 1 Introduction In this lecture, we will continue considering ways

More information

Notes on Binary Dumbbell Trees

Notes on Binary Dumbbell Trees Notes on Binary Dumbbell Trees Michiel Smid March 23, 2012 Abstract Dumbbell trees were introduced in [1]. A detailed description of non-binary dumbbell trees appears in Chapter 11 of [3]. These notes

More information

9 Distributed Data Management II Caching

9 Distributed Data Management II Caching 9 Distributed Data Management II Caching In this section we will study the approach of using caching for the management of data in distributed systems. Caching always tries to keep data at the place where

More information

B-Trees and External Memory

B-Trees and External Memory Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 and External Memory 1 1 (2, 4) Trees: Generalization of BSTs Each internal node

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17 01.433/33 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/2/1.1 Introduction In this lecture we ll talk about a useful abstraction, priority queues, which are

More information

CSE100. Advanced Data Structures. Lecture 8. (Based on Paul Kube course materials)

CSE100. Advanced Data Structures. Lecture 8. (Based on Paul Kube course materials) CSE100 Advanced Data Structures Lecture 8 (Based on Paul Kube course materials) CSE 100 Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs

More information

4. Suffix Trees and Arrays

4. Suffix Trees and Arrays 4. Suffix Trees and Arrays Let T = T [0..n) be the text. For i [0..n], let T i denote the suffix T [i..n). Furthermore, for any subset C [0..n], we write T C = {T i i C}. In particular, T [0..n] is the

More information

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation! Lecture 10: Multi-way Search Trees: intro to B-trees 2-3 trees 2-3-4 trees Multi-way Search Trees A node on an M-way search tree with M 1 distinct and ordered keys: k 1 < k 2 < k 3

More information

The strong chromatic number of a graph

The strong chromatic number of a graph The strong chromatic number of a graph Noga Alon Abstract It is shown that there is an absolute constant c with the following property: For any two graphs G 1 = (V, E 1 ) and G 2 = (V, E 2 ) on the same

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Problem. Indexing with B-trees. Indexing. Primary Key Indexing. B-trees: Example. B-trees. primary key indexing

Problem. Indexing with B-trees. Indexing. Primary Key Indexing. B-trees: Example. B-trees. primary key indexing 15-82 Advanced Topics in Database Systems Performance Problem Given a large collection of records, Indexing with B-trees find similar/interesting things, i.e., allow fast, approximate queries 2 Indexing

More information

Orthogonal Range Search and its Relatives

Orthogonal Range Search and its Relatives Orthogonal Range Search and its Relatives Coordinate-wise dominance and minima Definition: dominates Say that point (x,y) dominates (x', y') if x

More information

A Distribution-Sensitive Dictionary with Low Space Overhead

A Distribution-Sensitive Dictionary with Low Space Overhead A Distribution-Sensitive Dictionary with Low Space Overhead Prosenjit Bose, John Howat, and Pat Morin School of Computer Science, Carleton University 1125 Colonel By Dr., Ottawa, Ontario, CANADA, K1S 5B6

More information

B-Trees and External Memory

B-Trees and External Memory Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 B-Trees and External Memory 1 (2, 4) Trees: Generalization of BSTs Each internal

More information

DDS Dynamic Search Trees

DDS Dynamic Search Trees DDS Dynamic Search Trees 1 Data structures l A data structure models some abstract object. It implements a number of operations on this object, which usually can be classified into l creation and deletion

More information

38 Cache-Oblivious Data Structures

38 Cache-Oblivious Data Structures 38 Cache-Oblivious Data Structures Lars Arge Duke University Gerth Stølting Brodal University of Aarhus Rolf Fagerberg University of Southern Denmark 38.1 The Cache-Oblivious Model... 38-1 38.2 Fundamental

More information

A Fast Algorithm for Optimal Alignment between Similar Ordered Trees

A Fast Algorithm for Optimal Alignment between Similar Ordered Trees Fundamenta Informaticae 56 (2003) 105 120 105 IOS Press A Fast Algorithm for Optimal Alignment between Similar Ordered Trees Jesper Jansson Department of Computer Science Lund University, Box 118 SE-221

More information

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler Lecture 11: Multiway and (2,4) Trees 9 2 5 7 10 14 Courtesy to Goodrich, Tamassia and Olga Veksler Instructor: Yuzhen Xie Outline Multiway Seach Tree: a new type of search trees: for ordered d dictionary

More information

Orthogonal art galleries with holes: a coloring proof of Aggarwal s Theorem

Orthogonal art galleries with holes: a coloring proof of Aggarwal s Theorem Orthogonal art galleries with holes: a coloring proof of Aggarwal s Theorem Pawe l Żyliński Institute of Mathematics University of Gdańsk, 8095 Gdańsk, Poland pz@math.univ.gda.pl Submitted: Sep 9, 005;

More information

We have used both of the last two claims in previous algorithms and therefore their proof is omitted.

We have used both of the last two claims in previous algorithms and therefore their proof is omitted. Homework 3 Question 1 The algorithm will be based on the following logic: If an edge ( ) is deleted from the spanning tree, let and be the trees that were created, rooted at and respectively Adding any

More information

Lecture 15: The subspace topology, Closed sets

Lecture 15: The subspace topology, Closed sets Lecture 15: The subspace topology, Closed sets 1 The Subspace Topology Definition 1.1. Let (X, T) be a topological space with topology T. subset of X, the collection If Y is a T Y = {Y U U T} is a topology

More information

High Dimensional Indexing by Clustering

High Dimensional Indexing by Clustering Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should

More information