A Distribution-Sensitive Dictionary with Low Space Overhead
|
|
- Charlene Bradley
- 5 years ago
- Views:
Transcription
1 A Distribution-Sensitive Dictionary with Low Space Overhead Prosenjit Bose, John Howat, and Pat Morin School of Computer Science, Carleton University 1125 Colonel By Dr., Ottawa, Ontario, CANADA, K1S 5B6 Abstract. The time required for a sequence of operations on a data structure is usually measured in terms of the worst possible such sequence. This, however, is often an overestimate of the actual time required. Distribution-sensitive data structures attempt to take advantage of underlying patterns in a sequence of operations in order to reduce time complexity, since access patterns are non-random in many applications. Unfortunately, many of the distribution-sensitive structures in the literature require a great deal of space overhead in the form of pointers. We present a dictionary data structure that makes use of both randomization and existing space-efficient data structures to yield very low space overhead while maintaining distribution sensitivity in the expected sense. 1 Introduction For the dictionary problem, we would like to efficiently support the operations of Insert, Delete and Search over some totally ordered universe. There exist many such data structures: AVL trees [1], red-black trees [5] and splay trees [9], for instance. Splay trees are of particular interest because they are distributionsensitive, that is, the time required for certain operations can be measured in terms of the distribution of those operations. In particular, splay trees have the working set property, which means that the time required to search for an element is logarithmic in the number of distinct accesses since that element was last searched for. Splay trees are not the only dictionary to provide the working set property; the working set structure [6] and the unified structure [6] also have it. Unfortunately, such dictionaries often require a significant amount of space overhead. Indeed, this is a problem with data structures in general. Space overhead often takes the form of pointers: a binary search tree, for instance, might have two pointers per node in the tree: one to each child. If this is the case and we assume that pointers and keys have the same size, then it is easy to see that 2/3 of the storage used by the binary search tree consists of pointers. This seems to be wasteful, since we are really interested in the data itself and would rather not invest such a large fraction of space in overhead. To remedy this situation, This research was partially supported by NSERC and MRI.
2 there has been a great deal of research in the area of implicit data structures. An implicit data structure uses only the space required to hold the data itself in addition to only a constant number of words, each of size Olog n bits. Implicit dictionaries are a particularly well-studied problem [8,2, ]. Our goal is to combine notions of distribution sensitivity with ideas from implicit dictionaries to yield a distribution-sensitive dictionary with low space overhead. 1.1 Our Results We will present a dictionary data structure with worst-case insertion and deletion times Olog n and expected search time Olog tx, where x is the key being searched for and tx is the number of distinct queries made since x was last searched for, or n if x has not yet been searched for. The space overhead required for this data structure is Olog log n, i.e., Olog log n additional words of memory each of size Olog n bits are required aside from the data itself. Current data structures that can match this query time include the splay tree [9] in the amortized sense and the working set structure [6] in the worst case, although these require 3n and 5n pointers respectively, assuming three pointers per node one parent pointer and two child pointers. We also show how to modify this structure and by extension the working set structure [6] to support predecessor queries in time logarithmic in the working set number of the predecessor. The rest of the paper is organized in the following way. Section 2 briefly summarizes the working set structure [6] and shows how to modify it to reduce its space overhead. Section 3 shows how to modify the new dictionary to support more useful queries with additional but sublinear overhead. These modifications are also applicable to the working set structure [6] and make both data structures considerably more useful. Section concludes with possible directions for future research. 2 Modifying the Working Set Structure In this section, we describe the data structure. We will begin by briefly summarizing the working set structure [6]. We then show how to use randomization to remove the queues from the working set structure, and finally how to shrink the size of the trees in the working set structure. 2.1 The Working Set Structure The working set structure [6] consists of k balanced binary search trees T 1,..., T k and k queues Q 1,..., Q k. Each queue has precisely the same elements as its corresponding tree, and the size of T i and Q i is 2 2i, except for T k and Q k which simply contain the remaining elements. Therefore, since there are n elements, k = Olog log n. The structure is manipulated with a shift operation in the
3 following manner. A shift from i to j is performed by dequeuing an element from Q i and removing the corresponding element from T i. The removed element is then inserted into the next tree and queue where next refers to the tree closer to j, and the process is repeated until we reach T j and Q j. In this manner, the oldest elements are removed from the trees every time. The result of a shift is that the size of T i and Q i has decreased by one and the size of T j and Q j has increased by one. Insertions are made by performing a usual dictionary insertion into T 1 and Q 1, and then shifting from the first index to the last index. Such a shift makes room for the newly inserted element in the first tree and queue by moving the oldest element in each tree and queue down one index. Deletions are accomplished by searching for the element and deleting it from the tree and queue it was found in, and then shifting from the last index to the index the element was found at. Such a shift fills in the gap created by the removed element by bringing elements up from further down the data structure. Finally, a search is performed by searching successively in T 1, T 2,..., T k until the element is found. This element is then removed from the corresponding tree and queue and inserted into T 1 and Q 1 in the manner described previously. By performing this shift, we ensure that elements searched for recently are towards the front of the data structure and will therefore by found quickly on subsequent searches. The working set structure was shown by Iacono [6] to have insertion and deletion costs of Olog n and a search cost of Olog tx, where x is the key being searched for and tx is the number of distinct queries made since x was last searched for, or n if x has not yet been searched for. To see that the search cost is Olog tx, consider that if x is found in T i, it must have been dequeued from Q i 1 at some point. If this is the case, then 2 2i 1 accesses to elements other than x have taken place since the last access to x, and therefore tx 2 2i 1. Since the search time for x is dominated by the search time in the tree it was found in, the cost is O log 2 2i = Olog tx. 2.2 Removing the Queues Here we present a simple use of randomization to remove the queues from the working set structure. Rather than relying on the queue to inform the shifting procedure of the oldest element in the tree, we simply pick a random element in the tree and treat it exactly as we would the dequeued element. Lemma 1 shows that we still maintain the working set property in the expected sense. Lemma 1. The expected search cost in the randomized working set structure is Olog tx. Proof. Fix an element x and let t = tx denote the number of distinct accesses since x was last accessed. Suppose that x is in T i and a sequence of accesses occurs during which t distinct accesses occur. Since T i has size 2 2i, the probability that x is not removed from T i during these accesses is at least
4 Pr{x not removed from T i after t accesses} 1 1 t 2 2i = i t 2 2i t2 2i Now, if x T i, it must have been selected for removal from T i 1 at some point. The probability that x is in at least the i-th tree is therefore at most Pr{x T j for j i} 1 1 t 2 2i 1 An upper bound on the expectation E[S] of the search cost S is therefore 2 2i E[S] = = = k Olog T i Pr{x T i } k Olog T i Pr{x T j for j i} k O log 2 2i 1 k O 2 i 1 log log t O 2 i 1 = Olog t + t 1 2 2i 1 t 1 2 2i 1 t 1 2 2i 1 = Olog t + Olog t + k + log log t O 2 i+ log log t 1 Olog t + Olog t = Olog t + Olog t O 2 i 1 O 2 i 1 O 2 i 1 1 O 2 i 1 t 2 2i+ log log t 1 t 1 2 2i+ log log t 1 t 1 2 2i log t 1 1 t 2i 1 t 1 2 2i 1
5 It thus suffices to show that the remaining sum is O1. We will assume that t 2, since otherwise x can be in at most the second tree and can therefore be found in O1 time. Considering only the remaining sum, we have O 2 i t 2i 1 O 2 i 1 O 2 i 1 O 2 i i i i All that remains to show is that this infinite sum is bounded by a decreasing geometric series and is therefore constant. The ratio of consecutive terms is 2 1 i i i+1 lim i 2 i i = 2 lim i i If we substitute u = 1 1, we find that = u 2. Observe that as i, 2 2i 2 we have u 0. Thus 2i i u 2 2 lim i u 2 u ln2 1 = 2 lim 1 u u = 2 lim u 0 1 u = 0 2i ln 2 Therefore, the ratio of consecutive terms is o1 and the series is therefore bounded by a decreasing geometric series. An expected search time of Olog tx follows. At this point, we have seen how to eliminate the queues from the structure at a cost of an expected search cost. In the next section, we will show how to further reduce space overhead by shrinking the size of the trees. 2.3 Shrinking the Trees Another source of space overhead in the working set structure is that of the trees. As mentioned before, many pointers are required to support a binary search tree. Instead, we will borrow some ideas from the study of implicit data structures. Observe that there is nothing special about the trees used in the working set structure: they are simply dictionary data structures that support logarithmic time queries and update operations. In particular, we do not rely on the fact that they are trees. Therefore, we can replace these trees with one of the many implicit dictionary data structures in the literature see, e.g., [8, 2,, 3]. The
6 dictionary of Franceschini and Grossi [3] provides a worst-case optimal implicit dictionary with access costs Olog n, and so we will employ these results. It is useful to note that any dictionary that offers polylogarithmic access times will yield the same results: the access cost for each operation in our data structure will be the maximum of the access costs for the substructure, since the shifting operation consists of searches, insertions and deletions in the substructures. Unfortunately, the resulting data structure is not implicit in the strict sense. Since each substructure can use O1 words of size Olog n bits and we have Olog log n such substructures, the data structure as a whole could use as much as Olog log n words of size Olog n bits each. Nevertheless, this is a significant improvement over the On additional words used by the traditional data structures. We have Theorem 1. There exists a dictionary data structure that stores only the data required for its elements in addition to Olog log n words of size Olog n bits each. This dictionary supports insertions and deletions in worst-case Olog n time and searches in expected Olog tx time, where tx is the working set number of the query x. 3 Further Modifications In this section, we describe a simple modification to the data structure outlined in Section 2 that makes searches more useful. This improvement comes at the cost of additional space overhead. Until now, we have implicitly assumed that searches in our data structure are successful. If they are not, then we will end up searching in each substructure at a total cost of Olog n and returning nothing. Unfortunately, this is not very useful. 1 Typically, a dictionary will return the largest element in the dictionary that is smaller than the element searched for or the smallest element larger than the element searched for. Such predecessor and successor queries are a very important feature of comparison-based data structures: without them, one could simply use hashing to achieve O1 time operations. Predecessor queries are simple to implement in binary search trees since we can simply examine where we fell off the tree. This trick will not work in our data structure, however, since we have many such substructures and we will have to know when to stop. Our goal is thus the following. Given a search key x, we would as before like to return x in time Olog tx if x is in the data structure. If x is not in the data structure, we would like to like to return predx in time Olog tpredx, where predx denotes the predecessor of x. To accomplish this, we will augment our data structure with some pointers. In particular, every item in the data structure will have a pointer to its successor. During an insertion, each substructure will be searched for the inserted element 1 Note that this is also true of the original working set structure [6]. The modifications described here are also applicable to it.
7 for a total cost of Olog n and the smallest successor in each substructure will be recorded. The smallest such successor is clearly the new element s successor in the whole structure. Therefore, the cost of insertion remains Olog n. Similarly, during a deletion only the predecessor of the deleted element will need to have its successor pointer updated and thus the total cost of deletion remains Olog n. During a search for x, we proceed as before. Consider searching in any particular substructure i. If the result of the search in substructure i is in fact x, then the analysis is exactly the same as before and we can return x in time Olog tx. Otherwise, we won t find x in substructure i. In this case, we search substructure i for the predecessor in that substructure of x. 2 Denote this pred i x. Since every element knows its successor in the structure as a whole, we can determine succpred i x. If succpred i x = x, then we know that x is indeed in the structure and thus the query can be completed as before. If succpred i x < x, then we know that there is still an element smaller than x but larger than what we have seen, and so we continue searching for x. Finally, if succpred i x > x, then we have reached the largest element less than or equal to x, and so our search stops. In any case, after we find x or predx, we shift the element we returned to the first substructure as usual. The analysis of the time complexity of this search algorithm is exactly the same as before; we are essentially changing the element we are searching for during the search. The search time for the substructure we stop in dominates the cost of the search and since we return x if we found it or predx otherwise, the search time is Olog tx if x is in the dictionary and Olog tpredx otherwise. Of course, this augmentation incurs some additional space overhead. In particular, we now require n pointers, resulting in a space overhead of On. While the queues are now gone, we still have one pointer for each element in the dictionary. To fix this, observe that we can leave out the pointers for the last few trees and simply do a brute-force search at the cost of slightly higher time complexity. Suppose we leave the pointers out of the last j trees: T k j+1 to T k, where k represents the index of the last tree, as before. Therefore, each element x in the trees T 1,...,T k j has a pointer to succx. Now, suppose we are searching in the data structure and get to T k j+1. At this point, we may need to search all remaining trees, since if we do not find the key we are looking for, we have no way of knowing when we have found its predecessor. Consider the following lemma. Lemma 2. Let 0 j k. A predecessor search for x takes expected time O 2 j log tx if x is in the dictionary and O 2 j log tpredx otherwise. Proof. Assume the search reaches T k j+1, since otherwise our previous analyses apply. We therefore have tx 2 2k j, since at least 2 2k j operations have taken 2 Here we are assuming that substructures support predecessor queries in the same time required for searching. This is not a strong assumption, since any comparisonbased dictionary must compare x to succx and predx during an unsuccessful search for x.
8 place. 3 As before, the search time is bounded by the search time in T k. Since T k has size at most 2 2k, we have an expected search time of O log 2 2k = O log 2 2k j 2 j = O 2 j log 2 2k j O 2 j log tx This analysis applies as well when x is not in the dictionary. In this case, the search can be accomplished in time O 2 j log tpredx. One further consideration is that once an element is found in the last j trees, we need to find its successor so that it knows where it is once it is shifted to the front. However, this is straightforward because we have already examined all substructures in the data structure and so we can make a second pass. It remains to consider how much space we have saved using this scheme. Since each tree has size the square of the previous, by leaving out the last j trees, the total number of extra pointers used is O n 1/2j. We therefore have Theorem 2. Let 0 j k. There exists a dictionary that stores only the data required for its elements in addition to O n 1/2j words of size Olog n bits each. This dictionary supports insertions and deletions in worst-case Olog n time and searches in expected O 2 j log tx time if x is found in the dictionary and expected O 2 j log tpredx time otherwise in which case predx is returned. Observe that j k = Olog log n, and so while the dependence on j is exponential, it is still quite small relative to n. In particular, take j = 1 to get Corollary 1. There exists a dictionary that stores only the data required for its elements in addition to O n words of size Olog n bits each. This dictionary supports insertions and deletions in worst-case Olog n time and searches in expected Olog tx time if x is found in the dictionary. If x is not in the dictionary, predx is returned in expected Olog tpredx time. Conclusion We have seen how to modify the Iacono s working set structure [6] in several ways. To become more space efficient, we can remove the queues and use randomization to shift elements, while replacing the underlying binary search trees with implicit dictionaries. To support more useful search queries, we can sacrifice some space overhead to maintain information about some portion of the elements in the dictionary in order to support returning the predecessor of any otherwise unsuccessful search queries. All such modifications maintain the working set property in an expected sense. 3 This follows from Lemma 1, which is why the results here hold in the expected sense.
9 .1 Future Work The modifications described in this paper leave open a few directions for research. The idea of relying on the properties of the substructures in this case, implicitness proved fruitful. A natural question to ask, then, is what other substructure properties can carry over to the dictionary as a whole in a useful way? Other substructures could result in a combination of the working set property and some other useful properties. In this paper, we concerned ourselves with the working set property. There are other types of distribution sensitivity, such as the dynamic finger property, which means that query time is logarithmic in the rank difference between successive queries. One could also consider a notion complementary to the idea of the working set property, namely the queueish property [7], wherein query time is logarithmic in the number of items not accessed since the query item was last accessed. Are there implicit dictionaries that provide either of these properties? Could we provide any of these properties or some analogue of them for other types of data structures? Finally, it would be of interest to see if a data structure that does not rely on randomization is possible, in order to guarantee a worst case time complexity of Olog tx instead of an expected one. References 1. G.M. Adelson-Velskii and E.M. Landis. An algorithm for the organization of information. Soviet Math. Doklady, 3: , Gianni Franceschini and Roberto Grossi. Implicit dictionaries supporting searches and amortized updates in Olog nlog log n time. In SODA 03: Proceedings of the 1th Annual ACM-SIAM Symposium on Discrete Algorithms, pages , Gianni Franceschini and Roberto Grossi. Optimal worst-case operations for implicit cache-oblivious search trees. In WADS 03: Proceedings of the 30th International Workshop on Algorithms and Data Structures, pages , Gianni Franceschini and J. Ian Munro. Implicit dictionaries with O1 modifications per update and fast search. In SODA 06: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 0 13, Leonidas J. Guibas and Robert Sedgewick. A dichromatic framework for balanced trees. In FOCS 78: Proceedings of the 19th Annual IEEE Symposium on Foundations of Computer Science, pages 8 21, John Iacono. Alternatives to splay trees with Olog n worst-case access times. In SODA 01: Proceedings of the 12th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 5 522, John Iacono and Stefan Langerman. Queaps. Algorithmica, 21:9 56, J Ian Munro. An implicit data structure supporting insertion, deletion, and search in O log 2 n time. J. Comput. Syst. Sci., 331:66 7, Daniel Dominic Sleator and Robert Endre Tarjan. Self-adjusting binary search trees. J. ACM, 323: , 1985.
arxiv: v1 [cs.ds] 13 Jul 2009
Layered Working-Set Trees Prosenjit Bose Karim Douïeb Vida Dujmović John Howat arxiv:0907.2071v1 [cs.ds] 13 Jul 2009 Abstract The working-set bound [Sleator and Tarjan, J. ACM, 1985] roughly states that
More informationADAPTIVE SORTING WITH AVL TREES
ADAPTIVE SORTING WITH AVL TREES Amr Elmasry Computer Science Department Alexandria University Alexandria, Egypt elmasry@alexeng.edu.eg Abstract A new adaptive sorting algorithm is introduced. The new implementation
More informationAbstract Relaxed balancing of search trees was introduced with the aim of speeding up the updates and allowing a high degree of concurrency. In a rela
Chromatic Search Trees Revisited Institut fur Informatik Report 9 Sabine Hanke Institut fur Informatik, Universitat Freiburg Am Flughafen 7, 79 Freiburg, Germany Email: hanke@informatik.uni-freiburg.de.
More informationPoketree: A Dynamically Competitive Data Structure with Good Worst-Case Performance
Poketree: A Dynamically Competitive Data Structure with Good Worst-Case Performance Jussi Kujala and Tapio Elomaa Institute of Software Systems Tampere University of Technology P.O. Box 553, FI-33101 Tampere,
More informationUpper Bounds for Maximally Greedy Binary Search Trees
Upper Bounds for Maximally Greedy Binary Search Trees Kyle Fox -- University of Illinois at Urbana-Champaign Balanced Binary Search Trees Want data structures for dictionary, predecessor search, etc. Red-black
More informationDynamic Access Binary Search Trees
Dynamic Access Binary Search Trees 1 * are self-adjusting binary search trees in which the shape of the tree is changed based upon the accesses performed upon the elements. When an element of a splay tree
More informationDynamic Access Binary Search Trees
Dynamic Access Binary Search Trees 1 * are self-adjusting binary search trees in which the shape of the tree is changed based upon the accesses performed upon the elements. When an element of a splay tree
More informationPersistent Data Structures (Version Control)
Persistent Data Structures (Version Control) Ephemeral update v 0 v 1 Partial persistence version list v 0 v 1 Full persistence version tree v 0 v 1 v 2 Confluently persistence version DAG v 0 v 1 v 2
More informationRange Mode and Range Median Queries on Lists and Trees
Range Mode and Range Median Queries on Lists and Trees Danny Krizanc 1, Pat Morin 2, and Michiel Smid 2 1 Department of Mathematics and Computer Science, Wesleyan University, Middletown, CT 06459 USA dkrizanc@wesleyan.edu
More information3 Competitive Dynamic BSTs (January 31 and February 2)
3 Competitive Dynamic BSTs (January 31 and February ) In their original paper on splay trees [3], Danny Sleator and Bob Tarjan conjectured that the cost of sequence of searches in a splay tree is within
More informationarxiv: v2 [cs.ds] 9 Apr 2009
Pairing Heaps with Costless Meld arxiv:09034130v2 [csds] 9 Apr 2009 Amr Elmasry Max-Planck Institut für Informatik Saarbrücken, Germany elmasry@mpi-infmpgde Abstract Improving the structure and analysis
More informationCuckoo Hashing for Undergraduates
Cuckoo Hashing for Undergraduates Rasmus Pagh IT University of Copenhagen March 27, 2006 Abstract This lecture note presents and analyses two simple hashing algorithms: Hashing with Chaining, and Cuckoo
More informationHEAPS ON HEAPS* Downloaded 02/04/13 to Redistribution subject to SIAM license or copyright; see
SIAM J. COMPUT. Vol. 15, No. 4, November 1986 (C) 1986 Society for Industrial and Applied Mathematics OO6 HEAPS ON HEAPS* GASTON H. GONNET" AND J. IAN MUNRO," Abstract. As part of a study of the general
More informationCombining Binary Search Trees
Combining Binary Search Trees The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Demaine, Erik D., John
More informationarxiv: v1 [cs.ds] 7 Nov 2011
De-amortizing Binary Search Trees Prosenjit Bose Sébastien Collette Rolf Fagerberg Stefan Langerman arxiv:1111.1665v1 [cs.ds] 7 Nov 2011 Abstract We present a general method for de-amortizing essentially
More informationWe assume uniform hashing (UH):
We assume uniform hashing (UH): the probe sequence of each key is equally likely to be any of the! permutations of 0,1,, 1 UH generalizes the notion of SUH that produces not just a single number, but a
More informationOPPA European Social Fund Prague & EU: We invest in your future.
OPPA European Social Fund Prague & EU: We invest in your future. ECE 250 Algorithms and Data Structures Splay Trees Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering
More informationASYMMETRIC COMMUNICATION PROTOCOLS VIA HOTLINK ASSIGNMENTS
ASYMMETRIC COMMUNICATION PROTOCOLS VIA OTLINK ASSIGNMENTS Prosenjit Bose Danny Krizanc Stefan Langerman Pat Morin Abstract. We exhibit a relationship between the asymmetric communication problem of Adler
More information/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17
01.433/33 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/2/1.1 Introduction In this lecture we ll talk about a useful abstraction, priority queues, which are
More informationIII Data Structures. Dynamic sets
III Data Structures Elementary Data Structures Hash Tables Binary Search Trees Red-Black Trees Dynamic sets Sets are fundamental to computer science Algorithms may require several different types of operations
More informationDynamic Optimality Almost
Dynamic Optimality Almost Erik D. Demaine Dion Harmon John Iacono Mihai Pǎtraşcu Abstract We present an O(lg lg n)-competitive online binary search tree, improving upon the best previous (trivial) competitive
More informationSolutions. Suppose we insert all elements of U into the table, and let n(b) be the number of elements of U that hash to bucket b. Then.
Assignment 3 1. Exercise [11.2-3 on p. 229] Modify hashing by chaining (i.e., bucketvector with BucketType = List) so that BucketType = OrderedList. How is the runtime of search, insert, and remove affected?
More information9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology
Introduction Chapter 4 Trees for large input, even linear access time may be prohibitive we need data structures that exhibit average running times closer to O(log N) binary search tree 2 Terminology recursive
More informationAlgorithms. AVL Tree
Algorithms AVL Tree Balanced binary tree The disadvantage of a binary search tree is that its height can be as large as N-1 This means that the time needed to perform insertion and deletion and many other
More informationDeletion Without Rebalancing in Multiway Search Trees
Deletion Without Rebalancing in Multiway Search Trees Siddhartha Sen 1,3 and Robert E. Tarjan 1,2,3 1 Princeton University, {sssix,ret}@cs.princeton.edu 2 HP Laboratories, Palo Alto CA 94304 Abstract.
More informationLecture 3 February 9, 2010
6.851: Advanced Data Structures Spring 2010 Dr. André Schulz Lecture 3 February 9, 2010 Scribe: Jacob Steinhardt and Greg Brockman 1 Overview In the last lecture we continued to study binary search trees
More informationQuake Heaps: A Simple Alternative to Fibonacci Heaps
Quake Heaps: A Simple Alternative to Fibonacci Heaps Timothy M. Chan Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L G, Canada, tmchan@uwaterloo.ca Abstract. This note
More informationSearch Trees. Undirected graph Directed graph Tree Binary search tree
Search Trees Undirected graph Directed graph Tree Binary search tree 1 Binary Search Tree Binary search key property: Let x be a node in a binary search tree. If y is a node in the left subtree of x, then
More informationOutline for Today. How can we speed up operations that work on integer data? A simple data structure for ordered dictionaries.
van Emde Boas Trees Outline for Today Data Structures on Integers How can we speed up operations that work on integer data? Tiered Bitvectors A simple data structure for ordered dictionaries. van Emde
More informationLecture 8 13 March, 2012
6.851: Advanced Data Structures Spring 2012 Prof. Erik Demaine Lecture 8 13 March, 2012 1 From Last Lectures... In the previous lecture, we discussed the External Memory and Cache Oblivious memory models.
More informationOutline for Today. How can we speed up operations that work on integer data? A simple data structure for ordered dictionaries.
van Emde Boas Trees Outline for Today Data Structures on Integers How can we speed up operations that work on integer data? Tiered Bitvectors A simple data structure for ordered dictionaries. van Emde
More information8.1. Optimal Binary Search Trees:
DATA STRUCTERS WITH C 10CS35 UNIT 8 : EFFICIENT BINARY SEARCH TREES 8.1. Optimal Binary Search Trees: An optimal binary search tree is a binary search tree for which the nodes are arranged on levels such
More informationDeterministic Jumplists
Nordic Journal of Computing Deterministic Jumplists Amr Elmasry Department of Computer Engineering and Systems Alexandria University, Egypt elmasry@alexeng.edu.eg Abstract. We give a deterministic version
More informationTOP-DOWN SKIPLISTS. Luis Barba and Pat Morin. July 7, 2014
TOP-DOWN SKIPLISTS Luis Barba and Pat Morin July 7, 014 Abstract. We describe todolists (top-down skiplists), a variant of skiplists (Pugh 1990) that can execute searches using at most log ε n + O(1) binary
More informationA Combined BIT and TIMESTAMP Algorithm for. the List Update Problem. Susanne Albers, Bernhard von Stengel, Ralph Werchner
A Combined BIT and TIMESTAMP Algorithm for the List Update Problem Susanne Albers, Bernhard von Stengel, Ralph Werchner International Computer Science Institute, 1947 Center Street, Berkeley, CA 94704,
More informationReport on Cache-Oblivious Priority Queue and Graph Algorithm Applications[1]
Report on Cache-Oblivious Priority Queue and Graph Algorithm Applications[1] Marc André Tanner May 30, 2014 Abstract This report contains two main sections: In section 1 the cache-oblivious computational
More informationIntroduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree
Chapter 4 Trees 2 Introduction for large input, even access time may be prohibitive we need data structures that exhibit running times closer to O(log N) binary search tree 3 Terminology recursive definition
More informationDynamic Optimality Almost
ynamic Optimality Almost Erik. emaine ion Harmon John Iacono Mihai Pǎtraşcu Abstract We present an O(lg lg n)-competitive online binary search tree, improving upon the best previous (trivial) competitive
More informationICS 691: Advanced Data Structures Spring Lecture 8
ICS 691: Advanced Data Structures Spring 2016 Prof. odari Sitchinava Lecture 8 Scribe: Ben Karsin 1 Overview In the last lecture we continued looking at arborally satisfied sets and their equivalence to
More informationComputing intersections in a set of line segments: the Bentley-Ottmann algorithm
Computing intersections in a set of line segments: the Bentley-Ottmann algorithm Michiel Smid October 14, 2003 1 Introduction In these notes, we introduce a powerful technique for solving geometric problems.
More informationA Fast Algorithm for Optimal Alignment between Similar Ordered Trees
Fundamenta Informaticae 56 (2003) 105 120 105 IOS Press A Fast Algorithm for Optimal Alignment between Similar Ordered Trees Jesper Jansson Department of Computer Science Lund University, Box 118 SE-221
More informationLecture 2 February 4, 2010
6.897: Advanced Data Structures Spring 010 Prof. Erik Demaine Lecture February 4, 010 1 Overview In the last lecture we discussed Binary Search Trees(BST) and introduced them as a model of computation.
More informationLecture 9 March 4, 2010
6.851: Advanced Data Structures Spring 010 Dr. André Schulz Lecture 9 March 4, 010 1 Overview Last lecture we defined the Least Common Ancestor (LCA) and Range Min Query (RMQ) problems. Recall that an
More informationWorst-case running time for RANDOMIZED-SELECT
Worst-case running time for RANDOMIZED-SELECT is ), even to nd the minimum The algorithm has a linear expected running time, though, and because it is randomized, no particular input elicits the worst-case
More informationINSERTION SORT is O(n log n)
Michael A. Bender Department of Computer Science, SUNY Stony Brook bender@cs.sunysb.edu Martín Farach-Colton Department of Computer Science, Rutgers University farach@cs.rutgers.edu Miguel Mosteiro Department
More informationRandomized Splay Trees Final Project
Randomized Splay Trees 6.856 Final Project Robi Bhattacharjee robibhat@mit.edu Ben Eysenbach bce@mit.edu May 17, 2016 Abstract We propose a randomization scheme for splay trees with the same time complexity
More information1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1
Asymptotics, Recurrence and Basic Algorithms 1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 2. O(n) 2. [1 pt] What is the solution to the recurrence T(n) = T(n/2) + n, T(1)
More informationA TIGHT BOUND ON THE LENGTH OF ODD CYCLES IN THE INCOMPATIBILITY GRAPH OF A NON-C1P MATRIX
A TIGHT BOUND ON THE LENGTH OF ODD CYCLES IN THE INCOMPATIBILITY GRAPH OF A NON-C1P MATRIX MEHRNOUSH MALEKESMAEILI, CEDRIC CHAUVE, AND TAMON STEPHEN Abstract. A binary matrix has the consecutive ones property
More informationICS 691: Advanced Data Structures Spring Lecture 3
ICS 691: Advanced Data Structures Spring 2016 Prof. Nodari Sitchinava Lecture 3 Scribe: Ben Karsin 1 Overview In the last lecture we started looking at self-adjusting data structures, specifically, move-to-front
More informationAn Optimal Algorithm for the Euclidean Bottleneck Full Steiner Tree Problem
An Optimal Algorithm for the Euclidean Bottleneck Full Steiner Tree Problem Ahmad Biniaz Anil Maheshwari Michiel Smid September 30, 2013 Abstract Let P and S be two disjoint sets of n and m points in the
More informationThe Encoding Complexity of Network Coding
The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network
More informationProperties of red-black trees
Red-Black Trees Introduction We have seen that a binary search tree is a useful tool. I.e., if its height is h, then we can implement any basic operation on it in O(h) units of time. The problem: given
More informationDynamic Optimality and Multi-Splay Trees 1
Dynamic Optimality and Multi-Splay Trees 1 Daniel Dominic Sleator and Chengwen Chris Wang November 5, 2004 CMU-CS-04-171 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract
More informationLecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure
Lecture 6: External Interval Tree (Part II) Yufei Tao Division of Web Science and Technology Korea Advanced Institute of Science and Technology taoyf@cse.cuhk.edu.hk 3 Making the external interval tree
More informationCOMP Analysis of Algorithms & Data Structures
COMP 3170 - Analysis of Algorithms & Data Structures Shahin Kamali Binary Search Trees CLRS 12.2, 12.3, 13.2, read problem 13-3 University of Manitoba COMP 3170 - Analysis of Algorithms & Data Structures
More informationThe Unified Segment Tree and its Application to the Rectangle Intersection Problem
CCCG 2013, Waterloo, Ontario, August 10, 2013 The Unified Segment Tree and its Application to the Rectangle Intersection Problem David P. Wagner Abstract In this paper we introduce a variation on the multidimensional
More informationRank-Pairing Heaps. Bernard Haeupler Siddhartha Sen Robert E. Tarjan. SIAM JOURNAL ON COMPUTING Vol. 40, No. 6 (2011), pp.
Rank-Pairing Heaps Bernard Haeupler Siddhartha Sen Robert E. Tarjan Presentation by Alexander Pokluda Cheriton School of Computer Science, University of Waterloo, Canada SIAM JOURNAL ON COMPUTING Vol.
More informationCombinatorial Problems on Strings with Applications to Protein Folding
Combinatorial Problems on Strings with Applications to Protein Folding Alantha Newman 1 and Matthias Ruhl 2 1 MIT Laboratory for Computer Science Cambridge, MA 02139 alantha@theory.lcs.mit.edu 2 IBM Almaden
More informationHeap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the
Heap-on-Top Priority Queues Boris V. Cherkassky Central Economics and Mathematics Institute Krasikova St. 32 117418, Moscow, Russia cher@cemi.msk.su Andrew V. Goldberg NEC Research Institute 4 Independence
More informationAre Fibonacci Heaps Optimal? Diab Abuaiadh and Jeffrey H. Kingston ABSTRACT
Are Fibonacci Heaps Optimal? Diab Abuaiadh and Jeffrey H. Kingston ABSTRACT In this paper we investigate the inherent complexity of the priority queue abstract data type. We show that, under reasonable
More informationSuccinct Geometric Indexes Supporting Point Location Queries
Succinct Geometric Indexes Supporting Point Location Queries PROSENJIT BOSE Carleton University ERIC Y. CHEN Google Inc. MENG HE University of Waterloo and ANIL MAHESHWARI and PAT MORIN Carleton University
More informationRed-Black, Splay and Huffman Trees
Red-Black, Splay and Huffman Trees Kuan-Yu Chen ( 陳冠宇 ) 2018/10/22 @ TR-212, NTUST AVL Trees Review Self-balancing binary search tree Balance Factor Every node has a balance factor of 1, 0, or 1 2 Red-Black
More informationCOMP3121/3821/9101/ s1 Assignment 1
Sample solutions to assignment 1 1. (a) Describe an O(n log n) algorithm (in the sense of the worst case performance) that, given an array S of n integers and another integer x, determines whether or not
More informationSTUDY OF CONFLUENT PERSISTENCE FOR DATA STRUCTURES IN THE POINTER MACHINE MODEL.
UNIVERSITY OF THESSALY MASTER S THESIS STUDY OF CONFLUENT PERSISTENCE FOR DATA STRUCTURES IN THE POINTER MACHINE MODEL. Panayiotis Vlastaridis Supervisors: Panayiotis Bozanis, Associate Professor Panagiota
More informationLecture Notes: External Interval Tree. 1 External Interval Tree The Static Version
Lecture Notes: External Interval Tree Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk This lecture discusses the stabbing problem. Let I be
More information/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17
601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17 5.1 Introduction You should all know a few ways of sorting in O(n log n)
More informationConvex Hull of the Union of Convex Objects in the Plane: an Adaptive Analysis
CCCG 2008, Montréal, Québec, August 13 15, 2008 Convex Hull of the Union of Convex Objects in the Plane: an Adaptive Analysis Jérémy Barbay Eric Y. Chen Abstract We prove a tight asymptotic bound of Θ(δ
More informationarxiv: v1 [cs.ds] 15 Apr 2011
Confluent Persistence Revisited Sébastien Collette John Iacono Stefan Langerman arxiv:0.05v [cs.ds] 5 Apr 20 Abstract It is shown how to enhance any data structure in the pointer model to make it confluently
More informationDDS Dynamic Search Trees
DDS Dynamic Search Trees 1 Data structures l A data structure models some abstract object. It implements a number of operations on this object, which usually can be classified into l creation and deletion
More informationNotes on Binary Dumbbell Trees
Notes on Binary Dumbbell Trees Michiel Smid March 23, 2012 Abstract Dumbbell trees were introduced in [1]. A detailed description of non-binary dumbbell trees appears in Chapter 11 of [3]. These notes
More informationprinceton univ. F 15 cos 521: Advanced Algorithm Design Lecture 2: Karger s Min Cut Algorithm
princeton univ. F 5 cos 5: Advanced Algorithm Design Lecture : Karger s Min Cut Algorithm Lecturer: Pravesh Kothari Scribe:Pravesh (These notes are a slightly modified version of notes from previous offerings
More information1 Figure 1: Zig operation on x Figure 2: Zig-zig operation on x
www.alepho.com 1 1 Splay tree Motivation for this data structure is to have binary tree which performs rotations when nodes is accessed. Operations of interest are finding, inserting and deleting key,
More informationBinary Heaps in Dynamic Arrays
Yufei Tao ITEE University of Queensland We have already learned that the binary heap serves as an efficient implementation of a priority queue. Our previous discussion was based on pointers (for getting
More informationBIASED RANGE TREES. Vida Dujmović John Howat Pat Morin
BIASED RANGE TREES Vida Dujmović John Howat Pat Morin ABSTRACT. A data structure, called a biased range tree, is presented that preprocesses a set S of n points in R 2 and a query distribution D for 2-sided
More informationLecture 19 April 15, 2010
6.851: Advanced Data Structures Spring 2010 Prof. Erik Demaine Lecture 19 April 15, 2010 1 Overview This lecture is an interlude on time travel in data structures. We ll consider two kinds of time-travel
More information2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006
2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,
More informationAdvanced Algorithms. Class Notes for Thursday, September 18, 2014 Bernard Moret
Advanced Algorithms Class Notes for Thursday, September 18, 2014 Bernard Moret 1 Amortized Analysis (cont d) 1.1 Side note: regarding meldable heaps When we saw how to meld two leftist trees, we did not
More informationOnline Facility Location
Online Facility Location Adam Meyerson Abstract We consider the online variant of facility location, in which demand points arrive one at a time and we must maintain a set of facilities to service these
More informationLecture 5. Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs
Lecture 5 Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs Reading: Randomized Search Trees by Aragon & Seidel, Algorithmica 1996, http://sims.berkeley.edu/~aragon/pubs/rst96.pdf;
More informationCSE 502 Class 16. Jeremy Buhler Steve Cole. March A while back, we introduced the idea of collections to put hash tables in context.
CSE 502 Class 16 Jeremy Buhler Steve Cole March 17 2015 Onwards to trees! 1 Collection Types Revisited A while back, we introduced the idea of collections to put hash tables in context. abstract data types
More informationGeometric Unique Set Cover on Unit Disks and Unit Squares
CCCG 2016, Vancouver, British Columbia, August 3 5, 2016 Geometric Unique Set Cover on Unit Disks and Unit Squares Saeed Mehrabi Abstract We study the Unique Set Cover problem on unit disks and unit squares.
More informationBalanced Binary Search Trees
Balanced Binary Search Trees In the previous section we looked at building a binary search tree. As we learned, the performance of the binary search tree can degrade to O(n) for operations like getand
More informationLecture 01 Feb 7, 2012
6.851: Advanced Data Structures Spring 2012 Lecture 01 Feb 7, 2012 Prof. Erik Demaine Scribe: Oscar Moll (2012), Aston Motes (2007), Kevin Wang (2007) 1 Overview In this first lecture we cover results
More informationRepresenting Dynamic Binary Trees Succinctly
Representing Dynamic Binary Trees Succinctly J. Inn Munro * Venkatesh Raman t Adam J. Storm* Abstract We introduce a new updatable representation of binary trees. The structure requires the information
More informationComputer Science 210 Data Structures Siena College Fall Topic Notes: Binary Search Trees
Computer Science 10 Data Structures Siena College Fall 018 Topic Notes: Binary Search Trees Possibly the most common usage of a binary tree is to store data for quick retrieval. Definition: A binary tree
More information9/24/ Hash functions
11.3 Hash functions A good hash function satis es (approximately) the assumption of SUH: each key is equally likely to hash to any of the slots, independently of the other keys We typically have no way
More informationarxiv: v1 [math.co] 7 Dec 2018
SEQUENTIALLY EMBEDDABLE GRAPHS JACKSON AUTRY AND CHRISTOPHER O NEILL arxiv:1812.02904v1 [math.co] 7 Dec 2018 Abstract. We call a (not necessarily planar) embedding of a graph G in the plane sequential
More informationComparing the strength of query types in property testing: The case of testing k-colorability
Comparing the strength of query types in property testing: The case of testing k-colorability Ido Ben-Eliezer Tali Kaufman Michael Krivelevich Dana Ron Abstract We study the power of four query models
More informationHashing. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong
Department of Computer Science and Engineering Chinese University of Hong Kong In this lecture, we will revisit the dictionary search problem, where we want to locate an integer v in a set of size n or
More informationCache-Oblivious Persistence
Cache-Oblivious Persistence Pooya Davoodi 1,, Jeremy T. Fineman 2,, John Iacono 1,,andÖzgür Özkan 1 1 New York University 2 Georgetown University Abstract. Partial persistence is a general transformation
More informationThree Sorting Algorithms Using Priority Queues
Three Sorting Algorithms Using Priority Queues Amr Elmasry Computer Science Department Alexandria University Alexandria, Egypt. elmasry@cs.rutgers.edu Abstract. We establish a lower bound of B(n) = n log
More informationLecture 7 March 5, 2007
6.851: Advanced Data Structures Spring 2007 Prof. Erik Demaine Lecture 7 March 5, 2007 Scribes: Aston Motes and Kevin Wang 1 Overview In previous lectures, we ve discussed data structures that efficiently
More informationExpected Time in Linear Space
Optimizing Integer Sorting in O(n log log n) Expected Time in Linear Space Ajit Singh M. E. (Computer Science and Engineering) Department of computer Science and Engineering, Thapar University, Patiala
More informationEfficient pebbling for list traversal synopses
Efficient pebbling for list traversal synopses Yossi Matias Ely Porat Tel Aviv University Bar-Ilan University & Tel Aviv University Abstract 1 Introduction 1.1 Applications Consider a program P running
More informationThroughout this course, we use the terms vertex and node interchangeably.
Chapter Vertex Coloring. Introduction Vertex coloring is an infamous graph theory problem. It is also a useful toy example to see the style of this course already in the first lecture. Vertex coloring
More informationInterval Stabbing Problems in Small Integer Ranges
Interval Stabbing Problems in Small Integer Ranges Jens M. Schmidt Freie Universität Berlin, Germany Enhanced version of August 2, 2010 Abstract Given a set I of n intervals, a stabbing query consists
More informationOn the other hand, the main disadvantage of the amortized approach is that it cannot be applied in real-time programs, where the worst-case bound on t
Randomized Meldable Priority Queues Anna Gambin and Adam Malinowski Instytut Informatyki, Uniwersytet Warszawski, Banacha 2, Warszawa 02-097, Poland, faniag,amalg@mimuw.edu.pl Abstract. We present a practical
More informationMulti-Way Search Trees
Multi-Way Search Trees Manolis Koubarakis 1 Multi-Way Search Trees Multi-way trees are trees such that each internal node can have many children. Let us assume that the entries we store in a search tree
More informationOn k-dimensional Balanced Binary Trees*
journal of computer and system sciences 52, 328348 (1996) article no. 0025 On k-dimensional Balanced Binary Trees* Vijay K. Vaishnavi Department of Computer Information Systems, Georgia State University,
More informationAVL Trees. (AVL Trees) Data Structures and Programming Spring / 17
AVL Trees (AVL Trees) Data Structures and Programming Spring 2017 1 / 17 Balanced Binary Tree The disadvantage of a binary search tree is that its height can be as large as N-1 This means that the time
More information