Representing Dynamic Binary Trees Succinctly

Size: px
Start display at page:

Download "Representing Dynamic Binary Trees Succinctly"

Transcription

1 Representing Dynamic Binary Trees Succinctly J. Inn Munro * Venkatesh Raman t Adam J. Storm* Abstract We introduce a new updatable representation of binary trees. The structure requires the information theoretic minimum 2n + o(n) bits and supports basic navigational operations in constant time and subtree size in O(lg n). In contrast to the linear update costs of previously proposed succinct representations, our representation supports updates in O(lg 2 n) amortized time. 1 Introduction Trees, particularly binary trees, are elementary structures in many aspects of computing. The standard representation of a tree, with a pointer or two per parentchild relationship, is easy to navigate and update. Furthermore, the structure can be easily augmented so that operations such as determining subtree size can also be supported in constant time. Unfortunately, this representation can be very costly, even prohibitive, in terms of space. This is particularly true in applications such as text indexing, where the node of a binary tree corresponds to an index point in a text file. Taking this reasonable point of view that a pointer used in representing an n node tree takes lg n bits, the usual representation of a binary tree requires 2nigh bits (even without parent pointers). On the other hand, a binary tree can be represented in fewer than 2n bits, as there are only (2~)/(n + 1) or about 22'~/n a/2 binary trees on n nodes. Indeed Jacobson[5], Munro and Raman[7], and others have proposed 2n + o(n) bit representations that permit fast navigation of a tree. These approaches, and the work presented here, all lead to a mapping from the n (internal) nodes of a tree onto the integers [1, n/and from the external nodes onto the integers [1, n + 1]. This leads to a way of associating auxiliary data with nodes and external nodes. Most of these approaches, however, are inherently static. The focus of this paper is a succinct, quickly navigable and updatable representation of binary trees. Our representations deal with arbitrary binary trees on n *Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada, {imuuro, a~torm}~uwaterloo, ca tinstitute of Mathematical Sciences, Chennal, India , vr~m~tt@imsc, ernet, in (internal) nodes. Logically attaching an external node to each position in the tree without a child we have n+l external nodes. Data may be associated with all internal and/or all external nodes. This data is taken to be either of constant size or a lg n 1 bit reference. One choice may be made for the size of internal node data and another for external node data. Proofs, however, are given only for the case where data is associated with external nodes. The updates permitted are the natural insertion or deletion of a single node. We allow insertions to the tree along an edge or by inserting a new leaf. Conversely, a node with one child or a leaf may be deleted. We adopt a natural model of a random access machine under which a lg n bit word can be manipulated with the usual operations in unit time, i.e. the size of the tree roughly matches word size. It was under this model that Jacobson[5] showed how to represent a tree using 2n + o(n) bits and be able to determine the parent or child of a node in lg n bit inspections. Munro and P~man[7] improved this to inspecting a constant number of lg n bit words, and added a number of operations including subtree size. Clark and Munro[3/gave a representation aimed at large trees to be kept on secondary storage. They broke the tree into pieces so that each piece could be stored on a page of memory using a 3n + o(n) bit representation. An update could be made by totally recomputing the page in question and modifying any other pages along the path from the root. In a disk based model, of course, this implies rewriting all pages along such a path. Nevertheless, their approach was effective in the practice of maintaining suffix trees. Although the approach taken here is very different, their work is, in a very loose sense, the starting point for the work presented here. Our main results can be stated as follows: Theorem 1.1 There exists a 2n + o(n) bit binary tree representation that can be created in linear time and facilitates navigation and subtree size queries in constant time. The structure also supports finding any extra (fixed size) data associated with nodes. Given the location at which an insertion or deletion is to be lwe use lgn to denote Jig 2 n + 1] 529

2 performed, updates to the tree can be made in poly-log time depending on data associated with internal and/or external nodes. In particular: If no data is associated with nodes, update time is O(lg ~ n) worst case and O(lglgn) amortized. If data of fixed constant size is associated with internal nodes and/or external nodes, update time is O(lg 3 n) worst case and O(lg n) amortized. If fixed size data of O(lgn) bits (such as references to an arbitrary record) is associated with internal and/or external nodes, update time is O(lg4n) worst case and O(lg 2 n) amortized. In the next section we give a high level description of our structure. Subsequently, we provide a more detailed description of the structure and how it facilitates insertions and deletions. 2 Overview of the Structure We first describe our data structure giving the invariants that later are used by the search procedures and maintained by the update algorithms. The basic notion is to divide the tree into subtrees of size O(lg 2 n) ~the root's subtree may be smaller). We call these O{lg n) sized sub-trees small trees and we store them in blocks. Each of these small trees is then subdivided into tiny trees of O(lg n) nodes. These are stored in sub.blocks. The limited size of these tiny trees enables us to maintain a table of the representations of all possible binary trees of size at most clgn (for any constant c < 1/2). Such a table, permits the representation of a tiny tree of size O(lgn) using a O(lgn) sized pointer to its representation. Moreover, since there are only O(lg n) tiny trees for each small tree, we can use explicit O(lglgn) sized pointers between tiny trees. We now give a more detailed description of the blocking structure and how it responds to additions (the structure's response to deletions is analogous). 2.1 Blocks The tree is divided into subtrees of between lg~n and 3 lg 2 n nodes. These small trees are stored in blocks. (Note that the root's block is the only block that may be less than lg 2 n nodes in size.) This division can be done using a greedy algorithm which performs a post order traversal of the tree in the following manner: At each node we determine the size of the "incomplete blocks" presently containing each of its children. Each external node is viewed as an incomplete block of size 0 and passed to its parent. At each internal node a new block is found by taking the sub-block part from each child togetl~er with the node itself. If this new block contains at least lg 2 n nodes it is a "complete block" and a new incomplete block is passed to its parent. Otherwise, the combined block remains incomplete and is passed to the parent. Finally, the block containing the root is viewed as complete regardless of its size. Clearly we could restrict the size of a (complete) block to being between lg 2 n and 2 lg 2 n nodes, however relaxing the upper bound to 3 lg2n will be helpful in performing updates. The matter of references between small trees is, however, of some concern. As each small tree will have one parent node in another small tree, only O(n/lg 2 n) pointers ( = O(n/lgn) = o(n) bits) are required for references between parent and child small trees. However, an individual small tree could have O(lg ~ n) child small trees. Hence these inter-block child pointers are not stored in the blocks themselves, but in an auxiliary structure. As a consequence the size of a block will depend only on the number of nodes in the small tree it represents Block Organization In allocating storage during updates, it is convenient to group together subtrees of roughly the same size. Hence we say that blocks with between lg 2 n + (i - I) lg n and lg 2 n + i lg n nodes are in group i. Each block in the grouping is allocated the same amount of space: adequate for the largest but wasteful only by a factor of (1 + 1/lgn) for the smallest. Within a block grouping, blocks are stored contiguously so that no space is maintained between blocks. Block groupings are stored in an array, ordered by block size. The grouping with smallest blocks is first in the array, and the grouping of largest sized blocks is at the end. We use the optimally resizable arrays of Brodnik, et. al.[2] which permits accesses, extensions, and contractions of the array to be performed in constant time. The space overhead is proportional to the square root of the number of "words" in the array. Between each block grouping is some empty space to facilitate growth and contraction. This will be at most 3 lg n words of lg n bits each (i.e. at most 3 lg2n bits). To maintain a traversable structure, blocks are connected to their children (and children to parent) using explicit pointers of size lg n. This gives a total of 2 pointers per block (one parent, one child) since each block can be referenced only once as a child. As a result, while one block may have O(lg 2 n) pointers, there will be no more than O(n/lg 2 n) pointers in total Inter-Block Pointers All inter-block pointers for a given block are stored 530

3 contiguously in a separate pointer block. The main difference in the storage technique used for blocks and that used for pointer blocks is that between pointer block groupings there is no unused space. This is because, as we shall see later, pointer blocks only need modification upon block splitting or merging. As a result, when pointer block sizes change, they do so dramatically and so the mechanism employed in block rearranging is invalid. The technique used to maintain pointer blocks is somewhat involved. It can, however, be found in [8]. The inter-block pointers are arranged in a B-tree within the pointer block so that an external node can find its pointer in constant time. We discuss the details of this B-tree in section Sub-Blocks There are two main components of each sub-block: a pointer to the table representation, and leaf numbering information. We first discuss the crucial aspect of pointer components since it is by far the more important aspect, and then we go on to explain the external node numbering information, why it is necessary, and how it is used Pointer To Tree Representation The blocks of size O(lg 2 n) are divided into subblocks of between ¼ lg n and i~ lg n nodes. As mentioned before, these sub-blocks are pointers to a table containing a representation of every possible binary tree with at most ¼ lg n nodes. Since there are roughly 22r binary trees on r (or up to r) nodes, there will be roughly vfn entries in the table. As a consequence there is no problem with the space requirements of representing a copy of each possible subtree, therefore, we will ignore the actual table of vfn trees for the present. The table will actually maintain additional information useful when performing insertions and deletions so we shall return to the issue later. The references to subtrees in this table could be given by the parenthesis encoding of Jacobson[5] or of Munro and Raman[7]. We observe however, that as there are about 4r/2~r3] 2 binary trees on r nodes, 2r- 3 2 lg r- O(1) bits suffice. Adding a lg r+ 2 lg lg r prefix to indicate the value of r gives us a 2r - ~ lg r + o(lg r) bit designation for a subtree. Hence each sub-block of size r can use fewer than 2r bits to encode itself. Ultimately the space taken by our encoding is dominated by the virtually optimal encodings of these "tiny" trees. All other space used is to facilitate navigation, updates and interpretation. Within this extra space we require references to parent and child sub-blocks and to external data fields. Lemma 2.1 (Table Size) Representing the e(lgn) tables can be achieved with O(n c lg n) bits. It is interesting that the tables do not contribute to the dominant space term of our structure External Node Numberings In addition to a pointer to the explicit tree representation, each sub-block stores some information used to determine external node numbers within a given block. As described below, there are three types of external nodes in a sub-block: inter-sub-block pointers, interblock pointers, and genuine external nodes of the tree (these may be implications of real data). Due to the way pointers are stored in our structure, we must be able to determine in constant time, how many external nodes precede the node, within the block, in a preorder traversal. To achieve this we store an array (called the external node numbering array) in each block, which has an entry for each sub-block. A given sub-block's entry in the array stores the number of external nodes that precede the first external node of the sub-block. Additionally, in the table of tree representations, for each node of the "tiny" tree we store the number of external nodes preceding the node within the sub-block. With this information, and the number of external nodes before the root of the sub-block and within the block, we can determine external node numberings for each node of the tree in constant time Sub-Block Pointers We know that inter-sub-block pointers need only be of size lg lg n. Moreover, the number of inter-subblock pointers in a given sub-block is one less than the number of sub-blocks in the block. To store the pointers we number the external nodes in postorder and as previously mentioned, maintain a count of the number of external nodes before a sub-block. We store the list of a block's inter-sub--block pointers in a B-tree that is in fact a simplified version of a fusion tree[4] with the following properties: 1. Each node of the B-tree is of size ½ lg n bits. 2. The keys of the tree are of size lglg n bits and represent the external node's number in the block. 3. The ½1gn bit nodes each store from ~ 41glgn to 21glgn of these lg lg n sized keys. 4. At all times the tree is of height 2. We know that a B-tree with these properties will remain height 2 with up to lg 2 n/(4 lg lg n) 2 > lg n keys. 531

4 Since we have O(lg n) external nodes that are inter-subblock pointers in each block, we say that an external node is an inter-sub-block pointer if it is represented by a key in the tree. Assuming that the external node is found in the tree, we now must find the pointer associated with the desired position. To allow this to be done in constant time we store an additional B-tree node for each of the nodes on the second level of the B-tree which instead of storing keys, stores the intersub-block-pointers (we call this node the pointer node). This can be done since our keys and pointers are of the same size. When we find that a key is in a given second level node of the B-tree and at a given position, we go to the same position in the pointer node and we will find the pointer associated with that external node. With nodes containing up to lgn/21glgn keys, searching for a desired node could take O(lglgn) time with binary search. Instead, we maintain a table that will allow us to achieve the constant time bound. We store a table outside of the block structure which is indexed by a ½ lg n sized key. Since there exists an entry of the table for each of the ½ lg n sized keys, there are vz~ entries in total. Each entry stores in sequential order lg n/lglg n records of size lglgn. When we wish to know which branch to take on the top level of our B- tree, or if a key exists on the second level, we index into the table based on the B-tree node's bit representation (note we choose the table's index to be the same size as each node). Having found the correct entry in the table, we use the lg lg n sized external node number to determine how many records into the entry we must skip (i.e. external node ~ in the block accesses the th record in the list). The record to which you skip encodes the branch of the tree you must traverse or, if on the second level, where your corresponding key should reside. This table will allow us to index into the tree in constant time and hence we will be able to determine whether an external node represents an inter-sub-block pointer in constant time. Lemma 2.2 (Inter-Sub-Block Pointers) O(n/lglgn) bits su~ce to represent all inter-subblock pointers. Proof. In each block we have O(lgn) inter-subblock pointers each of size lglg n. The structure that holds the inter-sub-block pointers consists of one root node and lg n/lg lg n external nodes each of size ½ lg n. Additionally, the actual lg lg n sized pointers are stored in a second external node level containing lgn/lg lg n leaf nodes each of size ½ lg n. Therefore in total we have lg 2 n~ lg lg n + ½ lg n bits used up in storing a block's inter-sub-block pointers. Additionally, outside the block structure, there is a table of vrff entries each of size O(lgn) giving us O(~/~lgn) extra space. Since there are O(n/lg z n) blocks there will be n~ lglg n+n/2 lgn+ O(v/~lgn), or more simply O(n/lglgn), bits used in storing the structure's inter-sub-block pointers. [] Internal Sub-block Organization The internal organization of blocks is crucial to achieving quick accesses and updates. The sub-blocks within a block are arranged similarly to how we store blocks. Sub-blocks are stored in sub-block groupings such that all sub-blocks in a sub-block grouping are the same size =t=lglgn. The sub-block groupings are arranged such that the grouping that contains the sub-blocks of smallest size are first and the largest sub-block sized grouping is last. Between sub-block groupings we maintain a gap of at most lg n bits so that when sub-blocks grow upon insertions they can be rearranged easily. Since there are O(lg n/lg lg n) subblock groupings and there are at most lg n bits between each pair of groupings, the total space used in inter-subblock grouping gaps is lg 2 n~ lg lg n. 2.3 Data Representation The data representation technique used in our model is key to achieving the space constraints desired. It is clear that since, pointers to data dominate the numbers of overall pointers, if we had explicit pointers from the external nodes of the tree to the data set then we would require O(n lgn) bits of additional space which would push us beyond the desired 2n space bound. As a result we implement a storage technique for the data that mirrors the structural storage technique. In our model we consider the case in which data records are of constant size (the case where data records are lg n bit references is analogous). We store these fixed size records in blocks of size i lg n (i is at most ~ lg n). Like our structural blocks, the data blocks are stored in memory according to their size with all blocks of the same size stored contiguously. Within these blocks we contiguously store our fixed sized data records. Additionally, as with previously defined structural blocks, we maintain gaps between groupings of same sized blocks so that shuffling of blocks can be done efficiently. In each structural block is a pointer to the data block that stores its data. Additionally we know that if an external node of the subtree is not an inter-sub-block or inter-block pointer, then it is external to the entire tree and as such is associated with some data. After verifying that the leaf is external to the tree we can find its data in the block's corresponding data block. To simplify searching for the data within the data block, we use the previously described external node numberings. To determine which record to access in 532

5 the data block we must know precisely the number of external pointers that precede us within the block and sub--block. These values can be found in the respective B-trees and when subtracted from the external node numbering, give the data record that is to be accessed. It should be noted that if it is desired that the tree's internal nodes have associated data then an analogous method can be used to represent the internal data. 3 Modifying the Structure When an insertion is made into a block its size increases, however, provided it does not increase beyond the lgn block increment it remains valid in its current location. If however, the block size has increased beyond the lgn block increment, it must be relocated. 3.1 Relocating Blocks When a block grows we know that relocation will result in the block being placed in the next largest block grouping (if one is not available the block must be split). To do this a copy of the block is made and then the last block of the block grouping is moved into the moving block's location (we know the last block will fit since all blocks in a block grouping are of the same size). This move increases the gap between the two block groupings by the old block size. After the gap grows, its size will be large enough so that the moving block can be placed within. Resultantly, the moving block becomes the first block of the following block grouping. Figure 1 shows how blocks are relocated. Lemma 3.1 (Block Relocation) located in O(lg n) time. A block can be re- 3.2 Shrinking Gap Sizes We can see that in the process of relocating blocks upon the addition of a new node, gap sizes decrease. From this we can deduce that there will be a situation where a given gap will close (i.e. be reduced to 0 bits in size) and the above described block relocation algorithm will fail. To avoid gaps closure we monitor the gaps and when a gap is reduced to lg n bits we perform gap resizing Gap Resizing To resize the gap between block grouping a and b (where block size(a) < block size(b)) we make a copy of the first block in block grouping b (bl) and delete bl so that its previous location is now a gap. Then we make a copy of the first block of block grouping b+l ((b+ 1)1), and place bl at the end of block b. After this has been done, the gap between a and b is now sizeof(b) (or at least lg 2 n). We continue copying and moving blocks until the first block of the last block grouping has been moved to the end of the array and the array is extended if necessary (see figure 2). This will guarantee that block size will always be between 2 lg n and 3 lg 2 n. Finally we must update block pointers to maintain the tree's structure. A similar mechanism is implemented to avoid gap sizes from becoming too large upon deletions. The details of this complimentary mechanism are similar to the gap resizing algorithm above and as such are left to the reader. 3.3 Splitting Blocks Another implication of having block size constraints in the presence of insertions is that block sizes could exceed the maximum allowable size. In this case we must split the block so that the two resultant blocks are both of a legal size. Splitting a block is thus performed in three steps: finding a node at which the block can be validly split, splitting the block, and placing the resulting two blocks in their proper places. We will assume for now we can perform a block split in constant time by dereferencing one pointer to disconnect the two trees. Later we will show that in fact O(lg 2 n) amount of work may be necessary if a sub-block is split as well. Since we have shown that block reallocation takes O(lg n) time, to show that block splitting requires O(lg 2 n) time we must show that we will always be able to find a valid splitting node in O(lg 2 n) time. We can see that if we can split the block such that both parts have between 1/3 and 2/3 of the bits we will be left with 2 blocks both of which are now validly sized. It is known from [6] that it is possible to split a binary tree of n nodes through removal of an edge so that each subtree has no more that 2n/3 vertices. This leads to the following claim: Claim 3.1 A block split can be performed in O(lg 2 n) time. We can see that splitting a block requires O(lg 2 n) time if and only if at the sub-block level, all splitting operations can be completed in O(lg 2 n) time or less. In the next section we go on to describe the sub-block structure and eventually show that the above claim is true. In the process, we prove Theorem 1.1; that insertions and deletions can be done in O(lg 2 n) time. 3.4 Inserting into Sub-blocks When a new node is inserted into the structure, the actual insertion takes place at the sub-block level. After the correct block and sub-block are found, the sub-block is traversed to find the location where the new node will be placed. Once the location where the new node is to 533

6 (a) (1~) (c) Figure 1: (Relocating Blocks) In (a) block 2.2 is the destination of a new node. With the new node block 2.2 becomes too large and must move. In (b) the block is copied to a new location and the last block of the block grouping takes its place. Finally in (c), the now larger 2.2 is placed at the top of the next block grouping and the gap between the two block groupings decreases by lg n. be placed is found, the insertion can take place. An insertion requires us to change the sub-block pointer so that it points to the new representation of the tree. We know that this new representation will be in the table representing trees of size r + 1. Accordingly, after generating the new representation of the tree, by modifying the encoding to account for the insertion, we simply perform a binary search of the r + 1 size table to find the offset of the correct encoding. Once we have found the correct encoding, we set that to be the new offset and we set r + 1 to be the new size. This takes O(lg n) time as the tables are of size n ~. When we add a node to a sub-block, its size increases by two bits. As a result, we may have to move the sub-block to the next sub-block grouping (since subblock sizes are in increments of lg lg n). This sub-block reorganization is similar to the block-reorganization previously described. Additionally, since we assume that insertions may occur at the leaf level, the external node's data must be added to the tree structure. This is done by inserting the leaf's record into the external data structure defined above. Finally, the increase in leaves forces us to increment external node numberings (we omit these details). Lemma 3.2 Modifying the tree structure upon insertion of a node requires O(lgn) time Splitting Sub-blocks When an insertion is made into a sub-block of largest allowable size, the sub-block must be split. This can be done using the three step method outlined for block splitting. The actual splitting of a sub-block is a bit more complicated since we do not have explicit pointers to simply reassign. To perform the split we first determine the node at which the sub-tree will be split. After the node has been found, we split the sub-tree and generate the encodings for the two new sub-trees. To find the tree representations for the two new trees we first must determine the size of the trees and then search the corresponding tables for the representation's offset. Following the search of the appropriate tables, we determine the external node numberings by splitting the original external node numberings. Since we now have a new sub-block we must add one inter-sub-block pointer to the B-tree where the pointers are stored. This is the final stage in the splitting process. Lemma 3.3 A sub-block can be split in O(lgn) worst case time and O(1) amortized time. Lemma 3.4 A block split can be performed in O(lg 2 n) time and O(1) amortized time. 534

7 .. (.1 (b) (c) (d} Figure 2: (Gap Resizing) In (a) block 1.2 grows and is relocated. This leaves the gap between block grouping 1 and 2 at lg n bits and so the gap must be resized. Block 1.2 moves to the bottom of the second block grouping and block 3.0 moves to the bottom of the third block grouping. This continues until (c) where block (b-1).o is moved and b.o is moved to the bottom of the array of blocks. Notice that in (d) the array of blocks has grown by the size of the last block. 3.5 Modifying Data When an insertion occurs, the data associated with the given node must be added to the data structure. To do this we first determine (by examining the external node numberings), the location within the data block where the data record belongs. Then, we locate the data block through its pointer in the block and rewrite the block with the inserted record in place. When our records are of constant size, rewriting a data block will take O(lgn) amortized time. Conversely, when the data records are lg n bit references we will require O(lg z n) amortized time to rewrite blocks. Following the insertion the block is larger and must be relocated. This relocation is performed as described previously when dealing with structural block relocation. Lemma3.5 Modifying data blocks takes O(lgn) amortized time for fixed sized records and O(lg2n) amortized time for lg n sized references. This is the dominant cost in modifying our structure and proves Theorem Changes in lgn One difficulty with our model is that it has a reliance on the value of lg n however, this value has the potential to change over time. As a result there are steps that must be taken when the value of lg n increases or decreases. We know that before lg n doubles or halves, n must be squared or rooted. This means that we can amortize the cost of changing the structure over O(n 2) operations. 3.7 Subtree Size Since we divide our tree structure into small blocks, if we maintain at the root of the small block the block's subtree size, in the worst case updating subtree size will require visiting each of these small blocks. Additionally, if we maintain the subtree size within the small block at the root of each tiny block, when updating, these values must also be modified. While performing these updates would take o(n) time, we would like to have the ability to update subtree size in constant time. To achieve this we consider a certain class of accesses to the tree. When concerned with subtree size we say that navigation through the tree begins at the root and may end at any point in the tree (although for purposes of claims regarding worst case time for updating the size of subtrees, we will assume navigation ends at the root). Each small block contains a count of its subtree 535

8 size beginning at its root and ending at its leaves. Additionally, each tiny block in the structure maintains a count of the number of nodes in its subtree starting at its root. Finally, in the table of tree representations, we store the number of descendant nodes within the tiny block for each node of the tree. When we wish to determine the subtree size at a given node of the tree we take the value in the table and we add to it the subtree sizes of each of its descendant tiny blocks. Then we add to that number the subtree sizes of each of its descending small blocks. Since in the worst case there are O(lg 2 n) descendant small blocks and O(lgn) descendant tiny blocks, determining subtree size takes O(lg 2 n) time. When an insertion or deletion takes place, we must update the subtree size of each of the tiny blocks in the current small block. This update will potentially require visiting all the tiny blocks within the small block and accordingly will take O(lg n) time. After this operation the small block has correct current subtree size information however, all ancestor small blocks may have incorrect information. To correct this we must visit all the small blocks which we traversed to get to the node at which the insertion or deletion took place. Since we already have visited these small blocks we can amortize the cost of updating the subtree size over the steps taken in traversing to the current node. This allows us to achieve amortized constant time updates to subtree size. It should be noted that we could compute the subtree size in constant time by maintaining the subtree size sums of descendant blocks (i.e. each block maintains the sum of all preceding blocks subtree size). With these sums we could simply determine the first descendant block and the last descendant block and from these, determine the subtree size. The problem with this model is that updating the sums would take O(lg 2 n) time. Corollary 3.1 The results of Theorem 1.1 apply to a forest of binary trees with the added result that two trees in the forest can be joined in O(lg 2 n) time. Corollary 3.2 Updates can performed on a binary tree in amortized constant time with the use of O ( n lg lg n ) space. Corollary 3.2 follows by storing the small trees in a conventional manner with explicit parent-child pointers and explicit pointers to the data, though these pointers require only lglgn bits each. A few other minor modifications are required, but we omit these details. ically optimal number of bits. Additionally we have shown how our structure, unlike the ones previously proposed facilitates insertions and deletions in O(lg 2 n) time. It would be interesting to consider the problem of improving the time for an update. While our model can represent arbitrary k-degree ordinal trees, it does so by performing a trivial mapping which requires O(k) time to determine the kth child of any of the tree's nodes. The problem of succinctly representing, and efficiently updating, trees of higher degree so that navigation can be performed efficiently [1] remains open. It may well be amenable to the our techniques. References [1] D. Benoit, E. D. Demaine, J. I. Munro, and V. Raman, "Representing Trees of Higher Degree", In Proceedings of the 6th International Workshop on Algorithms and Data Structures (WADS), volume 1663 of LNCS, pages , Springer-Verlag, [2] A. Brodnik, S. Carlsson, E. D. Demaine, J. I. Munro, and R. Sedgewick, "Resizable Arrays in Optimal Time and Space", In Proceedings of the 6th International Workshop on Algorithms and Data Structures (WADS), volume 1663 of LNCS, Springer-Verlag, (1999) [3] D. R. Clark and J. I. Munro, "Efficient Suffix Trees on Secondary Storage", Proceedings of the 7th ACM- SIAM Symposium on Discrete Algorithms (SODA), (1996) [4] M. L. Fredman, and D. E. WiUard, "Surpassing the Information Theoretic Bound with Fusion Trees", Journal of Computer and System Sciences, 43 (1993) [5] G. Jacobson, "Space-efficient Static Trees and Graphs", Proceedings of the IEEE Symposium on the Foundations of Computer Science (FOCS) (1989) [6] R. J. Lipton, and R. E. Tarjan, "A Separator Theorem For Planar Graphs", SIAM Journal of Applied Mathematics, 36(2) (1979) [7] J. I. Munro and V. Raman, "Succinct Representations of Balanced Parentheses, Static Trees and Planar Graphs", In Proceedings of the 38th Annum Symposium on Foundations off Computer Science (FOCS), (1997) [8] A. J. Storm, Representing Dynamic Binary Trees Succinctly, MMath thesis, U. Waterloo, Conclusions We have presented a binary tree representation that is within a lower order term of the information theoret- 536

HEAPS ON HEAPS* Downloaded 02/04/13 to Redistribution subject to SIAM license or copyright; see

HEAPS ON HEAPS* Downloaded 02/04/13 to Redistribution subject to SIAM license or copyright; see SIAM J. COMPUT. Vol. 15, No. 4, November 1986 (C) 1986 Society for Industrial and Applied Mathematics OO6 HEAPS ON HEAPS* GASTON H. GONNET" AND J. IAN MUNRO," Abstract. As part of a study of the general

More information

A Distribution-Sensitive Dictionary with Low Space Overhead

A Distribution-Sensitive Dictionary with Low Space Overhead A Distribution-Sensitive Dictionary with Low Space Overhead Prosenjit Bose, John Howat, and Pat Morin School of Computer Science, Carleton University 1125 Colonel By Dr., Ottawa, Ontario, CANADA, K1S 5B6

More information

A Fast Algorithm for Optimal Alignment between Similar Ordered Trees

A Fast Algorithm for Optimal Alignment between Similar Ordered Trees Fundamenta Informaticae 56 (2003) 105 120 105 IOS Press A Fast Algorithm for Optimal Alignment between Similar Ordered Trees Jesper Jansson Department of Computer Science Lund University, Box 118 SE-221

More information

Succinct and I/O Efficient Data Structures for Traversal in Trees

Succinct and I/O Efficient Data Structures for Traversal in Trees Succinct and I/O Efficient Data Structures for Traversal in Trees Craig Dillabaugh 1, Meng He 2, and Anil Maheshwari 1 1 School of Computer Science, Carleton University, Canada 2 Cheriton School of Computer

More information

Efficient pebbling for list traversal synopses

Efficient pebbling for list traversal synopses Efficient pebbling for list traversal synopses Yossi Matias Ely Porat Tel Aviv University Bar-Ilan University & Tel Aviv University Abstract 1 Introduction 1.1 Applications Consider a program P running

More information

Lecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure

Lecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure Lecture 6: External Interval Tree (Part II) Yufei Tao Division of Web Science and Technology Korea Advanced Institute of Science and Technology taoyf@cse.cuhk.edu.hk 3 Making the external interval tree

More information

Faster, Space-Efficient Selection Algorithms in Read-Only Memory for Integers

Faster, Space-Efficient Selection Algorithms in Read-Only Memory for Integers Faster, Space-Efficient Selection Algorithms in Read-Only Memory for Integers Timothy M. Chan 1, J. Ian Munro 1, and Venkatesh Raman 2 1 Cheriton School of Computer Science, University of Waterloo, Waterloo,

More information

( ( ( ( ) ( ( ) ( ( ( ) ( ( ) ) ) ) ( ) ( ( ) ) ) ( ) ( ) ( ) ( ( ) ) ) ) ( ) ( ( ( ( ) ) ) ( ( ) ) ) )

( ( ( ( ) ( ( ) ( ( ( ) ( ( ) ) ) ) ( ) ( ( ) ) ) ( ) ( ) ( ) ( ( ) ) ) ) ( ) ( ( ( ( ) ) ) ( ( ) ) ) ) Representing Trees of Higher Degree David Benoit 1;2, Erik D. Demaine 2, J. Ian Munro 2, and Venkatesh Raman 3 1 InfoInteractive Inc., Suite 604, 1550 Bedford Hwy., Bedford, N.S. B4A 1E6, Canada 2 Dept.

More information

Notes on Binary Dumbbell Trees

Notes on Binary Dumbbell Trees Notes on Binary Dumbbell Trees Michiel Smid March 23, 2012 Abstract Dumbbell trees were introduced in [1]. A detailed description of non-binary dumbbell trees appears in Chapter 11 of [3]. These notes

More information

V Advanced Data Structures

V Advanced Data Structures V Advanced Data Structures B-Trees Fibonacci Heaps 18 B-Trees B-trees are similar to RBTs, but they are better at minimizing disk I/O operations Many database systems use B-trees, or variants of them,

More information

V Advanced Data Structures

V Advanced Data Structures V Advanced Data Structures B-Trees Fibonacci Heaps 18 B-Trees B-trees are similar to RBTs, but they are better at minimizing disk I/O operations Many database systems use B-trees, or variants of them,

More information

arxiv: v1 [cs.ds] 13 Jul 2009

arxiv: v1 [cs.ds] 13 Jul 2009 Layered Working-Set Trees Prosenjit Bose Karim Douïeb Vida Dujmović John Howat arxiv:0907.2071v1 [cs.ds] 13 Jul 2009 Abstract The working-set bound [Sleator and Tarjan, J. ACM, 1985] roughly states that

More information

ADAPTIVE SORTING WITH AVL TREES

ADAPTIVE SORTING WITH AVL TREES ADAPTIVE SORTING WITH AVL TREES Amr Elmasry Computer Science Department Alexandria University Alexandria, Egypt elmasry@alexeng.edu.eg Abstract A new adaptive sorting algorithm is introduced. The new implementation

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology Introduction Chapter 4 Trees for large input, even linear access time may be prohibitive we need data structures that exhibit average running times closer to O(log N) binary search tree 2 Terminology recursive

More information

CS301 - Data Structures Glossary By

CS301 - Data Structures Glossary By CS301 - Data Structures Glossary By Abstract Data Type : A set of data values and associated operations that are precisely specified independent of any particular implementation. Also known as ADT Algorithm

More information

6. Asymptotics: The Big-O and Other Notations

6. Asymptotics: The Big-O and Other Notations Chapter 7 SEARCHING 1. Introduction, Notation 2. Sequential Search 3. Binary Search 4. Comparison Trees 5. Lower Bounds 6. Asymptotics: The Big-O and Other Notations Outline Transp. 1, Chapter 7, Searching

More information

Given a text file, or several text files, how do we search for a query string?

Given a text file, or several text files, how do we search for a query string? CS 840 Fall 2016 Text Search and Succinct Data Structures: Unit 4 Given a text file, or several text files, how do we search for a query string? Note the query/pattern is not of fixed length, unlike key

More information

Quake Heaps: A Simple Alternative to Fibonacci Heaps

Quake Heaps: A Simple Alternative to Fibonacci Heaps Quake Heaps: A Simple Alternative to Fibonacci Heaps Timothy M. Chan Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L G, Canada, tmchan@uwaterloo.ca Abstract. This note

More information

3 Competitive Dynamic BSTs (January 31 and February 2)

3 Competitive Dynamic BSTs (January 31 and February 2) 3 Competitive Dynamic BSTs (January 31 and February ) In their original paper on splay trees [3], Danny Sleator and Bob Tarjan conjectured that the cost of sequence of searches in a splay tree is within

More information

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree Chapter 4 Trees 2 Introduction for large input, even access time may be prohibitive we need data structures that exhibit running times closer to O(log N) binary search tree 3 Terminology recursive definition

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

9/24/ Hash functions

9/24/ Hash functions 11.3 Hash functions A good hash function satis es (approximately) the assumption of SUH: each key is equally likely to hash to any of the slots, independently of the other keys We typically have no way

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

arxiv: v2 [cs.ds] 9 Apr 2009

arxiv: v2 [cs.ds] 9 Apr 2009 Pairing Heaps with Costless Meld arxiv:09034130v2 [csds] 9 Apr 2009 Amr Elmasry Max-Planck Institut für Informatik Saarbrücken, Germany elmasry@mpi-infmpgde Abstract Improving the structure and analysis

More information

Lecture 4 Feb 21, 2007

Lecture 4 Feb 21, 2007 6.897: Advanced Data Structures Spring 2007 Prof. Erik Demaine Lecture 4 Feb 21, 2007 Scribe: Mashhood Ishaque 1 Overview In the last lecture we worked in a BST model. We discussed Wilber lower bounds

More information

Lecture 8 13 March, 2012

Lecture 8 13 March, 2012 6.851: Advanced Data Structures Spring 2012 Prof. Erik Demaine Lecture 8 13 March, 2012 1 From Last Lectures... In the previous lecture, we discussed the External Memory and Cache Oblivious memory models.

More information

Lecture Notes: External Interval Tree. 1 External Interval Tree The Static Version

Lecture Notes: External Interval Tree. 1 External Interval Tree The Static Version Lecture Notes: External Interval Tree Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk This lecture discusses the stabbing problem. Let I be

More information

Poketree: A Dynamically Competitive Data Structure with Good Worst-Case Performance

Poketree: A Dynamically Competitive Data Structure with Good Worst-Case Performance Poketree: A Dynamically Competitive Data Structure with Good Worst-Case Performance Jussi Kujala and Tapio Elomaa Institute of Software Systems Tampere University of Technology P.O. Box 553, FI-33101 Tampere,

More information

Search Trees. Undirected graph Directed graph Tree Binary search tree

Search Trees. Undirected graph Directed graph Tree Binary search tree Search Trees Undirected graph Directed graph Tree Binary search tree 1 Binary Search Tree Binary search key property: Let x be a node in a binary search tree. If y is a node in the left subtree of x, then

More information

Rank-Pairing Heaps. Bernard Haeupler Siddhartha Sen Robert E. Tarjan. SIAM JOURNAL ON COMPUTING Vol. 40, No. 6 (2011), pp.

Rank-Pairing Heaps. Bernard Haeupler Siddhartha Sen Robert E. Tarjan. SIAM JOURNAL ON COMPUTING Vol. 40, No. 6 (2011), pp. Rank-Pairing Heaps Bernard Haeupler Siddhartha Sen Robert E. Tarjan Presentation by Alexander Pokluda Cheriton School of Computer Science, University of Waterloo, Canada SIAM JOURNAL ON COMPUTING Vol.

More information

We assume uniform hashing (UH):

We assume uniform hashing (UH): We assume uniform hashing (UH): the probe sequence of each key is equally likely to be any of the! permutations of 0,1,, 1 UH generalizes the notion of SUH that produces not just a single number, but a

More information

Space Efficient Data Structures for Dynamic Orthogonal Range Counting

Space Efficient Data Structures for Dynamic Orthogonal Range Counting Space Efficient Data Structures for Dynamic Orthogonal Range Counting Meng He and J. Ian Munro Cheriton School of Computer Science, University of Waterloo, Canada, {mhe, imunro}@uwaterloo.ca Abstract.

More information

Lecture 3 February 20, 2007

Lecture 3 February 20, 2007 6.897: Advanced Data Structures Spring 2007 Prof. Erik Demaine Lecture 3 February 20, 2007 Scribe: Hui Tang 1 Overview In the last lecture we discussed Binary Search Trees and the many bounds which achieve

More information

Physical Level of Databases: B+-Trees

Physical Level of Databases: B+-Trees Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,

More information

Representing Trees of Higher Degree

Representing Trees of Higher Degree Representing Trees of Higher Degree David Benoit Erik D. Demaine J. Ian Munro Rajeev Raman Venkatesh Raman S. Srinivasa Rao Abstract This paper focuses on space efficient representations of rooted trees

More information

PLANAR GRAPH BIPARTIZATION IN LINEAR TIME

PLANAR GRAPH BIPARTIZATION IN LINEAR TIME PLANAR GRAPH BIPARTIZATION IN LINEAR TIME SAMUEL FIORINI, NADIA HARDY, BRUCE REED, AND ADRIAN VETTA Abstract. For each constant k, we present a linear time algorithm that, given a planar graph G, either

More information

Discrete mathematics

Discrete mathematics Discrete mathematics Petr Kovář petr.kovar@vsb.cz VŠB Technical University of Ostrava DiM 470-2301/02, Winter term 2018/2019 About this file This file is meant to be a guideline for the lecturer. Many

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

Lecture 6: Analysis of Algorithms (CS )

Lecture 6: Analysis of Algorithms (CS ) Lecture 6: Analysis of Algorithms (CS583-002) Amarda Shehu October 08, 2014 1 Outline of Today s Class 2 Traversals Querying Insertion and Deletion Sorting with BSTs 3 Red-black Trees Height of a Red-black

More information

Question Bank Subject: Advanced Data Structures Class: SE Computer

Question Bank Subject: Advanced Data Structures Class: SE Computer Question Bank Subject: Advanced Data Structures Class: SE Computer Question1: Write a non recursive pseudo code for post order traversal of binary tree Answer: Pseudo Code: 1. Push root into Stack_One.

More information

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of California, San Diego CA 92093{0114, USA Abstract. We

More information

Let the dynamic table support the operations TABLE-INSERT and TABLE-DELETE It is convenient to use the load factor ( )

Let the dynamic table support the operations TABLE-INSERT and TABLE-DELETE It is convenient to use the load factor ( ) 17.4 Dynamic tables Let us now study the problem of dynamically expanding and contracting a table We show that the amortized cost of insertion/ deletion is only (1) Though the actual cost of an operation

More information

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree. The Lecture Contains: Index structure Binary search tree (BST) B-tree B+-tree Order file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture13/13_1.htm[6/14/2012

More information

Balanced Search Trees

Balanced Search Trees Balanced Search Trees Michael P. Fourman February 2, 2010 To investigate the efficiency of binary search trees, we need to establish formulae that predict the time required for these dictionary or set

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS245-2008S-19 B-Trees David Galles Department of Computer Science University of San Francisco 19-0: Indexing Operations: Add an element Remove an element Find an element,

More information

Unit 8: Analysis of Algorithms 1: Searching

Unit 8: Analysis of Algorithms 1: Searching P Computer Science Unit 8: nalysis of lgorithms 1: Searching Topics: I. Sigma and Big-O notation II. Linear Search III. Binary Search Materials: I. Rawlins 1.6 II. Rawlins 2.1 III. Rawlins 2.3 IV. Sigma

More information

18.3 Deleting a key from a B-tree

18.3 Deleting a key from a B-tree 18.3 Deleting a key from a B-tree B-TREE-DELETE deletes the key from the subtree rooted at We design it to guarantee that whenever it calls itself recursively on a node, the number of keys in is at least

More information

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof.

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof. Priority Queues 1 Introduction Many applications require a special type of queuing in which items are pushed onto the queue by order of arrival, but removed from the queue based on some other priority

More information

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 530A. B+ Trees. Washington University Fall 2013 CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key

More information

Lecture 2 February 4, 2010

Lecture 2 February 4, 2010 6.897: Advanced Data Structures Spring 010 Prof. Erik Demaine Lecture February 4, 010 1 Overview In the last lecture we discussed Binary Search Trees(BST) and introduced them as a model of computation.

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L08: B + -trees and Dynamic Hashing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,

More information

Problem Set 4 Solutions

Problem Set 4 Solutions Introduction to Algorithms October 24, 2003 Massachusetts Institute of Technology 6.046J/18.410J Professors Shafi Goldwasser and Silvio Micali Handout 18 Problem 4-1. Problem Set 4 Solutions Reconstructing

More information

Properties of red-black trees

Properties of red-black trees Red-Black Trees Introduction We have seen that a binary search tree is a useful tool. I.e., if its height is h, then we can implement any basic operation on it in O(h) units of time. The problem: given

More information

Laboratory Module X B TREES

Laboratory Module X B TREES Purpose: Purpose 1... Purpose 2 Purpose 3. Laboratory Module X B TREES 1. Preparation Before Lab When working with large sets of data, it is often not possible or desirable to maintain the entire structure

More information

DDS Dynamic Search Trees

DDS Dynamic Search Trees DDS Dynamic Search Trees 1 Data structures l A data structure models some abstract object. It implements a number of operations on this object, which usually can be classified into l creation and deletion

More information

An Algorithm for Enumerating All Spanning Trees of a Directed Graph 1. S. Kapoor 2 and H. Ramesh 3

An Algorithm for Enumerating All Spanning Trees of a Directed Graph 1. S. Kapoor 2 and H. Ramesh 3 Algorithmica (2000) 27: 120 130 DOI: 10.1007/s004530010008 Algorithmica 2000 Springer-Verlag New York Inc. An Algorithm for Enumerating All Spanning Trees of a Directed Graph 1 S. Kapoor 2 and H. Ramesh

More information

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See  for conditions on re-use Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files Static

More information

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University Trees Reading: Weiss, Chapter 4 1 Generic Rooted Trees 2 Terms Node, Edge Internal node Root Leaf Child Sibling Descendant Ancestor 3 Tree Representations n-ary trees Each internal node can have at most

More information

(2,4) Trees. 2/22/2006 (2,4) Trees 1

(2,4) Trees. 2/22/2006 (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 2/22/2006 (2,4) Trees 1 Outline and Reading Multi-way search tree ( 10.4.1) Definition Search (2,4) tree ( 10.4.2) Definition Search Insertion Deletion Comparison of dictionary

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

lecture notes September 2, How to sort?

lecture notes September 2, How to sort? .30 lecture notes September 2, 203 How to sort? Lecturer: Michel Goemans The task of sorting. Setup Suppose we have n objects that we need to sort according to some ordering. These could be integers or

More information

Advanced Algorithms. Class Notes for Thursday, September 18, 2014 Bernard Moret

Advanced Algorithms. Class Notes for Thursday, September 18, 2014 Bernard Moret Advanced Algorithms Class Notes for Thursday, September 18, 2014 Bernard Moret 1 Amortized Analysis (cont d) 1.1 Side note: regarding meldable heaps When we saw how to meld two leftist trees, we did not

More information

Dynamic Optimality Almost

Dynamic Optimality Almost Dynamic Optimality Almost Erik D. Demaine Dion Harmon John Iacono Mihai Pǎtraşcu Abstract We present an O(lg lg n)-competitive online binary search tree, improving upon the best previous (trivial) competitive

More information

Part 2: Balanced Trees

Part 2: Balanced Trees Part 2: Balanced Trees 1 AVL Trees We could dene a perfectly balanced binary search tree with N nodes to be a complete binary search tree, one in which every level except the last is completely full. A

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

ICS 691: Advanced Data Structures Spring Lecture 8

ICS 691: Advanced Data Structures Spring Lecture 8 ICS 691: Advanced Data Structures Spring 2016 Prof. odari Sitchinava Lecture 8 Scribe: Ben Karsin 1 Overview In the last lecture we continued looking at arborally satisfied sets and their equivalence to

More information

CS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial

CS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial Week 7 General remarks Arrays, lists, pointers and 1 2 3 We conclude elementary data structures by discussing and implementing arrays, lists, and trees. Background information on pointers is provided (for

More information

Towards ultimate binary heaps

Towards ultimate binary heaps Towards ultimate binary heaps Amr Elmasry 1 Jyrki Katajainen 2 1 Computer and Systems Engineering Department, Alexandria University Alexandria 21544, Egypt 2 Department of Computer Science, University

More information

Lecture 19 Apr 25, 2007

Lecture 19 Apr 25, 2007 6.851: Advanced Data Structures Spring 2007 Prof. Erik Demaine Lecture 19 Apr 25, 2007 Scribe: Aditya Rathnam 1 Overview Previously we worked in the RA or cell probe models, in which the cost of an algorithm

More information

Binary Trees, Binary Search Trees

Binary Trees, Binary Search Trees Binary Trees, Binary Search Trees Trees Linear access time of linked lists is prohibitive Does there exist any simple data structure for which the running time of most operations (search, insert, delete)

More information

Alphabet-Dependent String Searching with Wexponential Search Trees

Alphabet-Dependent String Searching with Wexponential Search Trees Alphabet-Dependent String Searching with Wexponential Search Trees Johannes Fischer and Pawe l Gawrychowski February 15, 2013 arxiv:1302.3347v1 [cs.ds] 14 Feb 2013 Abstract It is widely assumed that O(m

More information

Multi-Way Search Trees

Multi-Way Search Trees Multi-Way Search Trees Manolis Koubarakis 1 Multi-Way Search Trees Multi-way trees are trees such that each internal node can have many children. Let us assume that the entries we store in a search tree

More information

Outline. Definition. 2 Height-Balance. 3 Searches. 4 Rotations. 5 Insertion. 6 Deletions. 7 Reference. 1 Every node is either red or black.

Outline. Definition. 2 Height-Balance. 3 Searches. 4 Rotations. 5 Insertion. 6 Deletions. 7 Reference. 1 Every node is either red or black. Outline 1 Definition Computer Science 331 Red-Black rees Mike Jacobson Department of Computer Science University of Calgary Lectures #20-22 2 Height-Balance 3 Searches 4 Rotations 5 s: Main Case 6 Partial

More information

Applications of the Exponential Search Tree in Sweep Line Techniques

Applications of the Exponential Search Tree in Sweep Line Techniques Applications of the Exponential Search Tree in Sweep Line Techniques S. SIOUTAS, A. TSAKALIDIS, J. TSAKNAKIS. V. VASSILIADIS Department of Computer Engineering and Informatics University of Patras 2650

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Chapter 11: Indexing and Hashing" Chapter 11: Indexing and Hashing"

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing" Database System Concepts, 6 th Ed.! Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use " Chapter 11: Indexing and Hashing" Basic Concepts!

More information

8.1. Optimal Binary Search Trees:

8.1. Optimal Binary Search Trees: DATA STRUCTERS WITH C 10CS35 UNIT 8 : EFFICIENT BINARY SEARCH TREES 8.1. Optimal Binary Search Trees: An optimal binary search tree is a binary search tree for which the nodes are arranged on levels such

More information

Material You Need to Know

Material You Need to Know Review Quiz 2 Material You Need to Know Normalization Storage and Disk File Layout Indexing B-trees and B+ Trees Extensible Hashing Linear Hashing Decomposition Goals: Lossless Joins, Dependency preservation

More information

Succinct Dynamic Cardinal Trees

Succinct Dynamic Cardinal Trees Noname manuscript No. (will be inserted by the editor) Succinct Dynamic Cardinal Trees Diego Arroyuelo Pooya Davoodi Srinivasa Rao Satti the date of receipt and acceptance should be inserted later Abstract

More information

Multi-Way Search Trees

Multi-Way Search Trees Multi-Way Search Trees Manolis Koubarakis 1 Multi-Way Search Trees Multi-way trees are trees such that each internal node can have many children. Let us assume that the entries we store in a search tree

More information

Quiz 1 Solutions. (a) f(n) = n g(n) = log n Circle all that apply: f = O(g) f = Θ(g) f = Ω(g)

Quiz 1 Solutions. (a) f(n) = n g(n) = log n Circle all that apply: f = O(g) f = Θ(g) f = Ω(g) Introduction to Algorithms March 11, 2009 Massachusetts Institute of Technology 6.006 Spring 2009 Professors Sivan Toledo and Alan Edelman Quiz 1 Solutions Problem 1. Quiz 1 Solutions Asymptotic orders

More information

Binary Heaps in Dynamic Arrays

Binary Heaps in Dynamic Arrays Yufei Tao ITEE University of Queensland We have already learned that the binary heap serves as an efficient implementation of a priority queue. Our previous discussion was based on pointers (for getting

More information

Lecture 9 March 4, 2010

Lecture 9 March 4, 2010 6.851: Advanced Data Structures Spring 010 Dr. André Schulz Lecture 9 March 4, 010 1 Overview Last lecture we defined the Least Common Ancestor (LCA) and Range Min Query (RMQ) problems. Recall that an

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Binary Trees. BSTs. For example: Jargon: Data Structures & Algorithms. root node. level: internal node. edge.

Binary Trees. BSTs. For example: Jargon: Data Structures & Algorithms. root node. level: internal node. edge. Binary Trees 1 A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the root, which are disjoint from

More information

Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 3.. NIL. 2. error new key is greater than current key 6. CASCADING-CUT(, )

Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 3.. NIL. 2. error new key is greater than current key 6. CASCADING-CUT(, ) Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 1. if >. 2. error new key is greater than current key 3.. 4.. 5. if NIL and.

More information

Ensures that no such path is more than twice as long as any other, so that the tree is approximately balanced

Ensures that no such path is more than twice as long as any other, so that the tree is approximately balanced 13 Red-Black Trees A red-black tree (RBT) is a BST with one extra bit of storage per node: color, either RED or BLACK Constraining the node colors on any path from the root to a leaf Ensures that no such

More information

Red-Black-Trees and Heaps in Timestamp-Adjusting Sweepline Based Algorithms

Red-Black-Trees and Heaps in Timestamp-Adjusting Sweepline Based Algorithms Department of Informatics, University of Zürich Vertiefungsarbeit Red-Black-Trees and Heaps in Timestamp-Adjusting Sweepline Based Algorithms Mirko Richter Matrikelnummer: 12-917-175 Email: mirko.richter@uzh.ch

More information

Worst-case running time for RANDOMIZED-SELECT

Worst-case running time for RANDOMIZED-SELECT Worst-case running time for RANDOMIZED-SELECT is ), even to nd the minimum The algorithm has a linear expected running time, though, and because it is randomized, no particular input elicits the worst-case

More information

DOWNLOAD PDF LINKED LIST PROGRAMS IN DATA STRUCTURE

DOWNLOAD PDF LINKED LIST PROGRAMS IN DATA STRUCTURE Chapter 1 : What is an application of linear linked list data structures? - Quora A linked list is a linear data structure, in which the elements are not stored at contiguous memory locations. The elements

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

B-Trees and External Memory

B-Trees and External Memory Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 and External Memory 1 1 (2, 4) Trees: Generalization of BSTs Each internal node

More information

III Data Structures. Dynamic sets

III Data Structures. Dynamic sets III Data Structures Elementary Data Structures Hash Tables Binary Search Trees Red-Black Trees Dynamic sets Sets are fundamental to computer science Algorithms may require several different types of operations

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

11.9 Connectivity Connected Components. mcs 2015/5/18 1:43 page 419 #427

11.9 Connectivity Connected Components. mcs 2015/5/18 1:43 page 419 #427 mcs 2015/5/18 1:43 page 419 #427 11.9 Connectivity Definition 11.9.1. Two vertices are connected in a graph when there is a path that begins at one and ends at the other. By convention, every vertex is

More information

l So unlike the search trees, there are neither arbitrary find operations nor arbitrary delete operations possible.

l So unlike the search trees, there are neither arbitrary find operations nor arbitrary delete operations possible. DDS-Heaps 1 Heaps - basics l Heaps an abstract structure where each object has a key value (the priority), and the operations are: insert an object, find the object of minimum key (find min), and delete

More information

A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES)

A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES) Chapter 1 A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES) Piotr Berman Department of Computer Science & Engineering Pennsylvania

More information

Chapter 12: Indexing and Hashing (Cnt(

Chapter 12: Indexing and Hashing (Cnt( Chapter 12: Indexing and Hashing (Cnt( Cnt.) Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition

More information