CSE465, Fall 2009 February 25 1 Principles of B-trees Noes an binary search Anoe u has size size(u), keys k 1,..., k size(u) 1 chilren c 1,...,c size(u). Binary search property: for i = 1,..., size(u) 1 keys in the subtree of c i are smaller than k i keys in the subtree of c i+1 are larger than k i Size limitation, parameter t For root r, 2 size(r) 2t For every other noe u, t size(r) 2t Variant: upper boun can be 2t 1. Shape limitation, or balance All leaves have the same epth, i.e. the same istance from the root.
CSE465, Fall 2009 February 25 2 Nice properties: The number of levels if there are n keys: about log t n = 1 ln t ln n (about: can be smaller, orlarger by 1) The time neee to perform binary search: about log 2 t steps per level, hence, about log 2 n steps (but we may a 1 for each level, so a bit more). Insertion an eletion require to visit an restructure at most 2 noes on each level. With very small t, this gives O(log n) time for insertion an eletion. This is use for 2-3-4 trees (t = 2), 2-3 trees (t = 2, variation of the rules), reblack trees (groups of 1-2-3 noes obey the rules of 2-3-4 trees). In isk memory (or soli state isk etc.) we use large t, soeach noe occupies up to a full page (upper boun), an at least half a page (lower boun).
CSE465, Fall 2009 February 25 3 Insertion principles. 1 2 3 During insertion, a noe can increase its size by 1. Anoe with size above the upper boun is unstable, an splits into two parts, as equal in size as possible. The pointer to such a noe is replace with a trio: separating value an two pointers. The parent of a split chil increases in size: it has a trio where one pointer was before. If this makes it unstable, it splits. If the root splits, the trio that replaces the root pointer becomes the root noe. Pointer to the trio becomes the new root pointer. Remark. Unstable noes are goo for the esign an analysis, but they are not implemente. Basically, weescribe how toavoi using unstable noes. Deletion principles. ❶ ❷ ❸ When we elete in a noe, a sibling is a esignate helper. During eletion, the size can go own by 1. In the extreme case, the size rops to 1, an the number of keys to0. This basically means that there is nothing there, an the pointer to its only chil actually belongs to the gran-parent of the chil, or, inthe case of the root. the only chil of the root is the new root. If the noe becomes unstable, we first try to get a chil from the esignate helper. This changes the key that separates those two siblings. ❹ If getting the chil is impossible, the two siblings have sizes t an t 1, hence 2t 1 chilren, an they are legally fuse. The separating value (an one pointer) is remove from the parent. Extra principles for re-black trees ➀ ➁ ➂ Groups of 1-2-3 binary tree noes obey the rules of 2-3-4 tree. If a pointer in noe u points to a noe in the same group, it is re. Otherwise it is black. Agroup of 3 noes must have Λ shape: two re pointer in the root noe of the group.
CSE465, Fall 2009 February 25 4 B-trees an re-black trees: examples Inserting a, b, c, to an initially empty set. a a b a b c b a c a a a b b b a c c b b a c a c Worth to notice: after inserting c to the re-black tree we obtain a group of three noes of illegal shape, a rotation corrects this shape; after inserting to 2-3-4 tree the root noe becomes unstable an splits, which creates a new root; in re-black tree the root group is similarly split, an this split is obtaine by changing the colors of the pointers of the root noe: when they are re, the root is in the same group with the rest of the noe, when they are black, the root is a separate group, so the left an right subtrees are separate groups as well.
CSE465, Fall 2009 February 25 5 Inserting e, f, g to the set obtaine before. b b b a c e a c e f a c e f g b b b a c a a c e c e e f f b b b g a a a c e c e c f f e g Worth to notice: after inserting f we split a noe in 2-3-4 tree an a group in re-black tree; this brings the separating key,, to the parent noe or the parent group. The latter means that the pointer to the noe of becomes re.
CSE465, Fall 2009 February 25 6 Inserting h, i to the set obtaine before. b f b f a c e g h a c e g h i b a b f c f a c e g e g h h i b a b f c f a c e h e g g i h b f a c e g h
CSE465, Fall 2009 February 25 7 Inserting j to the set obtaine before. b f a c e g h i j b f h a c e g i j b f h a c e g i j b f b f a c e h a c e h g i g i j j b f a c e h g i j
CSE465, Fall 2009 February 25 8 Deleting a from the set obtaine before, in 2-3-4 tree. b f h c e g i j Empty noe is unstable, the sibling has no spare chilren, so they fuse into one noe: f h b c e g i j Again we have an unstable noe with one chil only, its sibling has a spare chil, so we transfer a chil: f h b c e g i j
CSE465, Fall 2009 February 25 9 Deleting a from the set obtaine before, in re-black tree. A group of noes that is too small is empty, its place is inicate by oubly-black pointer, which we inicate on the iagram with a black rectangle. When this group fuses with its sibling group, its parent group is reuce to zero noes an is now inicate by the oubly black pointer. b f b f c e h c e h g i g i j j Now this group receives extra chil, e, an extra (i.e. the only!) separating key,, while the helping group rops in size to one noe, h. f h b e g i c j
CSE465, Fall 2009 February 25 10 Tr eaps ranomness gives balance without rotations Binary search trees built by insertions alone, without eletions, in ranom orer have very goo epth on the average. However, the insertions o not have to come in ranom orer, an besies, we have eletions which are easy to implement but har to analyze: how the average epth changes if you perform both insertions an eletions. Luckily, avery simple trick allows to enforce the shape as if the the tree was built by insertions alone, in ranom orer. When we insert a new element we give it a ranom priority from some sufficiently large range. We will assume that this priority is a ranom real number from the range [0, 1]. The structures that form the tree now have 4 fiels: {key,pr,left,right. So when we insert a new element x we fin p{x,p,null,null. Then we connect this structure to the tree. The rules of that tree are the following Binary search orer of keys if s is a escenent of t->left then s->key < t->key if s is a escenent of t->right then s->key > t->key Heap orer of priorities if s is the parent of t then s->pr t->pr The heap orer enforces the shape as if the keys were inserte in the same orer as their priorities, the least priority is on the root, an in a subtree, with content etermine by the selection of the root an other ancestors, the smallest priority has uniformly ranom owner an this owner is at the root. We will see that the coe for enforcing that shape is very simple.
CSE465, Fall 2009 February 25 11 Insertion in treaps splitting In some applications an orere ictionary/tree can be split into two accoring to some key value x: one new ictionary will have keys smaller than x, the other will have the larger keys. Split is also useful uring insertion. We want to insert a new key x with priority p into tree t. s = {x,p,?,?; // magic allocates a new structure T = &t; // locate the subtree to moify while (t = *T, t!= NULL && t->pr < p) if (t->key < x) *T = t->left; else *T = t->right; if (t == NULL) { // insertion at the bottom *T = s; s->left = s->right = NULL; return; // replace noe *T with s, make chiren of s by splitting // ol *T into *L, *R, with keys smaller/larger than x L = &(s->left); R = &(s->right); *T = s; while (t!= NULL) if (t->key < x) *L = t, L = &(t->left), t = t->left; else *R = t, R = &(t->right), t = t->right; *L = *R = NULL;
CSE465, Fall 2009 February 25 12 Example how this works: s 24 2 10 0 10 0 10 0 A 36 1 T t 30 3 B A L 36 1 T s 24 2 R B t 30 3 A L 24 2 36 1 B 22 5 25 4 D C 22 5 25 4 D C 30 3 R t 25 4 C E 23 6 E 23 6 22 5 D F F E 23 6 10 0 10 0 F A 36 1 A 36 1 L 24 2 B 24 2 B 30 3 22 5 L t 30 3 25 4 R t 22 5 D C E F 23 6 L R t 25 4 D C E R t 23 6 F
CSE465, Fall 2009 February 25 13 Deletion in treaps meling In some applications two orere ictionaries/trees such that all keys from the first ictionary precee keys from the secon can be combine into one. This operation is calle mel. We want to elete key x. Welocate the structure with x an then we replace the pointer to that structure with the pointer to the mel of its chilren. voi remove(x,t) { while (t = *T) { // this breaks when t == NULL if (t->key < x)) *T = &(t->right); else if (t->key > x)) *T = &(t->left); else { *T = mel(t->left,t->right); free t; return; Mel is easier to implement recursively: tree mel(s,t) // assume keys in x smaller than in t { if (s == NULL) return t; else if (t == NULL) return s; else if (s->pr > t->pr) return s->right = mel(s->right,t), s; else return t->left = mel(s,t->left), t;
CSE465, Fall 2009 February 25 14 Priority queue an heaps Orere ictionary orinarily uses pointers for every key it stores an uses trees that are not perfectly balance. We can o better if we use it in a restricte way. In a typical application, we store objects with priorities an in some applications, the priorities can change over time. Set Operations: Initialize(Q), Insert(x, p, Q), DeleteMin(Q) Other upates: IncreasePriority(i, p, Q), DecreasePriority(i, p, Q), In the last operation, we ecrease the priority of an object at position i. We can know this position if we use cross-referencing. We implement priority queues using heaps. Heap is a tree in which parent noe has smaller (or equal) priority to the noe of the parent. This is calle heap orer or partial orer. Because we o not have other limitations, we are quite flexible how weplace the objects in the tree. The simplest heap with m objects is an array sector (H,1, m + 1) where position 1 is the root, an chilren of position i are: 2i an 2i + 1. Conversely, parent of i is i/2 (roune own). Example Example of a heap (H[i] has number i unerneath): 3 1 15 12 2 3 23 45 64 49 4 5 6 7 32 35 50 63 70 8 9 10 11 12
CSE465, Fall 2009 February 25 15 Restoring heap property We will efine all operations in a heap using two restore operations. Assume that <H,0,m> has heap property an that we want to change entry H[i] to x. Torestore the heap property we nee to rearrange the entries. We have two cases. The easier case is when x < H[i]. We will call this task ra(h, i, x), for restore after ecreasing. Wecan efine ra(h, i, x) recursively. rai(h,i,x) { if (i == 1 (p = i/2, x H[p])) { H[i] = x; return; H[i] = H[p]; rai(h,p,x); This formulation is recursive so wecan have an inuctive proof of correctness. Because the recursive call is the last statement in this function, we can eliminate this call using tail recursion principle. while (i > 1 && (p = i/2, x < H[p])) H[i] = H[p], i = p; H[i] = x; The worst case running time is c log 2 i.
CSE465, Fall 2009 February 25 16 Example: (H,1,13) forms a heap shown before an we execute ra(h,9,2). Location to be change by ra will be colore blue. 3 1 15 12 2 3 23 45 64 49 4 5 6 7 32 35 2 50 63 70 8 9 10 11 12 3 1 15 12 2 3 23 2 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 1 15 2 12 2 3 15 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 2 an over-write the root in the last step 1 3 12 2 3 15 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12
CSE465, Fall 2009 February 25 17 The more complicate case is when x > H[i]. We will call this task rai(h, i, m, x), for restore after increasing; here m is the number of keys store in the heap. We can efine rai(h, i, m, x) recursively. rai(h,i,m,x) { g = 2*i; // start computing the goo chil of i if (g > m) { // no chilren finish H[i] = x; return; if (g+1 m && H[g] > H[g+1]) g++; // goo chil has the smaller priority if (x H[g]) { H[i] = x; return; H[i] = H[g]; rai(h,g,m,x); Again, we can eliminate the tail recursion: while (g = 2*i, g < m) { if (g+1 m && H[g] > H[g+1]) g++; if (x H[g]) break; H[i] = H[g]; i = g; H[i] = x; The worst case running time is c log 2 m/i.
CSE465, Fall 2009 February 25 18 Example: (H, 1, 13) we execute rai(h, 0, 12, 39). Location to be change by rai are blue, the goo chil is yellow. 2 20 1 3 12 2 3 15 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 1 3 20 12 2 3 15 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 1 15 12 2 3 15 20 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 1 15 12 2 3 20 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12
CSE465, Fall 2009 February 25 19 Heapify, impose the heap property We can impose the heap property by pretening to restore it. Version 1: preten that all entries are an that you ecrease them to their actual values. If you o it in orer A[0], A[1],... then before you process A[i] the fragment < A, i, m > remains unchange. for (i = 2; i m; i++) ra(a,i,a[i]); The worst case running time is Σ log 2 i cm(log 2 m log 2 e) =Θ(m log m). c m 1 i=1 Version 2: preten that all entries are an that you increase them to their actual values. If you o it in orer A[m 1], A[m 2],... then before you process A[i] the fragment < A,0, i >remains unchange. for (i = m/2; i > 0; i--) rai(a,i,m,a[i]); The worst case running time is Σ log 2 m/i cm log 2 m cm(log 2 m log 2 e) = cm log 2 e =Θ(m). c m 1 i=1 Version 2 is much better!
CSE465, Fall 2009 February 25 20 Heap sort We want to sort (A,0, n). We create heap (H,1, n + 1) an then we remove the minimum from the heap an place it at the beginning of the resulting sequence. At any stage, (H,1, m + 1) = (A,0, m) contains the heap an (A, m, n) stores the resulting sequence. H = A-1; for (i = n/2; i > 0; i--) ra(h,i,n,h[i]); for (m = n; m > 1; m--) temp = H[m], H[m] = H[1], rai(h,1,m-1,temp); This sorts in the reverse orer. Tosort in increasing orer we nee to moify the heap so it has maximum at the root rather then the minimum.
CSE465, Fall 2009 February 25 21 Priority queue Abstract ata type: possible states: sets of elements of some comparable type; operations: MAKE_EMPTY, initialize empty set, INSERT, DELETE_MIN, remove an return the minimum. Implementation with boune capacity n, using array H[n] an m: MAKE_EMPTY() { m = 0; INSERT(x); { m++; ra(h,m,m,x) DELETE_MIN() { x = H[1]; m--; rai(h,1,m,h[m+1]); return x;