Principles of B-trees

Similar documents
Priority queue and heaps

CS350 - Exam 4 (100 Points)

Positions, Iterators, Tree Structures and Tree Traversals. Readings - Chapter 7.3, 7.4 and 8

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication

Acknowledgement. Flow Graph Theory. Depth First Search. Speeding up DFA 8/16/2016

Trees 2: Linked Representation, Tree Traversal, and Binary Search Trees

Lecture 5. Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs

CSE 530A. B+ Trees. Washington University Fall 2013

CS102 Binary Search Trees

Trees 2: Linked Representation, Tree Traversal, and Binary Search Trees

Plane Sweep Algorithms

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Tree: non-recursive definition. Trees, Binary Search Trees, and Heaps. Tree: recursive definition. Tree: example.

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help

Advanced Algorithms. Class Notes for Thursday, September 18, 2014 Bernard Moret

Hash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell

CSCI2100B Data Structures Heaps

Lecture 6: Analysis of Algorithms (CS )

6.854J / J Advanced Algorithms Fall 2008

Trees. A tree is a directed graph with the property

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

CSE100. Advanced Data Structures. Lecture 8. (Based on Paul Kube course materials)

Data Structures and Algorithms

Trees. (Trees) Data Structures and Programming Spring / 28

Recall: Properties of B-Trees

Chapter 22 Splay Trees

CMSC 341 Leftist Heaps

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof.

Priority Queues. T. M. Murali. January 23, T. M. Murali January 23, 2008 Priority Queues

Self-Balancing Search Trees. Chapter 11

Priority Queues. T. M. Murali. January 29, 2009

CMSC 341 Lecture 15 Leftist Heaps

DOWNLOAD PDF LINKED LIST PROGRAMS IN DATA STRUCTURE

Dictionaries. Priority Queues

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

A set of nodes (or vertices) with a single starting point

Chapter 6 Heapsort 1

Chapter 6 Heaps. Introduction. Heap Model. Heap Implementation

CMSC 341 Lecture 15 Leftist Heaps

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is?

Priority Queues. T. M. Murali. January 26, T. M. Murali January 26, 2016 Priority Queues

Lecture: Analysis of Algorithms (CS )

Binary Trees, Binary Search Trees

Pairwise alignment using shortest path algorithms, Gunnar Klau, November 29, 2005, 11:

Computer Science 210 Data Structures Siena College Fall Topic Notes: Priority Queues and Heaps

Implementations. Priority Queues. Heaps and Heap Order. The Insert Operation CS206 CS206

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

CS301 - Data Structures Glossary By

Trees. Eric McCreath

Binary heaps (chapters ) Leftist heaps

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

Module 2: Priority Queues

Multiway Search Trees. Multiway-Search Trees (cont d)

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

University of Illinois at Urbana-Champaign Department of Computer Science. Second Examination

Module 2: Priority Queues

Transform & Conquer. Presorting

Heap Model. specialized queue required heap (priority queue) provides at least

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Lecture Notes for Advanced Algorithms

Balanced Binary Search Trees. Victor Gao

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

Physical Level of Databases: B+-Trees

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

Data structures. Organize your data to support various queries using little time and/or space

Definition of a Heap. Heaps. Priority Queues. Example. Implementation using a heap. Heap ADT

Binary Trees

CMSC 341 Lecture 14: Priority Queues, Heaps

DDS Dynamic Search Trees

Algorithms and Theory of Computation. Lecture 7: Priority Queue

PART 2. Organization Of An Operating System

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

COMP : Trees. COMP20012 Trees 219

Readings. Priority Queue ADT. FindMin Problem. Priority Queues & Binary Heaps. List implementation of a Priority Queue

BINARY HEAP cs2420 Introduction to Algorithms and Data Structures Spring 2015

403: Algorithms and Data Structures. Heaps. Fall 2016 UAlbany Computer Science. Some slides borrowed by David Luebke

CSE 5311 Notes 4a: Priority Queues

TREES. Trees - Introduction

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

Data Structures and Algorithms

Binary Search Trees. Analysis of Algorithms

How much space does this routine use in the worst case for a given n? public static void use_space(int n) { int b; int [] A;

Algorithms, Spring 2014, CSE, OSU Lecture 2: Sorting

BST Deletion. First, we need to find the value which is easy because we can just use the method we developed for BST_Search.

Online Appendix to: Generalizing Database Forensics

B-Trees and External Memory

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

1 Tree Sort LECTURE 4. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

CMSC 341. Binary Heaps. Priority Queues

CS350: Data Structures B-Trees

Copyright 1998 by Addison-Wesley Publishing Company 147. Chapter 15. Stacks and Queues

B-Trees and External Memory

CIS265/ Trees Red-Black Trees. Some of the following material is from:

CMSC351 - Fall 2014, Homework #2

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

CS 240 Data Structures and Data Management. Module 2: Priority Queues

Fall, 2015 Prof. Jungkeun Park

Operations. Priority Queues & Heaps. Prio-Q as Array. Niave Solution: Prio-Q As Array. Prio-Q as Sorted Array. Prio-Q as Sorted Array

Transcription:

CSE465, Fall 2009 February 25 1 Principles of B-trees Noes an binary search Anoe u has size size(u), keys k 1,..., k size(u) 1 chilren c 1,...,c size(u). Binary search property: for i = 1,..., size(u) 1 keys in the subtree of c i are smaller than k i keys in the subtree of c i+1 are larger than k i Size limitation, parameter t For root r, 2 size(r) 2t For every other noe u, t size(r) 2t Variant: upper boun can be 2t 1. Shape limitation, or balance All leaves have the same epth, i.e. the same istance from the root.

CSE465, Fall 2009 February 25 2 Nice properties: The number of levels if there are n keys: about log t n = 1 ln t ln n (about: can be smaller, orlarger by 1) The time neee to perform binary search: about log 2 t steps per level, hence, about log 2 n steps (but we may a 1 for each level, so a bit more). Insertion an eletion require to visit an restructure at most 2 noes on each level. With very small t, this gives O(log n) time for insertion an eletion. This is use for 2-3-4 trees (t = 2), 2-3 trees (t = 2, variation of the rules), reblack trees (groups of 1-2-3 noes obey the rules of 2-3-4 trees). In isk memory (or soli state isk etc.) we use large t, soeach noe occupies up to a full page (upper boun), an at least half a page (lower boun).

CSE465, Fall 2009 February 25 3 Insertion principles. 1 2 3 During insertion, a noe can increase its size by 1. Anoe with size above the upper boun is unstable, an splits into two parts, as equal in size as possible. The pointer to such a noe is replace with a trio: separating value an two pointers. The parent of a split chil increases in size: it has a trio where one pointer was before. If this makes it unstable, it splits. If the root splits, the trio that replaces the root pointer becomes the root noe. Pointer to the trio becomes the new root pointer. Remark. Unstable noes are goo for the esign an analysis, but they are not implemente. Basically, weescribe how toavoi using unstable noes. Deletion principles. ❶ ❷ ❸ When we elete in a noe, a sibling is a esignate helper. During eletion, the size can go own by 1. In the extreme case, the size rops to 1, an the number of keys to0. This basically means that there is nothing there, an the pointer to its only chil actually belongs to the gran-parent of the chil, or, inthe case of the root. the only chil of the root is the new root. If the noe becomes unstable, we first try to get a chil from the esignate helper. This changes the key that separates those two siblings. ❹ If getting the chil is impossible, the two siblings have sizes t an t 1, hence 2t 1 chilren, an they are legally fuse. The separating value (an one pointer) is remove from the parent. Extra principles for re-black trees ➀ ➁ ➂ Groups of 1-2-3 binary tree noes obey the rules of 2-3-4 tree. If a pointer in noe u points to a noe in the same group, it is re. Otherwise it is black. Agroup of 3 noes must have Λ shape: two re pointer in the root noe of the group.

CSE465, Fall 2009 February 25 4 B-trees an re-black trees: examples Inserting a, b, c, to an initially empty set. a a b a b c b a c a a a b b b a c c b b a c a c Worth to notice: after inserting c to the re-black tree we obtain a group of three noes of illegal shape, a rotation corrects this shape; after inserting to 2-3-4 tree the root noe becomes unstable an splits, which creates a new root; in re-black tree the root group is similarly split, an this split is obtaine by changing the colors of the pointers of the root noe: when they are re, the root is in the same group with the rest of the noe, when they are black, the root is a separate group, so the left an right subtrees are separate groups as well.

CSE465, Fall 2009 February 25 5 Inserting e, f, g to the set obtaine before. b b b a c e a c e f a c e f g b b b a c a a c e c e e f f b b b g a a a c e c e c f f e g Worth to notice: after inserting f we split a noe in 2-3-4 tree an a group in re-black tree; this brings the separating key,, to the parent noe or the parent group. The latter means that the pointer to the noe of becomes re.

CSE465, Fall 2009 February 25 6 Inserting h, i to the set obtaine before. b f b f a c e g h a c e g h i b a b f c f a c e g e g h h i b a b f c f a c e h e g g i h b f a c e g h

CSE465, Fall 2009 February 25 7 Inserting j to the set obtaine before. b f a c e g h i j b f h a c e g i j b f h a c e g i j b f b f a c e h a c e h g i g i j j b f a c e h g i j

CSE465, Fall 2009 February 25 8 Deleting a from the set obtaine before, in 2-3-4 tree. b f h c e g i j Empty noe is unstable, the sibling has no spare chilren, so they fuse into one noe: f h b c e g i j Again we have an unstable noe with one chil only, its sibling has a spare chil, so we transfer a chil: f h b c e g i j

CSE465, Fall 2009 February 25 9 Deleting a from the set obtaine before, in re-black tree. A group of noes that is too small is empty, its place is inicate by oubly-black pointer, which we inicate on the iagram with a black rectangle. When this group fuses with its sibling group, its parent group is reuce to zero noes an is now inicate by the oubly black pointer. b f b f c e h c e h g i g i j j Now this group receives extra chil, e, an extra (i.e. the only!) separating key,, while the helping group rops in size to one noe, h. f h b e g i c j

CSE465, Fall 2009 February 25 10 Tr eaps ranomness gives balance without rotations Binary search trees built by insertions alone, without eletions, in ranom orer have very goo epth on the average. However, the insertions o not have to come in ranom orer, an besies, we have eletions which are easy to implement but har to analyze: how the average epth changes if you perform both insertions an eletions. Luckily, avery simple trick allows to enforce the shape as if the the tree was built by insertions alone, in ranom orer. When we insert a new element we give it a ranom priority from some sufficiently large range. We will assume that this priority is a ranom real number from the range [0, 1]. The structures that form the tree now have 4 fiels: {key,pr,left,right. So when we insert a new element x we fin p{x,p,null,null. Then we connect this structure to the tree. The rules of that tree are the following Binary search orer of keys if s is a escenent of t->left then s->key < t->key if s is a escenent of t->right then s->key > t->key Heap orer of priorities if s is the parent of t then s->pr t->pr The heap orer enforces the shape as if the keys were inserte in the same orer as their priorities, the least priority is on the root, an in a subtree, with content etermine by the selection of the root an other ancestors, the smallest priority has uniformly ranom owner an this owner is at the root. We will see that the coe for enforcing that shape is very simple.

CSE465, Fall 2009 February 25 11 Insertion in treaps splitting In some applications an orere ictionary/tree can be split into two accoring to some key value x: one new ictionary will have keys smaller than x, the other will have the larger keys. Split is also useful uring insertion. We want to insert a new key x with priority p into tree t. s = {x,p,?,?; // magic allocates a new structure T = &t; // locate the subtree to moify while (t = *T, t!= NULL && t->pr < p) if (t->key < x) *T = t->left; else *T = t->right; if (t == NULL) { // insertion at the bottom *T = s; s->left = s->right = NULL; return; // replace noe *T with s, make chiren of s by splitting // ol *T into *L, *R, with keys smaller/larger than x L = &(s->left); R = &(s->right); *T = s; while (t!= NULL) if (t->key < x) *L = t, L = &(t->left), t = t->left; else *R = t, R = &(t->right), t = t->right; *L = *R = NULL;

CSE465, Fall 2009 February 25 12 Example how this works: s 24 2 10 0 10 0 10 0 A 36 1 T t 30 3 B A L 36 1 T s 24 2 R B t 30 3 A L 24 2 36 1 B 22 5 25 4 D C 22 5 25 4 D C 30 3 R t 25 4 C E 23 6 E 23 6 22 5 D F F E 23 6 10 0 10 0 F A 36 1 A 36 1 L 24 2 B 24 2 B 30 3 22 5 L t 30 3 25 4 R t 22 5 D C E F 23 6 L R t 25 4 D C E R t 23 6 F

CSE465, Fall 2009 February 25 13 Deletion in treaps meling In some applications two orere ictionaries/trees such that all keys from the first ictionary precee keys from the secon can be combine into one. This operation is calle mel. We want to elete key x. Welocate the structure with x an then we replace the pointer to that structure with the pointer to the mel of its chilren. voi remove(x,t) { while (t = *T) { // this breaks when t == NULL if (t->key < x)) *T = &(t->right); else if (t->key > x)) *T = &(t->left); else { *T = mel(t->left,t->right); free t; return; Mel is easier to implement recursively: tree mel(s,t) // assume keys in x smaller than in t { if (s == NULL) return t; else if (t == NULL) return s; else if (s->pr > t->pr) return s->right = mel(s->right,t), s; else return t->left = mel(s,t->left), t;

CSE465, Fall 2009 February 25 14 Priority queue an heaps Orere ictionary orinarily uses pointers for every key it stores an uses trees that are not perfectly balance. We can o better if we use it in a restricte way. In a typical application, we store objects with priorities an in some applications, the priorities can change over time. Set Operations: Initialize(Q), Insert(x, p, Q), DeleteMin(Q) Other upates: IncreasePriority(i, p, Q), DecreasePriority(i, p, Q), In the last operation, we ecrease the priority of an object at position i. We can know this position if we use cross-referencing. We implement priority queues using heaps. Heap is a tree in which parent noe has smaller (or equal) priority to the noe of the parent. This is calle heap orer or partial orer. Because we o not have other limitations, we are quite flexible how weplace the objects in the tree. The simplest heap with m objects is an array sector (H,1, m + 1) where position 1 is the root, an chilren of position i are: 2i an 2i + 1. Conversely, parent of i is i/2 (roune own). Example Example of a heap (H[i] has number i unerneath): 3 1 15 12 2 3 23 45 64 49 4 5 6 7 32 35 50 63 70 8 9 10 11 12

CSE465, Fall 2009 February 25 15 Restoring heap property We will efine all operations in a heap using two restore operations. Assume that <H,0,m> has heap property an that we want to change entry H[i] to x. Torestore the heap property we nee to rearrange the entries. We have two cases. The easier case is when x < H[i]. We will call this task ra(h, i, x), for restore after ecreasing. Wecan efine ra(h, i, x) recursively. rai(h,i,x) { if (i == 1 (p = i/2, x H[p])) { H[i] = x; return; H[i] = H[p]; rai(h,p,x); This formulation is recursive so wecan have an inuctive proof of correctness. Because the recursive call is the last statement in this function, we can eliminate this call using tail recursion principle. while (i > 1 && (p = i/2, x < H[p])) H[i] = H[p], i = p; H[i] = x; The worst case running time is c log 2 i.

CSE465, Fall 2009 February 25 16 Example: (H,1,13) forms a heap shown before an we execute ra(h,9,2). Location to be change by ra will be colore blue. 3 1 15 12 2 3 23 45 64 49 4 5 6 7 32 35 2 50 63 70 8 9 10 11 12 3 1 15 12 2 3 23 2 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 1 15 2 12 2 3 15 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 2 an over-write the root in the last step 1 3 12 2 3 15 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12

CSE465, Fall 2009 February 25 17 The more complicate case is when x > H[i]. We will call this task rai(h, i, m, x), for restore after increasing; here m is the number of keys store in the heap. We can efine rai(h, i, m, x) recursively. rai(h,i,m,x) { g = 2*i; // start computing the goo chil of i if (g > m) { // no chilren finish H[i] = x; return; if (g+1 m && H[g] > H[g+1]) g++; // goo chil has the smaller priority if (x H[g]) { H[i] = x; return; H[i] = H[g]; rai(h,g,m,x); Again, we can eliminate the tail recursion: while (g = 2*i, g < m) { if (g+1 m && H[g] > H[g+1]) g++; if (x H[g]) break; H[i] = H[g]; i = g; H[i] = x; The worst case running time is c log 2 m/i.

CSE465, Fall 2009 February 25 18 Example: (H, 1, 13) we execute rai(h, 0, 12, 39). Location to be change by rai are blue, the goo chil is yellow. 2 20 1 3 12 2 3 15 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 1 3 20 12 2 3 15 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 1 15 12 2 3 15 20 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12 3 1 15 12 2 3 20 45 64 49 4 5 6 7 32 23 50 63 70 8 9 10 11 12

CSE465, Fall 2009 February 25 19 Heapify, impose the heap property We can impose the heap property by pretening to restore it. Version 1: preten that all entries are an that you ecrease them to their actual values. If you o it in orer A[0], A[1],... then before you process A[i] the fragment < A, i, m > remains unchange. for (i = 2; i m; i++) ra(a,i,a[i]); The worst case running time is Σ log 2 i cm(log 2 m log 2 e) =Θ(m log m). c m 1 i=1 Version 2: preten that all entries are an that you increase them to their actual values. If you o it in orer A[m 1], A[m 2],... then before you process A[i] the fragment < A,0, i >remains unchange. for (i = m/2; i > 0; i--) rai(a,i,m,a[i]); The worst case running time is Σ log 2 m/i cm log 2 m cm(log 2 m log 2 e) = cm log 2 e =Θ(m). c m 1 i=1 Version 2 is much better!

CSE465, Fall 2009 February 25 20 Heap sort We want to sort (A,0, n). We create heap (H,1, n + 1) an then we remove the minimum from the heap an place it at the beginning of the resulting sequence. At any stage, (H,1, m + 1) = (A,0, m) contains the heap an (A, m, n) stores the resulting sequence. H = A-1; for (i = n/2; i > 0; i--) ra(h,i,n,h[i]); for (m = n; m > 1; m--) temp = H[m], H[m] = H[1], rai(h,1,m-1,temp); This sorts in the reverse orer. Tosort in increasing orer we nee to moify the heap so it has maximum at the root rather then the minimum.

CSE465, Fall 2009 February 25 21 Priority queue Abstract ata type: possible states: sets of elements of some comparable type; operations: MAKE_EMPTY, initialize empty set, INSERT, DELETE_MIN, remove an return the minimum. Implementation with boune capacity n, using array H[n] an m: MAKE_EMPTY() { m = 0; INSERT(x); { m++; ra(h,m,m,x) DELETE_MIN() { x = H[1]; m--; rai(h,1,m,h[m+1]); return x;