Module 4: Dictionaries and Balanced Search Trees

Similar documents
COMP Analysis of Algorithms & Data Structures

Module 4: Priority Queues

COMP Analysis of Algorithms & Data Structures

COMP Analysis of Algorithms & Data Structures

(2,4) Trees. 2/22/2006 (2,4) Trees 1

Module 5: Hashing. CS Data Structures and Data Management. Reza Dorrigiv, Daniel Roche. School of Computer Science, University of Waterloo

Search Trees - 1 Venkatanatha Sarma Y

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

10/23/2013. AVL Trees. Height of an AVL Tree. Height of an AVL Tree. AVL Trees

Self-Balancing Search Trees. Chapter 11

Outline. Computer Science 331. Insertion: An Example. A Recursive Insertion Algorithm. Binary Search Trees Insertion and Deletion.

Algorithms. AVL Tree

CISC 235: Topic 4. Balanced Binary Search Trees

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Inf 2B: AVL Trees. Lecture 5 of ADS thread. Kyriakos Kalorkoti. School of Informatics University of Edinburgh

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Data Structures and Algorithms

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

CS350: Data Structures Red-Black Trees

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

Advanced Set Representation Methods

CSC Design and Analysis of Algorithms. Lecture 7. Transform and Conquer I Algorithm Design Technique. Transform and Conquer

Augmenting Data Structures

Outline. Binary Tree

Multiway Search Trees. Multiway-Search Trees (cont d)

Lesson 21: AVL Trees. Rotation

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Search Trees: BSTs and B-Trees

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

Balanced Search Trees. CS 3110 Fall 2010

Motivation for B-Trees

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

Data Structures and Algorithms for Engineers

Recall: Properties of B-Trees

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

Some Search Structures. Balanced Search Trees. Binary Search Trees. A Binary Search Tree. Review Binary Search Trees

Transform & Conquer. Presorting

Practical session No. 6. AVL Tree

Lecture 6: Analysis of Algorithms (CS )

Algorithms. Deleting from Red-Black Trees B-Trees

Basic Data Structures (Version 7) Name:

Midterm solutions. n f 3 (n) = 3

Lecture 7. Transform-and-Conquer

CS350: Data Structures AVL Trees

Data Structures in Java

Module 8: Range-Searching in Dictionaries for Points

CSC Design and Analysis of Algorithms

Trees. Eric McCreath

CS350: Data Structures AA Trees

Lower Bound on Comparison-based Sorting

Design and Analysis of Algorithms Lecture- 9: B- Trees

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is?

(2,4) Trees Goodrich, Tamassia. (2,4) Trees 1

AVL Trees. See Section 19.4of the text, p

CS 234. Module 6. October 16, CS 234 Module 6 ADT Dictionary 1 / 33

Analysis of Algorithms

CS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives.

CS102 Binary Search Trees

Lecture 3: B-Trees. October Lecture 3: B-Trees

CSI33 Data Structures

COSC160: Data Structures Balanced Trees. Jeremy Bolton, PhD Assistant Teaching Professor

Disk Accesses. CS 361, Lecture 25. B-Tree Properties. Outline

CSCI Trees. Mark Redekopp David Kempe

Programming II (CS300)

CSE 530A. B+ Trees. Washington University Fall 2013

Binary search trees (BST) Binary search trees (BST)

CS 3343 Fall 2007 Red-black trees Carola Wenk

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

CS 234. Module 8. November 15, CS 234 Module 8 ADT Priority Queue 1 / 22

CS301 - Data Structures Glossary By

Lecture 23: Binary Search Trees

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

Balanced Binary Search Trees

CS 350 : Data Structures Binary Search Trees

CSCI 136 Data Structures & Advanced Programming. Lecture 25 Fall 2018 Instructor: B 2

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees

Module 2: Priority Queues

Search Trees. COMPSCI 355 Fall 2016

CIS265/ Trees Red-Black Trees. Some of the following material is from:

Balanced Trees Part One

AVL Tree Definition. An example of an AVL tree where the heights are shown next to the nodes. Adelson-Velsky and Landis

TREES. Trees - Introduction

CMPS 2200 Fall 2017 Red-black trees Carola Wenk

Module 2: Priority Queues

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

CS 171: Introduction to Computer Science II. Binary Search Trees

Balanced Search Trees

Module 2: Priority Queues

CMPS 2200 Fall 2015 Red-black trees Carola Wenk

CS F-11 B-Trees 1

Data Structure - Advanced Topics in Tree -

Data Structure: Search Trees 2. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

Hash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell

Transcription:

Module 4: Dictionaries and Balanced Search Trees CS 24 - Data Structures and Data Management Jason Hinek and Arne Storjohann Based on lecture notes by R. Dorrigiv and D. Roche David R. Cheriton School of Computer Science, University of Waterloo Winter 22 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 / 3

Dictionary ADT A dictionary is a collection of items, each of which contains a key and some data and is called a key-value pair (KVP). Keys can be compared and are (typically) unique. Operations: search(k) insert(k, v) delete(k) optional: join, isempty, size, etc. Examples: symbol table, license plate database Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 2 / 3

Elementary Implementations Common assumptions: Dictionary has n KVPs Each KVP uses constant space (if not, the value could be a pointer) Comparing keys takes constant time Unordered array or linked list search Θ(n) insert Θ() delete Θ(n) (need to search) Ordered array search Θ(log n) insert Θ(n) delete Θ(n) Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 3 / 3

Binary Search Trees (review) Structure A BST is either empty or contains a KVP, left child BST, and right child BST. Ordering Every key k in T.left is less than the root key. Every key k in T.right is greater than the root key. 5 6 25 23 29 8 4 27 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 4 / 3

BST Search and Insert search(k) Compare k to current node, stop if found, else recurse on subtree unless it s empty Example: search(24) 5 6 25 23 29 8 4 27 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 5 / 3

BST Search and Insert search(k) Compare k to current node, stop if found, else recurse on subtree unless it s empty Example: search(24) 5 6 25 23 29 8 4 27 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 5 / 3

BST Search and Insert search(k) Compare k to current node, stop if found, else recurse on subtree unless it s empty Example: search(24) 5 6 25 23 29 8 4 27 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 5 / 3

BST Search and Insert search(k) Compare k to current node, stop if found, else recurse on subtree unless it s empty Example: search(24) 5 6 25 23 29 8 4 27 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 5 / 3

BST Search and Insert search(k) Compare k to current node, stop if found, else recurse on subtree unless it s empty insert(k, v) Search for k, then insert (k, v) as new node Example: insert(24,...) 5 6 25 23 29 8 4 24 27 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 5 / 3

BST Delete If node is a leaf, just delete it. 5 6 25 23 29 8 4 24 27 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 6 / 3

BST Delete If node is a leaf, just delete it. 5 6 25 23 29 8 4 24 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 6 / 3

BST Delete If node is a leaf, just delete it. If node has one child, move child up 5 6 25 23 29 8 4 24 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 6 / 3

BST Delete If node is a leaf, just delete it. If node has one child, move child up 5 25 8 4 23 29 24 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 6 / 3

BST Delete If node is a leaf, just delete it. If node has one child, move child up Else, swap with successor or predecessor node and then delete 5 25 8 4 23 29 24 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 6 / 3

BST Delete If node is a leaf, just delete it. If node has one child, move child up Else, swap with successor or predecessor node and then delete 23 25 8 4 5 29 24 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 6 / 3

BST Delete If node is a leaf, just delete it. If node has one child, move child up Else, swap with successor or predecessor node and then delete 23 25 8 4 24 29 5 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 6 / 3

Height of a BST search, insert, delete all have cost Θ(h), where h = height of the tree = max. path length from root to leaf If n items are inserted one-at-a-time, how big is h? Worst-case: Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 7 / 3

Height of a BST search, insert, delete all have cost Θ(h), where h = height of the tree = max. path length from root to leaf If n items are inserted one-at-a-time, how big is h? Worst-case: n = Θ(n) Best-case: Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 7 / 3

Height of a BST search, insert, delete all have cost Θ(h), where h = height of the tree = max. path length from root to leaf If n items are inserted one-at-a-time, how big is h? Worst-case: n = Θ(n) Best-case: lg(n + ) = Θ(log n) Average-case: Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 7 / 3

Height of a BST search, insert, delete all have cost Θ(h), where h = height of the tree = max. path length from root to leaf If n items are inserted one-at-a-time, how big is h? Worst-case: n = Θ(n) Best-case: lg(n + ) = Θ(log n) Average-case: Θ(log n) (just like recursion depth in quick-sort) Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 7 / 3

AVL Trees Introduced by Adel son-vel skĭı and Landis in 962, an AVL Tree is a BST with an additional structural property: The heights of the left and right subtree differ by at most. (The height of an empty tree is defined to be.) At each non-empty node, we store height(r) height(l) {,, }: means the tree is left-heavy means the tree is balanced means the tree is right-heavy Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL Trees Introduced by Adel son-vel skĭı and Landis in 962, an AVL Tree is a BST with an additional structural property: The heights of the left and right subtree differ by at most. (The height of an empty tree is defined to be.) At each non-empty node, we store height(r) height(l) {,, }: means the tree is left-heavy means the tree is balanced means the tree is right-heavy We could store the actual height, but storing balances is simpler and more convenient. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL insertion To perform insert(t, k, v): First, insert (k, v) into T using usual BST insertion Then, move up the tree from the new leaf, updating balance factors. If the balance factor is,, or, then keep going. If the balance factor is ±2, then call the fix algorithm to rebalance at that node. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 9 / 3

How to fix an unbalanced AVL tree Goal: change the structure without changing the order A B C D Notice that if heights of A, B, C, D differ by at most, then the tree is a proper AVL tree. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 / 3

Right Rotation This is a right rotation on node z: z y y x z x D C A B C D A B Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 / 3

Right Rotation This is a right rotation on node z: z y y x z x D C A B C D A B Note: Only two edges need to be moved, and two balances updated. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 / 3

Left Rotation This is a left rotation on node z: z y y z x A x B A B C D C D Again, only two edges need to be moved and two balances updated. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 2 / 3

Pseudocode for rotations rotate-right(t ) T : AVL tree returns rotated AVL tree. newroot T.left 2. T.left newroot.right 3. newroot.right T 4. return newroot rotate-left(t ) T : AVL tree returns rotated AVL tree. newroot T.right 2. T.right newroot.left 3. newroot.left T 4. return newroot Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 3 / 3

Double Right Rotation This is a double right rotation on node z: z z y x x D y D A C B C A B First, a left rotation on the left subtree (y). Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 4 / 3

Double Right Rotation This is a double right rotation on node z: z x y y z x D A A B C D B C First, a left rotation on the left subtree (y). Second, a right rotation on the whole tree (z). Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 4 / 3

Double Left Rotation This is a double left rotation on node z: z x y z y A x D A B C D B C Right rotation on right subtree (y), followed by left rotation on the whole tree (z). Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 5 / 3

Fixing a slightly-unbalanced AVL tree Idea: Identify one of the previous 4 situations, apply rotations fix(t ) T : AVL tree with T.balance = ±2 returns a balanced AVL tree. if T.balance = 2 then 2. if T.left.balance = then 3. T.left rotate-left(t.left) 4. return rotate-right(t ) 5. else if T.balance = 2 then 6. if T.right.balance = then 7. T.right rotate-right(t.right) 8. return rotate-left(t ) Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 6 / 3

AVL Tree Operations search: Just like in BSTs, costs Θ(height) insert: Shown already, total cost Θ(height) fix will be called at most once. delete: First search, then swap with successor (as with BSTs), then move up the tree and apply fix (as with insert). fix may be called Θ(height) times. Total cost is Θ(height). Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 7 / 3

AVL tree examples Example: insert(8) 22-3 4 4 28 37 6 3 8-46 6 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL tree examples Example: insert(8) 22-3 4 4 28 37 6 3 8-46 8 6 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL tree examples Example: insert(8) 22-3 4 4 28 37 6 3 8-46 8 6 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL tree examples Example: insert(8) 22-3 4 2 4 28 37 6 3 8-46 8 6 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL tree examples Example: insert(8) 22-3 6 4 28 37 4 8 3 8-46 6 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL tree examples Example: delete(22) 22-3 6 4 28 37 4 8 3 8-46 6 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL tree examples Example: delete(22) 28-3 6 4 37 4 8 3 8-46 6 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL tree examples Example: delete(22) 28-3 2 6 4 37 4 8 3 8-46 6 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL tree examples Example: delete(22) 28-2 37 6 4 3 46 4 8 3 8-6 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

AVL tree examples Example: delete(22) 4-28 6 3 8-37 4 8 6 3 46 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 8 / 3

Height of an AVL tree Define N(h) to be the least number of nodes in a height-h AVL tree. One subtree must have height at least h, the other at least h 2: + N(h ) + N(h 2), h N(h) =, h =, h = What sequence does this look like? Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 9 / 3

Height of an AVL tree Define N(h) to be the least number of nodes in a height-h AVL tree. One subtree must have height at least h, the other at least h 2: + N(h ) + N(h 2), h N(h) =, h =, h = What sequence does this look like? The Fibonacci sequence! N(h) = F h+3 = ϕ h+3 5, where ϕ = + 5 2 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 9 / 3

AVL Tree Analysis Easier lower bound on N(h): N(h) > 2N(h 2) > 4N(h 4) > 8N(h 6) > > 2 i N(h 2i) 2 h/2 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 2 / 3

AVL Tree Analysis Easier lower bound on N(h): N(h) > 2N(h 2) > 4N(h 4) > 8N(h 6) > > 2 i N(h 2i) 2 h/2 Since n > 2 h/2, h 2 lg n, and an AVL tree with n nodes has height O(log n). Also, n 2 h+, so the height is Θ(log n). search, insert, delete all cost Θ(log n). Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 2 / 3

2-3 Trees A 2-3 Tree is like a BST with additional structual properties: Every node either contains one KVP and two children, or two KVPs and three children. All the leaves are at the same level. (A leaf is a node with empty children.) Searching through a -node is just like in a BST. For a 2-node, we must examine both keys and follow the appropriate path. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 2 / 3

Insertion in a 2-3 tree First, we search to find the leaf where the new key belongs. If the leaf has only KVP, just add the new one to make a 2-node. Otherwise, order the three keys as a < b < c. Split the leaf into two -nodes, containing a and c, and (recursively) insert b into the parent along with the new link. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 22 / 3

2-3 Tree Insertion Example: insert(9) 25 43 8 3 36 5 2 2 24 28 33 39 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 23 / 3

2-3 Tree Insertion Example: insert(9) 25 43 8 3 36 5 2 2 24 28 33 39 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 23 / 3

2-3 Tree Insertion Example: insert(9) 25 43 8 3 36 5 2 9 2 24 28 33 39 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 23 / 3

2-3 Tree Insertion Example: insert(9) 25 43 8 2 3 36 5 2 9 24 28 33 39 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 23 / 3

2-3 Tree Insertion Example: insert(4) 25 43 8 2 3 36 5 2 9 24 28 33 39 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 23 / 3

2-3 Tree Insertion Example: insert(4) 25 43 8 2 3 36 5 2 9 24 28 33 39 4 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 23 / 3

2-3 Tree Insertion Example: insert(4) 25 43 8 2 3 36 4 5 2 9 24 28 33 39 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 23 / 3

2-3 Tree Insertion Example: insert(4) 25 36 43 8 2 3 4 5 2 9 24 28 33 39 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 23 / 3

2-3 Tree Insertion Example: insert(4) 36 25 43 8 2 3 4 5 2 9 24 28 33 39 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 23 / 3

Deletion from a 2-3 Tree As with BSTs and AVL trees, we first swap the KVP with its successor, so that we always delete from a leaf. Say we re deleting KVP x from a node V : If V is a 2-node, just delete x. ElseIf V has a 2-node immediate sibling U, perform a transfer: Put the intermediate KVP in the parent between V and U into V, and replace it with the adjacent KVP from U. Otherwise, we merge V and a -node sibling U: Remove V and (recursively) delete the intermediate KVP from the parent, adding it to U. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 24 / 3

2-3 Tree Deletion Example: delete(43) 36 25 43 8 2 3 4 5 2 9 24 28 33 39 42 48 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(43) 36 25 48 8 2 3 4 5 2 9 24 28 33 39 42 56 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(43) 36 25 48 8 2 3 4 56 2 9 24 28 33 39 42 5 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(9) 36 25 48 8 2 3 4 56 2 9 24 28 33 39 42 5 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(9) 36 25 48 8 3 4 56 2 2 24 28 33 39 42 5 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(9) 36 25 48 8 3 4 56 2 2 24 28 33 39 42 5 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(42) 36 25 48 8 3 4 56 2 2 24 28 33 39 42 5 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(42) 36 25 48 8 3 56 2 2 24 28 33 39 4 5 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(42) 36 25 8 3 48 56 2 2 24 28 33 39 4 5 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(42) 25 36 8 3 48 56 2 2 24 28 33 39 4 5 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

2-3 Tree Deletion Example: delete(42) 25 36 8 3 48 56 2 2 24 28 33 39 4 5 62 Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 25 / 3

B-Trees The 2-3 Tree is a specific type of B-tree: A B-tree of minsize d is a search tree satisfying: Each node contains at most 2d KVPs. Each non-root node contains at least d KVPs. All the leaves are at the same level. Some people call this a B-tree of order (2d + ), or a (d +, 2d + )-tree. A 2-3 tree has d =. search, insert, delete work just like for 2-3 trees. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 26 / 3

Height of a B-tree What is the least number of KVPs in a height-h B-tree? Level Nodes Node size KVPs 2 d 2d 2 2(d + ) d 2d(d + ) 3 2(d + ) 2 d 2d(d + ) 2 h 2(d + ) h d 2d(d + ) h h Total: + 2d(d + ) i = 2(d + ) h i= Therefore height of tree with n nodes is Θ ( (log n)/(log d) ). Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 27 / 3

Analysis of B-tree operations Assume each node stores its KVPs and child-pointers in a dictionary that supports O(log d) search, insert, and delete. Then search, insert, and delete work just like for 2-3 trees, and each require Θ(height) node operations. Total cost is O ( log n log d ) (log d) = O(log n). Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 28 / 3

Dictionaries in external memory Tree-based data structures have poor memory locality: If an operation accesses m nodes, then it must access m spaced-out memory locations. Observation: Accessing a single location in external memory (e.g. hard disk) automatically loads a whole block (or page ). In an AVL tree or 2-3 tree, Θ(log n) pages are loaded in the worst case. If d is small enough so a 2d-node fits into a single page, then a B-tree of minsize d only loads Θ ( (log n)/(log d) ) pages. This can result in a huge savings: memory access is often the largest time cost in a computation. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 29 / 3

B-tree variations Max size 2d + : Permitting one additional KVP in each node allows insert and delete to avoid backtracking via pre-emptive splitting and pre-emptive merging. Red-black trees: Identical to a B-tree with minsize and maxsize 3, but each 2-node or 3-node is represented by 2 or 3 binary nodes, and each node holds a color value of red or black. B + -trees: All KVPs are stored at the leaves (interior nodes just have keys), and the leaves are linked sequentially. Hinek & Storjohann (CS, UW) CS24 - Module 4 Winter 22 3 / 3