Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Similar documents
2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Data Structures and Algorithms

CS F-11 B-Trees 1

CSE 530A. B+ Trees. Washington University Fall 2013

Self-Balancing Search Trees. Chapter 11

UNIT III BALANCED SEARCH TREES AND INDEXING

Motivation for B-Trees

2-3 and Trees. COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

CS350: Data Structures B-Trees

CSCI Trees. Mark Redekopp David Kempe

CISC 235: Topic 4. Balanced Binary Search Trees

CSE 373 OCTOBER 25 TH B-TREES

Trees. A tree is a directed graph with the property

Data Structure - Advanced Topics in Tree -

Advanced Set Representation Methods

CS350: Data Structures Red-Black Trees

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

CS 350 : Data Structures B-Trees

Red-black trees (19.5), B-trees (19.8), trees

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees

Balanced Search Trees

(2,4) Trees. 2/22/2006 (2,4) Trees 1

Material You Need to Know

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

Design and Analysis of Algorithms Lecture- 9: B- Trees

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Comp 335 File Structures. B - Trees

THE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan

COMP : Trees. COMP20012 Trees 219

CMSC 341 Lecture 14: Priority Queues, Heaps

TREES. Trees - Introduction

CMSC 341 Lecture 15 Leftist Heaps

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

CMSC 341 Lecture 15 Leftist Heaps

Module 4: Dictionaries and Balanced Search Trees

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

catch(...){ printf( "Assignment::SolveProblem() AAAA!"); }

CMSC 341 Leftist Heaps

Recall: Properties of B-Trees

Multi-Way Search Trees

Hash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell

Multi-Way Search Trees

Algorithms. AVL Tree

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Chapter 12 Advanced Data Structures

M-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss)

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Trees. Eric McCreath

Multi-way Search Trees

B-trees. It also makes sense to have data structures that use the minimum addressable unit as their base node size.

Data Structures Week #6. Special Trees

Multiway Search Trees. Multiway-Search Trees (cont d)

Define the red- black tree properties Describe and implement rotations Implement red- black tree insertion

Data Structures Week #6. Special Trees

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

M-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =

13.4 Deletion in red-black trees

CSE332: Data Abstractions Lecture 7: B Trees. James Fogarty Winter 2012

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is?

Data Structures and Algorithms for Engineers

8. Binary Search Tree

Binary Search Trees. Analysis of Algorithms

CMSC 341 Priority Queues & Heaps. Based on slides from previous iterations of this course

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

CSIT5300: Advanced Database Systems

CS 331 DATA STRUCTURES & ALGORITHMS BINARY TREES, THE SEARCH TREE ADT BINARY SEARCH TREES, RED BLACK TREES, THE TREE TRAVERSALS, B TREES WEEK - 7

Indexing and Hashing

Search Trees. The term refers to a family of implementations, that may have different properties. We will discuss:

Algorithms. Deleting from Red-Black Trees B-Trees

Lecture 3: B-Trees. October Lecture 3: B-Trees

Tree-Structured Indexes

Database Systems. File Organization-2. A.R. Hurson 323 CS Building

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

Data Structures and Algorithms

Search Trees - 1 Venkatanatha Sarma Y

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University

FORTH SEMESTER DIPLOMA EXAMINATION IN ENGINEERING/ TECHNOLIGY- OCTOBER, 2012 DATA STRUCTURE

Introduction. Choice orthogonal to indexing technique used to locate entries K.

Trees. (Trees) Data Structures and Programming Spring / 28

CMPS 2200 Fall 2017 Red-black trees Carola Wenk

Lecture 5. Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs

Introduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana

Splay Trees. (Splay Trees) Data Structures and Programming Spring / 27

The priority is indicated by a number, the lower the number - the higher the priority.

Augmenting Data Structures

Balanced Trees Part One

CSC 261/461 Database Systems Lecture 17. Fall 2017

Section 5.3: Event List Management

Tree-Structured Indexes

Search Trees. COMPSCI 355 Fall 2016

12 July, Red-Black Trees. Red-Black Trees

B-Trees and External Memory

CS127: B-Trees. B-Trees

Transcription:

Lecture 10: Multi-way Search Trees: intro to B-trees 2-3 trees 2-3-4 trees Multi-way Search Trees A node on an M-way search tree with M 1 distinct and ordered keys: k 1 < k 2 < k 3 <... < k M 1, has M children {T 0, T 1, T 2,..., T M 1 } Every element in child T i has a value larger than k i and smaller than k i+1 Number of valid keys doesn t have to be the same for every node on the tree T k 1 k 2 k 3 k M 1 T: 2 1,5,8 T 0 T 1 T 2 T 3 T M 1 9,15 11 M-Way Search Trees Representation M-Way Search A node of an M-way search tree can be represented as: int m; m=3; struct mnode { int in_use; Key keys[m-1]; mnode *children[m]; 2 keys children 9 15 } in_use: how many of the m keys of" this node are currently in use Search on an M-way search tree is similar to that on a BST, with more than 1 compares per node (a BST is just an M-way search tree with M = 2) If all nodes have M 1 keys, with linear search on the nodes, it takes O(M log M N) time to search an M-way search tree of N internal nodes ((M 1) N keys) With binary search on the nodes, it takes" O(log 2 M log M N) time

Advantage 1: External Search M-way search tree can be used as an index for external (disk) searching of large files/databases Characteristics of disk access: orders of magnitude slower than memory access for efficiency, data usually transferred in blocks of 512 bytes to 8KB To speed up external search, put as much data as possible on each disk block, for example, by making each node on an M-way search tree the size of a disk block B-Trees Invented by Bayer and McCreight in 1972 A B-tree is a Balanced M-way search tree, M 2 (usually ~100) Search takes O(log M log N) Insertion and removal each takes O(M log N) time B-Trees are covered in detail in EECS 484, here we look at the in-memory versions: 2-3 trees and 2-3-4 trees (a.k.a., 2-4 trees) Advantage 2: Balanced Search Trees AVL trees keep a BST" balanced by limiting how" unbalanced a tree can be Perfect binary trees are" by definition balance, but perfect binary trees of height h have exactly 2 h+1 1 internal nodes, so only trees with 1, 3, 7, 15, 31, 63, internal nodes can be balanced... Balance M-way trees prevent a tree from becoming unbalanced by storing more than 1 keys per node such that the trees are always perfect (but not binary) trees BST 2-3 Tree insert 39, 38,... 32

2-3 Trees Properties (balance condition) of 2-3 trees: 1. all leaf nodes at the same level and contain 1 or 2 keys 2. an internal node has either 1 key and 2 children" (a 2-node) or 2 keys and 3 children (a 3-node) 3. a key in an internal node is between the keys in its adjacent children 2-3 Tree Example Demo: http://slady.net/java/bt/view.php?w=600&h=450 S: Small L: Large keys < S keys > S keys < S S < keys < L keys > L 2-node 3-node [Carrano] 2-3 Trees Search Times A 2-3 tree of height h has least number of nodes when all internal nodes are 2-nodes (a BST) since all leaves must be at the same level, the tree is a perfect tree and the number of nodes (and therefore keys) is n = 2 h+1 1 h = floor(log 2 n) A 2-3 tree of height h has the largest number of nodes when all internal nodes are 3-nodes h number of nodes: N = 3 i = (3 h+1 1) 2 i=0 number of keys (each node has 2 keys):" n = 3 h+1 1 h = floor(log 3 n) Search time on 2-3 trees: O(log n) 2-3 Trees Insert As with BST, a new node is always inserted as a leaf node 1. Search for leaf where key belongs; remember the search path 2. If leaf is a 2-node, add key to leaf 3. If leaf is a 3-node, adding the new key makes it an invalid node with 3 keys, split the invalid node into two 2-nodes, with the smallest and largest keys, and pass the middle key up to parent 4. If parent is a 2-node, add the child s middle key with" the two new children, else split parent by Step 3 above 5. If there s no parent, create a new root" (increase tree height by 1) Observation: whereas a BST increases height by extending a single path, a 2-3 tree increases height globally by raising the root, hence it s always balanced

Splitting a Leaf Node Splitting a Non-Leaf Node [Gordon-Ross] [Gordon-Ross] Splitting and Raising the Root Insert 39 insert into a 2-node leaf, no splitting necessary [Gordon-Ross] [Gordon-Ross]

Insert 38 Insert 38: Split Leaf Node insert to a 3-node leaf, splitting necessary [Gordon-Ross] insert to a 3-node leaf, splitting necessary [Gordon-Ross] Exercise: Insert 15 Exercise: Insert 75 Then 85 30 50 15 39 10 20 38 40

2-3 Trees Removal 1. If item to be removed is not in a leaf node, swap with the max of the child to the key s right (the next bigger item, or in-order successor) 2. If n is a 3-node," swap Remove 65 remove item, done 3. else if n is a 2-node: while (n has no item && n is not root ) { let p be the parent of n; let q be the adjacent sibling of n (left or right); if (q is a 3-node) rotate items; else merge nodes; n=p; } Remove 70 if (n has no item, n must be root) the child of n becomes the new root, remove n; next slide Removal: Leaf Node Merging: Sibling is not a 3-node: merge nodes move item from parent to sibling merge to left to fill node left to right (think: array) merging could leave" [Singh,Carrano] parent without" any item S: Small L: Large P: Parent n n Merge Merge Removal: Leaf Node Rotation: Sibling is a 3-node: redistribute items" between siblings and parent take from right to empty parent right to left (think: array implementation) 100 [Singh,Carrano] n Remove 100 Rotate S: Small L: Large P: Parent Removal: Non-Leaf Node Rotation: Sibling is a 3-node: redistribute" n items adopt child Merging: Sibling is not a 3-node: merge nodes move item from" parent to sibling adopt child of n S: Small L: Large P: Parent If n s parent ends up without item, apply process on parent n PS: Parent of S PL: Parent of L [Singh,Carrano]

Removal: Root Remove 80: Percolated Merging If merging process reaches the root and root is without item remove root Observation: whereas a BST pushes holes down to the leaves, a 2-3 tree percolates holes up and decreases height globally by lowering the root can t redistribute, must merge swap merge [Singh,Carrano] Remove 80: Percolated Merging Remove 80: Percolated Merging remove root merge violates 2-3 tree property, merge recursively up the tree root is now empty, set tree to point to new root

Remove 37 Then 70 1. Removal always begins at a leaf node Remove 37 2. If item to be removed is not in a leaf node," swap with in-order successor 3. If n is a 3-node," remove item, done 4. if n is a 2-node: while (n has no item && n is not root ) { let p be the parent of n; let q be the adjacent sibling of n (left or right); if (q is a 3-node) rotate items; else merge nodes; n=p; } if (n has no item, n must be root) the child of n becomes the new root, remove n; swap Remove 37 merge Remove 70 swap rotate merge merge

Remove 70 2-3-4 Trees 38 50 80 Similar to 2-3 trees also known as 2-4 trees demo: http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml 4-nodes can have 3 items and 4 children 4-node Why bother? Unlike with 2-3 trees, insertions and removals in 2-3-4 trees can be done in one pass [Singh,Carrano] 2-3-4 Tree Example 2-3-4 Trees Insert Items are inserted at leaf nodes Since a 4-node cannot take on another item," 4-nodes are preemptively split up during the insertion process a 4-node On the way from the root down to the leaf:" split up all 4-nodes on the way insertion can be done in one pass" (in 2-3 trees, a reverse pass is likely necessary) no worrying about overflowing a node when we actually do the insertion the only kind of node that can overflow (a 4-node) has been made a 2- or 3-node [Singh,Carrano]

2-3-4 Trees Insert Splitting a 4-node S: Small M: Medium L: Large 2-3-4 Trees Insert Splitting a 4-node whose parent is a 2-node S: Small M: Medium L: Large P: Parent 2-3-4 Trees Insert 2-3-4 Trees Insert Example Inserting 60, 30, 10, 20, 50, 40, 70, 80, 15, 90, 100 Splitting a 4-node whose parent is a" 3-node 20 pre-split 20 S: Small M: Medium L: Large P: Parent Q: Parent... 50, 40...

2-3-4 Trees Insert Example Inserting 50, 40,... 2-3-4 Trees Insert Example Inserting 70... 70 pre-split 70... 70,...... 80, 15... 2-3-4 Trees Insert Example 2-3-4 Trees Insert Example Inserting 80, 15... Inserting 90... 100 pre-split 90 pre-split 90 raise root... 90...... 100...

2-3-4 Trees Insert Example Inserting 100... 100 2-3-4 Trees Removal Removal always begins at a leaf node" swap item of non-leaf node with in-order successor Whereas a 4-node can overflow during insertion, a 2-node can become empty during removal On the way from root down to the leaf:" turn 2-nodes (except root) into 3-nodes" prevents a 2-node from becoming an empty node deletion can be done in one pass" (in 2-3 trees, a reverse pass is likely necessary) [Singh,Carrano] Removal: 2-Node 3-Node Rotate: if adjacent sibling is a 3- or 4-node" redistribute items from sibling adopt child delete 35 Removal: 2-Node 3-Node Merge: if adjacent sibling is a 2-node redistribute item from parent" parent has at least 2 items, unless it s root merge nodes delete 35 30 50 Rotate 20 50 30 50 Merge 50 10 20 40 10 30 40 10 40 10 30 40 25 35 25 25 35 25 35

Removal: 2-Node 3-Node Root merge: if parent is root and both parent and adjacent sibling are 2-nodes merge with parent and sibling delete 35 10 30 Merge 25 35 40 10 30 40 25 35 2-3-4 Trees Removal Summary On the way from root down to the leaf:" turn 2-nodes (except root) into 3-nodes Rotate: if adjacent sibling is a 3- or 4-node" redistribute items from sibling, take from right adopt child Merge: adjacent sibling is a 2-node redistribute item from parent;" parent has at least 2 items, unless it s root merge nodes, merge left Exercise: remove" 32, 35, 40, 38, 39, 37, 60 Remove 32, 35, 40, 38, 39, 37, 60 Remove 32, 35, 40, 38, 39, 37, 60 Remove 32 37 50 Remove 40 37 50 30 35 39 70 90 10 20 32 33 34 36 38 40 60 80 100 Remove 35 pre-rotate 37 50 30 34 39 70 90 10 20 33 36 38 40 60 80 100 pre-rotate 37 70 30 34 39 70 90 10 20 33 35 36 38 40 60 80 100 30 34 39 50 90 10 20 33 36 38 40 60 80 100

Remove 32, 35, 40, 38, 39, 37, 60 37 70 Remove 40 30 34 10 20 36 33 38 90 60 80 10 20 100 36 38 39 40 90 60 80 10 20 100 10 20 33 39 pre-merge 30 10 20 33 90 80 80 90 60 38 39 80 100 10 20 100 33 90 50 30 90 60 34 70 34 70 pre-merge 50 100 Remove 32, 35, 40, 38, 39, 37, 60 34 70 36 37 39 36 Remove 37 60 80 37 50 37 50 36 33 60 38 39 30 34 70 30 36 pre-rotate Remove 32, 35, 40, 38, 39, 37, 60 Remove 39 33 90 50 30 34 37 70 50 30 34 33 40 37 70 Remove 38 39 50 pre-merge 10 20 Remove 32, 35, 40, 38, 39, 37, 60 60 36 37 80 70 90 30 34 50 100 10 20 100 33 36 37 60 80 100

Remove 32, 35, 40, 38, 39, 37, 60 Compared to 2-3 Trees Remove 60 70 30 34 50 root not pre-merged 90 Insertion and deletion are easier for 2-4 tree one pass no need to percolate over/under-flow node" all the way back up to root 10 20 10 20 33 36 60 80 100 70 pre-merge 30 34 90 33 36 50 60 80 100 But at the cost of: extra comparison in each node wasted space in each node (a 2-node is actually a" 4-node with two empty slots and 2 null pointers) pre-emptive splitting of 4-nodes pre-allocates space that may not be needed right away further wasting space number of NULL pointers in a tree with N internal nodes is 4N (N 1) = 3N+1 [Rosenfeld,Brinton] Implementation While 2-3 trees and 2-4 trees are conceptually clean, their implementation is complicated because we need to maintain multiple node types and there are a lot of cases to consider, such as whether we are redistributing from a left sibling or a right sibling merging with a 2-node or a 3-node merging with the small or the large item of the parent passing a node to a 2-node or to a 3-node parent filling the small, middle, or large item slot at the parent adopting a left child or a right child rotating left or right It would be nice if we could simplify these cases and reduce the amount of wasted space by turning 2-3 and 2-4 trees into binary trees [Rosenfeld]