Red-black trees (19.5), B-trees (19.8), trees

Similar documents
CS350: Data Structures Red-Black Trees

Chapter 12 Advanced Data Structures

12 July, Red-Black Trees. Red-Black Trees

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Self-Balancing Search Trees. Chapter 11

Binary Search Trees. Analysis of Algorithms

Binary search trees (chapters )

ECE 242 Data Structures and Algorithms. Trees IV. Lecture 21. Prof.

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Algorithms. AVL Tree

Binary search trees (chapters )

Balanced Search Trees. CS 3110 Fall 2010

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Red-Black trees are usually described as obeying the following rules :

B Tree. Also, every non leaf node must have at least two successors and all leaf nodes must be at the same level.

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

CISC 235: Topic 4. Balanced Binary Search Trees

Motivation for B-Trees

Binary heaps (chapters ) Leftist heaps

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

CSE 530A. B+ Trees. Washington University Fall 2013

CS350: Data Structures B-Trees

Trees. (Trees) Data Structures and Programming Spring / 28

Augmenting Data Structures

Search Trees - 2. Venkatanatha Sarma Y. Lecture delivered by: Assistant Professor MSRSAS-Bangalore. M.S Ramaiah School of Advanced Studies - Bangalore

BRONX COMMUNITY COLLEGE of the City University of New York DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Some Search Structures. Balanced Search Trees. Binary Search Trees. A Binary Search Tree. Review Binary Search Trees

Red-black tree. Background and terminology. Uses and advantages

Algorithms. Deleting from Red-Black Trees B-Trees

CS102 Binary Search Trees

AVL Trees (10.2) AVL Trees

Laboratory Module X B TREES

CS 350 : Data Structures B-Trees

CS 331 DATA STRUCTURES & ALGORITHMS BINARY TREES, THE SEARCH TREE ADT BINARY SEARCH TREES, RED BLACK TREES, THE TREE TRAVERSALS, B TREES WEEK - 7

Define the red- black tree properties Describe and implement rotations Implement red- black tree insertion

COMP171. AVL-Trees (Part 1)

CS350: Data Structures AA Trees

Balanced Search Trees

Data Structures in Java

AVL Trees Goodrich, Tamassia, Goldwasser AVL Trees 1

Comp 335 File Structures. B - Trees

Balanced Trees Part One

CS 206 Introduction to Computer Science II

Unit III - Tree TREES

Trees. Eric McCreath

AVL Tree Definition. An example of an AVL tree where the heights are shown next to the nodes. Adelson-Velsky and Landis

CIS265/ Trees Red-Black Trees. Some of the following material is from:

Trees. A tree is a directed graph with the property

Ch04 Balanced Search Trees

A red-black tree is a balanced binary search tree with the following properties:

Data Structures Lesson 7

Copyright 1998 by Addison-Wesley Publishing Company 147. Chapter 15. Stacks and Queues

Splay Trees. (Splay Trees) Data Structures and Programming Spring / 27

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

Friday Four Square! 4:15PM, Outside Gates

Lecture 21: Red-Black Trees

Red-Black-Trees and Heaps in Timestamp-Adjusting Sweepline Based Algorithms

BINARY SEARCH TREES cs2420 Introduction to Algorithms and Data Structures Spring 2015

AVL Trees / Slide 2. AVL Trees / Slide 4. Let N h be the minimum number of nodes in an AVL tree of height h. AVL Trees / Slide 6

UNIT III BALANCED SEARCH TREES AND INDEXING

CS 758/858: Algorithms

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

Balanced Binary Search Trees

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

Section 4 SOLUTION: AVL Trees & B-Trees

Dynamic Access Binary Search Trees

AVL Trees. (AVL Trees) Data Structures and Programming Spring / 17

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University

Note that this is a rep invariant! The type system doesn t enforce this but you need it to be true. Should use repok to check in debug version.

Red-Black Trees. Based on materials by Dennis Frey, Yun Peng, Jian Chen, and Daniel Hood

Balanced Trees. Prof. Clarkson Fall Today s music: Get the Balance Right by Depeche Mode

We will show that the height of a RB tree on n vertices is approximately 2*log n. In class I presented a simple structural proof of this claim:

Data Structure - Advanced Topics in Tree -

Balanced Trees. Nate Foster Spring 2018

Properties of red-black trees

Search Trees. COMPSCI 355 Fall 2016

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Multi-Way Search Trees

Balanced Binary Search Trees

Balanced search trees. DS 2017/2018

Balanced Binary Search Trees. Victor Gao

CS350: Data Structures AVL Trees

13.4 Deletion in red-black trees

Multi-Way Search Trees

Search Trees. Chapter 11

CS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives.

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees

Chapter 22 Splay Trees

13.4 Deletion in red-black trees

Data Structures and Algorithms

Lecture 5. Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs

Part 2: Balanced Trees

CmpSci 187: Programming with Data Structures Spring 2015

COMP : Trees. COMP20012 Trees 219

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Programming II (CS300)

CS Transform-and-Conquer

TREES. Trees - Introduction

ECE250: Algorithms and Data Structures AVL Trees (Part A)

Transcription:

Red-black trees (19.5), B-trees (19.8), 2-3-4 trees

Red-black trees A red-black tree is a balanced BST It has a more complicated invariant than an AVL tree: Each node is coloured red or black A red node cannot have a red child In any path from the root to a null, the number of black nodes is the same The root node is black Implicitly, a null is coloured black

A red-black tree 13 8 15 4 10 14 16 2 6 9 11 1 3 5 7 12

Not red-black trees why? 13 13 13 8 15 8 15 15 4 10 4 10 9 11 13 8 15

Red-black trees invariant A red node cannot have a red child If the shortest path has k nodes (all black)... In each path from the root to a leaf, the number of black nodes is the same...then the longest path can only have 2k nodes Maximum height 2 log n, where n is number of nodes

Maintaining the red-black invariant In AVL trees, we maintained the invariant by rotating parts of the tree In red-black trees, we use two operations: rotations recolouring: changing a red node to black or vice versa Recolouring is an administrative operation that doesn't change the structure or contents of the tree

AVL versus red-black trees To insert a value into an AVL tree: go down the tree to find the parent of the new node insert a new node as a child go up the three, rebalancing...so two passes of the tree (down and up) required in the worst case In a red-black tree: go down the tree to find the parent of the new node......but rebalance and recolour the tree as you go down after inserting, no need to go up the tree again

Red-black insertion First, add the new node as in a BST, making it red P P...... X If the new node's parent is black, everything's fine

Red-black insertion If the parent of the new node is red, we have broken the invariant. (How?) We need to repair it. We need to consider several cases. In all cases, since the parent node is red, the grandparent is black. (Why?) Let's take the case where the parent's sibling is black.

Left-left tree ( outside grandchild ) G X: Newly-inserted node, breaks invariant P S P: Parent of new node X C D E G: Grandparent of new node A B S: Sibling of parent

Left-left tree ( outside grandchild ) G Recolouring G P S P S X C D E X C D E A B Now the number of black nodes in each path has changed but right rotation will fix it A B

Left-left tree ( outside grandchild ) X < P < G < S G Right rotation and recolouring P P S X G X C D E A B C S A B Why does this now satisfy the invariant? D E

Left-right tree ( inside grandchild ) P < X < G < S G Left rotation of P G P S X S A B X D C E A P B C D E Now we have a left-left tree! We know how to fix that already.

Left-right tree ( inside grandchild ) G Right rotation and recolouring X X S P G P C D E A B C S A B D E

Summary so far Insert the new node as in a BST, make it red Problem: if the parent is red, the invariant is broken (red node with red child) To fix a red node with a red child: If the node has a black sibling, rotate and recolour If the node has a red sibling,? Two approaches, bottom-up (simpler) and top-down (more efficient)

Bottom-up insertion If a new node, its parent and its parent's sibling are all red: do a colour flip Make the parent and its sibling black, and the grandparent red G Colour flip G P S P S X C D E X C D E A B A B

Bottom-up insertion A colour flip almost restores the invariant......but if G has a red parent, we will have a red node with a red child So move up the tree to G and apply the same double-red repair process there as we did to X.

Bottom-up insertion Insert the new node as in a BST, make it red If the new node has a red parent P: If the parent's sibling S is black, use rotations and recolourings to fix it the rotations are the same as in an AVL tree If S is red, do a colour flip, which makes the grandparent G red so you need to do the same double-red repair to G if its parent is red Lastly: if you get to the root and the root is red, make it black

Insättning, ett enkelt exempel 1 11 2 14 7 5 8 Invariants: A node is either red or black 1.The root is always black 2.A red node always has black children (a null reference is considered to refer to a black node) 3.The number of black nodes in any path from the root to a leaf is the same 19

Insättning, ett enkelt exempel 1 11 2 14 7 5 8 Invariants: A node is either red or black The root is always black A red node always has black children (a null reference is considered to refer to a black node) 1.The number of black nodes in any path from the root to a leaf is the same 4 20

Insättning, ett enkelt exempel 1 11 2 14 7 5 8 Invariants: A node is either red or black The root is always black A red node always has black children (a null reference is considered to refer to a black node) 1.The number of black nodes in any path from the root to a leaf is the same 4 Colour flip! 21

Insättning, ett enkelt exempel 1 11 2 14 7 5 8 Invariants: A node is either red or black The root is always black A red node always has black children (a null reference is considered to refer to a black node) 1.The number of black nodes in any path from the root to a leaf is the same 4 The problem has now shifted up the tree 22

Insättning, ett enkelt exempel 1 11 2 14 7 5 8 Invariants: A node is either red or black The root is always black A red node always has black children (a null reference is considered to refer to a black node) 1.The number of black nodes in any path from the root to a leaf is the same 4 Left-right tree: Rotate left about 2 23

Insättning, ett enkelt exempel 1 2 11 7 14 5 8 4 Left-left tree: swap colours of 7 and 11 Invariants: A node is either red or black The root is always black A red node always has black children (a null reference is considered to refer to a black node) 1.The number of black nodes in any path from the root to a leaf is the same 24

Insättning, ett enkelt exempel 1 2 11 7 14 8 5 Invariants: A node is either red or black The root is always black A red node always has black children (a null reference is considered to refer to a black node) 1.The number of black nodes in any path from the root to a leaf is the same 4 Left-left tree: Rotate right around 11 to restore the balance 25

Insättning, ett enkelt exempel 1 7 2 11 5 8 4 14 Invariants: A node is either red or black 1.The root is always black 2.A red node always has black children (a null reference is considered to refer to a black node) 3.The number of black nodes in any path from the root to a leaf is the same 26

Top-down insertion In bottom-up insertion, we sometimes need to move up the tree rebalancing and recolouring it after we insert an element But this only happens if P and S are both red Idea: as we go down the tree looking for the insertion point, rebalance and recolour the tree so that either P or S is black that way we never need to move up the tree again after insertion!

Top-down insertion If on the way down we come across a node X with two red children, colour-flip it immediately! X Colour flip X L R L R A B C D A B C D But what if X's parent is also red? We break the invariant! Observation: X's parent's sibling must be black (or we would've colour-flipped them on the way down), so a single rotation + recolouring will fix the invariant!

Top-down insertion Go down the tree as before Whenever a node X has two red children, colour-flip; if X's parent P is red, use rotations and recolourings as before to fix it This is easy because P's sibling must be black Insert the new node as usual, making it red; if the parent P is also red, use rotations and recolourings to fix it Again, P's sibling is black so we avoid the colour flip case

Insättning, ett enkelt exempel 1 11 2 14 7 5 8 We would've visited 5 next: remember it! Invariants: A node is either red or black 1.The root is always black 2.A red node always has black children (a null reference is considered to refer to a black node) 3.The number of black nodes in any path Inserting from 4, the root to a leaf is we the get same to a node with two red children: colour flip! 30

Insättning, ett enkelt exempel 1 11 2 14 7 5 8 Invariants: A node is either red or black The root is always black A red node always has black children (a null reference is considered to refer to a black node) 1.The number of black nodes in any path from the root to a leaf is the same Left-right tree: Rotate left about 2 31

Insättning, ett enkelt exempel 1 2 11 7 14 8 5 Invariants: A node is either red or black The root is always black A red node always has black children (a null reference is considered to refer to a black node) 1.The number of black nodes in any path from the root to a leaf is the same Left-left tree: swap colours of 7 and 11 32

Insättning, ett enkelt exempel 1 2 11 7 14 8 5 Invariants: A node is either red or black The root is always black A red node always has black children (a null reference is considered to refer to a black node) 1.The number of black nodes in any path from the root to a leaf is the same Left-left tree: Rotate right around 11 to restore the balance 33

Insättning, ett enkelt exempel 1 7 2 11 5 8 14 Invariants: A node is either red or black 1.The root is always black 2.A red node always has black children (a null reference is considered to refer to a black node) 3.The number of black nodes in any path from the root to a leaf is the same The colour flip is finished. Now we continue down and insert 4! 34

Insättning, ett enkelt exempel 1 7 2 11 5 8 4 14 Invariants: A node is either red or black 1.The root is always black 2.A red node always has black children (a null reference is considered to refer to a black node) 3.The number of black nodes in any path from the root to a leaf is the same No need to go up the tree afterwards 35

Red-black deletion Use the normal BST deletion algorithm, which will end up removing a leaf from the tree If the leaf is red, everything's fine If the leaf is black, the invariant is broken Idea: go down the tree, making sure that the current node is always red Lots of special cases! See book 19.5.4.

Red-black versus AVL trees Red-black trees have a weaker invariant than AVL trees (less balanced) but still O(log n) running time Advantage: less work to maintain the invariant (top-down insertion no need to go up tree afterwards), so insertion and deletion are cheaper Disadvantage: lookup will be slower if the tree is less balanced But in practice red-black trees are faster than AVL trees

2-3 trees In a binary tree, each node has two children In a 2-3 tree, each node has either 2 children (a 2-node) or 3 (a 3-node) A 2-node is a normal BST node: One data value x, which is greater than all values in the left subtree and less than all values in the right subtree A 3-node is different: Two data values x and y All the values in the left subtree are less than x All the values in the middle subtree are between x and y All the values in the right subtree are greater than y

2-3 trees An example of a 2-3 tree:

Why 2-3 trees? To get a balanced BST we had to find funny invariants and define our operations in odd ways With a 2-3 tree we have the invariant: The tree is always perfectly balanced and we can maintain it!

Insertion into a 2-3 tree Suppose we want to insert 4 First, find the right leaf node 7 2 11, 15 1 5 8 14 18, 20 4 should go here

Insertion into a 2-3 tree If it's a 2-node, turn it into a 3-node by adding the value! 7 2 11, 15 1 4, 5 8 14 18, 20

Insertion into a 2-3 tree Now suppose we want to insert 3. Find the right leaf node 7 2 11, 15 1 4, 5 8 14 18, 20 3 should go here

Insertion into a 2-3 tree We now have a 4-node not allowed! Split it into two 2-nodes and attach them to the parent: 7 2 11, 15 3, 1 8 14 4, 5 18, 20 But this is a 4-node!

Insertion into a 2-3 tree 4 goes here because it was the middle value before 7 2, 4 11, 15 1 3 5 8 14 18, 20

Insertion into a 2-3 tree Now suppose we want to add 19. Find the right leaf node and add it 7 19 should go here 2, 4 11, 15 1 3 5 8 14 18, 20

Insertion into a 2-3 tree Now suppose we want to add 19. Again, we have a 4-node split it 7 A 4-node 2, 4 11, 15 1 3 5 8 14 18, 19, 20

Insertion into a 2-3 tree But now we have a 4-node one level above! Split that. 7 A 4-node 2, 4 11, 15, 19 1 3 5 8 14 18 20

Insertion into a 2-3 tree Finally we have a 2-3 tree again. 7, 15 2, 4 11 19 1 3 5 8 14 18 20

2-3 trees, summary 2-3 trees do not use rotation, unlike balanced BSTs Instead, they keep the tree perfectly balanced and use splits when there is no room for a new node Complexity is O(log n), as tree is perfectly balanced Much simpler than e.g. red-black trees! But implementation is annoying :(

B-trees B-trees generalise 2-3 trees: In a B-tree of order k, a node can have k children Each non-root node must be at least half-full A 2-3 tree is a B-tree of order 3 10 22 30 40 13 15 18 20 32 35 38 55 77 88 26 27 42 46

Why B-trees B-trees are used for disk storage in databases: Hard drives read data in blocks of typically ~4KB For good performance, you want to minimise the number of blocks read This means you want: 1 tree node = 1 block B-trees with k about 1024 achieve this 10 22 30 40 13 15 18 20 32 35 38 55 77 88 26 27 42 46

2-3-4 trees A 2-3-4 tree is a B-tree of order 4 Example:

Red-black trees are 2-3-4 trees! Any red-black tree is equivalent to a 2-3-4 tree! A 2-node is a black node x

Red-black trees are 2-3-4 trees! Any red-black tree is equivalent to a 2-3-4 tree! A 3-node is a black node with one red child x y x y

Red-black trees are 2-3-4 trees! Any red-black tree is equivalent to a 2-3-4 tree! A 4-node is a black node with two red children x y z

Surprise! Red-black trees are just a fancy way of representing a 2-3-4 tree using a binary tree! 8 13 17 NIL 1 6 NIL 11 NIL NIL 15 NIL 22 25 27 NIL NIL NIL NIL NIL NIL

Red-black trees vs 2-3-4 trees Red-black trees 2-3-4 trees Black node with no red children Black node with one red child Black node with two red children Add a red child to a black node Add a red child to a red node with a black sibling and rotate Colour change + rotate 2-node 3-node 4-node Change a 2-node to a 3-node Change a 3-node to a 4-node Split a 4-node Exercise: check for yourself how the red-black tree operations correspond to 2-3-4 tree operations!

Summary Red-black trees normally faster than AVL trees because there is no need to go up the tree after inserting or deleting On the other hand, trickier to implement 2-3 trees: allow 2 or 3 children per node Possible to keep perfectly balanced Slightly annoying to implement B-trees: generalise 2-3 trees to k children If k is big, the height is very small useful for on-disk trees e.g. databases Red-black trees are 2-3-4 trees in disguise!