Lecture 6. Binary Search Trees and Red-Black Trees

Similar documents
Lecture 7. Binary Search Trees and Red-Black Trees

Algorithms. AVL Tree

Balanced search trees

Lecture 4 Median and Selection

CSE 373 OCTOBER 11 TH TRAVERSALS AND AVL

CS102 Binary Search Trees

Properties of red-black trees

Define the red- black tree properties Describe and implement rotations Implement red- black tree insertion

BST Deletion. First, we need to find the value which is easy because we can just use the method we developed for BST_Search.

Trees. (Trees) Data Structures and Programming Spring / 28

Lecture 4. The Substitution Method and Median and Selection

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

Search Trees. Undirected graph Directed graph Tree Binary search tree

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

CS 380 ALGORITHM DESIGN AND ANALYSIS

We will show that the height of a RB tree on n vertices is approximately 2*log n. In class I presented a simple structural proof of this claim:

CSC148 Week 7. Larry Zhang

Self-Balancing Search Trees. Chapter 11

ECE 242 Data Structures and Algorithms. Trees IV. Lecture 21. Prof.

Balanced Trees Part Two

Note that this is a rep invariant! The type system doesn t enforce this but you need it to be true. Should use repok to check in debug version.

CSE373: Data Structures & Algorithms Lecture 4: Dictionaries; Binary Search Trees. Kevin Quinn Fall 2015

Red-Black Trees. Based on materials by Dennis Frey, Yun Peng, Jian Chen, and Daniel Hood

Section 05: Solutions

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Balanced Search Trees Date: 9/20/18

Binary Search Trees. Analysis of Algorithms

Binary Heaps in Dynamic Arrays

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

Algorithms and Data Structures: Lower Bounds for Sorting. ADS: lect 7 slide 1

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

Binary Search Trees, etc.

Lecture 6: Analysis of Algorithms (CS )

CS 171: Introduction to Computer Science II. Binary Search Trees

13.4 Deletion in red-black trees

1 Leaffix Scan, Rootfix Scan, Tree Size, and Depth

Section 4 SOLUTION: AVL Trees & B-Trees

Friday Four Square! 4:15PM, Outside Gates

Lecture 10. Finding strongly connected components

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

COMP Analysis of Algorithms & Data Structures

AVL Trees / Slide 2. AVL Trees / Slide 4. Let N h be the minimum number of nodes in an AVL tree of height h. AVL Trees / Slide 6

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

Comparison Based Sorting Algorithms. Algorithms and Data Structures: Lower Bounds for Sorting. Comparison Based Sorting Algorithms

Red-Black trees are usually described as obeying the following rules :

AVL Tree Definition. An example of an AVL tree where the heights are shown next to the nodes. Adelson-Velsky and Landis

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

1 Introduction. 2 InsertionSort. 2.1 Correctness of InsertionSort

Augmenting Data Structures

A dictionary interface.

COMP 250 Fall recurrences 2 Oct. 13, 2017

COMP Analysis of Algorithms & Data Structures

Lower Bound on Comparison-based Sorting

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Course Review for. Cpt S 223 Fall Cpt S 223. School of EECS, WSU

1 Lower bound on runtime of comparison sorts

Recall: Properties of B-Trees

Sorting. Dr. Baldassano Yu s Elite Education

Motivation for B-Trees

Announcements. Midterm exam 2, Thursday, May 18. Today s topic: Binary trees (Ch. 8) Next topic: Priority queues and heaps. Break around 11:45am

Last week: Breadth-First Search

CISC 235: Topic 4. Balanced Binary Search Trees

CS 206 Introduction to Computer Science II

CSCI 136 Data Structures & Advanced Programming. Lecture 25 Fall 2018 Instructor: B 2

(Refer Slide Time: 05:25)

CS 161 Problem Set 4

13.4 Deletion in red-black trees

A set of nodes (or vertices) with a single starting point

Hierarchical data structures. Announcements. Motivation for trees. Tree overview

Ch04 Balanced Search Trees

COMP171. AVL-Trees (Part 1)

CSE373: Data Structures & Algorithms Lecture 5: Dictionary ADTs; Binary Trees. Linda Shapiro Spring 2016

CS 361 Data Structures & Algs Lecture 9. Prof. Tom Hayes University of New Mexico

Binary Trees, Binary Search Trees

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

B-Trees. CS321 Spring 2014 Steve Cutchin

CSE332: Data Abstractions Lecture 7: B Trees. James Fogarty Winter 2012

CSE 373 APRIL 17 TH TREE BALANCE AND AVL

Computer Science 210 Data Structures Siena College Fall Topic Notes: Binary Search Trees

CSE 326: Data Structures Splay Trees. James Fogarty Autumn 2007 Lecture 10

Hash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell

Exercise 1 : B-Trees [ =17pts]

Search Trees: BSTs and B-Trees

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

Algorithms. Red-Black Trees

CSC 172 Data Structures and Algorithms. Fall 2017 TuTh 3:25 pm 4:40 pm Aug 30- Dec 22 Hoyt Auditorium

Week 2. TA Lab Consulting - See schedule (cs400 home pages) Peer Mentoring available - Friday 8am-12pm, 12:15-1:30pm in 1289CS

Binary search trees (chapters )

CSE 326: Data Structures Lecture #8 Binary Search Trees

Red-black trees (19.5), B-trees (19.8), trees

3137 Data Structures and Algorithms in C++

Computer Science 210 Data Structures Siena College Fall Topic Notes: Binary Search Trees

CSE 100 Advanced Data Structures

CS 331 DATA STRUCTURES & ALGORITHMS BINARY TREES, THE SEARCH TREE ADT BINARY SEARCH TREES, RED BLACK TREES, THE TREE TRAVERSALS, B TREES WEEK - 7

Binary Trees

Lecture 21: Red-Black Trees

COMP Analysis of Algorithms & Data Structures

Binary Search Trees Treesort

Transcription:

Lecture Binary Search Trees and Red-Black Trees

Sorting out the past couple of weeks: We ve seen a bunch of sorting algorithms InsertionSort - Θ n 2 MergeSort - Θ n log n QuickSort - Θ n log n expected; Θ n 2 worst case RadixSort - Θ n + r log r M And a couple of silly ones: BogoSort - Θ n! expected; worst case StickSort - Θ 1

They all looked sort of like this: Input 4 8 1 5 2 SortingAlgorithm Output 1 2 4 5 8

But what happens if your input doesn t arrive all at once? Input 4 8 1 5 2 SortingAlgorithm Output 1 2 4 5 8

And then some inputs also leave? Input 4 8 1 5 2 SortingAlgorithm Output 1 2 4 5 8

Actually, we already saw something similar but it was too slow (Θ n 2 ) InsertionSort(A): for i in [1:n] current A[i] j i-1 while j >= 0 and A[j] > current: A[j+1] A[j] j j-1 A[j+1] current n iterations of the outer loop In the worst case, about n iterations of this inner loop

What do we mean by sorted? Sorted arrays? 1 2 5 Sorted linked lists? HEAD 1 2 5 Sorted sticks?

Today Begin a brief foray into data structures! See CS 1 for more! Binary search trees You may remember these from CS 10B They are better when they re balanced. this will lead us to Self-Balancing Binary Search Trees Red-Black trees.

Why are we studying self-balancing BSTs? 1. The punchline is important: A data structure with O(log(n)) INSERT/DELETE/SEARCH 2. The idea behind Red-Black Trees is clever It s good to be exposed to clever ideas. Also it s just aesthetically pleasing.

Some data structures for storing objects like (Sorted) arrays: 5 (aka, nodes with keys) 1 2 4 5 8 (Sorted) linked lists: HEAD 1 2 4 5 8 Some basic operations: INSERT, DELETE, SEARCH

Sorted Arrays 1 2 4 5 8 O(log(n)) SEARCH: 1 2 4 5 8 eg, Binary search to see if is in A. O(n) INSERT/DELETE: 1 2 4 54.5 8

Sorted linked lists 1 2 4 5 8 O(1) INSERT/DELETE: (assuming we have a pointer to the location of the insert/delete) HEAD 1 2 4 5 8 O(n) SEARCH: HEAD 1 2 4 5 8

Motivation for Binary Search Trees Sorted Arrays Linked Lists Binary Search Trees Search O(log(n)) O(n) O(log(n)) Insert/Delete O(n) O(1) O(log(n))

For today all keys are distinct. Binary tree terminology Each node has two children 2 of is a descendant 5 The left child of is 2 The right child of is 4 This node is the root 5 This is a node. It has a key (). 2 4 8 Each node has a pointer to its left child, right child, and parent. Both children of are NIL 1 1 These nodes are leaves.

Binary Search Trees It s a binary tree so that: Every LEFT descendant of a node has key less than that node. Every RIGHT descendant of a node has key larger than that node. Example of building a binary search tree: 4 5 8 1 2

Binary Search Trees It s a binary tree so that: Every LEFT descendant of a node has key less than that node. Every RIGHT descendant of a node has key larger than that node. Example of building a binary search tree: 4 5 8 1 2

Binary Search Trees It s a binary tree so that: Every LEFT descendant of a node has key less than that node. Every RIGHT descendant of a node has key larger than that node. Example of building a binary search tree: 5 1 2 4 8

Binary Search Trees It s a binary tree so that: Every LEFT descendant of a node has key less than that node. Every RIGHT descendant of a node has key larger than that node. Example of building a binary search tree: 5 2 1 4 8

Binary Search Trees It s a binary tree so that: Every LEFT descendant of a node has key less than that node. Every RIGHT descendant of a node has key larger than that node. Example of building a binary search tree: 5 2 4 8 Q: Is this the only binary search tree I could possibly build with these values? A: No. I made choices about which nodes to choose when. Any choices would have been fine. 1

Aside: this should look familiar kinda like QuickSort 4 5 8 1 2

Which of these is a BST? Binary Search Trees It s a binary tree so that: Every LEFT descendant of a node has key less than that node. Every RIGHT descendant of a node has key larger than that node. 5 5 2 4 8 2 4 8 1 Binary Search Tree 1 NOT a Binary Search Tree

Remember the goal Fast SEARCH/INSERT/DELETE Can we do these?

SEARCH in a Binary Search Tree definition by example 5 2 4 8 EXAMPLE: Search for 4. EXAMPLE: Search for 4.5 It turns out it will be convenient to return 4 in this case (that is, return the last node before we went off the tree) 1 How long does this take? O(length of longest path) = O(height) Write pseudocode (or actual code) to implement this! Ollie the over-achieving ostrich

INSERT in a Binary Search Tree 5 EXAMPLE: Insert 4.5 INSERT(key): x = SEARCH(key) Insert a new node with desired key at x 2 4 8 1 x = 4 4.5

INSERT in a Binary Search Tree This slide skipped in class here for reference 1 5 2 4 8 x = 4 4.5 EXAMPLE: Insert 4.5 INSERT(key): x = SEARCH(key) if key > x.key: Make a new node with the correct key, and put it as the right child of x. if key < x.key: Make a new node with the correct key, and put it as the left child of x. if x.key == key: return

DELETE in a Binary Search Tree EXAMPLE: Delete 2 5 DELETE(key): x = SEARCH(key) if x.key == key:.delete x. 1 2 4 8 x = 2

DELETE in a Binary Search Tree several cases (by example) say we want to delete 5 5 5 5 2 Case 1: if is a leaf, just delete it. 2 Siggi the Studious Stork Write pseudocode for all of these! This triangle is a cartoon for a subtree Case 2: if has just one child, move that up.

DELETE in a Binary Search Tree ctd. Case : if has two children, replace with it s immediate successor. (aka, next biggest thing after ) 2.1 5 4 2.1 5 4 Does this maintain the BST property? Yes. How do we find the immediate successor? SEARCH for in the subtree under.right How do we remove it when we find it? If [.1] has 0 or 1 children, do one of the previous cases. What if [.1] has two children? It doesn t.

How long do these operations take? 5 SEARCH/INSERT/DELETE 2 4 8 1 Think-Pair-Share!

How long do these operations take? SEARCH is the big one. Everything else just calls SEARCH and then does some small O(1)-time operation. 5 Time = O(height of tree) Trees have depth O(log(n)). Done! 2 4 8 How long does search take? Lucky the lackadaisical lemur.

Wait... 2 4 This is a valid binary search tree. The version with n nodes has depth n, not O(log(n)). 5 8 Could such a tree show up? In what order would I have to insert the nodes? Inserting in the order 2,,4,5,,,8 would do it. So this could happen.

What to do? How often is every so often in the worst case? It s actually pretty often! Goal: Fast SEARCH/INSERT/DELETE All these things take time O(height) And the height might be big!!! Ollie the over-achieving ostrich Idea 0: Keep track of how deep the tree is getting. If it gets too tall, re-do everything from scratch. At least Ω(n) every so often. Turns out that s not a great idea. Instead we turn to

Self-Balancing Binary Search Trees

Idea 1: Rotations No matter what lives underneath A,B,C, this takes time O(1). (Why?) Maintain Binary Search Tree (BST) property, while moving stuff around. YOINK! Y X C A Y B X A Y X Note: A, B, C, X, Y are variable names, not the contents of the nodes. A B C B fell down. B C CLAIM: this still has BST property.

This seems helpful YOINK! 5 2 5 4 2 4 8 8

Does this work? Whenever something seems unbalanced, do rotations until it s okay again. Even for me this is pretty vague. What do we mean by seems unbalanced? What s okay? Lucky the Lackadaisical Lemur

Idea 2: have some proxy for balance Maintaining perfect balance is too hard. Instead, come up with some proxy for balance: If the tree satisfies [SOME PROPERTY], then it s pretty balanced. We can maintain [SOME PROPERTY] using rotations. There are actually several ways to do this, but today we ll see

Red-Black Trees A Binary Search Tree that balances itself! No more time-consuming by-hand balancing! Be the envy of your friends and neighbors with the time-saving 5 Maintain balance by stipulating that black nodes are balanced, and that there aren t too many red nodes. It s just good sense! 2 4 8

Red-Black Trees these rules are the proxy for balance Every node is colored red or black. The root node is a black node. NIL children count as black nodes. Children of a red node are black nodes. For all nodes x: all paths from x to NIL s have the same number of black nodes on them. 5 2 4 8 I m not going to draw the NIL children in the future, but they are treated as black nodes. NIL NIL NIL NIL NIL NIL NIL NIL

Examples(?) Yes! Every node is colored red or black. The root node is a black node. NIL children count as black nodes. Children of a red node are black nodes. For all nodes x: all paths from x to NIL s have the same number of black nodes on them. No! No! No!

Why??????? This is pretty balanced. The black nodes are balanced The red nodes are spread out so they don t mess things up too much. We can maintain this property as we insert/delete nodes, by using rotations. 5 2 4 8 9 This is the really clever idea! This Red-Black structure is a proxy for balance. It s just a smidge weaker than perfect balance, but we can actually maintain it!

Let s build some intuition! This is pretty balanced To see why, intuitively, let s try to build a Red-Black Tree that s unbalanced. Lucky the lackadaisical lemur One path could be twice as long another if we pad it with red nodes. Conjecture: the height of a red-black tree is at most 2 log(n)

That turns out to be basically right. [proof sketch] Say there are b(x) black nodes in any path from x to NIL. (excluding x, including NIL). Claim: Then there are at least 2 b(x) 1 non-nil nodes in the subtree underneath x. (Including x). [Proof by induction on board if time] Then: n 2 b root 1 height 2 2 1 Rearranging: using the Claim NIL b(root) >= height/2 because of RBTree rules. height n + 1 2 2 height 2log(n + 1) y x z

Okay, so it s balanced but can we maintain it? Yes! For the rest of lecture: sketch of how we d do this. See CLRS for more details. (You are not responsible for the details for this class but you should understand the main ideas).

Many cases Suppose we want to insert here. eg, want to insert 0. And then there are 9 more cases for all of the various symmetries of these cases

Inserting into a Red-Black Tree Make a new red node. Insert it as you would normally. What if it looks like this? Example: insert 0 0

Many cases Suppose we want to insert here. eg, want to insert 0. And then there are 9 more cases for all of the various symmetries of these cases

Inserting into a Red-Black Tree Make a new red node. Insert it as you would normally. Fix things up if needed. Example: insert 0 What if it looks like this? No! 0

Inserting into a Red-Black Tree Make a new red node. Insert it as you would normally. Fix things up if needed. 0 Example: insert 0 What if it looks like this? Can t we just insert 0 as a black node? No!

We need a bit more context -1 What if it looks like this? Example: insert 0

We need a bit more context Add 0 as a red node. -1 What if it looks like this? Example: insert 0 0

We need a bit more context Add 0 as a red node. Claim: RB-Tree properties still hold. -1 What if it looks like this? Example: insert 0 Flip colors! 0

But what if that was red? -1 What if it looks like this? Example: insert 0 And we wanted to flip colors 0

More context - -1 What if it looks like this? Example: insert 0 0

More context - -1 What if it looks like this? Example: insert 0 Now we re basically inserting into some smaller tree. Recurse!

Many cases Suppose we want to insert here. eg, want to insert 0. And then there are 9 more cases for all of the various symmetries of these cases

Inserting into a Red-Black Tree Make a new red node. Insert it as you would normally. Fix things up if needed. 0 What if it looks like this? Example: Insert 0. Actually, this can t happen? - path has one black node -- has at least two It might happen that we just turned 0 red from the previous step. Or it could happen if is actually NIL.

Recall Rotations Maintain Binary Search Tree (BST) property, while moving stuff around. YOINK! X Y Y Y C A B X A X A B C B C CLAIM: this still has BST property.

Inserting into a Red-Black Tree Make a new red node. Insert it as you would normally. Fix things up if needed. What if it looks like this? YOINK! Need to argue that if RB-Tree property held before, it still does. 0 0

Many cases Suppose we want to insert here. eg, want to insert 0. And then there are 9 more cases for all of the various symmetries of these cases

Deleting from a Red-Black tree Fun exercise! Ollie the over-achieving ostrich

That s a lot of cases You are not responsible for the nitty-gritty details of Red-Black Trees. (For this class) Though implementing them is a great exercise! You should know: What are the properties of an RB tree? And (more important) why does that guarantee that they are balanced?

What was the point again? Red-Black Trees always have height at most 2log(n+1). As with general Binary Search Trees, all operations are O(height) So all operations are O(log(n)).

Conclusion: The best of both worlds Sorted Arrays Linked Lists Balanced Binary Search Trees Search O(log(n)) O(n) O(log(n)) Insert/Delete O(n) O(1) O(log(n))

Aside: many ways to balance trees! AVL trees: balance = height(left) height(right) 2 4 { 1,0,1} (2,) trees: * root-leaf distance same for all leaves * some nodes may have children 1, 5,9 11 0 2 4 8 10 12

Today Begin a brief foray into data structures! See CS 1 for more! Binary search trees You may remember these from CS 10B They are better when they re balanced. this will lead us to Self-Balancing Binary Search Trees Red-Black trees. Recap

Recap Balanced binary trees are the best of both worlds! But we need to keep them balanced. Red-Black Trees do that for us. We get O(log(n))-time INSERT/DELETE/SEARCH Clever idea: have a proxy for balance 5 2 4 8

Next time Midterm 1 in CEMEX Auditorium Before next time Review session: Sunday 10/14, 4- PM, Main Quad 420-040 Study ( work smart ) Get some rest the night before

Midterm 1 plan (This is just my guess!) 1. A few short questions (20 pts ~ 15 min) 2. Design+analyze an algorithm (40 pts ~ 0 min). Design+analyze another algorithm (40 pts ~ 0 min) 4. Bonus question (10 pts ~ only if you have time) Study resources: HW Review session Piazza Textbooks Classmates Office Hours HW, Sections, Lecture notes