Note that this is a rep invariant! The type system doesn t enforce this but you need it to be true. Should use repok to check in debug version.

Similar documents
Algorithms. AVL Tree

Red-Black Trees. Based on materials by Dennis Frey, Yun Peng, Jian Chen, and Daniel Hood

Balanced Trees. Nate Foster Spring 2018

Trees. Eric McCreath

Define the red- black tree properties Describe and implement rotations Implement red- black tree insertion

Balanced Search Trees. CS 3110 Fall 2010

CSE 373 OCTOBER 11 TH TRAVERSALS AND AVL

Search Trees. COMPSCI 355 Fall 2016

Balanced Trees. Prof. Clarkson Fall Today s music: Get the Balance Right by Depeche Mode

Lecture 5. Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs

12 July, Red-Black Trees. Red-Black Trees

Balanced Trees. Nate Foster Spring 2019

AVL Trees Goodrich, Tamassia, Goldwasser AVL Trees 1

CISC 235: Topic 4. Balanced Binary Search Trees

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

CSE100. Advanced Data Structures. Lecture 8. (Based on Paul Kube course materials)

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

A red-black tree is a balanced binary search tree with the following properties:

CS350: Data Structures AVL Trees

We will show that the height of a RB tree on n vertices is approximately 2*log n. In class I presented a simple structural proof of this claim:

AVL Tree Definition. An example of an AVL tree where the heights are shown next to the nodes. Adelson-Velsky and Landis

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

AVL Trees (10.2) AVL Trees

Self-Balancing Search Trees. Chapter 11

Red-Black trees are usually described as obeying the following rules :

CS 380 ALGORITHM DESIGN AND ANALYSIS

Lecture 21: Red-Black Trees

Binary Search Trees. Analysis of Algorithms

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

BRONX COMMUNITY COLLEGE of the City University of New York DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

ECE 242 Data Structures and Algorithms. Trees IV. Lecture 21. Prof.

COMP171. AVL-Trees (Part 1)

Balanced search trees

Programming Languages and Techniques (CIS120)

BST Deletion. First, we need to find the value which is easy because we can just use the method we developed for BST_Search.

AVL Trees. (AVL Trees) Data Structures and Programming Spring / 17

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

CSCI 136 Data Structures & Advanced Programming. Lecture 25 Fall 2018 Instructor: B 2

Properties of red-black trees

10/23/2013. AVL Trees. Height of an AVL Tree. Height of an AVL Tree. AVL Trees

Multiway Search Trees. Multiway-Search Trees (cont d)

Lecture 6: Analysis of Algorithms (CS )

Programming Languages and Techniques (CIS120)

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Ch04 Balanced Search Trees

Some Search Structures. Balanced Search Trees. Binary Search Trees. A Binary Search Tree. Review Binary Search Trees

Multi-Way Search Trees

Trees. (Trees) Data Structures and Programming Spring / 28

CS350: Data Structures Red-Black Trees

Algorithms. Red-Black Trees

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

Search Trees - 2. Venkatanatha Sarma Y. Lecture delivered by: Assistant Professor MSRSAS-Bangalore. M.S Ramaiah School of Advanced Studies - Bangalore

Advanced Programming Handout 5. Purely Functional Data Structures: A Case Study in Functional Programming

Data Structures and Algorithms

CSE 373 APRIL 17 TH TREE BALANCE AND AVL

Binary search trees (chapters )

CSCI2100B Data Structures Trees

13.4 Deletion in red-black trees

Search Structures. Kyungran Kang

Advanced Programming Handout 6. Purely Functional Data Structures: A Case Study in Functional Programming

TREES. Trees - Introduction

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

Algorithms. Deleting from Red-Black Trees B-Trees

Fall, 2015 Prof. Jungkeun Park

Multi-Way Search Trees

Quiz 1 Solutions. (a) f(n) = n g(n) = log n Circle all that apply: f = O(g) f = Θ(g) f = Ω(g)

Binary search trees (chapters )

Dictionaries. Priority Queues

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

Data Structures in Java

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Search Trees - 1 Venkatanatha Sarma Y

Red-black trees (19.5), B-trees (19.8), trees

Transform & Conquer. Presorting

B Tree. Also, every non leaf node must have at least two successors and all leaf nodes must be at the same level.

Lecture 16 AVL Trees

Sample Exam 1 Questions

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

Final Examination CSE 100 UCSD (Practice)

Balanced Search Trees

Part 2: Balanced Trees

Balanced Binary Search Trees

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

CS102 Binary Search Trees

Lecture 16 Notes AVL Trees

Data Structures and Algorithms CMPSC 465

AVL Trees / Slide 2. AVL Trees / Slide 4. Let N h be the minimum number of nodes in an AVL tree of height h. AVL Trees / Slide 6

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Trees 2: Linked Representation, Tree Traversal, and Binary Search Trees

Chapter 2: Basic Data Structures

CSE 100 Advanced Data Structures

Analysis of Algorithms

Binary Trees, Binary Search Trees

Binary search trees. We can define a node in a search tree using a Python class definition as follows: class SearchTree:

ECE250: Algorithms and Data Structures AVL Trees (Part A)

Analysis of Algorithms

Trees. A tree is a directed graph with the property

CSC 172 Data Structures and Algorithms. Fall 2017 TuTh 3:25 pm 4:40 pm Aug 30- Dec 22 Hoyt Auditorium

Data Structure: Search Trees 2. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

Transcription:

Announcements: Prelim tonight! 7:30-9:00 in Thurston 203/205 o Handed back in section tomorrow o If you have a conflict you can take the exam at 5:45 but can t leave early. Please email me so we have a headcount. Show up in 4158 Upson at 5:30. 1

Binary search trees A binary tree is easy to define inductively in OCaml. We will use the following definition which represents a node as a triple of a value and two children, and which explicitly represents leaf nodes. type 'a tree = TNode of 'a * 'a tree * 'a tree TLeaf A binary search tree is a binary tree with the following representation invariant: o For any node n, every node in the left subtree of n has a value less than that of n, and every node in the right subtree of n has a value more than that of n. Note that this is a rep invariant! The type system doesn t enforce this but you need it to be true. Should use repok to check in debug version. Given such a tree, how do you perform a lookup operation? o Start from the root, and at every node, if the value of the node is what you are looking for, you are done; o Otherwise, recursively look up in the left or right subtree depending on the value stored at the node. In code: let rec contains x = function TLeaf -> false TNode (y, l, r) -> if x=y then true else if x < y then contains x l else contains x r 2

Note the use of the keyword function so that the variable used in the pattern matching need not be named. This is equivalent to (unnecessarily) naming a variable and then using match: let rec contains x t = match t with TLeaf -> false TNode (y, l, r) -> if x=y then true else if x < y then contains x l else contains x r Adding an element is similar: you perform a lookup until you find the empty node that should contain the value. This is a nondestructive update, so as the recursion completes, a new tree is constructed that is just like the old one except that it has a new node (if needed): let rec add x = function TLeaf -> TNode (x, TLeaf, TLeaf) (* When get to leaf, put new node there *) TNode (y, l, r) as t -> (* Recursively search for value *) if x=y then t else if x > y then TNode (y, l, add x r) else (* x < y *) TNode (y, add x l, r) 3

What is the running time of those operations? Since add is just a lookup with an extra constant-time node creation, we focus on the lookup operation. Clearly, the run time of lookup is O(h), where h is the height of the tree. What's the worst-case height of a tree? Clearly, a tree of n nodes all in a single long branch (imagine adding the numbers 1,2,3,4,5,6,7 in order into a binary search tree). So the worst-case running time of lookup is still O(n) (for n the number of nodes in the tree). What is a good shape for a tree that would allow for fast lookup? A perfect binary tree has the largest number of nodes n for a given height h: n = 2 h+1-1. Therefore h = lg(n+1)-1 = O(lg n). ^ 50 / \ 25 75 height=3 / \ / \ n=15 10 30 60 90 / \ / \ / \ / \ V 4 12 27 40 55 65 80 99 If a tree with n nodes is kept balanced, its height is O(lg n), which leads to a lookup operation running in time O(lg n). 4

How can we keep a tree balanced? It can become unbalanced during element addition or deletion. Most balanced tree schemes involve adding or deleting an element just like in a normal binary search tree, followed by some kind of tree surgery to rebalance the tree. Some examples of balanced binary search tree data structures include o AVL (or height-balanced) trees (1962) o 2-3 trees (1970's) o Red-black trees (1970's) In each of these, we ensure asymptotic complexity of O(lg n) o by enforcing a stronger invariant on the data structure than just the binary search tree invariant. 5

Red black trees: Recall that BST s work well when balanced, o But if you insert in the wrong order (sorted) you basically get a stupid representation of a list Solution: self-balancing trees o AVL trees, or red-black trees are most popular They are guaranteed to stay approximately balanced o Achieve this by rotations RB trees have the longest path from the root to any leaf is no more than twice the shortest path (this is kind of a non-operational rep invariant) Here are the new conditions we add to the binary search tree representation invariant: 1. There are no two adjacent red nodes along any path. 2. Every path from the root to a leaf has the same number of black nodes. This number is called the black height (BH) of the tree. If a tree satisfies these two conditions, it must also be the case that every subtree of the tree also satisfies the conditions. If a subtree violated either of the conditions, the whole tree would also. Additionally, we require that the root of the tree be colored black. This can always be enforced by simply setting its color to black; doing this does not cause any other invariants to be violated. 6

Notes: o You don t need red nodes. So the shortest path from root to leaf will be purely black. With these invariants, the longest possible path from the root to an empty node would alternately contain red and black nodes; o therefore it is at most twice as long as the shortest possible path, which only contains black nodes. o If n is the number of nodes in the tree, the longest path cannot have a length greater than twice the length of the paths in a perfect binary tree: 2 log n, which is O(log n). o Therefore, the tree has height O(log n) and the operations are all asymptotically logarithmic in the number of nodes. Another way to see this is to think about just the black nodes in the tree. Suppose we snip all the red nodes out of the trees and reconnect each black node to its closest black ancestor. Then we have a tree whose leaves are all at depth BH, and whose branching factor ranges between 2 and 4. Such a tree must contain at least Ω(2 BH ) nodes, and so must the whole tree when we add the red nodes back in. If n is Ω(2 BH ), then black height BH is O(log n). But invariant 1 says that the longest path is at most h = 2BH. So h is O(log n) too. 7

How do we check for membership in red-black trees? Exactly the same way as for general binary trees. type color = Red Black type 'a rbtree = Node of color * 'a * 'a rbtree * 'a rbtree Leaf let rec mem x = function Leaf -> false Node (_, y, left, right) -> x = y (x < y && mem x left) (x > y && mem x right) More interesting is the insert operation. As with standard binary trees, we add a node by replacing the leaf found by the search procedure. o We also color the new node red to ensure that invariant 2 is preserved. o However, this may destroy invariant 1 by producing two adjacent red nodes. In order to restore the invariant, we consider not only the new red node and its red parent, but also its (black) grandparent. The next figure shows the four possible cases that can arise. 8

Notice that in each of these trees, the values of the nodes in a, b, c, d must have the same relative ordering with respect to x, y, and z: a < x < b < y < c < z < d. Therefore, we can transform the tree to restore the invariant locally by replacing any of the above four cases. (This works in part because we did some clever renaming of variables.) 9

Be sure to compare the below with, e.g., the Wikipedia description of how RB trees work in an imperative programming language. let balance = function Black, z, Node (Red, y, Node (Red, x, a, b), c), d Black, z, Node (Red, x, a, Node (Red, y, b, c)), d Black, x, a, Node (Red, z, Node (Red, y, b, c), d) Black, x, a, Node (Red, y, b, Node (Red, z, c, d)) -> Node (Red, y, Node (Black, x, a, b), Node (Black, z, c, d)) a, b, c, d -> Node (a, b, c, d) let insert x s = let rec ins = function Leaf -> Node (Red, x, Leaf, Leaf) Node (color, y, a, b) as s -> if x < y then balance (color, y, ins a, b) else if x > y then balance (color, y, a, ins b) else s in match ins s with Node (_, y, a, b) -> Node (Black, y, a, b) Leaf -> (* guaranteed to be nonempty *) raise (Failure "RBT insert failed with ins returning leaf") 10