Balanced BST. Balanced BSTs guarantee O(logN) performance at all times

Similar documents
Algorithms. AVL Tree

CSI33 Data Structures

More Binary Search Trees AVL Trees. CS300 Data Structures (Fall 2013)

More BSTs & AVL Trees bstdelete

AVL Trees / Slide 2. AVL Trees / Slide 4. Let N h be the minimum number of nodes in an AVL tree of height h. AVL Trees / Slide 6

COMP171. AVL-Trees (Part 1)

Dynamic Access Binary Search Trees

Data Structures and Algorithms

Ch04 Balanced Search Trees

Dynamic Access Binary Search Trees

AVL Trees. (AVL Trees) Data Structures and Programming Spring / 17

Balanced Search Trees. CS 3110 Fall 2010

Part 2: Balanced Trees

ADVANCED DATA STRUCTURES STUDY NOTES. The left subtree of each node contains values that are smaller than the value in the given node.

CS350: Data Structures AVL Trees

10/23/2013. AVL Trees. Height of an AVL Tree. Height of an AVL Tree. AVL Trees

Balanced Binary Search Trees. Victor Gao

Fundamental Algorithms

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

Some Search Structures. Balanced Search Trees. Binary Search Trees. A Binary Search Tree. Review Binary Search Trees

Data Structures and Algorithms(12)

CS Transform-and-Conquer

CISC 235: Topic 4. Balanced Binary Search Trees

ECE250: Algorithms and Data Structures AVL Trees (Part A)

8.1. Optimal Binary Search Trees:

CMPE 160: Introduction to Object Oriented Programming

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

Data Structures in Java

Data Structures Lesson 7

AVL Trees. See Section 19.4of the text, p

Recall from Last Time: AVL Trees

AVL Tree Definition. An example of an AVL tree where the heights are shown next to the nodes. Adelson-Velsky and Landis

Trees. (Trees) Data Structures and Programming Spring / 28

CS 261 Data Structures. AVL Trees

BRONX COMMUNITY COLLEGE of the City University of New York DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

AVL Trees. Version of September 6, AVL Trees Version of September 6, / 22

Analysis of Algorithms

CS350: Data Structures Red-Black Trees

Binary Search Trees. Analysis of Algorithms

Balanced Binary Search Trees

AVL Trees Goodrich, Tamassia, Goldwasser AVL Trees 1

Search Structures. Kyungran Kang

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

COSC160: Data Structures Balanced Trees. Jeremy Bolton, PhD Assistant Teaching Professor

AVL Trees Heaps And Complexity

Data Structures Week #6. Special Trees

Binary Search Tree - Best Time. AVL Trees. Binary Search Tree - Worst Time. Balanced and unbalanced BST

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

Data Structure - Advanced Topics in Tree -

Data Structures and Algorithms Lecture 7 DCI FEEI TUKE. Balanced Trees

Computer Science 210 Data Structures Siena College Fall Topic Notes: Binary Search Trees

Data Structures Week #6. Special Trees

Splay Trees. (Splay Trees) Data Structures and Programming Spring / 27

Binary search trees. We can define a node in a search tree using a Python class definition as follows: class SearchTree:

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Lecture 6: Analysis of Algorithms (CS )

COMP Analysis of Algorithms & Data Structures

COMP Analysis of Algorithms & Data Structures

Introduction to Programming: Lecture 13

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Lecture 16 Notes AVL Trees

Inf 2B: AVL Trees. Lecture 5 of ADS thread. Kyriakos Kalorkoti. School of Informatics University of Edinburgh

Week 2. TA Lab Consulting - See schedule (cs400 home pages) Peer Mentoring available - Friday 8am-12pm, 12:15-1:30pm in 1289CS

Data Structures Week #6. Special Trees

AVL trees and rotations

CE 221 Data Structures and Algorithms

CS 310: Tree Rotations and AVL Trees

CS 206 Introduction to Computer Science II

DATA STRUCTURES AND ALGORITHMS

CS 206 Introduction to Computer Science II

Lecture: Analysis of Algorithms (CS )

CSI33 Data Structures

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

Advanced Tree Data Structures

Algorithms. Deleting from Red-Black Trees B-Trees

CSCI 136 Data Structures & Advanced Programming. Lecture 25 Fall 2018 Instructor: B 2

3137 Data Structures and Algorithms in C++

CSC 172 Data Structures and Algorithms. Fall 2017 TuTh 3:25 pm 4:40 pm Aug 30- Dec 22 Hoyt Auditorium

Binary Search Tree: Balanced. Data Structures and Algorithms Emory University Jinho D. Choi

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

CS102 Binary Search Trees

COMP Analysis of Algorithms & Data Structures

Lesson 21: AVL Trees. Rotation

Transform & Conquer. Presorting

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Efficient Access to Non-Sequential Elements of a Search Tree

AVL Trees. Reading: 9.2

MA/CSSE 473 Day 20. Recap: Josephus Problem

Data Structure. Chapter 10 Search Structures (Part II)

Section 4 SOLUTION: AVL Trees & B-Trees

Trees. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Assignment 3 Solutions, COMP 251B

Lecture 23: Binary Search Trees

Multi-Way Search Trees

Graduate Algorithms CS F-07 Red/Black Trees

Computer Science 210 Data Structures Siena College Fall Topic Notes: Binary Search Trees

Binary search trees (chapters )

Balanced Binary Search Trees

Balanced search trees

Transcription:

Balanced BST Balanced BSTs guarantee O(logN) performance at all times the height or left and right sub-trees are about the same simple BST are O(N) in the worst case Categories of BSTs AVL, SPLAY trees: dynamic environment optimal trees: static environment E.G.M. Petrakis Trees in Main Memory 1

AVL Trees AVL (Adelson, Lelskii, Landis): the height of the left and right subtrees differ by at most 1 the same for every subtree Number of comparisons for membership operations best case: completely balanced worst case: 1.44 log(n+2) expected case: logn +.25 <= log(n +1) E.G.M. Petrakis Trees in Main Memory 2

AVL Trees : completely balanced sub-tree / : the left sub-tree is 1 higher \ : the right sub-tree is 1 higher E.G.M. Petrakis Trees in Main Memory 3

AVL Trees E.G.M. Petrakis Trees in Main Memory 4

Non AVL Trees critical node // : the left sub-tree is 2 higher \\ : the right sub-tree is 2 higher E.G.M. Petrakis Trees in Main Memory 5

Single Right Rotations Insertions or deletions may result in non AVL trees => apply rotations to balance the tree Β Α Α T3 T1 Β T1 T2 T2 T3 E.G.M. Petrakis Trees in Main Memory 6

Single Left Rotations Α ΑB T1 Β Α T3 T2 T3 T1 T2 E.G.M. Petrakis Trees in Main Memory 7

4 4 2 2 insert 1 2 single right rotation 1 4 1 4 insert 9 4 single left rotation 4 9 E.G.M. Petrakis Trees in Main Memory 9

Double Left Rotation T4 T2 Composition of two single rotations (one right and one left rotation) ΑC ΑB A T3 T1 E.G.M. Petrakis Trees in Main Memory 9 or T4 ΑC Α Τ2 or Β Α T3 T1 T4 ΑC T2 ΑΒ or T3 A Τ1

Example of Double Left Rotation Critical node 4 4 \\ 2 7 2 = / = 6 / 7 = 9 = 7 4 2 6 9 6 insert 6 9 E.G.M. Petrakis Trees in Main Memory 10

Double Right Rotation Composition of two single rotations (one left and one right rotation) ΑB ΑC ΑΒ Α ΑB ΑA C T1 T2 ΑΒ T3 T4 T1 A T2 T3 T4 T1 T2 or T3 Τ4 E.G.M. Petrakis Trees in Main Memory 11 or or

Insertion (deletion) in AVL 1. Follow the search path to verify that the key is not already there 2. Insert (delete) the key 3. Retreat along the search path and check the balance factor 4. Rebalance if necessary (see next) E.G.M. Petrakis Trees in Main Memory 12

Rebalancing For every node reached coming up from its left sub-tree after insertion readjust balance factor = becomes / => no operation \ becomes = => no operation / becomes // => must be rebalanced!! The // node becomes a critical node Only the path from the critical node to the leaf has to be rebalanced!! Rotation is applied only at the critical node! E.G.M. Petrakis Trees in Main Memory 13

Rebalancing (cont.) The balance factor of the critical node determines what rotation is to take place single or double If the child and the grand child (inserted node) of the critical node are on the same direction (both / ) => single rotation Else => double rotation Rebalance similarly if coming up from the right sub-tree (opposite signs) E.G.M. Petrakis Trees in Main Memory 14

Performance Performance of membership operations on AVL trees: easy for the worst case! An AVL tree will never be more than 45% higher that its perfectly balanced counterpart (Adelson, Velskii, Landis): log(n+1) <= h b (N) <= l.4404log(n+2) 0.302 E.G.M. Petrakis Trees in Main Memory 15

Worst case AVL Sparse tree => each sub-tree has minimum number of nodes E.G.M. Petrakis Trees in Main Memory 16

Fibonacci Trees T h : tree of height h T h has two sub-trees, one with height h-1 and one with height h-2 else it wouldn t have minimum number of nodes T 0 is the empty sub-tree (also Fibonacci) T 1 is the sub-tree with 1 node (also Fibonacci) E.G.M. Petrakis Trees in Main Memory 17

E.G.M. Petrakis Trees in Main Memory 1 Fibonacci Trees (cont.) Average height N h number of nodes of T h N h = N h-1 + N h-2 + 1 N 0 = 1 N 1 = 2 From which h <= 1.44 log(n+1) 1 2 5 1 5 1 2 5 1 5 1 N 2 h 2 h h + = + +

More Examples 7 single rotation 7 7 9 insert 9 9 E.G.M. Petrakis Trees in Main Memory 19

Examples (cont.) 6 double rotation 6 7 6 insert 7 7 E.G.M. Petrakis Trees in Main Memory 20

Examples (cont.) double rotation 7 6 insert 7 6 7 6 E.G.M. Petrakis Trees in Main Memory 21

Examples (cont.) single rotation 7 7 6 7 9 9 delete 6 9 E.G.M. Petrakis Trees in Main Memory 22

Examples (cont.) 5 6 single rotation 6 6 9 7 9 7 9 7 delete 5 E.G.M. Petrakis Trees in Main Memory 23

Examples (cont.) double rotation 5 6 6 6 7 7 7 delete 5 E.G.M. Petrakis Trees in Main Memory 24

General Deletions 5 5 3 delete 4 2 delete 2 4 7 10 1 3 7 10 1 6 9 11 6 9 11 E.G.M. Petrakis Trees in Main Memory 25

General Deletions (cont.) 5 5 2 7 2 10 delete delete 6 delete 5 1 3 6 10 1 3 7 11 9 11 9 E.G.M. Petrakis Trees in Main Memory 26

General Deletions (cont.) 3 2 10 delete 5 1 7 11 9 E.G.M. Petrakis Trees in Main Memory 27

Self Organizing Search Splay Trees: adapt to query patterns move to root (front): whenever a node is accessed move it to the root using rotations equivalent to move-to-front in arrays current node insertions: inserted node deletions: father of deleted node or null if this is the root membership: the last accessed node E.G.M. Petrakis Trees in Main Memory 2

20 Search(10) 20 15 30 15 30 13 first rotation 10 second rotation 10 14 20 13 14 10 10 13 15 30 third rotation 13 20 15 14 30 14 E.G.M. Petrakis Trees in Main Memory 29

Splay Cases If the current node q has no grandfather but it has father p => only one single rotation two symmetric cases: L, R a p b q c E.G.M. Petrakis Trees in Main Memory 30 a p b q c

If p has also grandfather qp => 4 cases gp a p LL b q a c d gp a p q d b c RL E.G.M. Petrakis Trees in Main Memory 31 gp gp LL symmetric of RR RL symmetric of RL q p d c b q p a b c d

1 1 a q α q j j 2 i 7 RR 2 i 7 LL 6 h 6 h 3 g 5 g c 4 d 5 current node 3 4 e f e f c d E.G.M. Petrakis Trees in Main Memory 32

E.G.M. Petrakis Trees in Main Memory 33 1 a q j i 2 7 6 g 3 c 4 5 e f b LR 1 α q j i 2 7 6 h 3 c 4 5 e f d g RL a, b, c are sub-trees

5 a 2 1 q j b 3 4 f 6 7 i c d e g h E.G.M. Petrakis Trees in Main Memory 34

Splay Performance Splay trees adapt to unknown or changing probability distributions Splay trees do not guarantee logarithmic cost for each access AVL trees do! asymptotic cost close to the cost of the optimal BST for unknown probability distributions It can be shown that the cost of m operations on an initially empty splay tree, where n are insertions is O(mlogn) in the worst case E.G.M. Petrakis Trees in Main Memory 35

Optimal BST Static environment: no insertions or deletions Keys are accessed with various frequencies Have the most frequently accessed keys near the root Application: a symbol table in main memory E.G.M. Petrakis Trees in Main Memory 36

Searching Given symbols a 1 < a 2 <.< a n and their probabilities: p 1, p 2, p n minimize cost Successful search cost = Transform unsuccessful to successful consider new symbols E 1, E 2, E n - α1 α2 αi αi+1. αn αn+1 n i= 1 p level( i a i ) E0 E1 E2 Ei En Ei= (αi, αi+1 ) E0= (-, α1 ) En= (αn, ) E.G.M. Petrakis Trees in Main Memory 37

Unsuccessful Search a n a n-2 a n- 1 E i failure node unsuccessful search for all values in E i terminates on the same failure node (in fact, one node higher) a n-3 E.G.M. Petrakis Trees in Main Memory 3

(a1, a2, a3) = (do, if, read) p i = q i = 1/7 Example read if if do if do read if do cost = 15/7 cost = 13/7 Optimal BST E.G.M. Petrakis Trees in Main Memory 39 read cost = 15/7

read do do read if if cost = 15/7 cost = 15/7 E.G.M. Petrakis Trees in Main Memory 40

Search Cost If p i is the probability to search for a i and q i is the probability to search in E i then n n pi + qi = 1 i = 1 i = 1 n cost = pi level(ai) + qi {level (Ei) 1} i = 1 i 1 successful search E.G.M. Petrakis Trees in Main Memory 41 n unsuccessful search

Observation 1 In a BST, a subtree has nodes that are consecutive in a sorted sequence of keys (e.g. [5,26]) 20 5 10 25 13 24 26 12 14 E.G.M. Petrakis Trees in Main Memory 42

Observation 2 If T ij is a sub-tree of an optimal BST holding keys from i to j then T ij must be optimal among all possible BSTs that store the same keys optimality lemma: all sub-trees of an optimal BST are also optimal E.G.M. Petrakis Trees in Main Memory 43

Optimal BST Construction 1) Construct all possible trees: 1 2n NP-hard, there are trees!! n + 1 n 2) Dynamic programming solves the problem in polynomial time O(n 3 ) at each step, the algorithm finds and stores the optimal tree in each range of key values increases the size of the range at each step until the whole range is obtained E.G.M. Petrakis Trees in Main Memory 44

Example (successful search only): 1) BSTs with 1 node optimal keys 1 10 20 40 probabilities 0,3 0,2 0,1 0,4 range 1 10 20 40 cost= 0.3 0.2 0.1 0.4 2) BSTs with 2 nodes k=1-10 k=10-20 k=20-20 range 1-10 1 optimal range 10-20 10 20 cost=0.3 1+0.2 2=0.7 cost=0.2+0.1 2=0.4 10 20 1 10 cost=0.2 1+0,3.2=0. cost=0.1+0.2 2=0.5 10 optimal range 20-40 E.G.M. Petrakis Trees in Main Memory 45 20 40 cost=0.1+0.=0.9 40 20 cost=0.4+0.2=0.6

3) BSTs with 3 nodes k=1-20 range 1-20 20 1 10 cost=0.1+2 0.3+3 0.2=1.3 10 1 20 cost=0.2+2(0.3+0.1)=1 1 10 20 cost=0.3+2 0.2+3 0.1=1 optimal k=10-40 range 10-40 10 40 20 cost=0.2+2 0.4+3 0.1=1.3 20 10 40 cost=0.1+2(0.2+0.4)=1.3 10 40 20 cost=0.4+2 0.2+30.1=1.1 E.G.M. Petrakis Trees in Main Memory 46

4) BSTs with 4 nodes range 1-40 1 10 40 20 cost=0.3+2 0.4+3 0.2+4 0.1=2.1 10 1 40 cost=0.2+2(0.3+0.4)+3 0.1=1.9 1 20 20 10 40 OPTIMAL BST cost=0.1+2(0.3+0.4)+3 0.2=2.1 1 40 10 E.G.M. Petrakis Trees in Main Memory 47 20 cost=0.4+2 0.3+3 0.2+4 0.1=2

Complexity Compute all optimal BSTs for all C ij, i,j=1,2..n Let m=j-i: number of keys in range C ij n-m-1 C ij s must be computed The one with the minimum cost must be found, this takes O(m(n-m-1)) time For all C ij s it takes 2 3 (nm m ) = O(n ) 1 m There is a better O(n 2 n ) algorithm by Knuth There is also a O(n) greedy algorithm E.G.M. Petrakis Trees in Main Memory 4

Optimal BSTs High probability keys should be near the root But, the value of a key is also a factor It may not be desirable to put the smallest or largest key close to the root => this may result in skinny trees (e.g., lists) E.G.M. Petrakis Trees in Main Memory 49