Search Trees Chapter 6 4 8 9
Outline Binar Search Trees AVL Trees Spla Trees
Outline Binar Search Trees AVL Trees Spla Trees
Binar Search Trees A binar search tree is a proper binar tree storing ke-value entries at its internal nodes and satisfing the following propert: Let u, v, and w be three nodes such that u is in the left subtree of v and w is in the right subtree of v. We have ke(u) ke(v) ke(w) We will assume that eternal nodes are placeholders : the do not store entries (makes algorithms a little simpler) An inorder traversal of a binar search trees visits the kes in increasing order Binar search trees are ideal for maps or dictionaries with ordered kes. 6 9 4 8
Binar Search Tree All nodes in left subtree An node All nodes in right subtree 38 5 5 7 3 4 63 4 8 35 4 49 55 7
Maintain a sub-tree. Search: Loop Invariant If the ke is contained in the original tree, then the ke is contained in the sub-tree. ke 7 38 5 5 7 3 4 63 4 8 35 4 49 55 7
Search: Define Step Cut sub-tree in half. Determine which half the ke would be in. Keep that half. ke 7 38 5 5 7 3 4 63 4 8 35 4 49 55 7 If ke < root, then ke is in left half. If ke = root, then ke is found If ke > root, then ke is in right half.
Search: Algorithm To search for a ke k, we trace a downward path starting at the root The net node visited depends on the outcome of the comparison of k with the ke of the current node If we reach a leaf, the ke is not found and return of an eternal node signals this. Eample: find(4): Call TreeSearch(4,root) Algorithm TreeSearch(k, v) if T.isEternal (v) return v if k ke(v) return TreeSearch(k, T.left(v)) else if k ke(v) return v else { k ke(v) } return TreeSearch(k, T.right(v)) 6 4 8 9
Insertion To perform operation insert(k, o), we search for ke k (using TreeSearch) Suppose k is not alread in the tree, and let w be the leaf reached b the search We insert k at node w and epand w into an internal node Eample: insert 5 w 6 6 4 8 w 9 4 8 w 5 9
Insertion Suppose k is alread in the tree, at node v. We continue the downward search through v, and let w be the leaf reached b the search Note that it would be correct to go either left or right at v. We go left b convention. We insert k at node w and epand w into an internal node Eample: insert 6 4 8 w 6 9 w 4 8 w 6 6 9
Deletion To perform operation remove(k), we search for ke k Suppose ke k is in the tree, and let v be the node storing k If node v has an eternal leaf child w, we remove v and w from the tree with operation removeeternal(w), which removes w and its parent Eample: remove 4 6 6 4 v 8 w w 5 9 5 8 9
Deletion (cont.) Now consider the case where the ke k to be removed is stored at a node v whose children are both internal we find the internal node w that follows v in an inorder traversal we cop the entr stored at w into node v we remove node w and its left child (which must be a leaf) b means of operation removeeternal() Eample: remove 3 3 v 5 v 8 8 w 5 6 9 6 9
Performance Consider a dictionar with n items implemented b means of a linked binar search tree of height h the space used is O(n) methods find, insert and remove take O(h) time The height h is O(n) in the worst case and O(log n) in the best case It is thus worthwhile to balance the tree (net topic)!
Outline Binar Search Trees AVL Trees Spla Trees
AVL Trees The AVL tree is the first balanced binar search tree ever invented. It is named after its two inventors, G.M. Adelson-Velskii and E.M. Landis, who published it in their 96 paper "An algorithm for the organiation of information.
AVL Trees AVL trees are balanced. An AVL Tree is a binar search tree in which the heights of siblings can differ b at most. 4 44 7 78 3 5 48 6 3 88 height
Height of an AVL Tree Claim: The height of an AVL tree storing n kes is O(log n).
Height of an AVL Tree Proof: We compute a lower bound n(h) on the number of internal nodes of an AVL tree of height h. Observe that n() = and n() = For h >, a minimal AVL tree contains the root node, one minimal AVL subtree of height h - and another of height h -. That is, n(h) = + n(h - ) + n(h - ) Knowing n(h - ) > n(h - ), we get n(h) > n(h - ). So n(h) > n(h - ), n(h) > 4n(h - 4), n(h) > 8n(n - 6), > i n(h - i) If h is even, we let i = h/-, so that n(h) > h/- n() = h/ If h is odd, we let i = h/-/, so that n(h) > h/-/ n() = h/-/ In either case, n(h) > h/- Taking logarithms: h < log(n(h)) + Thus the height of an AVL tree is O(log n) n() 3 4 n()
Insertion height = 7 height = 3 7 4 Insert() 4 Problem!
Insertion Imbalance ma occur at an ancestor of the inserted node. height = 3 7 height = 4 7 4 8 Insert() 3 4 Problem! 8 3 5 3 5
Insertion: Rebalancing Strateg Step : Search Starting at the inserted node, traverse toward the root until an imbalance is discovered. height = 4 7 3 4 Problem! 8 3 5
Insertion: Rebalancing Strateg Step : Repair The repair strateg is called trinode restructuring. 3 nodes, and are distinguished: = the parent of the high sibling = the high sibling = the high child of the high sibling We can now think of the subtree 3 rooted at as consisting of these 3 nodes plus their 4 subtrees 3 height = 4 7 4 Problem! 8 5
Step : Repair Insertion: Rebalancing Strateg The idea is to rearrange these 3 nodes so that the middle value becomes the root and the other two becomes its children. Thus the grandparent parent child structure becomes a triangular parent two children structure. Note that must be either bigger than both and or smaller than both and. Thus either or is made the root of this subtree. height = h h- T Then the subtrees T are attached at the appropriate places. Since the heights of subtrees T differ b at most, the resulting tree is balanced. T T one is & one is h-4
Insertion: Trinode Restructuring Eample Note that is the middle value. height = h h- Restructure h- T T T T T T one is & one is h-4 one is & one is h-4
Insertion: Trinode Restructuring - 4 Cases There are 4 different possible relationships between the three nodes, and before restructuring: height = h height = h height = h height = h h- h- T h- h- T T T T T T T T T T T one is & one is h- 4 one is & one is h- 4 one is & one is h- 4 one is & one is h- 4
Insertion: Trinode Restructuring - 4 Cases This leads to 4 different solutions, all based on the same principle. height = h height = h height = h height = h h- h- T h- h- T T T T T T T T T T T one is & one is h- 4 one is & one is h- 4 one is & one is h- 4 one is & one is h- 4
Insertion: Trinode Restructuring - Case Note that is the middle value. height = h h- Restructure h- T T T T T T one is & one is h-4 one is & one is h-4
Insertion: Trinode Restructuring - Case height = h h- h- T Restructure T T T T T one is & one is h-4 one is & one is h-4
Insertion: Trinode Restructuring - Case 3 height = h h- h- Restructure T T T T T T one is & one is h-4 one is & one is h-4
Insertion: Trinode Restructuring - Case 4 height = h h- T h- Restructure T 3 T T T T T one is & one is h-4 one is & one is h-4
Insertion: Trinode Restructuring - The Whole Tree Do we have to repeat this process further up the tree? No! The tree was balanced before the insertion. Insertion raised the height of the subtree b. Rebalancing lowered the height of the subtree b. Thus the whole tree is still balanced. height = h h- Restructure h- T 3 T T T T T 3 T T one is & one is h-4 one is & one is h-4
Removal Imbalance ma occur at an ancestor of the removed node. height = 3 7 height = 3 7 4 8 Remove(8) 4 Problem! 3 5 3 5
Removal: Rebalancing Strateg Step : Search Let w be the node actuall removed (i.e., the node matching the ke if it has a leaf child, otherwise the node following in an in-order traversal. Starting at w, traverse toward the root until an imbalance is discovered. height = 3 7 4 Problem! 3 5
Removal: Rebalancing Strateg Step : Repair We again use trinode restructuring. 3 nodes, and are distinguished: height = 3 7 = the parent of the high sibling = the high sibling = the high child of the high sibling (if children are equall high, keep chain linear) 3 4 Problem! 5
Removal: Rebalancing Strateg Step : Repair The idea is to rearrange these 3 nodes so that the middle value becomes the root and the other two becomes its children. Thus the grandparent parent child structure becomes a triangular parent two children structure. Note that must be either bigger than both and or smaller than both and. Thus either or is made the root of this subtree, and is lowered b. Then the subtrees T are attached at the appropriate places. Although the subtrees T can differ in height b up to, after restructuring, sibling subtrees will differ b at most. T T or & h- 4 height = h h- or T
Removal: Trinode Restructuring - 4 Cases There are 4 different possible relationships between the three nodes, and before restructuring: height = h height = h height = h height = h h- h- T h- h- T or h- 3 T T or h- 3 T T T T T T T T or & h-4 or & h-4 or & h-4 or & h-4
Removal: Trinode Restructuring - Case Note that is the middle value. height = h h- Restructure h or h- h- or T T or T T T T or & h- 4 or or & h- 4
Removal: Trinode Restructuring - Case height = h h- T T or Restructure h- or h or h- T T T T or or & h- 4 or & h-4
Removal: Trinode Restructuring - Case 3 height = h h- h- Restructure T T T T T T or & h- 4 or & h-4
Removal: Trinode Restructuring - Case 4 height = h h- T h- Restructure T 3 T T T T T or & h- 4 or & h-4
Step : Repair Removal: Rebalancing Strateg Unfortunatel, trinode restructuring ma reduce the height of the subtree, causing another imbalance further up the tree. Thus this search and repair process must in the worst case be repeated until we reach the root.
Java Implementation of AVL Trees Please see tet
Running Times for AVL Trees a single restructure is O() using a linked-structure binar tree find is O(log n) height of tree is O(log n), no restructures needed insert is O(log n) initial find is O(log n) Restructuring is O() remove is O(log n) initial find is O(log n) Restructuring up the tree, maintaining heights is O(log n)
AVLTree Eample
Outline Binar Search Trees AVL Trees Spla Trees
Self-balancing BST Spla Trees Invented b Daniel Sleator and Bob Tarjan Allows quick access to recentl accessed elements D. Sleator Bad: worst-case O(n) Good: average (amortied) case O(log n) Often perform better than other BSTs in practice R. Tarjan
Splaing Splaing is an operation performed on a node that iterativel moves the node to the root of the tree. In spla trees, each BST operation (find, insert, remove) is augmented with a spla operation. In this wa, recentl searched and inserted elements are near the top of the tree, for quick access.
3 Tpes of Spla Steps Each spla operation on a node consists of a sequence of spla steps. Each spla step moves the node up toward the root b or levels. There are tpes of step: Zig-Zig Zig-Zag Zig These steps are iterated until the node is moved to the root.
Zig-Zig Performed when the node forms a linear chain with its parent and grandparent. i.e., right-right or left-left T 4 ig-ig T T T T T 4
Zig-Zag Performed when the node forms a non-linear chain with its parent and grandparent i.e., right-left or left-right ig-ag T T 4 T T T 4 T
Zig Performed when the node has no grandparent i.e., its parent is the root ig T 4 w w T T T 4 T T
Spla Trees & Ordered Dictionaries which nodes are splaed after each operation? method find(k) insert(k,v) spla node if ke found, use that node if ke not found, use parent of eternal node where search terminated use the new node containing the entr inserted remove(k) use the parent of the internal node w that was actuall removed from the tree (the parent of the node that the removed item was swapped with)
Recall BST Deletion Now consider the case where the ke k to be removed is stored at a node v whose children are both internal we find the internal node w that follows v in an inorder traversal we cop ke(w) into node v we remove node w and its left child (which must be a leaf) b means of operation removeeternal() Eample: remove 3 which node will be splaed? 3 v 5 v 8 8 w 5 6 9 6 9
Note on Deletion The tet (Goodrich, p. 463) uses a different convention for BST deletion in their splaing eample Instead of deleting the leftmost internal node of the right subtree, the delete the rightmost internal node of the left subtree. We will stick with the convention of deleting the leftmost internal node of the right subtree (the node immediatel following the element to be removed in an inorder traversal).
Spla Tree Eample
Worst-case is O(n) Eample: Performance Find all elements in sorted order This will make the tree a left linear chain of height n, with the smallest element at the bottom Subsequent search for the smallest element will be O(n)
Average-case is O(log n) Performance Proof uses amortied analsis We will not cover this Operations on more frequentl-accessed entries are faster. Given a sequence of m operations, the running time to access entr i is: O log m / f (i) where f(i) is the number of times entr i is accessed.
(, 4) Trees Other Forms of Search Trees These are multi-wa search trees (not binar trees) in which internal nodes have between and 4 children Have the propert that all eternal nodes have eactl the same depth. Worst-case O(log n) operations Somewhat complicated to implement Red-Black Trees Binar search trees Worst-case O(log n) operations Somewhat easier to implement Requires onl O() structural changes per update
Summar Binar Search Trees AVL Trees Spla Trees