Advanced Set Representation Methods AVL trees. 2-3(-4) Trees. Union-Find Set ADT DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 1
Advanced Set Representation. AVL Trees Problem with BSTs: worst case operation may take O(n) time. One solution: AVL tree: binary search tree with a balance condition: For every node in an AVL tree T, the height of the left (T L ) and right (T R ) subtrees can differ by at most 1: h L - h R 1 Balance condition must be easy to maintain, and it ensures that the tree depth is O(log n) Requiring that the left and right subtrees have the same height does not suffice (tree may not be shallow) DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 2
Two BSTs Which BST is AVL? depth=0 depth=1 depth=2 depth=3 For every node in an AVL tree T, the height of the left (T L ) and right (T R ) subtrees can differ by at most 1: h L - h R 1 DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 3
AVL Tree Height Claim: height of an AVL tree storing n keys is O(log n) Proof Let n(h) denote the number of nodes of an AVL tree of height h. It is easy to see that n(1) = 1 and n(2) = 2 For n > 2, an AVL tree of height h contains: the root node, one AVL subtree of height h-1 and another of height h-2. That is, n(h) = 1 + n(h-1) + n(h-2) Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). By induction n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6),, n(h) > 2 i n(h-2i) Solving the base case we get: n(h) > 2 h/2-1 Taking logarithms: h < 2log n(h) +2 Thus the height of an AVL tree is O(log n) DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 4
Insertion in an AVL Tree Insertion is as in a binary search tree Always done by attaching an external node. Example: After insertion the tree may need rebalancing DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 5
AVL tree rebalancing Two kinds of rotation: single rotation (left or right) double rotation left-right: left rotation around the left child of a node followed by a right rotation around the node itself right-left: right rotation around the left child of a node followed by a left rotation around the node itself Rotation features: Nodes not in the subtree of the node rotated are unaffected A rotation takes constant time Before and after the rotation tree is still BST Code for left rotation is symmetric to code for a right rotation DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 6
AVL Single Rotations Single Rotation k 1 < k 2 all elements in subtree A are smaller than k 1 all elements in subtree C are larger than k 2 all elements in subtree B are in between k 1 and k 2 DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 7
Single Rotation Right Example Note that subtrees B and C are empty DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 8
Single Rotation Left Example DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 9
AVL Implementation Detail typedef struct { ElementT element; AVLPtr left; AVLPtr right; int height; } AVLNode; typedef AVLNode *AVLPtr; DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 10
AVL Single Rotations void snglrotright( AVLPtr *k2 ) { AVLPtr k1 ; k1 = (*k2)->left ; (*k2)->left = k1->right ; k1->right = *k2 ; (*k2)->height = max( height( (*k2)->left ), height( (*k2)->right) ) +1 ; k1->height = max( height( k1- >left ), (*k2)->height ) + 1 ; *k2 = k1 ; /* assign new root */ } /* snglrotleft is symmetric */ DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 11
Single Rotation not enough DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 12
AVL Double Rotation Right-left k 2 < k 1, k 3 < k 1, k 3 < k 2 right rotation around the left child of a node followed by a left rotation around the node itself Rotate to make k 2 topmost node void dblrotleft( AVLPtr k3 ) { /* rotate between k1 and k2 */ snglrotright ( (*k3)->left ); /* rotate between k3 and k2 */ snglrotleft ( k3 ); } DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 13
AVL Double Rotation Left-right: k 1 < k 2, k 1 < k 3, k 2 < k 3 left rotation around the left child of a node followed by a right rotation around the node itself Rotate to make k 2 topmost node DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 14
AVL Right-left Double Rotation Example 1 Subtrees W, X, Y, Z are empty DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 15
AVL Right-left Double Rotation Example 2 Subtree X is empty DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 16
AVL Double Rotation example A symmetric case of double rotation Tree before insertion of 27 k 3 k 1 Tree after insertion of 27 and re-balancing k 2 27 k 2 k 1 k 3 DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 17
AVL Trees. When to use rotations Balance factor: height(t L ) height(t R ) Use Single rotation left: when a node is inserted in the right subtree of the right child (B) of the nearest ancestor (A) with balance factor -2 Single rotation right: when a node is inserted in the left subtree of the left child (B) of the nearest ancestor (A) with balance factor +2. Right-left double rotation: when a node is inserted in the left subtree of the right child (B) of the nearest ancestor (A) with balance factor -2. Left-right double rotation: when a node is inserted in the right subtree of the left child (B) of the nearest ancestor (A) with balance factor +2. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 18
AVL Trees. Deletions A node is deleted using the standard inorder successor (predecessor) logic for binary search trees Imbalance is fixed using rotations Identify the parent of the actual node that was deleted, then: If the left child was deleted, the balance factor at the parent decreased by 1. If the right child was deleted, the balance factor at the parent increased by 1. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 19
AVL Trees. Deletions. Rebalancing (I) Let A be the node where balance must be restored. If the deleted node was in A's right subtree, then let B be the root of A's left subtree. Then: B has balance factor 0 or +1 after deletion -- then perform a single right rotation B has balance factor -1 after deletion -- then perform a left-right rotation If the deleted node was in A's left subtree, then let B be the root of A's right subtree. Then: B has balance factor 0 or -1 after deletion -- then perform a single left rotation B has balance factor +1 after deletion -- then perform a right-left rotation DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 20
AVL Trees. Deletions. Rebalancing (II) Unlike insertion, one rotation may not be enough to restore balance to the tree. If this is the case, then locate the next node where the balance factor is "bad" (call this A): If A's balance factor is positive, then let B be A's left child If B's left subtree height is larger than B's right subtree height, then perform a single right rotation. If B's left subtree height is smaller than B's right subtree height, then perform a left-right rotation. If B's left subtree height is equal to B's right subtree height, then perform either rotation. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 21
AVL Trees. Deletions. Rebalancing (III) If A's balance factor is negative, then let B be A's right child If B's right subtree height is larger than B's left subtree height, then perform a single left rotation. If B's right subtree height is smaller than B's left subtree height, then perform a right-left rotation. If B's right subtree height is equal to B's left subtree height, then perform either rotation. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 22
AVL trees demo Demos from: http://webpages.ull.es/users/jriera/docencia/avl/a VL%20tree%20applet.htm http://www.site.uottawa.ca/~stan/csi2514/applets/a vl/bt.html Replacement for AVL trees: Red-Black trees (not discussed here) DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 23
Running times for AVL tree operations a single restructure is O(1) using a linked-structure binary tree find is O(log n) height of tree is O(log n), no restructures needed insert is O(log n) initial find is O(log n) restructuring up the tree, maintaining heights is O(log n) remove is O(log n) initial find is O(log n) restructuring up the tree, maintaining heights is O(log n) DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 24
2-3 Trees 2-3 tree properties: Each interior node has two or three children. Each path from the root to a leaf has the same length. A tree with zero or one node(s) is a special case of a 2-3 tree. Representing sets with 2-3 trees: Elements are placed at the leaves If element a is to the left of element b, then a < b must hold. Ordering of elements based on one field of a record: a key. At each interior node: key of the smallest descendant of the second child and, if there is a third child, key of the smallest descendant of third child. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 25
2-3 trees A 2-3 tree of k levels has between 2 k - 1 and 3 k - 1 leaves, i.e. representing a set of n elements requires at least 1+log 3 n levels and no more than 1+log 2 n levels. Thus, path lengths in the tree are O(log n). DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 26
2-3 trees. Insertion 2 children If parent node has only two children: insert in order, adjust keys DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 27
2-3 trees. Insertion 3 children If parent node node already has three children: We cannot have four nodes, and thus we have to split parent in two nodes node and node. The two smallest elements among the four children of node stay with node, The two larger will become the children of node Continue process up the tree Special case when splitting the root DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 28
2-3 trees. Insertion DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 29
2-3 trees. Deletion Leaf deletion could leave parent with only 1 child Parent=root: delete node and let its lone child be the new root Otherwise, let p be the parent of node if p has another child, and that child of p has three children, then we can transfer the proper one to node; done; if the children of p, have only two children, transfer the lone child of node to an adjacent sibling of node, and delete node; p has only one child, repeat all the above, recursively, with p in place of node. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 30
2-3 trees. Deletion DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 31
2-3-4 trees. 2-3-4 tree refer to how many links to child nodes can potentially be contained in a given node. For non-leaf nodes, three arrangements : A node with one data item always has two children A node with two data items always has three children A node with three data items always has four children In short, a non-leaf node must always have one more child than it has data items. Symbolically, if the number of child links is L and the number of data items is D, then L = D + 1 Empty nodes are not allowed. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 32
2-3-4 trees. Node kinds. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 33
2-3-4 trees. Properties and example. All children in the subtree rooted at child 0 have key values less than key 0. All children in the subtree rooted at child 1 have key values greater than key 0 but less than key 1. All children in the subtree rooted at child 2 have key values greater than key 1 but less than key 2. All children in the subtree rooted at child 3 have key values greater than key 2. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 34
2-3-4 trees. Example. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 35
2-3-4 Trees. Insertion Insertion procedure: similar to insertion in 2-3 trees items are inserted at the leafs since a 4-node cannot take another item, 4-nodes are split up during insertion process Strategy on the way from the root down to the leaf: split up all 4-nodes "on the way" insertion can be done in one pass (remember: in 2-3 trees, a reverse pass might be necessary) DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 36
2-3-4 Trees. Deletion Deletion procedure: similar to deletion in 2-3 trees items are deleted at the leafs swap item of internal node with inorder successor note: a 2-node leaf creates a problem Strategy (different strategies possible) on the way from the root down to the leaf: turn 2-nodes (except root) into 3-nodes deletion can be done in one pass (remember: in 2-3 trees, a reverse pass might be necessary) DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 37
Demo: 2-3-4 trees Demo from: http://www.cs.unm.edu/~rlpm/499/ttft.html Local: Demos\2-3-4trees\TTFT.jar DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 38
Disjoint Sets with the UNION and FIND Operations Applicable to problems where: start with a collection of objects, each in a set by itself; combine sets in some order, and from time to time ask which set a particular object is in Equivalence classes: If set S has an equivalence relation (reflexive, symmetrical, transitive) defined on it, then the set S can be partitioned into disjoint subsets S 1, S 2,,... S with Equivalence problem: given a set S and a sequence of statements of the form a b process the statements in order in such a way that at any time we are able to determine in which equivalence class a given element belongs U k S = k S DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 39
Union-find set ADT Example statements (see previous slide) Operations: union(a, B) takes the union of the components A and B and calls the result either A or B, arbitrarily. find(x), a function that returns the name for the component of which x is a member. initial(a, x) creates a component named A that contains only the element x. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 40
Union-find set ADT. Implementations List-based Each set is stored in a sequence represented with a linked-list Each node should store an object containing the element and a reference to the set name Tree based Each element is stored in a node, which contains a pointer to a set name A node v whose set pointer points back to v is also a set name Each set is a tree, rooted at a node with a selfreferencing set pointer DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 41
Examples List based Tree based DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 42
Union-find set. List based implementation const int n = /* appropriate value */ typedef int NameT; typedef int ElementT; typedef struct ufset { struct /* headers for set lists */ { int count:; int firstelement ; } setheaders[ n ] ; struct /* table giving set containing each member */ { NameT setname ; int nextelement ; } setnames[ n ] ; } UFSetT; DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 43
Union-find sets. Tree representation Maintain S as a collection of trees (a forest), one per partition Initially, there are n trees, each containing a single element find(i) returns the root of the tree containing i union(i, j) merges the trees containing i and j Typical traversals not needed, so no need for pointers to children, but we need pointers to parent Parent pointers can be stored in an array: parent[i] (=-1 for root) DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 44
Union-find sets. Tree implementation DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 45
Union-find sets. Speeding up Union by size (rank): When performing a union, make the root of smaller tree point to the root of the larger Implies O(n log n) time for performing n unionfind operations: Each time we follow a pointer, we are going to a subtree of size at least double the size of the previous subtree Thus, we will follow at most O(log n) pointers for any find. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 46
Union-find sets. Speeding up Path compression: After performing a find, compress all the pointers on the path just traversed so that they all point to the root Implies O(n log* n) time for performing n union- find DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 47
Using union by rank and path compression Time bound of O(mα(n)) where α(n) is a very slowly growing function DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 48
Union-find ADT demo Demo from: http://www.cs.unm.edu/~rlpm/499/uf.html Local: Demos\UnionFind\UnionFind.jar DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 49
A very quickly growing function and its inverse For integers k 0 and j 1, define A k (j): A A j + 1, k ( j) = ( j+ 1) k 1 ( j), if k 1 where A k-10 (j)=j, A k-1 (i) (j)= A k-1 (A k-1 (i-1) (j)) for i 1. k is called the level of the function and i in the above is called iterations. A k (j) strictly increase with both j and k. Let us see how quick the increase is if k = 0 DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 50
A very quickly growing function and its inverse Let us see A k (1): for k=0,1,2,3,4. A 0 (1)=1+1=2 A 1 (1)=2 1+1=3 A 2 (1)=2 1+1 (1+1)-1=7 A 3 (1)=A 2 (1+1) (1)=A 2 (2) (1)=A 2 (A 2 (1))=A 2 (7)=2 7+1 (7+1) -1=2 8 8-1=2047 A 4 (1)=A 32 (1)=A 3 (A 3 (1)) =A 3 (2047)=A 2 (2048) (2047) >> A 2 (2047) =2 2048 2048-1 >2 2048 =(2 4 ) 512 =(16) 512 >>10 80. (estimated number of atoms in universe) DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 51
Inverse of A k (n):α(n) α(n)=min{k: A k (1) n} α(n)= 0 for 0 n 2 1 n =3 2 for 4 n 7 3 for 8 n 2047 4 for 2048 n A 4 (1). Extremely slow increasing function. α(n) 4 for all practical purpose. DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 52
Reading AHU, chapter 5, sections 5.4, 5.5 Preiss, chapter: Search Trees section AVL Search Trees CLR, chapter 22, sections 1-4 CLRS chapter 21 Notes DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 53