Search Structures Kyungran Kang (korykang@ajou.ac.kr) Ellis Horowitz, Sartaj Sahni and Susan Anderson-Freed, Fundamentals of Data Structures in C, 2nd Edition, Silicon Press, 2007.
Contents Binary Search Trees AVL Trees Red-Black Trees 2-3 Trees 2-3-4 Trees B-Trees Tries -2-
Binary Search Tree A possible binary search tree with items (if, do, while) if do while A possible binary search tree with items (if, do, while, void, for) for for do while do void if void Are they optimal??? if while -3-
Evaluation of Binary Search Tree (1) Extended binary search tree Placing a special square node at every null link External node: not part of original tree Internal node: part of original tree do if while Cost of binary search tree Assume the binary search tree contains the identifiers a 1, a 2,, a n with a 1 < a 2 < < a n When only successful search is made, the cost of binary search tree is i= 1 level( where p i is the probability of searching for a i n p i a i ) -4-
Evaluation of Binary Search Tree (2) To take the unsuccessful searches into consideration, We consider every external node as failure node We may partition the identifiers not in the binary search tree into n+1 classes E i, 0 i n E 0 contains all identifiers x < a 1 E i contains all identifiers x such that a i < x < a i+1 Total cost of binary search tree n i= 1 p i level ( a i= 0 Optimal binary search tree i ) + n q i ( level ( failure node 1) where q i is the probability of searching for identifier in E i Minimizes the cost over all possible binary search trees for a given set of identifiers n i= 1 i) n p i + q i= 0 i = 1-5-
Performance of Binary Search Tree Time complexity of binary search tree average case: O(log 2 n) worst case: O(n) If we maintain the binary search tree as a complete binary tree Minimize the average and maximum search time Average and worst case: O(log 2 n) Significant increase in the time required to add new element because of reconstruction of the tree Method of growing balanced binary trees Balanced binary trees Average and worst case: O(log 2 n) -6-
AVL Trees (1) An AVL tree is a kind of balanced binary search tree, named after its inventors, Adelson-Velskii and Landis in 1962 Height balanced binary tree An empty binary tree is height balanced If T is a nonempty binary tree with T L and T R as its left and right subtrees, T is height balanced iff T L and T R are height balanced, and h L - h R 1 where h L and h R are height of T L and T R, respectively Balance factor, BF(T), of node T in a binary tree (h L h R ), where h L and h R are heights of left and right subtree of T For any node T in an AVL tree BF(T) = -1, 0, or 1-7-
AVL Trees (2) bf=+2 bf=+2 Yes, it s AVL tree No, it s not AVL tree -8-
AVL Trees (3) A rotation is a local operation in a search tree that preserves inorder traversal key ordering Rebalancing using four kinds of rotations Y: new inserted node, A: the nearest ancestor of Y, whose balance factor becomes ±2 LL : Y is inserted in the left subtree of left subtree of A LR : Y is inserted in the right subtree of the left subtree of A RR : Y is inserted in the right subtree of the right subtree of A RL : Y is inserted in the left subtree of the right subtree of A Height of the subtrees which are not involved in the rotation remain unchanged Insertion into an AVL tree Time to insert a new identifier: O(h), where h is the height of the tree before insertion -9-
AVL Tree Rebalancing bf=+2 H I J LL rotation H bf=-2 I I I H J H J J RR rotation bf=+2 H J I H I J H I bf=-2 J H I J LR rotation RL rotation -10-
Rebalancing in AVL Tree LL Rotation B h-1 A A R h-2 h Insert a node at B L bf=+2 B h A A R h+1 h-2 B L B R B L B R B LL Rotation h A B L B R -11- A R
Rebalancing in AVL Tree LR Rotation (1) B A h Insert a node at C L bf=+2 B A h-1 C h-2 h C h-2 B L C L C R A R B L C L C R A R B C A LR Rotation h B L C L C R A R -12-
Rebalancing in AVL Tree LR Rotation (2) B A h Insert a node at C R bf=+2 B A h-1 B L C C L C R A R h-2 h B L C h-2 C L C R A R C B A LR Rotation B L C L C R A R -13-
Rebalancing in AVL Tree RR Rotation h-2 A L B L B R A B h-1 h Insert a node at B R h-2 bf=-2 A L B L B R A B h B RR Rotation A A L B L -14- B R
Rebalancing in AVL Tree RL Rotation (1) A C Insert a node at B L bf=-2 A C h-2 B h h-2 B h A L h-1 A L B L B R C R B L B R C R A B C RL Rotation A L B L B R C R -15-
Rebalancing in AVL Tree RL Rotation (2) A C h Insert a node at B R bf=-2 A C h-2 B h-2 B h A L h-1 A L B L B R C R B L B R C R B A C RL Rotation h A L B L B R C R -16-
AVL Tree Implementation in C (1) #define struct { int key; } element; typedef struct treenode *treepointer; struct treenode { treepointer leftchild; element data; short int bf; treepointer rightchild; }; int unbalanced = FALSE; /* the tree is balanced */ treepointer root=null; -17-
AVL Tree Implementation in C (2) void avlinsert(treepointer *parent, element x, int *unbalanced) { if (!*parent) {/* insert element into the tree */ *unbalanced = TRUE; *parent= (treepointer)malloc(sizeof(treenode)); (*parent)->leftchild =(*parent)->rightchild = NULL; (*parent)->; (*parent)->data=x; } else if(x.key < (*parent)->data.key) { avlinsert(&(*parent)->leftchild, x, unbalanced); if(*unbalanced) /* left branch has grown higher */ switch((*parent)->bf) { case -1: (*parent)->bf = 0; *unbalanced = FALSE; break; case 0: (*parent)->bf = 1; break; case 1: leftrotation(parent, unbalanced); } p.503 Program 10.3 } -18-
AVL Tree Implementation in C (3) else if(x.key > (*parent)->data.key) { avlinsert(*(*parent)->rightchild, x, unbalanced); if(*unbalanced) /*right branch has grown higher */ switch((*parent)->bf) { case 1: (*parent)->bf = 0; *unbalanced = FALSE; break; case 0: (*parent)->bf = -1; break; case -1: rightrotation(parent, unbalanced); } } else { *unbalanced = 0; printf ( the key is already in the tree\n ); } } /* end of avlinsert */ -19-
Example of AVL Tree Insertion (1) (a) insert ch (b) insert (c) insert ember bf=-2 RR rotation -20-
Example of AVL Tree Insertion (2) (d) insert August Aug (e) insert il Aug bf=+2 LL rotation Aug -21-
Example of AVL Tree Insertion (3) (f) insert January Aug bf=+2 LR rotation Aug Jan Jan -22-
Example of AVL Tree Insertion (4) (g) insert December Aug Dec Jan (h) insert July Aug Dec Jan Jul -23-
Example of AVL Tree Insertion (5) (i) insert February bf=-2 Aug Jan RL rotation Aug Dec Jan Dec Feb Jul Feb Jul -24-
Example of AVL Tree Insertion (6) (j) insert June Aug Dec Feb bf=+2 Jan Jul Jun LR rotation Aug Jan Dec Feb Jul Jun -25-
Example of AVL Tree Insertion (7) (k) insert October Jan Dec Aug Jan Dec Feb Jul Aug Feb RR rotation Jul bf=-2 Jun Oct Jun Oct -26-
Example of AVL Tree Insertion (8) (l) insert September Aug Dec Feb Jan Jul Jun Oct Sep -27-
Deletion from AVL Tree (1) Delete a node x as in ordinary binary search tree. Note that the last node deleted is a leaf. Then trace the path from the new leaf towards the root. For each node x encountered, check if bf <2 If yes, proceed to parent(x) If not, perform an appropriate rotation at x For deletion, after we perform a rotation at x, we may have to perform a rotation at some ancestor of x. Thus, we must continue to trace the path until we reach the root -28-
Deletion from AVL Tree Replacement? Aug? Dec Jan Jul Dec Jul Jan? Jul? Jan Jul Dec?? Jan -29-
Deletion from AVL Tree Example (1) Delete Jan Dec Aug Feb Jul Jun Jan Oct -2 RR case in insertion Dec Sep Aug Feb Jul Oct Jun Sep -30-
Deletion from AVL Tree Example (2) Delete Feb Jan bf=1 LL case in 2 insertion Aug Dec Feb Jul -2 Aug Jan Jun Oct Dec Jul Sep Jun Oct RR case in insertion Sep -31-
Deletion from AVL Tree Example (3) bf=-2 Jan Aug Jan Dec Jul Aug Jul Oct Jun Oct Dec Jun Sep Sep -32-
Comparison of Various Structures Operation Sequential list (sorted) Linked list AVL tree Search for x O(log n) O(n) O(log n) Search for kth item O(1) O(k) O(log n) Delete x O(n) O(1) (doubly linked or position is known) O(log n) Delete kth item O(n-k) O(k) O(log n) Insert x O(n) O(1) (if position is known) O(log n) -33-
Red-Black Trees (1) A red-black tree is a binary search tree with one extra attribute for each node: the color, which is either red or black Red-Black Trees are one of the preferred methods of maintaining binary search trees The Red-Black Tree was invented by Rudolf Bayer in 1972 The trees were originally called Symmetric Binary B- Trees They were renamed "Red-Black Trees" by Leonidas J. Guibas and Robert Sedgewick in 1978-34-
Red-Black Trees (2) A binary search tree is a red-black tree if: The root and all leave nodes (terminal nodes) are colored black A node is either red or black On any path from the root to a leaf, red nodes must not be adjacent (color invariant) Every simple path from a given node to any of its descendant leaves contains the same number of black nodes (height invariant) -35-
Red-Black Trees (3) Black-height of a node x, bh(x), is the number of black nodes on any path from x to a leaf, not counting x A red-black tree with n internal nodes has height at most 2 log 2 (n+1) Approximate balancing is performed during insertion and deletion operations The cost is O(log n) instead of O(n) for a full rebalance after insertion -36-
Red-Black Trees - Example h = 4 26 bh = 2 h = 1 bh = 1 17 h = 2 41 bh = 1 NIL NIL NIL h = 3 bh = 2 30 h = 1 47 bh = 1 38 50 NIL h = 2 bh = 1 h = 1 bh = 1 NIL NIL NIL NIL -37-
Red-Black Tree Implementation in C typedef enum {red, black} color; typedef struct redblack *redblackptr; Typedef struct redblack { element data; redblackptr leftchild; redblackptr rightchild; color Color; } -38-
Red-Black Tree Insertion (1) Insert node as usual in BST Color the node Red Check what Red-Black property is violated Every node is Red or Black? NULLs are Black? If node is Red, both children must be Black? Every path from node to leaf nodes must contain the same number of Blacks? -39-
Red-Black Tree Insertion (2) Imbalances LLb : pu is LEFT CHILD of gu, u is LEFT CHILD of pu, and uncle is black LLr : pu is LEFT CHILD of gu, u is LEFT CHILD of pu, and uncle is red LRb : pu is LEFT CHILD of gu, u is RIGHT CHILD of pu, and uncle is black LRr : pu is LEFT CHILD of gu, u is RIGHT CHILD of pu, and uncle is red RRb, RRr, RLb, RLr Imbalances of the type Xyr are handled by changing colors Imbalances of the type Xyb require a rotation -40-
Red-Black Tree Insertion (3) gu gu pu gur pu gur u pur u pur ul ur (a) LLr imbalance ul ur (b) After LLr color change gu gu pu gur pu gur pul u pul u ul ur (c) LRr imbalance ul ur (d) After LRr color change -41-
Red-Black Tree Insertion (4) gu pu pu gur pul gu pul pur (a) LLb imbalance pur gur (b) After LLb rotation gu u pu gur pu gu pul u pul ul ur gur ul ur (c) LRb imbalance (d) After LRb rotation -42-
Red-Black Tree Insertion Example (1) 50 10 80 50 10 80 90 70 90 (a) Initial (b) Insert 70 50 pu 50 10 gu 80 10 80 u pu 70 90 70 90 u 60 60 (c) Insert 60 (d) LLr color change -43-
Red-Black Tree Insertion Example (2) 10 pu 50 gu 70 60 65 80 u 90 10 50 80 65 60 70 90 (e) Insert 65 (f) LRb rotation -44-
Red-Black Tree Insertion Example (3) 10 50 gu 65 80 90 10 gu 50 u 65 80 pu 90 pu 60 70 60 70 u 62 62 (g) Insert 62 (h) LRr color change 65 50 80 10 60 70 62 90 (i) RLb rotation -45-