Relational Database Systems 2 4. Trees & Advanced Indexes

Size: px
Start display at page:

Download "Relational Database Systems 2 4. Trees & Advanced Indexes"

Transcription

1 Relational Database Systems 2 4. Trees & Advanced Indexes Wolf-Tilo Balke Benjamin Köhncke Institut für Informationssysteme Technische Universität Braunschweig

2 4 Trees & Advanced Indexes 4.1 Introduction 4.2 Binary Search Trees 4.3 Self Balancing Binary Search Trees 4.4 B-Trees Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 2

3 4.1 Introduction Indexes need a suitable data structure For efficient index look-ups search keys need to be ordered Remember: All indexes should be stored in a separate database file, not together with data A suitable number of DB blocks (adjacent on disk) is reserved at index creation time If the space is not sufficient, another file is created and linked to the original index file Search Key 1 Block Address 1 Search Key 2 Block Address 2 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 3

4 4.1 Introduction Search within an index Bisection search possible: log 2 n ; O(log n) But usually indexes span several DB blocks If index is in n blocks, O(n) blocks need to be read from disk Example: search for Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 4

5 4.1 Introduction Maintenance of index is also difficult Insert a new search key with value 5! In worst case, all cells need to be shifted and all blocks need to be accessed Similar problem occurs when deleting a value Often: do not shift values, but mark key as deleted Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 5

6 4.1 Introduction In this lecture, we discuss more efficient multilevel data structures B-trees Prevalent in database systems Better access performance Much better update performance To understand B-trees better, we start by examining binary search trees Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 6

7 4.2 Binary Trees Binary trees are Rooted and directed trees Each node has none, one or two children Each node (except root) has exactly one parent 0/1 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 7

8 4.2 Binary Trees Some naming conventions Nodes without children are called leaf nodes The depth of node N is the path length from the root The tree height is the maximum node depth If there is a path from node N1 to node N2, N1 is an ancestor of N2 and N2 is a descendant of N1 The size of a node N is the number of all descendants of N including itself A subtree of a node N is formed of all descendant nodes including N and the respective links root subtree red red Leaf nodes tree height = 3 red node: size = 3 depth = 1 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 8

9 4.2 Binary Trees Properties of binary trees Full binary tree (or proper) Each node has either zero or two children Perfect binary tree All leaf nodes have the same depth With height h, contains 2 h nodes Height-balanced binary tree Depth of all leaf nodes differ by at most 1 With height h, contains between 2 h-1 and 2 h nodes Degenerated binary tree Each node has either zero or one child Behaves like a linked list: search in O(n) Full and Balanced Full and Perfect Degenerated Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 9

10 4.2 Binary Search Trees Binary search trees are binary trees with Each node has a unique value assigned There is a total order on all values Left subtree of a node contains only values less than node value Right subtree of a node contains only values larger than the node value Aiming for O(log n) search complexity Structurally resembles bisection search / Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 10

11 4.2 Binary Search Trees Constructing and inserting into binary search trees Values are inserted incrementally First value is root Additional values sink into tree Sink to left subtree if value smaller Sink to right subtree if value larger Attach to last node as left/right child, if subtree is empty Insert order of values does highly influence resulting and intermediate tree properties Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 11

12 4.2 Binary Search Trees Suppose insert order 57, 33, 42, 85, 17, 61, Insert Insert 33, 42 Degenerated Insert 85, 17 Full and Balanced Insert 61, 99 Perfect and Full Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 12

13 4.2 Binary Search Trees Suppose insert order 99, 85, 61, 57, 42, 33, Insert complexity is thus O(n) worst case O(log n) average case Degenerated Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 13

14 4.2 Binary Search Tree Search Key Start with root Recursive Procedure If node value = v Return node If node is leaf Value not found if v < node value Descend to left subtree Else Descend to right subtree Complexity: Average case: O(log n) Worst case: O(n) Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 14

15 4.2 Binary Search Tree Tree Traversal Accesses all nodes of the tree Pre-Order Visit node Traverse left subtree Traverse right subtree In-Order (sorted access) Traverse left subtree Visit node Traverse right subtree Post-Order Traverse left subtree Traverse right subtree Visit node Pre-Order: In-Order: Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 15

16 4.2 Binary Search Tree Deleting Nodes has complexity O(n) worst case, O(log n) average case Locate the node to delete by tree traversal If node is leaf, just delete it If node has one child, delete node and attach child to parent If node has two children Replace either by a) in-order successor (the left-most child of the right subtree) b) in-order predecessor (the right-most child of the left subtree) Example: delete search key with value 57 a) b) Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 16

17 4.2 Binary Search Trees Summary Very simple, dynamic data structure Quite efficient on average O(log n) for all operations Can be very inefficient for degenerated cases O(n) for all operations 0/1 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 17

18 4.3 Self-Balancing Binary Search Trees Observation: Binary Search Trees are very efficient when perfect or balanced Idea: Continuously optimize tree structure to keep tree balanced Popular Implementations AVL-Tree (classic example) Red-Black-Tree Splay-Tree Scapegoat-Tree Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 18

19 4.3 Self-Balancing Binary Search Trees Basic Concepts for Deletion: Global Rebuild (Lazy Deletion) Start with balanced tree Don t delete a node, just mark it as deleted Search algorithm scans deleted nodes, but does not return them If Rebuild Condition is met, rebuild the whole tree without the deleted nodes Rebuild as soon as half of the nodes are marked as deleted Complete rebuild can be performed in O(n) Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 19

20 4.3 Self-Balancing Binary Search Trees Global Rebuild (cont.) Search Efficiency n number of unmarked nodes Tree is balanced, contains max 2n nodes overall Number of accesses during search usually just increases by 1 O(log n) Delete Efficiency Global rebuild is in O(n) But only necessary after n deletions Amortized additional costs per deletion is O(1) Overall complexity Average: O(log n) Worst Case: O(n), if actual rebuild is performed Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 20

21 4.3 Self-Balancing Binary Search Trees Global Rebuild (cont.) Direct Deletion with Rebuild Similar complexity as with lazy deletion Increased per delete effort Reduced per search effort until rebuild Delete nodes as in normal binary trees Increment deletion counter c d Rebuild tree as soon as c d = n, reset c d Delete 57,33,42,61 Rebuild Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 21

22 4.3 Self-Balancing Binary Search Trees Basic Concepts for Insertion and Deletion: Local Balancing (Subtree Balancing) Start with balanced tree Insert/delete nodes normally If a subtree becomes too unbalanced, locally balance subtree to regain global balance To detect unbalanced subtrees, each node n needs to know the size v and the height h(v) of it s subtree Unbalanced Condition: (Height Balancing) Subtree is too unbalanced when h(left(v))-h(right(v)) > α α is a constant which can be adjusted (for AVL, α=1) Alternative Unbalanced Condition: Subtree is too unbalanced when h(v) > α * log 2 v α is a constant which can be adjusted Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 22

23 4.3 Self-Balancing Binary Search Trees Local Balancing (cont.) After inserting a node, walk back the tree and update stored subtree statistics h(v) and v. If node a node v is too imbalanced, balance subtree of v 57 2, , 6 Height Imbalanced for α=1 2-0 = 2 > , , , , , , 1 key 17 1, , 1 h(v) v 5 0, 1 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 23

24 4.3 Self-Balancing Binary Search Trees Local Balancing can be achived by Rebuilding the subtree O( v ) = O(n) the worst case However, O(log n) in average This operation is expensive. But in the context of DBMS, it may pay off as it can also consolidate and optimize physical storage locations Especially suited for disk based trees Rotating Only pointers are moved very efficient O(1) Does not change physical storage of nodes Especially suited for main memory based trees Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 24

25 4.3 Self-Balancing Binary Search Trees Local Balancing Rotating Simple Rotation (left, right) Pivot y right x x 3 1 y 1 2 left 2 3 Double Rotation (left-left, right-right, Rollercoaster) z right-right y right-right x y 4 x z 1 y x z 1 2 left-left left-left 3 4 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 25

26 4.3 Self-Balancing Binary Search Trees Local Balancing Rotating Double Rotation (left-right, Zig-Zag) z left z right x y 4 x 4 y z 1 x y Double Rotation (right-left, Zig-Zag) Analogous to left-right Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 26

27 Self-Balancing Binary Search Trees The presented concepts can be combined in different ways to implement self-balancing trees AVL-Tree (classic example) Red-Black-Tree Splay-Tree Scapegoat-Tree Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 27

28 Self-Balancing Binary Search Trees Implementation: AVL-Trees Invented 1962 by Adelson-Velsky and Landis Uses Local Rebalancing with Rotations for Insertion and Deletion Unbalanced criterion: h(left(v)) - h(right(v)) > 1 Height difference of left and right subtree of v is 2 or more Height information is stored explicitly within nodes Update backtracking after each insert and delete Storage overhead of O(n) Guaranteed maximum height of 1.44 log 2 n Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 28

29 Self-Balancing Binary Search Trees Implementation: Scapegoat-Trees Invented 1993 by Galperin and Rivest Uses Global Rebuilding for Deletions Local Balancing with Rotations for Insertions Unbalanced criterion: h(v) > log 1/α v + 1; 0.5 α 1 Node statistics (height, size) determined dynamically during backtracking Only global statistics are stored Storage overhead of O(1) Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 29

30 4.4 Problems with Binary Search Trees Are binary trees really suitable for disk based databases? Yes and No Binary Trees are great data-structures for usage in internal memory But they have a very bad performance when stored on external storage (i.e. hard disks) 0/1 & = Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 30

31 4.4 Problems with Binary Search Trees Binary tree nodes have to be stored within hard disk blocks in linear fashion When tree is large, nodes are scattered among the blocks In worst case, a new block must be read from disk for every node accessed during search or traversal Every linearization scheme for binary trees has that problem Reading a block from disk is very expensive Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 31

32 4.4 Problems with Binary Search Trees Sample linearization Search for 42 In worst case needs to fetch 3 blocks from disk for just 4 nodes Problem is even worse for full tree traversal Tree: Disk/DB Blocks: Block 1 Block 2 Block 3 Block 4 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 32

33 4.4 B-Trees B-Trees adapt concepts and techniques learned for binary trees and optimize them for harddisk storage Basic Ideas: Searching within a DB/disk block is very efficient Take advantage of static nature within a block Search can be performed in memory with bisection search Treat entire blocks as tree nodes Reading blocks from the disk is expensive Most data resides in the leaf nodes Thus minimize the height of the tree Dramatically increase fan-out factor Tree becomes bushy Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 33

34 4.4 Block Search Trees First Improvement: Block Search Tree Nodes are complete DB blocks Each node can store up to q pointers p i and q-1 unique and ordered key entries k i : <p 1, k 1,, k q-1, p q > k i < k i+1 Pointers p i link to subtrees (or are empty). All keys in subtree of p i are less than k i and greater as k i-1 Node Pointers Key Value Node EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 34

35 4.4 Block Search Trees Locate a key k Recursive Procedure: Start with root node Use bisection search within the current node If key found Return it If key not found If there is a p i with k i-1 < k < k i» Follow p i and repeat algorithm with link node Else» Key not in tree Example: Locate EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 35

36 4.4 Block Search Trees Insert a key k Recursive Procedure: Start with root node Use bisection search within the current node If key found Key cannot inserted twice, abort If key not found If there is a p i with k i-1 < k < k i» Follow p i and repeat algorithm with link node Else» If there is space left in the node Insert key and restore sort order» Else Create new, empty node Insert k into new node Link new node to p i in current node such that with k i-1 < k < k i EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 36

37 4.4 Block Search Trees Insert a key : EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 37

38 4.4 Block Search Trees Delete a key k Start with root node Locate k If k is in leaf node, delete k from node and restore order If leaf node is now empty, delete the node If k is in internal node If no or only one directly adjacent pointer of k are used» Delete k and restore order If k is a separator between two used pointers,» If space in both subnodes is sufficient Union both nodes into one Delete k and restore order» Else Replace k with new separator key Either largest key in left node or smallest key in right node Any completely empty node is deleted as in binary search trees EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 38

39 4.4 Block Search Trees Delete a key : EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 39

40 4.4 Block Search Trees Delete a key : EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 40

41 4.4 Block Search Trees Delete a key : EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 41

42 4.4 Block Search Trees Block Search Trees have similar properties to Binary Search Trees Can be perfect, balanced or degenerated Assume height h=3; fan-out-factor q=2048; and total number of keys n Block Search Tree One node can store up to 2047 keys and 2048 links Perfect : n = 8581M Balanced : 4M n 8581M Degenerated : n = 6141 Binary Search Tree One node can store 1 key and up to 2 links Perfect : n = 7 Balanced : 3 < n 7 Degenerated : n = 3 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 42

43 4.4 Block Search Trees Assume n = 1,000,000,000; fan-out-factor q=2048; and height h Block Search Tree Balanced : h = 3 Degenerated : h = 488,520 Binary Search Tree Balanced : h = 30 Degenerated : h = 1,000,000,000 During search, there is one disk access per tree height in worst case In this example, block search tree are already 10 times more efficient when balanced Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 43

44 4.4 Block Search Trees Summary Data structure optimized for disk storage Is very efficient in average case O(log n) for all operations Even better Average node-accesses to locate a key is log fan-out n» Fan-out usually in the order of several thousands» Binary tree averages only to log 2 n Accessing a node is expensive on disks, huge improvement Can be very inefficient for degenerated cases O(n) for all operations Performs almost as bad as binary trees in degenerated case Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 44

45 4.4 B-Trees B-Tree are specialized Block Search Trees for Indexing Invented by Rudolf Bayer in 1971 Keys may be non-unique Tree is self-balancing No degenerated cases anymore EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 45

46 4.4 B-Trees Basic structure of a B-tree node Nodes contain key values and respective data (block) pointers to the actual data records Additionally, there are node pointers for the left, resp. right interval around a key value Key Value Data Pointer Tree Node Node Pointers Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 46

47 1, Adams, $ 887,00 2, Bertram, $19,99 3, Behaim, $ 167,00 4, Cesar, $ 1866,00 5, Miller, $179,99 6, Naders, $ 682,56 7, Ruth, $ 8642,78 8, Smith, $675,99 9, Tarrens, $ 99, B-Trees B-Trees as Primary Index EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 47

48 4.4 B-Trees All base operations similar to Block Search Tree with small changes Guaranteed fill degree Self-Balancing Each node contains between L(ower) and U(pper) links Usually 2* L = U Nodes are split during insertion as soon as they contain more than U-2 keys Nodes are unioned during deletion as soon as they contain less than L keys If complete node is created or deleted, use local rebalancing to re-balance tree Local rebuilding for disk-based storage, rotations for memory based storage EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 48

49 4.4 B-Trees All insertions happen at the leaf nodes Search the tree to find leaf node where new element should be added If the leaf node contains fewer than the maximum legal number of elements ( leaf node < U) Insert the new element in the node and restore order Otherwise the leaf node is split into two nodes (node split) The median is chosen from among the leaf's elements and the new element Values less than the median are put in the new left node and values greater than the median are put in the new right node, with the median acting as a separation value That separation value is added to the node's parent, which may cause it to be split, and so on If the splitting goes all the way up to the root, it creates a new root with a single separator value and two children Remember: the lower bound on the size of internal nodes does not apply to the root Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 49

50 4.4 B-Trees Deleting nodes is problematic, because nodes sizes can decrease under the minimum number of elements Deleting an element may put it under the minimum number of elements and children ( node size < L) Deleting an element in an internal node may be a separator for its child nodes Deletion from a leaf node Search for the value to delete If the value is in a leaf node, it can simply be deleted from the node, perhaps leaving the node with too few elements; in that case the tree has to be rebalanced Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 50

51 4.4 B-Trees Rebalancing after deletion If some leaf node is under the minimum size, some elements must be redistributed from its siblings to bring all children nodes again up to the minimum (stealing) If all siblings have only minimum size the parent node is affected and has to hand over an element If the parent then falls under the minimum degree, the redistribution must be applied iteratively up the tree Since the minimum element count does not apply to the root, making the root the only deficient node is not a problem Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 51

52 4.4 B-Trees The rebalancing strategy is to find a sibling of the deficient node which has more than the minimum number of elements Choose a new separator, move it to the parent node and redistribute the values in both original nodes to the new left and right children If the sibling node immediately to the right of the deficient node has only the minimum number of elements, examine the sibling node immediately to the left Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 52

53 4.4 B-Trees If both immediate siblings have only the minimum number of elements, create a new node with all the elements from the deficient node, all the elements from one of its siblings, and the separator in the parent between the two combined sibling nodes Remove the separator from the parent, and replace the two children it separated with the combined node. If that brings the number of elements in the parent under the minimum, repeat these steps with that deficient node, unless it is the root, since the root may be deficient Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 53

54 4.4 B-Trees Example: Steal Keys from Siblings Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 54

55 4.4 B-Trees Example: Join Child Nodes Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 55

56 4.4 B-Trees Each element in an internal node acts as a separation value for two subtrees. When such an element is deleted, there are two cases: Both of the two child nodes to the left and right of the deleted element have the minimum number of elements (L-1) and then can then be joined into a legal single node with (2L-2) elements One of the two child nodes contains more than the minimum number of elements. Then a new separator for those subtrees must be found. There are two possible choices: The largest element in the left subtree is the largest element which is still less than the separator The smallest element in the right subtree is the smallest element which is still greater than the separator Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 56

57 4.4 B-Trees Deletion from an internal node If the value is in an internal node, choose a new separator, remove it from the leaf node it is in, and replace the element to be deleted with the new separator This has deleted an element from child node so the deletion has been passed down the tree iteratively If the child is a leaf node the leaf node deletion procedure applies Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 57

58 4.4 B-Trees Example: Build a B-Tree Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 58

59 4.4 B-Trees Summary Very efficient data structure for disk storage O(log n) for all operations Even better Guaranteed maximum node-accesses to locate a key is Balanced binary tree guarantees only log 2 n) log fan out ( n ) Accessing a node is expensive on disks huge improvement No degenerated cases Self-Balancing rarely necessary as most updates affect just one node Wasted space decreased due to guaranteed minimal fill factor EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 59

60 4.4 B*Trees The B*Tree is a constrained B-Tree All non-root nodes need to be filled to 2/3 Implemented in various file systems HFS Raiser 4 Used to be quite popular, but lost its importance EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 60

61 4.4 B + Trees The B + Tree is an optimization of the B-Tree Improved traversal performance Increased search efficiency Increased memory efficiency B + Tree uses different nodes for leaf nodes and internal nodes Internal Nodes: Only unique keys and node links No data pointers! Leaf Nodes: Replicated keys with data pointer Data pointers only here Node Pointer Key Value Key Value Data Pointer EN Node Internal Node Leaf Node Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 61

62 4.4 B + Trees Internal Nodes are used for search guidance A block can contain more keys fan-out higher Leafs just contain data links All leafs are linked to each other in-order for increased traversal performance Internal Search Nodes Data Nodes EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 62

63 4.4 B + Trees Summary B + Tree is THE super index structure for disk-based databases Improved over B-Tree Improved traversal performance Increased search efficiency Increased memory efficiency EN Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 63

64 IMDB s Observation Loading data from the hard disk is a major bottleneck Available main memory still doubles every 18 month Moore s Law Idea Store all data in fast main memory! Solutions Use traditional DBMS with huge buffer pool (block cache) DBMS are usually optimized for sequential disk access Design special In-Memory Databases Systems Or MMDB (Main Memory Database) Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 64

65 IMDB s Why do we need in-memory databases? Embedded Systems Mobile Phones PDA s Sensors Diskless Computing Devices Ultra-High-Performance (Real Time) Scenarios Network Applications Telecommunication Applications High-Volume Trading Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 65

66 IMDB s Why should IMDB s be different? Traditional DBMS do also work in-memory but they waste potential Random access has nearly no penalty compared to sequential access Optimizing for linear storage and block read/write unnecessary Type Media Size Random Acc. Speed Transfer Speed Characteristics Price Price/GB Pri DDR3-Ram (Corsair 1600C7DHX) 2 GiB ms 8000 MB/sec Vol, Dyn, Ra, OL Sec Harddrive Magnetic (Seagate ST As) 1000 GB 12 ms 80 MB/sec Stat, RA, OL Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 66

67 IMDB s Storing a DB in main memory also has problems Main memory usually smaller and more expensive Main memory is not persistent What happens in case of power failure? How to ensure the durability requirement of DBs? Severe problem for transaction management Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 67

68 IMDB s IMDB Index Structures B-Trees are great, but they are shallow and bushy which is unnecessary in main memory Can save some performance there Hash Indexes are very suitable in main memory for unsorted data Especially bucket chained hashing is very efficient For sorted data: Use the T-Tree instead of B-Tree Specialized tree for main memory databases Blend between AVL-Tree and B-Tree Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 68

69 The T-Tree T-Tree design considerations I/O access is cheap in main memory Expensive resources are computation time and memory space Properties T-Tree is a self-balancing binary tree (AVL algorithm) T-Tree nodes contain only links Each node links to m data records (d 1 d m ) Data entries are ordered, smallest left, biggest right All nodes contain a maximum of c max entries Each internal node contains c min to c max entries (usually c max -c min 2) Each node has a link to it s parent Each node has at most a left and a right subtree Left subtree contains only entries smaller than the minimal node entry Right subtree contains only entries bigger than maximal node entry d 1 p d 2 d m-1 d m l r Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 69

70 The T-Tree Search a key Search similar to Binary Search Tree, but If smaller than node min, go left If bigger than node max, go right Else do bisection search within node Example: Locate 44 Naming: Internal Nodes: 2 Children; Half-Leafs: 1 Child; Leafs: 0 Children p p p 5 12 l r 13 l r l r p 55 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 70 l r

71 The T-Tree Insert a key 1. Locate responsible node for key (node min key node max ) 2. If node contains free space 1. Insert key, restore order 3. Else 1. Replace node min with key, store node min as new insert key m 2. Scan left subtree for node with biggest node min key m 3. If there is such a node 4. Else 1. Insert key m and recursively push down it s smallest element if full 1. Create new node and insert key m 2. Rebalance tree with local rebalancing (as with AVL) if necessary Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 71

72 The T-Tree Insert a key: 16 p p p 5 12 l r 13 l r l r p 55 l r p p p 5 12 l r 13 l r l r p 55 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 72 l r

73 The T-Tree Insert a key: 22 p p p 5 12 l r 13 l r l r p 55 p l r p p l r l r p l r p l r l r Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 73

74 The T-Tree Delete a key 1. Locate responsible node for key (node min key node max ) 2. If node contains key, delete it; else stop 3. If node is internal node and contains less than c min entries 1. Replace it with greatest node min from a sub-leaf/sub-half-leaf. That node is the new working node from now on. 4. If node is a half-leaf and can be merged with a leaf 1. Merge the nodes, delete empty leaf. Goto If node (a leaf) is not empty, stop. Else delete it. 6. Locally Rebalance tree similar to AVL tree. Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 74

75 IMDB Indexes How do main memory index structures compare? Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 75

76 IMDB Indexes How do main memory index structures compare? Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 76

77 IMDB Indexes How do main memory index structures compare? Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 77

78 IMDB Indexes Why not always use Chained Bucked Hashing? No range queries Storage overhead Suboptimal if amount of data is unknown during initialization Why do T-Tree and AVL-Tree perform better than B- Tree and ordered array for search? Bisection search within a B-tree node/array needs to compute position of next comparison AVL and T-Tree do only need 2 comparisons in each node Why does T-Tree perform better than AVL for updates? Due to larger nodes, many updates do not require a rebalancing Why does ordered array suck for updates? Reordering of all elements necessary for each update Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 78

79 References and Timeline AVL-Tree G. Adelson-Velskii, E. M. Landis: An algorithm for the organization of information. Proceedings of the USSR Academy of Sciences 146: (Russian), English translation by M. J. Ricci in Soviet Math. Doklady, 3: , 1962 B-Trees R. Bayer, E. M. McCreight: Organization and Maintenance of Large Ordered Indexes. Acta Informatica 1, , 1972 T-Trees T. J. Lehman, M. J. Carey: A Study of Index Structures for Main Memory Database Management Systems, Int. Conf. On Very Large 12th Database, Kyoto, August 1986 Scapegoat Trees I. Galperin, R. L. Rivest: Scapegoat trees, ACM-SIAM Symposium on Discrete Algorithms, Austin, Texas, US, 1993 Datenbanksysteme 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 79

Relational Database Systems 2 4. Trees & Advanced Indexes

Relational Database Systems 2 4. Trees & Advanced Indexes Relational Database Systems 2 4. Trees & Advanced Indexes Silke Eckstein Benjamin Köhncke Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 3 Indexing Buffer

More information

Relational Database Systems 2 4. Trees & Advanced Indexes

Relational Database Systems 2 4. Trees & Advanced Indexes Relational Database Systems 2 4. Trees & Advanced Indexes Wolf-Tilo Balke Jan-Christoph Kalo Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 4 Trees & Advanced

More information

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology Introduction Chapter 4 Trees for large input, even linear access time may be prohibitive we need data structures that exhibit average running times closer to O(log N) binary search tree 2 Terminology recursive

More information

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree Chapter 4 Trees 2 Introduction for large input, even access time may be prohibitive we need data structures that exhibit running times closer to O(log N) binary search tree 3 Terminology recursive definition

More information

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 530A. B+ Trees. Washington University Fall 2013 CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key

More information

Algorithms. AVL Tree

Algorithms. AVL Tree Algorithms AVL Tree Balanced binary tree The disadvantage of a binary search tree is that its height can be as large as N-1 This means that the time needed to perform insertion and deletion and many other

More information

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree. The Lecture Contains: Index structure Binary search tree (BST) B-tree B+-tree Order file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture13/13_1.htm[6/14/2012

More information

Balanced Binary Search Trees. Victor Gao

Balanced Binary Search Trees. Victor Gao Balanced Binary Search Trees Victor Gao OUTLINE Binary Heap Revisited BST Revisited Balanced Binary Search Trees Rotation Treap Splay Tree BINARY HEAP: REVIEW A binary heap is a complete binary tree such

More information

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

B-Trees. Version of October 2, B-Trees Version of October 2, / 22 B-Trees Version of October 2, 2014 B-Trees Version of October 2, 2014 1 / 22 Motivation An AVL tree can be an excellent data structure for implementing dictionary search, insertion and deletion Each operation

More information

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time B + -TREES MOTIVATION An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations finish within O(log N) time The theoretical conclusion

More information

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University Trees Reading: Weiss, Chapter 4 1 Generic Rooted Trees 2 Terms Node, Edge Internal node Root Leaf Child Sibling Descendant Ancestor 3 Tree Representations n-ary trees Each internal node can have at most

More information

Lecture 8 13 March, 2012

Lecture 8 13 March, 2012 6.851: Advanced Data Structures Spring 2012 Prof. Erik Demaine Lecture 8 13 March, 2012 1 From Last Lectures... In the previous lecture, we discussed the External Memory and Cache Oblivious memory models.

More information

CISC 235: Topic 4. Balanced Binary Search Trees

CISC 235: Topic 4. Balanced Binary Search Trees CISC 235: Topic 4 Balanced Binary Search Trees Outline Rationale and definitions Rotations AVL Trees, Red-Black, and AA-Trees Algorithms for searching, insertion, and deletion Analysis of complexity CISC

More information

UNIT III BALANCED SEARCH TREES AND INDEXING

UNIT III BALANCED SEARCH TREES AND INDEXING UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant

More information

Balanced Search Trees. CS 3110 Fall 2010

Balanced Search Trees. CS 3110 Fall 2010 Balanced Search Trees CS 3110 Fall 2010 Some Search Structures Sorted Arrays Advantages Search in O(log n) time (binary search) Disadvantages Need to know size in advance Insertion, deletion O(n) need

More information

Splay Trees. (Splay Trees) Data Structures and Programming Spring / 27

Splay Trees. (Splay Trees) Data Structures and Programming Spring / 27 Splay Trees (Splay Trees) Data Structures and Programming Spring 2017 1 / 27 Basic Idea Invented by Sleator and Tarjan (1985) Blind rebalancing no height info kept! Worst-case time per operation is O(n)

More information

Tree-Structured Indexes

Tree-Structured Indexes Tree-Structured Indexes Chapter 9 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Introduction As for any index, 3 alternatives for data entries k*: ➀ Data record with key value k ➁

More information

Physical Level of Databases: B+-Trees

Physical Level of Databases: B+-Trees Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,

More information

CSE332: Data Abstractions Lecture 7: B Trees. James Fogarty Winter 2012

CSE332: Data Abstractions Lecture 7: B Trees. James Fogarty Winter 2012 CSE2: Data Abstractions Lecture 7: B Trees James Fogarty Winter 20 The Dictionary (a.k.a. Map) ADT Data: Set of (key, value) pairs keys must be comparable insert(jfogarty,.) Operations: insert(key,value)

More information

Laboratory Module X B TREES

Laboratory Module X B TREES Purpose: Purpose 1... Purpose 2 Purpose 3. Laboratory Module X B TREES 1. Preparation Before Lab When working with large sets of data, it is often not possible or desirable to maintain the entire structure

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Multi-way Search Trees

Multi-way Search Trees Multi-way Search Trees Kuan-Yu Chen ( 陳冠宇 ) 2018/10/24 @ TR-212, NTUST Review Red-Black Trees Splay Trees Huffman Trees 2 Multi-way Search Trees. Every node in a binary search tree contains one value and

More information

Trees. (Trees) Data Structures and Programming Spring / 28

Trees. (Trees) Data Structures and Programming Spring / 28 Trees (Trees) Data Structures and Programming Spring 2018 1 / 28 Trees A tree is a collection of nodes, which can be empty (recursive definition) If not empty, a tree consists of a distinguished node r

More information

Lecture 6: Analysis of Algorithms (CS )

Lecture 6: Analysis of Algorithms (CS ) Lecture 6: Analysis of Algorithms (CS583-002) Amarda Shehu October 08, 2014 1 Outline of Today s Class 2 Traversals Querying Insertion and Deletion Sorting with BSTs 3 Red-black Trees Height of a Red-black

More information

Balanced Binary Search Trees

Balanced Binary Search Trees Balanced Binary Search Trees Pedro Ribeiro DCC/FCUP 2017/2018 Pedro Ribeiro (DCC/FCUP) Balanced Binary Search Trees 2017/2018 1 / 48 Motivation Let S be a set of comparable objects/items: Let a and b be

More information

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation! Lecture 10: Multi-way Search Trees: intro to B-trees 2-3 trees 2-3-4 trees Multi-way Search Trees A node on an M-way search tree with M 1 distinct and ordered keys: k 1 < k 2 < k 3

More information

CSCI Trees. Mark Redekopp David Kempe

CSCI Trees. Mark Redekopp David Kempe CSCI 104 2-3 Trees Mark Redekopp David Kempe Trees & Maps/Sets C++ STL "maps" and "sets" use binary search trees internally to store their keys (and values) that can grow or contract as needed This allows

More information

Tree-Structured Indexes

Tree-Structured Indexes Introduction Tree-Structured Indexes Chapter 10 As for any index, 3 alternatives for data entries k*: Data record with key value k

More information

Self-Balancing Search Trees. Chapter 11

Self-Balancing Search Trees. Chapter 11 Self-Balancing Search Trees Chapter 11 Chapter Objectives To understand the impact that balance has on the performance of binary search trees To learn about the AVL tree for storing and maintaining a binary

More information

PART IV. Given 2 sorted arrays, What is the time complexity of merging them together?

PART IV. Given 2 sorted arrays, What is the time complexity of merging them together? General Questions: PART IV Given 2 sorted arrays, What is the time complexity of merging them together? Array 1: Array 2: Sorted Array: Pointer to 1 st element of the 2 sorted arrays Pointer to the 1 st

More information

Some Search Structures. Balanced Search Trees. Binary Search Trees. A Binary Search Tree. Review Binary Search Trees

Some Search Structures. Balanced Search Trees. Binary Search Trees. A Binary Search Tree. Review Binary Search Trees Some Search Structures Balanced Search Trees Lecture 8 CS Fall Sorted Arrays Advantages Search in O(log n) time (binary search) Disadvantages Need to know size in advance Insertion, deletion O(n) need

More information

Analysis of Algorithms

Analysis of Algorithms Analysis of Algorithms Trees-I Prof. Muhammad Saeed Tree Representation.. Analysis Of Algorithms 2 .. Tree Representation Analysis Of Algorithms 3 Nomenclature Nodes (13) Size (13) Degree of a node Depth

More information

CMPE 160: Introduction to Object Oriented Programming

CMPE 160: Introduction to Object Oriented Programming CMPE 6: Introduction to Object Oriented Programming General Tree Concepts Binary Trees Trees Definitions Representation Binary trees Traversals Expression trees These are the slides of the textbook by

More information

CSCI 136 Data Structures & Advanced Programming. Lecture 25 Fall 2018 Instructor: B 2

CSCI 136 Data Structures & Advanced Programming. Lecture 25 Fall 2018 Instructor: B 2 CSCI 136 Data Structures & Advanced Programming Lecture 25 Fall 2018 Instructor: B 2 Last Time Binary search trees (Ch 14) The locate method Further Implementation 2 Today s Outline Binary search trees

More information

COMP171. AVL-Trees (Part 1)

COMP171. AVL-Trees (Part 1) COMP11 AVL-Trees (Part 1) AVL Trees / Slide 2 Data, a set of elements Data structure, a structured set of elements, linear, tree, graph, Linear: a sequence of elements, array, linked lists Tree: nested

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture II: Indexing Part I of this course Indexing 3 Database File Organization and Indexing Remember: Database tables

More information

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees B-Trees AVL trees and other binary search trees are suitable for organizing data that is entirely contained within computer memory. When the amount of data is too large to fit entirely in memory, i.e.,

More information

AVL Trees. (AVL Trees) Data Structures and Programming Spring / 17

AVL Trees. (AVL Trees) Data Structures and Programming Spring / 17 AVL Trees (AVL Trees) Data Structures and Programming Spring 2017 1 / 17 Balanced Binary Tree The disadvantage of a binary search tree is that its height can be as large as N-1 This means that the time

More information

Advanced Tree Data Structures

Advanced Tree Data Structures Advanced Tree Data Structures Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park Binary trees Traversal order Balance Rotation Multi-way trees Search Insert Overview

More information

Fundamental Algorithms

Fundamental Algorithms WS 2007/2008 Fundamental Algorithms Dmytro Chibisov, Jens Ernst Fakultät für Informatik TU München http://www14.in.tum.de/lehre/2007ws/fa-cse/ Fall Semester 2007 1. AVL Trees As we saw in the previous

More information

Material You Need to Know

Material You Need to Know Review Quiz 2 Material You Need to Know Normalization Storage and Disk File Layout Indexing B-trees and B+ Trees Extensible Hashing Linear Hashing Decomposition Goals: Lossless Joins, Dependency preservation

More information

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap DATA STRUCTURES AND ALGORITHMS Hierarchical data structures: AVL tree, Bayer tree, Heap Summary of the previous lecture TREE is hierarchical (non linear) data structure Binary trees Definitions Full tree,

More information

CS301 - Data Structures Glossary By

CS301 - Data Structures Glossary By CS301 - Data Structures Glossary By Abstract Data Type : A set of data values and associated operations that are precisely specified independent of any particular implementation. Also known as ADT Algorithm

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Data Structures Week #6. Special Trees

Data Structures Week #6. Special Trees Data Structures Week #6 Special Trees Outline Adelson-Velskii-Landis (AVL) Trees Splay Trees B-Trees 21.Aralık.2010 Borahan Tümer, Ph.D. 2 AVL Trees 21.Aralık.2010 Borahan Tümer, Ph.D. 3 Motivation for

More information

Data Warehousing & Data Mining

Data Warehousing & Data Mining Data Warehousing & Data Mining Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last week: Logical Model: Cubes,

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Section 1: True / False (1 point each, 15 pts total)

Section 1: True / False (1 point each, 15 pts total) Section : True / False ( point each, pts total) Circle the word TRUE or the word FALSE. If neither is circled, both are circled, or it impossible to tell which is circled, your answer will be considered

More information

Lecture: Analysis of Algorithms (CS )

Lecture: Analysis of Algorithms (CS ) Lecture: Analysis of Algorithms (CS583-002) Amarda Shehu Fall 2017 1 Binary Search Trees Traversals, Querying, Insertion, and Deletion Sorting with BSTs 2 Example: Red-black Trees Height of a Red-black

More information

Spring 2017 B-TREES (LOOSELY BASED ON THE COW BOOK: CH. 10) 1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 1

Spring 2017 B-TREES (LOOSELY BASED ON THE COW BOOK: CH. 10) 1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 1 Spring 2017 B-TREES (LOOSELY BASED ON THE COW BOOK: CH. 10) 1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 1 Consider the following table: Motivation CREATE TABLE Tweets ( uniquemsgid INTEGER,

More information

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 The data of the problem is of 2GB and the hard disk is of 1GB capacity, to solve this problem we should Use better data structures

More information

Trees. Eric McCreath

Trees. Eric McCreath Trees Eric McCreath 2 Overview In this lecture we will explore: general trees, binary trees, binary search trees, and AVL and B-Trees. 3 Trees Trees are recursive data structures. They are useful for:

More information

CSE 326: Data Structures Splay Trees. James Fogarty Autumn 2007 Lecture 10

CSE 326: Data Structures Splay Trees. James Fogarty Autumn 2007 Lecture 10 CSE 32: Data Structures Splay Trees James Fogarty Autumn 2007 Lecture 10 AVL Trees Revisited Balance condition: Left and right subtrees of every node have heights differing by at most 1 Strong enough :

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS245-2008S-19 B-Trees David Galles Department of Computer Science University of San Francisco 19-0: Indexing Operations: Add an element Remove an element Find an element,

More information

Principles of Data Management. Lecture #5 (Tree-Based Index Structures)

Principles of Data Management. Lecture #5 (Tree-Based Index Structures) Principles of Data Management Lecture #5 (Tree-Based Index Structures) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Headlines v Project

More information

Tree-Structured Indexes. Chapter 10

Tree-Structured Indexes. Chapter 10 Tree-Structured Indexes Chapter 10 1 Introduction As for any index, 3 alternatives for data entries k*: Data record with key value k 25, [n1,v1,k1,25] 25,

More information

THE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan

THE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan THE B+ TREE INDEX CS 564- Spring 2018 ACKs: Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? The B+ tree index Basics Search/Insertion/Deletion Design & Cost 2 INDEX RECAP We have the following query:

More information

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs Computational Optimization ISE 407 Lecture 16 Dr. Ted Ralphs ISE 407 Lecture 16 1 References for Today s Lecture Required reading Sections 6.5-6.7 References CLRS Chapter 22 R. Sedgewick, Algorithms in

More information

Tree-Structured Indexes

Tree-Structured Indexes Tree-Structured Indexes Chapter 10 Comp 521 Files and Databases Fall 2010 1 Introduction As for any index, 3 alternatives for data entries k*: index refers to actual data record with key value k index

More information

Ch04 Balanced Search Trees

Ch04 Balanced Search Trees Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 05 Ch0 Balanced Search Trees v 3 8 z Why care about advanced implementations? Same entries,

More information

Trees. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Trees. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University Trees CptS 223 Advanced Data Structures Larry Holder School of Electrical Engineering and Computer Science Washington State University 1 Overview Tree data structure Binary search trees Support O(log 2

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs Algorithms in Systems Engineering ISE 172 Lecture 16 Dr. Ted Ralphs ISE 172 Lecture 16 1 References for Today s Lecture Required reading Sections 6.5-6.7 References CLRS Chapter 22 R. Sedgewick, Algorithms

More information

Advanced Set Representation Methods

Advanced Set Representation Methods Advanced Set Representation Methods AVL trees. 2-3(-4) Trees. Union-Find Set ADT DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 1 Advanced Set Representation. AVL Trees Problem with BSTs: worst case operation

More information

Data Structures Week #6. Special Trees

Data Structures Week #6. Special Trees Data Structures Week #6 Special Trees Outline Adelson-Velskii-Landis (AVL) Trees Splay Trees B-Trees October 5, 2015 Borahan Tümer, Ph.D. 2 AVL Trees October 5, 2015 Borahan Tümer, Ph.D. 3 Motivation for

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Balanced Search Trees

Balanced Search Trees Balanced Search Trees Computer Science E-22 Harvard Extension School David G. Sullivan, Ph.D. Review: Balanced Trees A tree is balanced if, for each node, the node s subtrees have the same height or have

More information

Search Trees - 1 Venkatanatha Sarma Y

Search Trees - 1 Venkatanatha Sarma Y Search Trees - 1 Lecture delivered by: Venkatanatha Sarma Y Assistant Professor MSRSAS-Bangalore 11 Objectives To introduce, discuss and analyse the different ways to realise balanced Binary Search Trees

More information

CS Transform-and-Conquer

CS Transform-and-Conquer CS483-11 Transform-and-Conquer Instructor: Fei Li Room 443 ST II Office hours: Tue. & Thur. 1:30pm - 2:30pm or by appointments lifei@cs.gmu.edu with subject: CS483 http://www.cs.gmu.edu/ lifei/teaching/cs483_fall07/

More information

Lecture 8 Index (B+-Tree and Hash)

Lecture 8 Index (B+-Tree and Hash) CompSci 516 Data Intensive Computing Systems Lecture 8 Index (B+-Tree and Hash) Instructor: Sudeepa Roy Duke CS, Fall 2017 CompSci 516: Database Systems 1 HW1 due tomorrow: Announcements Due on 09/21 (Thurs),

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Module 4: Dictionaries and Balanced Search Trees

Module 4: Dictionaries and Balanced Search Trees Module 4: Dictionaries and Balanced Search Trees CS 24 - Data Structures and Data Management Jason Hinek and Arne Storjohann Based on lecture notes by R. Dorrigiv and D. Roche David R. Cheriton School

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Introduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana

Introduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana Introduction to Indexing 2 Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana Indexed Sequential Access Method We have seen that too small or too large an index (in other words too few or too

More information

Lecture 3: B-Trees. October Lecture 3: B-Trees

Lecture 3: B-Trees. October Lecture 3: B-Trees October 2017 Remarks Search trees The dynamic set operations search, minimum, maximum, successor, predecessor, insert and del can be performed efficiently (in O(log n) time) if the search tree is balanced.

More information

Augmenting Data Structures

Augmenting Data Structures Augmenting Data Structures [Not in G &T Text. In CLRS chapter 14.] An AVL tree by itself is not very useful. To support more useful queries we need more structure. General Definition: An augmented data

More information

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

2-3 Tree. Outline B-TREE. catch(...){ printf( Assignment::SolveProblem() AAAA!); } ADD SLIDES ON DISJOINT SETS Outline catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } Balanced Search Trees 2-3 Trees 2-3-4 Trees Slide 4 Why care about advanced implementations? Same entries, different insertion sequence:

More information

CS350: Data Structures B-Trees

CS350: Data Structures B-Trees B-Trees James Moscola Department of Engineering & Computer Science York College of Pennsylvania James Moscola Introduction All of the data structures that we ve looked at thus far have been memory-based

More information

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25 Multi-way Search Trees (Multi-way Search Trees) Data Structures and Programming Spring 2017 1 / 25 Multi-way Search Trees Each internal node of a multi-way search tree T: has at least two children contains

More information

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 2004 Goodrich, Tamassia (2,4) Trees 1 Multi-Way Search Tree A multi-way search tree is an ordered tree such that Each internal node has at least two children and stores d -1 key-element

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya CS62: Foundations of Algorithm Design and Machine Learning Sourangshu Bhattacharya Binary Search Tree - Best Time All BST operations are O(d), where d is tree depth minimum d is d = ëlog for a binary tree

More information

AVL Trees / Slide 2. AVL Trees / Slide 4. Let N h be the minimum number of nodes in an AVL tree of height h. AVL Trees / Slide 6

AVL Trees / Slide 2. AVL Trees / Slide 4. Let N h be the minimum number of nodes in an AVL tree of height h. AVL Trees / Slide 6 COMP11 Spring 008 AVL Trees / Slide Balanced Binary Search Tree AVL-Trees Worst case height of binary search tree: N-1 Insertion, deletion can be O(N) in the worst case We want a binary search tree with

More information

Chapter 12: Indexing and Hashing (Cnt(

Chapter 12: Indexing and Hashing (Cnt( Chapter 12: Indexing and Hashing (Cnt( Cnt.) Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition

More information

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya CS62: Foundations of Algorithm Design and Machine Learning Sourangshu Bhattacharya Balanced search trees Balanced search tree: A search-tree data structure for which a height of O(lg n) is guaranteed when

More information

(2,4) Trees Goodrich, Tamassia. (2,4) Trees 1

(2,4) Trees Goodrich, Tamassia. (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 (2,4) Trees 1 Multi-Way Search Tree ( 9.4.1) A multi-way search tree is an ordered tree such that Each internal node has at least two children and stores d 1 key-element items

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part V Lecture 13, March 10, 2014 Mohammad Hammoud Today Welcome Back from Spring Break! Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+

More information

Balanced Binary Search Trees

Balanced Binary Search Trees Balanced Binary Search Trees Why is our balance assumption so important? Lets look at what happens if we insert the following numbers in order without rebalancing the tree: 3 5 9 12 18 20 1-45 2010 Pearson

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Algorithms. Deleting from Red-Black Trees B-Trees

Algorithms. Deleting from Red-Black Trees B-Trees Algorithms Deleting from Red-Black Trees B-Trees Recall the rules for BST deletion 1. If vertex to be deleted is a leaf, just delete it. 2. If vertex to be deleted has just one child, replace it with that

More information

- 1 - Handout #22S May 24, 2013 Practice Second Midterm Exam Solutions. CS106B Spring 2013

- 1 - Handout #22S May 24, 2013 Practice Second Midterm Exam Solutions. CS106B Spring 2013 CS106B Spring 2013 Handout #22S May 24, 2013 Practice Second Midterm Exam Solutions Based on handouts by Eric Roberts and Jerry Cain Problem One: Reversing a Queue One way to reverse the queue is to keep

More information

I/O-Algorithms Lars Arge

I/O-Algorithms Lars Arge I/O-Algorithms Fall 203 September 9, 203 I/O-Model lock I/O D Parameters = # elements in problem instance = # elements that fits in disk block M = # elements that fits in main memory M T = # output size

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Relational Database Systems 2 3. Indexing and Access Paths

Relational Database Systems 2 3. Indexing and Access Paths Relational Database Systems 2 3. Indexing and Access Paths Wolf-Tilo Balke Jan-Christoph Kalo Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 3 Indexing

More information

Binary Search Tree Balancing Methods: A Critical Study

Binary Search Tree Balancing Methods: A Critical Study IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.8, August 2007 237 Binary Search Tree Balancing Methods: A Critical Study Suri Pushpa 1, Prasad Vinod 2 1 Dept. of Computer

More information

B-tree From Wikipedia, the free encyclopedia

B-tree From Wikipedia, the free encyclopedia mhtml:file://c:\users\s\desktop\.mht Page 1 of 11 B-tree From Wikipedia, the free encyclopedia In computer science, a B-tree is a tree data structure that keeps data sorted and allows searches, sequential

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L08: B + -trees and Dynamic Hashing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,

More information