15-82 Advanced Topics in Database Systems Performance Problem Given a large collection of records, Indexing with B-trees find similar/interesting things, i.e., allow fast, approximate queries 2 Indexing Primary Key Indexing primary key indexing B-trees and variants (static) hashing extendible hashing secondary key indexing spatial access methods text... find employee with ssn=12 attributes in a record R n SSN a2 a a4 a5 table R R1 R2 R R4 R5 R sequential scan R1 R2 B-tree R on SSN R4 R5 R usinganindex SSN=12 4 B-trees B-trees: Example Most successful family of index schemes B-trees B +- trees B * -trees Canbeusedfor primary/secondary, or clustering/non-clustering index. Balanced n-way search trees Here is a B-tree of order : < > < > 5 1
B-tree Properties In a B-tree of order n: key order preserved at most n pointers at least n/2 pointers (except root) all leaves at the same level B-tree Properties (cont.) block aware nodes: each node -> disk page O(log (N)) for everything! (ins/del/search) typically, if m = 50-100, then 2 - levels utilization >= 50%, guaranteed; on average % if number of pointers is k, node has exactly k-1 keys (leaves are empty) p1 v1 v2 v n-1 pn 8 Exact-Match Queries Range Queries E.g., ssn=8 E.g., 5<salary<8 < > < > H steps (= disk accesses) < > < > 10 Proximity Queries B-trees: Insertion E.g., nearest neighbor searches: salary ~ 8 Insert in leaf; on overflow, push middle up (recursively) split: preserves B - tree properties < > < > 11 12 2
B-trees: Insertion (cont.) Easy case: Tree T0; insert 8 B-trees: Insertion (cont.) Tree T0; insert 8 < > < > < > < > 8 14 B-trees: Insertion (cont.) B-trees: Insertion (cont.) Hardest case: Tree T0; insert 2 Hardest case: Tree T0; insert 2 < > < > 1 2 2 push middle up 15 1 B-trees: Insertion (cont.) B-trees: Insertion (cont.) Hardest case: Tree T0; insert 2 Hardest case: Tree T0; insert 2 Overflow; push middle 22 Final state 2 1 18
B-trees - insertion B-trees: Insertion Sketch Q: What if there are two middles? (eg, order 4) A: either one is fine Insertinleaf;onoverflow,pushmiddleup (recursively propagate split ) split: preserves all B - tree properties (!!) notice how it grows: height increases when root overflows & splits Automatic, incremental re-organization 1 20 Algorithm: Insertion of Key K B-trees: Deletion find the correct leaf node L ; if ( L overflows ) { split L by pushing middle key up to parent P ; Rough outline of algorithm: Delete key; on underflow, may need to merge if ( P overflows) { In practice, some implementors just allow underflows to happen repeat the split recursively; }else{ add key K in node L ; // maintain key order in L } 21 22 B-trees: Deletion Cases B-trees: Deletion Case 1 Case 1 delete a key at a leaf no underflow Case 2 delete non-leaf key no underflow Case delete leaf-key; underflow, and rich sibling Case 4 delete leaf-key; underflow, and poor sibling Case 1: delete a key at a leaf Easiest case: no underflow (delete from T0) < > < > 2 24 4
B-trees: Deletion Case 1 (cont.) Easiest case: Tree T0; delete B-trees: Deletion Case 2 Case2: delete a key at a non-leaf no underflow (eg., delete from T0) 1 < > < > < > < > Delete & promote, ie: 25 2 B-trees: Deletion Case 2 (cont.) B-trees: Deletion Case 2 (cont.) Case2: delete a key at a non-leaf no underflow (eg., delete from T0) Case2: delete a key at a non-leaf no underflow (eg., delete from T0) < > < > Delete & promote, ie: < > < > Delete & promote, ie: 1 2 28 B-trees: Deletion Case 2 (cont.) B-trees: Deletion Case 2 (cont.) Case2: delete a key at a non-leaf no underflow (eg., delete from T0) FINAL TREE < > < 1 > Case2: delete a key at a non-leaf no underflow (eg., delete from T0) Q: How to promote? A: pick the largest key from the left sub-tree (or the smallest from the right sub-tree) Observation: every deletion eventually becomes a deletion of a leaf key 2 0 5
B-trees: Deletion Case 2 (cont.) B-trees: Deletion Cases (cont.) Case2: delete a key at a non-leaf no underflow (eg., delete from T0) < > < 1 > Delete & promote, ie: Case1: delete a key at a leaf no underflow Case2: delete non-leaf key no underflow Case: delete leaf-key; underflow, and rich sibling Case4: delete leaf-key; underflow, and poor sibling 1 2 B-trees: Deletion Case B-trees: Deletion Case (cont.) Case: underflow & rich sibling (eg., delete from T0) Case: underflow & rich sibling e.g., delete from T0 Delete & borrow < > < > Rich sibling < > < > 4 B-trees: Deletion Case (cont.) B-trees: Deletion Case (cont.) Case: underflow & rich sibling Case: underflow & rich sibling e.g., delete from T0 rich = can give a key, without underflowing borrowing a key: THROUGH the PARENT! Delete & borrow < Rich sibling > < > NO!! 5
B-trees: Deletion Case (cont.) B-trees: Deletion Case (cont.) Case: underflow & rich sibling e.g., delete from T0 Delete & borrow Case: underflow & rich sibling e.g., delete from T0 Delete & borrow < > < > < > < > 8 B-trees: Deletion Case (cont.) B-trees: Deletion Cases (cont.) Case: underflow & rich sibling e.g., delete from T0 Delete & borrow THROUGH the parent < > < > Case1: delete a key at a leaf no underflow Case2: delete non-leaf key no underflow Case: delete leaf-key; underflow, and rich sibling Case4: delete leaf-key; underflow, and poor sibling 40 B-trees: Deletion Case 4 B-trees: Deletion Case 4 (cont.) Case4: underflow & poor sibling eg., delete from T0) Case4: underflow & poor sibling eg., delete from T0) < > < > < > < > 41 42
B-trees: Deletion Case 4 (cont.) Case4: underflow & poor sibling eg., delete from T0) B-trees: Deletion Case 4 (cont.) Case4: underflow & poor sibling eg., delete from T0) < > < > A: merge w/ poor sibling Merge, by pulling a key from the parent exact reversal from insertion: split and push up, vs. merge and pull down Ie.: 4 44 B-trees: Deletion Case 4 (cont.) B-trees: Deletion Case 4 (cont.) Case4: underflow & poor sibling eg., delete from T0) Case4: underflow & poor sibling eg., delete from T0) < A: merge w/ poor sibling FINAL TREE < > > 45 4 B-trees: Deletion Case 4 (cont.) Algorithm: Deletion of Key K Case 4: underflow & poor sibling -> pull key from parent, and merge Q: What if the parent underflows? A: repeat recursively locate key K, in node N if( N is a non-leaf node) { delete K from N ; find the immediately largest key K1 ; /* which is guaranteed to be on a leaf node L */ copy K1 in the old position of K ; invoke DELETION on K1 from the leaf node L ; else { /* N is a leaf node */ next slide 4 48 8
Deletion of Key K (cont.) B-trees in Practice if( N underflows ){ let N1 be the sibling of N ; if( N1 is "rich"){ /* ie., N1 can lend us a key */ borrow a key from N1 THROUGH parent node; } else { /* N1 is 1 key away from underflowing */ MERGE: pull key from parent P, merge it with keys of N and N1 into new node; if( P underflows) { repeat recursively } } } In practice: no empty leaves; pointers to records theory < > < > 4 50 B-trees in Practice (cont.) B-trees in Practice (cont.) In practice: no empty leaves; pointers to records practice < > < > In practice: < > < > SSN 1 51 52 B-trees in Practice (cont.) Overview In practice, the formats are: B trees - leaf nodes: (v1, rp1, v2, rp2, vn, rpn) - Non-leaf nodes: (p1, v1, rp1, p2, v2, rp2, ) B+ - trees, B*-trees < > < > 5 54
B+ trees: Motivation B-tree print keys in sorted order: B+ trees: Motivation (cont.) B-tree needs back-tracking how to avoid it? < > < > < > < > 55 5 Solution: B+ - trees B+ trees Facilitate sequential ops They string all leaf nodes together AND replicate keys from non-leaf nodes, to make sure every key appears at the leaf level < >= < >= 5 58 B+ trees: Insertion B*-trees: Motivation E.g., insert 2 < >= < >= Splits drop utilization to 50% May increase height How to avoid them? 5 0 10
B*-trees: Deferred Split! B*-trees: deferred split! Instead of splitting, LEND keys to sibling! (through PARENT, of course!) < > < > Instead of splitting, LEND keys to sibling! (through PARENT, of course!) FINAL TREE < > < > 1 2 2 2 1 2 B*-trees: Advantages B*-trees: deferred split! Tree becomes Shorter, BUT: What if sibling has no room for lending? More packed, Faster Rare case: improve together space utilization speed BUT: What if sibling has no room for lending? 2-to- split 1. get the keys from the sibling 2. pool them with ours (and a key from the parent). split in Details: too messy (and even worse for deletion) 4 Conclusions Performance Aspects of B-trees Two parameters matter: Main ideas: recursive; block-aware; on overflow -> split; defer splits All B-tree variants have excellent, O(logN) worst-case performance for ins/del/search Height H (maximum search path) H =1+ log F* ( N/C* ) N is the number of tuples C* is the average number of entries in a leaf node, and F* is the average number of entries in an index node. It s the prevailing indexing method Size S (number of pages tree occupies) S = i (F*) i-1,1 i < H 5 11
Reducing the Number of Leafs Increasing the Fanout Increase page size (hard) Shorten data length (values, tuples, pointers) Is it worthwhile to change the tuples to TIDs? Compression Prefix store differences (suffixes) Suffix store prefixes No extra page accesses! From Gray&Reuter: 1.1 log F* x must hold i.e., average fan-out really small or tuples > 1K Prefix compression: sequential scan anchor keys 8 Lehman and Yao CC on B-trees Example safe node: node with <2k entries unsafe node: node with =2k entries Simple CC won t do.why? Transaction 1: read x; look for 15; get ptr to y; x y 15 8 10 12 15 x 10 15 y 8 10 y 12 15 Transaction 2: read x; read y; insert split y into y+y 0 Previous B-tree CC algorithms B link -tree Samadi 1 lock the whole subtree of affected node Bayer & Schkolnick 1 parameters on degree/type of consistency required writer-exclusion locks (readers may proceed) upper exclusive locks on modified nodes Miller & Snyder 18 pioneer and follower locks locked region moves up the tree no modifications Node + P2k+1 pointertonextnodeatthe same level of tree Rightmost node s B-link is NULL IDEA: Splitting is implemented as legal to have left twin and no parent 1 2 12
Advantages Algorithms Allows for temporary fix until all pointers are added correctly Link pointers should be used infrequently because splitting a node is a special case Search No locks needed for reads Just move right as well as down Insertions Level traversal comes for free as a side effect Well-ordered locks Use stack to remember ancestors Split while preserving links Deletions No underflows, no merging 4