CS232A: Database System Principles INDEXING. Indexing. Indexing. Given condition on attribute find qualified records Attr = value

CS232A: Database System Principles INDEXING 1 Indexing Given condition on attribute find qualified records Attr = value Qualified records? value value value Condition may also be Attr>value Attr>=value 2 Indexing Data Stuctures used for quickly locating tuples that meet a specific type of condition Equality condition: find Movie tuples where Director=X Other conditions possible, eg, range conditions: find Employee tuples where Salary> AND Salary< Many types of indexes. Evaluate them on Access time Insertion time Deletion time Disk Space needed (esp. as it effects access time)

Topics Conventional indexes B-trees Hashing schemes 4 Terms and Distinctions Primary index the index on the attribute (a.k.a. search key) that determines the sequencing of the table Secondary index index on any other attribute Dense index every value of the indexed attribute appears in the index Sparse index many values do not appear A Dense Primary Index 90 0 1 1 1 90 0 1 Sequential File 90 0 1 1 1 Dense and Sparse Primary Indexes Dense Primary Index 90 0 1 + can tell if a value exists without accessing file (consider projection) + better access to overflow records 0 1 160 0 Sparse Primary Index 90 0 1 Find the index record with largest value that is less or equal to the value we are looking. + less index space more + and - in a while

Sparse vs. Dense Tradeoff Sparse:Less index space per record can keep more of index in memory Dense: Can tell if any record exists without accessing file (Later: sparse better for insertions dense needed for secondary indexes) 7 Multi-Level Indexes Treat the index as a file and build an index on it Two levels are usually sufficient. More than three levels are rare. Q: Can we build a dense second level index for a dense index? 0 2 0 600 7 9 00 0 1 160 0 2 2 0 3 0 460 0 5 90 0 1 A Note on Pointers Record pointers consist of block pointer and position of record in the block Using the block pointer only saves space at no extra disk accesses cost

Representation of Duplicate Values in Primary Indexes Index may point to first instance of each value only 0 0 1 Deletion from Dense Index Delete, Deletion from dense primary index file with no duplicate values is handled in the same way with deletion from a sequential file Q: What about deletion from dense primary index with duplicates Header 90 Header 90 0 1 Lists of available entries Deletion from Sparse Index if the deleted entry does not appear in the index do nothing Delete 0 1 160 0 Header 90 0 1

Deletion from Sparse Index (cont d) if the deleted entry does not appear in the index do nothing if the deleted entry appears in the index replace it with the next search-key value comment: we could leave the deleted value in the index assuming that no part of the system may assume it still exists without checking the block Delete 0 1 160 0 Header 90 0 1 Deletion from Sparse Index (cont d) if the deleted entry does not appear in the index do nothing if the deleted entry appears in the index replace it with the next search-key value unless the next search key value has its own index entry. In this case delete the entry Delete, then Header Header 0 1 160 0 90 0 1 Insertion in Sparse Index if no new block is created then do nothing Insert 35 0 1 160 0 Header 35 90 0 1

Insertion in Sparse Index Insert 15 Header if no new block is created then do nothing else create overflow record Reorganize periodically Could we claim space of next block? How often do we reorganize and how much expensive it is? B-trees offer convincing answers 0 1 160 0 90 0 1 16 Secondary indexes File not sorted on secondary search key 0 90 60 Sequence field 17 Secondary indexes Sparse index 0 90... does not make sense! 0 90 60 Sequence field 18

Secondary indexes Dense index 90... sparse high level 60... First level has to be dense, next levels are sparse (as usual) 0 90 60 Sequence field 19 Duplicate values & secondary indexes Duplicate values & secondary indexes one option... Problem: excess overhead! disk space search time... 21

Duplicate values & secondary indexes another option: lists of pointers Problem: variable size records in index! 22 Duplicate values & secondary indexes 60... Yet another idea : Chain records with same key? λ λ λ λ Problems: Need to add fields to records, messes up maintenance Need to follow chain to know records 23 Duplicate values & secondary indexes 60... buckets 24

Why bucket idea is useful Enables the processing of queries working with pointers only. Very common technique in Information Retrieval Indexes Name: primary Dept: secondary Year: secondary Records EMP (name,dept,year,...) 25 Advantage of Buckets: Process Queries Using Pointers Only Find employees of the Toys dept with 4 years in the company SELECT Name FROM Employee WHERE Dept= Toys AND Year=4 Dept Index Toys PCs Pens Suits Aaron Suits 4 Helen Pens 3 Jack PCs 4 Jim Toys 4 Joe Toys 3 Nick PCs 2 Walt Toys 5 Yannis Pens 1 Intersect toy bucket and 2nd Floor bucket to get set of matching EMP s Year Index 1 2 3 4 This idea used in text information retrieval cat dog Buckets known as Inverted lists Documents...the cat is fat......my cat and my dog like each other......fido the dog... 27

Information Retrieval (IR) Queries Find articles with cat and dog Intersect inverted lists Find articles with cat or dog Union inverted lists Find articles with cat and not dog Subtract list of dog pointers from list of cat pointers Find articles with cat in title Find articles with cat and dog within 5 words 28 Common technique: more info in inverted list cat type Title 5 Author Abstract 57 position location d 3 d 2 d 1 dog Title 0 Title 12 29 Posting: an entry in inverted list. Represents occurrence of term in article Size of a list: 1 Rare words or (in postings) mis-spellings 6 Common words Size of a posting: -15 bits (compressed)

Vector space model w1 w2 w3 w4 w5 w6 w7 DOC = <1 0 0 1 1 0 0 > Query= <0 0 1 1 0 0 0 > PRODUCT = 1 +. = score 31 Tricks to weigh scores + normalize e.g.: Match on common word not as useful as match on rare words... 32 Summary of Indexing So Far Basic topics in conventional indexes multiple levels sparse/dense duplicate keys and buckets deletion/insertion similar to sequential files Advantages simple algorithms index is sequential file Disadvantages eventually sequentiality is lost because of overflows, reorganizations are needed

Example continuous free space Index (sequential) 33 60 90 39 31 35 36 32 38 34 overflow area (not sequential) 34 Outline: Conventional indexes B-Trees NEXT Hashing schemes 35 NEXT: Another type of index Give up on sequentiality of index Try to get balance 36

B+Tree Example n=3 Root 3 5 11 35 0 1 1 1 1 1 156 179 1 0 1 1 1 0 37 Sample non-leaf 57 81 95 to keys to keys to keys to keys < 57 57 k<81 81 k<95 95 38 Sample leaf node: From non-leaf node 57 81 to next leaf in sequence To record with key 57 To record with key 81 95 To record with key 85 39

In textbook s notation n=3 Leaf: 35 35 Non-leaf: Size of nodes: n+1 pointers n keys (fixed) 41 Non-root nodes have to be at least half-full Use at least Non-leaf: Leaf: (n+1)/2 pointers (n+1)/2 pointers to data 42

n=3 Full node min. node Non-leaf 1 1 1 Leaf 3 5 11 35 43 B+tree rules tree of order n (1) All leaves at same lowest level (balanced tree) (2) Pointers in leaves point to records except for sequence pointer 44 (3) Number of pointers/keys for B+tree Max Max Min ptrs keys ptrs data Min keys Non-leaf (non-root) n+1 n (n+1)/2 (n+1)/2-1 Leaf (non-root) n+1 n (n+1)/2 (n+1)/2 Root n+1 n 1 1 45

Insert into B+tree (a) simple case space available in leaf (b) leaf overflow (c) non-leaf overflow (d) new root 46 (a) Insert key = 32 n=3 3 5 11 31 32 0 47 (a) Insert key = 7 n=3 3 5 3 5 11 7 31 7 0 48

(c) Insert key = 160 n=3 1 156 179 160 179 1 0 1 1 1 1 0 160 49 (d) New root, insert 45 n=3 new root 1 2 3 12 25 32 45 Deletion from B+tree (a) Simple case - no example (b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) Cases (b) or (c) at non-leaf 51

(b) Coalesce with sibling Delete n=4 0 52 (c) Redistribute keys Delete n=4 35 35 0 35 53 (d) Non-leaf coalese Delete 37 n=4 new root 1 3 14 22 25 26 37 45 25 25 54

B+tree deletions in practice Often, coalescing is not implemented Too hard and not worth it! 55 Comparison: B-trees vs. static indexed sequential file Ref #1: Held & Stonebraker B-Trees Re-examined CACM, Feb. 1978 56 Ref # 1 claims: - Concurrency control harder in B-Trees - B-tree consumes more space For their comparison: block = 512 bytes key = pointer = 4 bytes 4 data records per block 57

Example: 1 block static index 127 keys k1 k2 k3 k1 k2 k3 1 data block (127+1)4 = 512 Bytes -> pointers in index implicit! up to 127 contigous blocks 58 Example: 1 block B-tree k1 k1 k2 k2... 63 keys k63 k3 - next 63x(4+4)+8 = 512 Bytes -> pointers needed in B-tree up to 63 blocks because index and data blocks are not contiguous 1 data block 59 Size comparison Ref. #1 Static Index B-tree # data # data blocks height blocks height 2 -> 127 2 2 -> 63 2 128 -> 16,129 3 64 -> 3968 3 16,1 -> 2,048,383 4 3969 -> 2,047 4 2,048 -> 15,752,961 5 60

Ref. #1 analysis claims For an 8,000 block file, after 32,000 inserts after 16,000 lookups Static index saves enough accesses to allow for reorganization Ref. #1 conclusion Static index better!! 61 Ref #2: M. Stonebraker, Retrospective on a database system, TODS, June 19 Ref. #2 conclusion B-trees better!! 62 Ref. #2 conclusion B-trees better!! DBA does not know when to reorganize Self-administration is important target DBA does not know how full to load pages of new index 63

Ref. #2 conclusion B-trees better!! Buffering B-tree: has fixed buffer requirements Static index: large & variable size buffers needed due to overflow 64 Speaking of buffering Is LRU a good policy for B+tree buffers? Of course not! Should try to keep root in memory at all times (and perhaps some nodes from second level) 65 Interesting problem: For B+tree, how large should n be? n is number of keys / node 66

Assumptions You have the right to set the disk page size for the disk where a B-tree will reside. Compute the optimum page size n assuming that The items are 4 bytes long and the pointers are also 4 bytes long. Time to read a node from disk is 12+.003n Time to process a block in memory is unimportant B+tree is full (I.e., every page has the maximum number of items and pointers Can get: f(n) = time to find a record f(n) n opt n 68 FIND n opt by f (n) = 0 Answer should be n opt = few hundred What happens to n opt Disk gets faster? CPU get faster? as 69

Variation on B+tree: B-tree (no +) Idea: Avoid duplicate keys Have record pointers in non-leaf nodes K1 P1 K2 P2 K3 P3 to record to record to record with K1 with K2 with K3 to keys to keys to keys to keys < K1 K1<x<K2 K2<x<k3 >k3 71 B-tree example n=2 sequence pointers not useful now! (but keep space for simplicity) 60 65 125 90 0 1 1 1 1 1 160 1 1 25 45 85 5 145 165 72

So, for B-trees: MAX MIN Tree Rec Keys Tree Rec Keys Ptrs Ptrs Ptrs Ptrs Non-leaf non-root n+1 n n (n+1)/2 (n+1)/2-1 (n+1)/2-1 Leaf non-root 1 n n 1 (n+1)/2 (n+1)/2 Root non-leaf n+1 n n 2 1 1 Root Leaf 1 n n 1 1 1 73 Tradeoffs: B-trees have marginally faster average lookup than B+trees (assuming the height does not change) in B-tree, non-leaf & leaf different sizes Smaller fan-out in B-tree, deletion more complicated B+trees preferred! 74 Example: -Pointers 4 bytes -Keys 4 bytes - Blocks 0 bytes (just example) - Look at full 2 level tree 75

B-tree: Root has 8 keys + 8 record pointers + 9 son pointers = 8x4 + 8x4 + 9x4 = 0 bytes Each of 9 sons: 12 rec. pointers (+12 keys) = 12x(4+4) + 4 = 0 bytes 2-level B-tree, Max # records = 12x9 + 8 = 116 76 B+tree: Root has 12 keys + 13 son pointers = 12x4 + 13x4 = 0 bytes Each of 13 sons: 12 rec. ptrs (+12 keys) = 12x(4 +4) + 4 = 0 bytes 2-level B+tree, Max # records = 13x12 = 156 77 So... 8 records B+ B ooooooooooooo ooooooooo 156 records 8 records Total = 116 Conclusion: For fixed block size, B+ tree is better because it is bushier 78

Outline/summary Conventional Indexes Sparse vs. dense Primary vs. secondary B trees B+trees vs. B-trees B+trees vs. indexed sequential Hashing schemes --> Next 79 Hashing hash function h(key) returns address of bucket or record for secondary index buckets are required if the keys for a specific hash value do not fit into one page the bucket is a linked list of pages key Records key h(key) Buckets Records h(key) key Example hash function Key = x1 x2 xn n byte character string Have b buckets h: add x1 + x2 +.. xn compute sum modulo b 81

This may not be best function Read Knuth Vol. 3 if you really need to select a good function. Good hash function: Expected number of keys/bucket is the same for all buckets 82 Within a bucket: Do we keep keys sorted? Yes, if CPU time critical & Inserts/Deletes not too frequent 83 Next: example to illustrate inserts, overflows, deletes h(k) 84

EXAMPLE 2 records/bucket INSERT: h(a) = 1 h(b) = 2 h(c) = 1 h(d) = 0 0 1 2 3 d a c b e h(e) = 1 85 EXAMPLE: deletion Delete: e f c 0 1 2 a b c e d d 3 f g maybe move g up 86 Rule of thumb: Try to keep space utilization between % and % Utilization = # keys used total # keys that fit If < %, wasting space If > %, overflows significant depends on how good hash function is & on # keys/bucket 87

How do we cope with growth? Overflows and reorganizations Dynamic hashing Extensible Linear 88 Extensible hashing: two ideas (a) Use i of b bits output by hash function b h(k) 0011 use i grows over time. 89 (b) Use directory h(k)[0-i ]. to bucket. 90

Example: h(k) is 4 bits; 2 keys/bucket i = 2 1 i = 1 0001 00 01 Insert 1 2 01 10 1 2 10 11 New directory 91 Example continued i = 2 00 01 11 Insert: 0111 0000 2 0000 0001 1 2 0001 0111 0111 2 01 2 10 92 Example continued i = 2 00 01 0000 0001 0111 2 2 i = 3 000 001 0 11 Insert: 01 01 01 01 10 3 2 3 2 011 0 1 1 111 93

Extensible hashing: deletion No merging of blocks Merge blocks and cut directory if possible (Reverse insert procedure) 94 Deletion example: Run thru insert example in reverse! 95 Summary Extensible hashing + - - Can handle growing files - with less wasted space - with no full reorganizations Indirection (Not bad if directory in memory) Directory doubles in size (Now it fits, now it does not) 96

Linear hashing Another dynamic hashing scheme Two ideas: (a) Use i low order bits of hash (b) File grows linearly b 0111 grows i 97 Example b=4 bits, i =2, 2 keys/bucket 01 insert 01 can have overflow chains! 0000 01 1111 00 01 11 m = 01 (max used block) Future growth buckets Rule If h(k)[i ] m, then look at bucket h(k)[i ] else, look at bucket h(k)[i ] -2 i -1 98 Example b=4 bits, i =2, 2 keys/bucket 01 insert 01 0000 01 01 1111 1111 00 01 11 m = 01 (max used block) 11 Future growth buckets 99

Example Continued: How to grow beyond this? i = 2 3 0000 01 01 1111 000 001 0 011 0 1 1 111 m = 11 (max used block) 0 1 0 01 01 1... 0 When do we expand file? Keep track of: # used slots total # of slots = U If U > threshold then increase m (and maybe i ) 1 Summary Linear Hashing + + - Can handle growing files - with less wasted space - with no full reorganizations No indirection like extensible hashing Can still have overflow chains 2

Example: BAD CASE Very full Very empty Need to move m here Would waste space... 3 Summary Hashing -How it works -Dynamic hashing -Extensible -Linear 4 Next: Indexing vs Hashing Index definition in SQL Multiple key access 5

Indexing vs Hashing Hashing good for probes given key e.g., SELECT FROM R WHERE R.A = 5 6 Indexing vs Hashing INDEXING (Including B Trees) good for Range Searches: e.g., SELECT FROM R WHERE R.A > 5 7 Index definition in SQL Createindex name on rel (attr) Createunique index name on rel (attr) defines candidate key Drop INDEX name 8

Note CANNOT SPECIFY TYPE OF INDEX (e.g. B-tree, Hashing, ) OR PARAMETERS (e.g. Load Factor, Size of Hash,...)... at least in SQL... 9 Note ATTRIBUTE LIST MULTIKEY INDEX (next) e.g., CREATE INDEX foo ON R(A,B,C) 1 Multi-key Index Motivation: Find records where DEPT = Toy AND SAL > k 111

Strategy I: Use one index, say Dept. Get all Dept = Toy records and check their salary I1 112 Strategy II: Use 2 Indexes; Manipulate Pointers Toy Sal > k 113 Strategy III: Multiple Key Index One idea: I1 I2 I3 114

Example Art Sales Toy Dept Index k 15k 17k 21k 12k 15k 15k 19k Salary Index Example Record Name=Joe DEPT=Sales SAL=15k 115 For which queries is this index good? Find RECs Dept = Sales Find RECs Dept = Sales Find RECs Dept = Sales Find RECs SAL = k SAL=k SAL > k 116 Interesting application: Geographic Data y x DATA: <X1,Y1, Attributes> <X2,Y2, Attributes>... 117

Queries: What city is at <Xi,Yi>? What is within 5 miles from <Xi,Yi>? Which is closest point to <Xi,Yi>? 118 Example 25 15 35 l j k h i n o m e f g d b c a 5 h i g f d e c a b 15 15 j k l m n o Search points near f Search points near b 119 Queries Find points with Yi > Find points with Xi < 5 Find points close to i = <12,38> Find points close to b = <7,24> 1

Many types of geographic index structures have been suggested Quad Trees R Trees 121 Two more types of multi key indexes Grid Partitioned hash 122 Grid Index Key 1 V1 V2 Key 2 X1 X2 Xn Vn To records with key1=v3, key2=x2 123

CLAIM Can quickly find records with key 1 = V i Key 2 = X j key 1 = V i key 2 = X j And also ranges. E.g., key 1 V i key 2 < X j 124 But there is a catch with Grid Indexes! How is Grid Index stored on disk? Like Array... V1 V2 V3 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 Problem: Need regularity so we can compute position of <Vi,Xj> entry 125 Solution: Use Indirection X1 X2 X3 V1 V2 V3 V4 Buckets -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Buckets *Grid only contains pointers to buckets 126

With indirection: Grid can be regular without wasting space We do have price of indirection 127 Can also index grid on value ranges Salary Grid 0-K 1 K-K 2 K- 3 8 Linear Scale 1 2 3 Toy Sales Personnel 128 Grid files + - - Good for multiple-key search Space, management overhead (nothing is free) Need partitioning ranges that evenly split keys 129

Partitioned hash function Idea: 01 110 Key1 h1 h2 Key2 1 EX: h1(toy) =0 000 h1(sales) =1 001 h1(art) =1 0 011. h2(k) =01 0 h2(k) =11 1 h2(k) =01 1 h2(k) =00 111. <Fred> <Joe><Sally> Insert <Fred,toy,k>,<Joe,sales,k> <Sally,art,k> 131 h1(toy) =0 000 h1(sales) =1 001 h1(art) =1 0 011. h2(k) =01 0 h2(k) =11 1 h2(k) =01 1 h2(k) =00 111. <Fred> <Joe><Jan> <Mary> <Sally> <Tom><Bill> <Andy> Find Emp. with Dept. = Sales Sal=k 132

h1(toy) =0 000 h1(sales) =1 001 h1(art) =1 0 011. h2(k) =01 0 h2(k) =11 1 h2(k) =01 1 h2(k) =00 111. Find Emp. with Sal=k <Fred> <Joe><Jan> <Mary> <Sally> <Tom><Bill> <Andy> look here 133 h1(toy) =0 000 h1(sales) =1 001 h1(art) =1 0 011. h2(k) =01 0 h2(k) =11 1 h2(k) =01 1 h2(k) =00 111. Find Emp. with Dept. = Sales <Fred> <Joe><Jan> <Mary> <Sally> <Tom><Bill> <Andy> look here 134 Summary Post hashing discussion: - Indexing vs. Hashing - SQL Index Definition - Multiple Key Access - Multi Key Index Variations: Grid, Geo Data - Partitioned Hash 135

Reading Chapter 5 Skim the following sections: 5.3.6, 5.3.7, 5.3.8 5.4.2, 5.4.3, 5.4.4 Read the rest 136 The BIG picture. Chapters 2 & 3: Storage, records, blocks... Chapter 4 & 5: Access Mechanisms - Indexes - B trees - Hashing -Multi key Chapter 6 & 7: Query Processing NEXT 137