Database Systems. File Organization-2. A.R. Hurson 323 CS Building

File Organization-2 A.R. Hurson 323 CS Building

Indexing schemes for Files The indexing is a technique in an attempt to reduce the number of accesses to the secondary storage in an information retrieval system. An index is usually defines on a single field of data indexing field. The index is an auxiliary structure that stores each value of the index field along with a list of pointers to all disk blocks that hold a record with that field value.

Indexing schemes for Files An index is of two types: Dense index: index has an entry for each data record. Non-dense index: block. index has an entry for each data The index file is ordered, this allows us to perform a binary search on the index. The index file is much smaller than the data file, in many cases, this allows us to keep the index in the primary memory.

Primary Indexes A primary index is an ordered file whose records are of fixed length with two fields: A field with the same data type as the ordering key field of the data file. A pointer to a disk block. The ordering key field in called the primary key of the data file.

Primary Indexes The index file has an entry for each data file block. Each index entry i has the value of the primary key field for the first record in a block K(i) and a pointer to that block P(i). Primary Key Value Block Pointer K(i) P(i)

Index File Aaron, Ed Adams, John Alexander, Ed Allen, Troy Wong, James Wright, Pam Aaron, Ed Acosta, Marc Adams, John Akers, Jan Alexander, Ed Allen, Sam Allen, Troy Andrea, Ali Wong, James Woods, Bob Wright, Pam Zimmer, Deb

Primary Indexes The total number of entries in the index is the same as the number of blocks of the data file. The first record in each block is called an anchor record. The index file for a primary index needs substantially fewer blocks than the data file, since: There are fewer entries in the index file than there are records in the data file. Each index entry is smaller than a data record.

Primary Indexes A record whose primary key value is K will be in a block whose address is P(i), and K(i) K < K(i+1). To retrieve a record with a primary key value of K, we do a binary search on the index file to find the appropriate index entry i and then to access the designated data block. If the number of data file blocks is b then we need log 2 b + 1 accesses to the disk.

Example Assume an ordered, un-spanned, fixed size file of 30,000 records stored on a disk with block size of 1024 bytes and record length of 100 bytes. As a result: Bfr = 1024/100 = 10 records/block #of blocks = 30,000/10 = 3,000 A binary search on the data file needs log 2 3000 = 12 disk accesses

Example Now assume the ordering key field of 9 bytes and a block pointer of 6 bytes long. As a result: Each index entry is 15 bytes, bfr of index = 1024/15 = 68 entries/block # of blocks for index = 3000/68 = 45 blocks To perform a binary search on the index file needs log 2 45 = 6 disk accesses. Finally, we need one more access to the designated block.

Primary Indexes Insertion and deletion of records are the major problem with primary index organization, like any ordered file. With the primary indexing the problem is compounded an insertion will also effect the structure of the index file, since moving records in the file will change the anchor records in some of the blocks.

Primary Indexes An unordered overflow file can be used for insertion. However, to improve the retrieval, we can use sorted linked list of overflow records for each block. Record deletion can be facilitated extensively by using a deletion marker.

Clustering Indexes The clustering indexing can be applied when records of a file are physically ordered on a non key field does not have distinct values for different records. Such a field is called a clustering field of the file.

Clustering Indexes A clustering index is a non-dense ordered file of fixed length records, each with two fields: The first field is of the same type as the clustering field of the file. The second field is a pointer to a data file block. Clustering value K(i) Block pointer P(i)

Clustering Indexes Each distinct value of the clustering field has an entry in the index file containing a pointer, P(i), to the first data file block that has a record with clustering field value equal to K(i).

Index File 1 2 3 4 5 6 8 1 1 1 2 2 3 3 3 3 3 4 4 5 5 5 5 Clustering Field Value Block Pointer 6 6 6 8

Clustering Indexes Record insertion and deletion still cause considerable problems. To alleviate the problem of insertion, a whole block can be reserved for each clustering value. In case a clustering value requires additional blocks, additional blocks are allocated and linked together.

1 1 1 2 3 4 5 6 8 3 3 3 4 4 4 2 2 3 3 Block Pointer Block Pointer Block Pointer Block Pointer Block Pointer

Secondary Indexes A secondary indexing scheme is a dense ordered file of record each with two fields: The first field is of the same data type as some nonordering field of the data file. The second field is a pointer to a disk block. indexing value Block pointer K(i) P(i)

Secondary Indexes The field on which the indexing file is constructed is called an indexing field of the file. As will be clear, a secondary index provides a logical ordering on the records (why)? Two cases will be considered: Indexing field is a key field (secondary Key), Indexing field is not a key field.

Indexing field as a key field In this case, the entries of the indexing file <K(i), P(i)> are unique and ordered by values of K(i). The P(i) now is pointing to the data block containing the record with the same value of K(i).

1 2 3 4 5 6 7 8 9 9 5 13 8 6 15 3 17 10 11 12

Indexing field as a key field This structure usually requires much more storage space than a primary key indexing. And, the search time for an arbitrary record is much more than the primary indexing scheme.

Example A fixed length, un-spanned, ordered file with 30,000 records of size 100 bytes is stored on a disk with block size of 1024 bytes. For this environment, data file requires 3000 blocks and a linear search would require, on the average, 1,500 block accesses.

Example Suppose we construct a secondary index on a nonordering key field of length 9 bytes, also assume that the block pointer is 6 bytes long. Each index entry is 15 bytes and hence, bfr = 1024/15 = 68 The number of entries in the index file is 30,000 and hence, the number of blocks needed to store the index file is 30,000/68 = 442

Example A binary search on the index file requires log 2 442 = 9 block accesses. Therefore, we need a total 10 block accesses for a record in comparison to 1,500 accesses in a linear search.

Indexing field as non-key field In this case, several entries in the index can have the same value for the indexing field. We can have several options for implementation of index file: Index file has several entries with the same K(i) value a dense index.

Indexing field as non-key field Records in the index file are of variable length type <K(i), P(i,1), P(i,2),, P(i,k)> each P(i,j) pointing to a distinct file block containing a record whose indexing field value is K(i).

Indexing field as non-key field Index file is of fixed length record type, however, we create an extra level of indirection to handle multiple pointers. In this case, the P(i) in the index entry <K(i), P(i)> points to a block of pointers, each pointing to a disk file containing a record with the indexing field equal to K(i). Finally, a liked list of blocks can be used if the P(i) s for the same K(i) can not fit into a block.

1 2 3 4 5 6 7 8 3 5 1 6 2 3 4 6 3 6 4 1 1 2 6 3

Multilevel indexing The indexing techniques discussed so far assumed a sorted index structure. This allowed us to use a binary search in order to locate a pointer to a disk block containing the designated record. A binary search requires log 2 n block accesses for an index structure of n blocks at each stage the search space is divided by two.

Multilevel indexing The multilevel index structure allows to reduce the search space by a factor of bfr at each step. As a result, it requires log bfr n block accesses. Each block of a multilevel index organization is an ordered file of bfr records, each entry holding a distinct value K(i) and a pointer to the anchor record of lower level block.

Multilevel indexing In another words each block entry in the first level acts as a primary index for the data file each block points to bfr anchor records of bfr data blocks. At the second level, again each block entry acts as a primary index to the 1st level each block points to bfr anchor records of bfr blocks in the first level. This process can be repeated until we get to an index level with one block.

Multilevel indexing The multilevel indexing scheme can be used on any type of index primary, clustering, secondary as long as the first level has a distinct value for and fixed-length entries.

Second (top) Level 2 35 55 85 First (base) Level 2 8 15 24 35 39 44 51 55 63 71 80 2 5 8 12 15 21 24 29 35 36 39 41 44 46 51 52 85 55 58

Example A fixed length, un-spanned, ordered file with 30,000 records of size 100 bytes is stored on a disk with block size of 1024 bytes. Suppose an ordering key field of length 9 bytes, and the block pointer of 6 bytes long.

Each index entry is 15 bytes and hence, bfr = 1024/15 = 68 The number of entries in the index file is 30,000 and hence, the number of first level blocks needed is: b 1 = 30,000/68 = 442 The number of second level blocks will be: b 1 /bfr = 442/68 = 7 The number of third level blocks is:

b 2 /bfr = 7/68 = 1 So the third level is the top level of the index structure. To access a record we need 3 + 1 = 4 block accesses. Compare it to 10 accesses needed in the previous example.

Multilevel indexing In general, assume entry i at level j of the index is represented as: < K j (i), p j (i) > and search for a record whose primary key value is K (no overflow). If the record is in the file, then there will be some entry at level 1 with K 1 (i) K < K 1 (i+1). As a result, the record will be in the data file block whose address is in p 1 (i).

Algorithm search Multilevel index p address of top level block for j t step -1 to 1 do begin end; read the index block (at j th address is in p; index level) whose search block p for entry i such that K j (i) K < K j (i+1); p p j (i); read the data file block whose address is p; search block p for record key = K;

Dynamic Multilevel Indexes The multilevel indexing reduces the number of accesses to the disk, however, still insertion and deletion operations are problematic. To get advantage of multilevel indexing while reducing the complexity of insertion and deletion, a multilevel index that leaves some space in each of blocks to facilitate insertion and deletion will be discussed Dynamic multilevel index.

Tree Data Structure A tree is a collection of data elements called nodes. The root node is a node without having any incoming arcs. Each node except the root node has a parent and zero or more child nodes. A node without child nodes is called a leaf node.

Tree Data Structure The level of a node is one more than the level of its parent, with the level of the root node being zero. A sub-tree of a node consists of that node and all of its descendent nodes. A node of a tree is usually represented by some data values and a group of pointers pointing to its children.

Sub-tree for B B A C Root node - level zero D A node at level 1 E F G H I Leaf nodes J K

Tree Data Structure Search Tree: a search tree of order p is a tree such that each node contains at most p-1 search values and p pointers in the order: < P 1, K 1 ; P 2,K 2 ;., P q-1, K q-1 ; P q > where q p, each K i is a search value from some order set of values, and each P i is a pointer to a child node.

Search Tree K 1 K i-1 K i K q-1 X X X X < K 1 K i-1 < X < K i K q-1 < X

Search Tree Within each node K 1 < K 2 < < K q-1 For all value of X in the sub-tree pointed at by P i, we have: K i-1 < X < K i 1 < i < q, X < K 1 i = 1, K q-1 < X i = q

Search Tree 5 3 6 9 1 7 8 12 Null tree pointer Tree node pointer

Search Tree Insertions and deletions to a search tree are problematic. Deletion could result in an almost empty node in the search tree. Insertion could result in a non-balanced tree.

B-Tree B-Tree is a search tree with additional constraints, these constraints will remedy issues related to insertion and deletion operations. As a result, the tree is always maintained as a balanced tree, moreover, the space wasted by deletion, if any, never becomes excessive. However, these advantages comes at the expense of more complicated insertion and deletion procedures when inserting into an already full node, or deleting from a node that makes it less than full.

B-Tree A B-Tree of order p, when used as an access structure on a key field (i.e., each key value is unique) is defined: Each internal nose has the following format; < P 1, < K 1, Pr 1 >, P 2 < K 2, Pr 2 >,., P q-1, < K q-1, Pr q-1 >, P q > q p, P i is a tree pointer a pointer to another node in the B-Tree, Pr i is a data pointer a pointer to a data block that contains a record with a search key value equal to K i.

For all search key field values X in the sub-tree pointed at by p i, we have: K i-1 < X < K i 1 < i < q-1, X < K i i = 1, K q-1 < X i = q. Each node has at most p tree pointers Each node except the root and leaf nodes has at least (p/2) tree pointers. The root has at least two tree pointers unless it is the only node in the tree.

A node with q tree pointers, q p, has q-1 search key field values. Leaf nodes have the same structure as internal nodes, except all tree pointers are nil, and they are all at the same level.

B-Tree P 1 K 1 Pr 1 P 2 K i-1 Pr i-1 P i K q-1 Pr q-1 P q Data pointer Data pointer Data pointer X X X X X < K 1 K i-1 < X < K i K q-1 < X

5 8 1 3 6 7 9 12 P = 3 and values were inserted in the order of: 8, 5, 1, 7, 3, 12, 9, 6

Question What if we use a B-Tree for a non-key field of a file several records can have the same value for the search field. In this case the file pointers (Pr i ) instead of pointing to a file block, points to a block or linked list of blocks that contain pointers to the file record themselves.

An example: A B-Tree starts with a single root node at level (0). Once the root node is full with p-1 search key values and we attempt to insert another entry in the root, the root node is split into two nodes at level 1. The middle value is kept in the root and the rest of the values are split as evenly as possible and moves to the other two nodes.

Insert 8 in an empty B-Tree 8 Insert 5 5 8 Insert 1 5 1 8

Insert 7 5 1 7 8 Insert 3 5 1 3 7 8

Insert 12 5 8 1 3 7 12 Insert 9 5 8 1 3 7 9 12

Insert 6 5 8 1 3 6 7 9 12 Insert 2?

B-Tree In general, when a non-root is full and a new entry is inserted into it, that node is split into two nodes at the same level and the middle entry is moved up to the parent node along with two pointers to the new nodes. If the parent is full, it is also split in the same way. Splitting can propagate all the way to the root node creating a new level if the root is split.

B-Tree If deletion of a value makes a node less half full, it may be combined with its neighbors. This can propagate all the way to the root. Hence, deletion can cause reduction in the tree levels.

Example Assume the search field is V=9 bytes long, the disk block is 512 bytes, and a block pointer is P=6 bytes. Each B-Tree node has P tree pointers and P-1 data pointers and P-1 search key field. A node must fit into a disk block. Hence, (P * 6) + (P-1) * (6+9) 512 21 * P 512 P = 25 However, usually, a node has some additional information such as a pointer to the parent node and the number of entries in the node.

B + -Tree Most implementation of dynamic multilevel index use a variation of B-Tree called B + -Tree. In B-Tree, every value of the search field appears once at some level in the tree along with a data pointer. In B + -Tree, data pointers are stored only at the leaf nodes. Hence, structure of the leaf nodes is different from the structure of internal nodes.

B + -Tree Leaf nodes have an entry for each value of the search field along with a pointer to a data block containing them. For non-key search fields, the data pointer points to a block containing pointers to the data file records An extra level of indirection. The leaf nodes of the B + -Tree are usually linked together to provide an ordered access on the search field to the records.

B + -Tree The internal nodes of a B + -Tree of order p is: of the form: < P 1, K 1, P 2, K 2,., P q-1, K q-1, P q > where q p, and each P i is a tree pointer. Within each internal node K 1 < K 2 < < K q-1 For all search field value of X in the sub-tree pointed at by P i, we have: K i-1 < X K i for 1 < i < q, X K i for i = 1, and K i-1 < X for i = q

B + -Tree Each internal node has at most p tree pointers. Each internal node, except the root, has at least (p/2) tree pointers. The root has at least two pointers if it is an internal node. An internal node with q pointers q p, has q-1 search field values.

B + -Tree P 1 K 1 P 2 K 2 K i-1 P i K i K q-1 P q Tree Pointer Tree Pointer Tree Pointer X X X X X K 1 K i-1 < X K i K q-1 < X

B + -Tree The Leaf nodes of a B + -Tree of order p is: of the form < < K 1, Pr 1 >, < K 2, Pr 2 >,., < K q-1, Pr q-1 >, P next > q p, Pr i is a data pointer a pointer to a data block that contains a record with a search key value equal to K i and P next points to the next leaf node of the B + -Tree.

B + -Tree Within each leaf node K 1 < K 2 < < K q-1 where q p Each Pr i is a data pointer pointing to a file block containing the record whose search field is K i (or pointing to a block of records pointers that points to records whose search field value is K i ). Each leaf node has at least (p/2) values. All leaf nodes are at the same level.

B + -Tree Example Suppose the search field is V=9 bytes long, the block size is B=512 bytes, and a block pointer is 6 bytes long. An internal node of B + -Tree can have up to p tree pointers and p-1 search field values that must fit into a single block. Hence, (p * 6) + ((p-1) * 9) 512 15 p 521 p = 34

B + -Tree Example In a comparison with previous example (B-Tree), one can conclude that a B + -Tree offers a bigger fan-out and hence the depth of B + -Tree is less than the depth of the similar B-Tree. As in B-Tree, a node in B + -Tree may need additional information such as the number of entries in the node, the pointer to the parent and sibling.

Insertion Assume a B + -Tree of order p=3, the following sequence of keys (8, 5, 1, 7, 3, 12, 9, 6) are inserted into an empty B + -Tree. Show the snap shots after each insertion. Data Pointer Nil Pointer Tree node Pointer

Insert 8 in an empty B + -Tree 8 Root is a leaf node Insert 5 Insert 1 5 8 5 Root is a leaf node Overflow, new level 1 5 8

When a leaf node is full and a new entry is inserted, the node overflows and must be split. The first j = (p/2) entries are kept there and the remaining entries moved to a new leaf node. The j th search field is replicated in the parent internal node and a tree pointer to the new node is created in the parent.

Insert 7 5 1 5 7 8 Insert 3 3 5 1 3 5 7 8

Insert 12 5 3 8 1 3 5 7 8 12

Insert 9 5 3 8 1 3 5 7 8 9 12

In the case if the parent node is full, it should be split. The entries in the internal node up to p j The j th tree pointer j = (p/2) are kept in the same node, while the j th search value is moved to the parent not replicated. A new internal node hold the entries p j+1 to the end of the entries in the node.

Insert 6 5 3 7 8 1 3 5 6 7 8 9 12

Deletion Perform the following delete operations on the following B + -Tree: 7 1 6 9 1 7 5 6 8 9 12

Delete 5 7 1 6 9 1 7 6 8 9 12

In delete operation, entry is always removed from the leaf node. If It happens to be in an internal node as well, then it must be removed from there too.

v Deletion may cause underflow. In this case, we try to find a sibling leaf node that is more than half full. If we have such a sibling, we can redistribute the search values such that both nodes are at least half full. Otherwise the node is merged with one of its siblings usually we try the sibling to the left, if not try the sibling to the right, if not the three nodes are merged into two nodes.

Delete 12 This cause an underflow so the nodes are redistribute 7 1 6 8 1 7 6 8 9

Delete 9 This cause an underflow 1 6 1 6 7 8