B-Trees & its Variants

Size: px
Start display at page:

Download "B-Trees & its Variants"

Transcription

1 B-Trees & its Variants Advanced Data Structure Spring 2007 Zareen Alamgir

2 Motivation Yet another Tree! Why do we need another Tree-Structure?

3 Data Retrieval from External Storage In database programs, the data is too large to fit in memory, therefore, it is stored on secondary storage (disks or tapes). Disk access is very expensive, the disk I/O operation takes milliseconds while CPU processes data on the order of nanoseconds, one million times faster. When dealing with external storage the disk accesses dominate the running time.

4 Balanced Binary Search Trees Balanced binary search trees (AVL & Red-Black) have good performance if the entire data can fit in the main memory. These trees are not optimized for external storage and require many disk accesses, thus give poor performance for very large data.

5 Reduce Disk Accesses Data is transfer to and from the disk in block (typically block are of 512, 2048, 4096 or 8192 bytes). To reduce disk accesses Store multiple records in a block on the disk. Reduce tree height by increasing the number of children of a node. To achieve above goals we use Multiway (m-way) search tree, which is a generalization of BST, binary search tree

6 Multiway(m-way) Search Trees

7 Multiway(m-way) Search Trees In an m-way tree all the nodes have degree m. Each node has the following structure: P 1 K 1 P 2 K i-1 P i K i K q-1 P q keys<k 1 K i-1 <keys<k i K q-1 <keys K i are the keys, where 1 i q-1, q<=m P i are pointers to subtrees, where 1 i q, q<=m The keys in each node are in ascending order K 1 K 2... K i The key K i is larger than keys in subtree pointed by P i and smaller than keys in subtree pointed by P i+1. The subtrees are the m-way trees.

8 Multiway(m-way) Search Trees M-way tree is a generalization of BST, its working, benefits & issues are same. Benefits Fast information retrieval. Fast update Problems The tree is not balanced. Leaf nodes are on different levels. Bad space usage, tree can become skew M-way tree

9 B-Trees

10 B-Trees B-Tree is a balanced m-way tree that is tuned to minimize disk accesses. The node size of B-Tree is usually equal to the disk block size and the number of keys in a node depends on Key size Disk block size Data organization (either key or entire data record is store in a node) Access paths from root to leaf nodes are small

11 B-Tree: Definition A B-tree of order m has following properties The root has at least two subtrees unless it is a leaf. Each non-root and each non-leaf node holds q-1 keys and q pointers to subtrees, where m / 2 q m. Each leaf node holds q-1 keys where m / 2 q m. All lea are on the same level. It is clear that B-tree is always at least half full, has fewer levels and is perfectly balanced. Fr P 1 K 1 P 2 K i-1 P i K i K q-1 P q keys<k 1 K i-1 <keys<k i K q-1 <keys

12 B-Tree B-Tree can have a field KeyTally, KT, to indicate the number of keys currently stored in the node. B-Tree node usually contains key and data pointer pair. The data pointer points to the data record which is not stored in the node, with this scheme we can pack more keys & pointers in a B-Tree node. KT P 1 K 1 D 1 K i-1 DK i-1 i-1 P i K i D i K q-1 KD q-1 q-1 P q keys<k 1 Data pointer K i-1 <keys<k i Data pointer K q-1 <keys

13 Height of B-Tree Height of the B-Tree with n keys is important as it bound the number of disk accesses. The height of the tree is maximum when each node has minimum number of the subtree pointers, q = m / 2.

14 Height of B-Tree The height of B-tree is maximum if all nodes have minimum number of keys. 1 key in the root + 2(q-1) keys on the second level + + 2q h-2 (q-1) keys in the leaves (level h). 1+ 2(q -1) + 2q(q -1) + + 2q h 2 i = 1+ ( q 1) 2q i= 0 Applying the formula of h 1 q 1 = 1+ 2( q 1) q 1 = 1+ 2q Thus, the number of n 1+ 2q h log q h 1 h 1 n h-2 (q -1) geometric progression keys in B - Tree of height h is given as :

15 Height of B-Tree The height of B-tree is minimum if all nodes are full, thus we have m-1 keys in the root + m(m-1) keys on the second level + + m h-1 (m-1) keys in the leaf nodes (m -1) + m(m -1) + m = Applying the formula of h m 1 = ( m 1) m 1 Thus, the number of h log log h 1 i= 0 = m h n m m ( m 1) m 1 h 1 m ( n + 1) = ( m 1) ( n + 1) h log i q 2 (m -1) + + m h 1 i= 0 m geometric progression keys in B - Tree of n i h-1 (m -1) height h is given as :

16 Height of B-Tree If number of nodes in B-tree equal 2,000,000 (2 million) and m=200 then maximum height of B-tree is 3, where as the binary tree would be of height 20. Note: Order m is chosen so that B-tree node size is nearly equal to the disk block size.

17 Search in a B-Tree Search in a B-tree is similar to the search in BST except that in B-tree we make a multiway branching decision instead of binary branching in BST Search key 71

18 B-Tree Insert Operation Insertion in B-tree is more complicated than in BST. In BST, the keys are added in top down fashion resulting in an unbalanced tree. B-tree is built bottom up, the keys are added in the leaf node, if the leaf node is full another node is created, keys are evenly distributed and middle key is promoted to the parent. If parent is full, the process is repeated. B-tree can also be built in top down fashion using pre-splitting technique.

19 Basic Idea Find position for the key in the appropriate leaf node Insert key in order and adjust pointer No Is node full? yes Split node: Create a new node Move half of the keys from the full node to the new node and adjust pointers Promote the median key (before split) to the parent Split guarantees that each node has m / 2 1 keys. If parent is full

20 Cases in B-Tree Insert Operation In B-tree insertion we have the following cases: Case 1: The leaf node has room for the new key. Case 2: The leaf in which key is to be placed is full. This case can lead to the increase in tree height. Now we explain these cases in detail.

21 B-Tree Insert Operation Case 1: The leaf node has room for the new key. Insert Find appropriate leaf node for key Insert 3 in order

22 B-Tree Insert Operation Case 2: The leaf in which key is to be placed is full. Insert 16 Find appropriate leaf node for key No room for key 16 in leaf node Insert key 19 in parent node in order Move median key 19 up and Split node: create a new node and move keys to the new node

23 B-Tree Insert Operation Case 2: The leaf in which key is to be placed is full and this lead to the increase in tree height

24 B-Tree Insert Operation Case 2: The height of the tree increases. Insert 16 Insert 27 in parent in order 55 No room for 27 in parent, Split node No room for 19 in parent, Split parent node Insert 19 in parent node in order No room for key 16, Move median key 19 up & Split node

25 B-Tree Delete Operation Deletion is analogous to insertion, but a little more complicated. Two major cases Case 1: Deletion from leaf node Case 2: Deletion from nonleaf node Apply delete by copy technique used in BST, this will reduce this case to case 1. In delete by copy, the key to be deleted is replaced by the largest key in the right subtree or smallest in left subtree (which is always a leaf).

26 B-Tree Delete Operation Leaf node deletion cases: After deletion node is at least half full. After deletion underflow occurs Redistribute: if number of keys in siblings > 2 1. m Merge nodes if number of keys in siblings <. Merging leads to decrease in tree height. m 2 1

27 B-Tree Delete Operation After deletion node is at least half full. (inverse of insertion case 1) Search key Key found, delete key 3. Move others keys in the node to eliminate the gap.

28 B-Tree Delete Operation Underflow occurs, evenly redistribute the keys if left or right sibling has keys > m / 2. 1 Delete 14 Search key Underflow occurs, evenly redistribute keys in the underflow node, in its sibling and the separator key.

29 B-Tree Delete Operation Underflow occurs and the keys in the left & right sibling are = m / 2 1. Merge the underflow node and a sibling. Delete Move separator key down. Move the keys to underflow node and discard the sibling Underflow occurs, merge nodes.

30 B-Tree Delete Operation Underflow occurs, height decreases after merging. Delete 21 Underflow occurs, merge nodes Underflow occurs, merge nodes by moving separator key and the keys in sibling node to the underflow node.

31 Issues in B-tree In B-tree, accessing data in sequential order is not efficient. In-order traversal of the B-tree requires many disk accesses. In B-tree, data pointers are stored in each node, thus resulting in less subtree pointers per node and more tree levels. KT P 1 K 1 D 1 K i-1 DK i-1 P i K i D i K q-1 free KD q-1 space q-1 P q keys<k 1 Data pointer K i-1 <keys<k i Data pointer K q-1 <keys

32 B+-Trees

33 B+ -Tree- Variant of B- Tree Resolves the issues in B-tree. In B+ -tree Pointers to data is stored only in leaf nodes. Internal nodes contain only keys and subtree pointers Can accommodate more keys in internal nodes. Less disk accesses due to fewer levels in the tree. B+ -tree provides faster sequential access of data.

34 B+-Tree Structure B+-tree consist of two parts Index Set Provides indexes for fast access of data. Consist of internal nodes that store only key & subtree pointers. Sequence Set Consist of leaf nodes that contain data pointers. Provide efficient sequential access of data (using doubly linked list). Index Set Sequential Search Sequence Set

35 B+-Tree: Index Node Structure The basic structure of the B+-tree index node of order m is same as that of B-tree node of order m. The root has at least two subtrees unless it is a leaf. Each non-root index node holds q-1 keys and q subtree pointers, where m / 2 q m. Only difference is that index node do not contain data pointers. Fr P 1 K 1 P 2 K i-1 P i K i K q-1 P q keys<k 1 K i-1 <keys<k i K q-1 <keys

36 B+-Tree: Sequence Node Structure The structure of the B+-tree sequence node is as follows: K 1 D 1 K i-1 D i-1 K i D i K q-1 D q-1 Pointers to previous and next leaf node in tree Data pointer Data pointer Data pointer Data pointer

37 Search in a B+-Tree The search in B+-tree works similar to B-tree but it always ends at the leaf node because Pointer to the data is stored in the leaf node. Existence of the key in index set does not guarantee that the particular record is present in the tree. A key can occur multiple time in the index set (this does not create problem because the key in index set node act only as a separator) Note 62 is not present in the sequence set B+-tree of order 5

38 B+-Tree Insert Case: The leaf node has room for the key to be inserted. Insert 3 Find appropriate leaf node for key 3, and insert in order

39 B+-Tree Insert Operation Case 2: The leaf in which key is to be placed is full. Insert 16 Find appropriate leaf node for key No room for key 16, Split node: create a new node and move m / 2 keys to the new node. Insert a copy of the first key of the new node in the parent node in order Modify Sequence Set next node links

40 B+ -Tree Insert Operation Case: Only root node exists, and it is full. Insert 18 Find appropriate position for key 18 No room for key 18, Split node: create a new sequence set node and move m / 2 keys to the new node Create a new index set node and make it a root node. Insert the first key of the new sequence set node in the new root.

41 B+-Tree Delete Operation B+-tree deletion follows same rules as that of B- tree deletion but the separator in index set node is not removed when a key is deleted. Deletion cases: After deletion node is at least half full. After deletion underflow occurs Redistribute: if number of keys in siblings >. Merge nodes if number of keys in siblings <. Merging lead to decrease in tree height. m 2 1 m 2 1

42 B+-Tree Delete Case: After deletion node is at least half full. Delete Search key 14 Note: key 14 in the parent node is still a valid separator key. It is not modified Key found, delete key 14. Move others keys in the node to eliminate the gap.

43 B+-Tree Delete Operation Underflow occurs, evenly redistribute the keys if left or right sibling has keys > m / 2. 1 Delete 14 Search Separator key 14 key & delete in the parent it. is no longer valid Insert a copy of the first key of the sibling node in parent in order Underflow occurs, evenly redistribute keys in the underflow node and in the sibling node. Note: unlike B-tree, the separator key in the parent node is not included. Why?

44 B+-Tree Delete Operation Underflow occurs and the keys in the left & right sibling are = m / 2 1. Merge the underflow node and a sibling. Delete 32 Search key 32 & delete it Underflow occurs, merge nodes. Move keys in sibling to underflow node and discard the sibling node.

45 Efficient Sequential Access in B+-Tree For efficient sequential access, start from the beginning of the sequence set and traverse the sequence set using the next pointers in sequence set nodes Sequence Set Start

46 Comparison B-Tree & B+- Tree B-Tree Data pointers are stored in all nodes No redundant keys Search can end at any node Slow sequential access Higher trees B+-Tree Data pointers are stored only in leaf nodes (sequence set) Redundant keys may exist Search always ends at leaf node Efficient sequential access Flatter trees (no data pointers in index set nodes)

47 B* -Trees

48 B* -Tree -- Variant of B- Tree Each node of a B-tree represents a block of secondary memory, therefore, accessing a node is expensive operation. Thus, the fewer nodes that are created, the better. In B*-tree is a variant of B-tree introduced by Donald Knuth and named by Douglas Comer. In a B*-tree, all nodes except the root are required to be at least two-thirds full, not just half full as in B-tree. The number of keys in all non-root nodes in a B*-tree of order m 2m 1 is k, where k m 1. 3 Average utilization of B*-tree is 81%.

49 B* -Tree Insert Operation In B*-tree, the frequency of node splitting is decreased by delaying a split. A split in a B*-tree is delayed by attempting to redistribute the keys between node and its sibling when node overflows. In B*-tree split operation two nodes are split into three instead of one into two as in B-tree. All three nodes participating in the split are guaranteed to be twothirds full after split.

50 B*-Tree Insert Operation Overflow occurs, evenly redistribute the keys between node and its sibling. Insert Overflow occurs, evenly redistribute keys in the overflow node, in its sibling including the separator key in parent and new key.

51 B*-Tree Insert Operation Overflow occurs, sibling is full, split node. Insert Overflow occurs, sibling is full, split node

52 B* -Tree Delete Operation B*-tree deletion follows same rules as that of B-tree deletion. Deletion cases: After deletion node is at least two third full. After deletion underflow occurs 2m 1 3 2m 1 3 Redistribute: if number of keys in siblings >. Merge nodes if number of keys in siblings <. Examples of deletion are omitted as they are similar to B-tree deletion.

53 Questions?

54 References Data Structure and Algorithms in C++, Adam Drozek. Introduction to Algorithms, T.H.Cormen, C.E.Leiserson, R.L.Rivest, and C.Stein. Fundamentals of Database Systems, Elmasri Navathe. Fundamentals of Data Structures in C++, E.Horowitz, S.Sahni and D.Mehta The Ubiquitous B-Tree, DOUGLAS COMER, ACM Computing Surveys (CSUR), Volume 11,Issue 2(June 1979).

55 The End

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is?

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is? Multiway searching What do we do if the volume of data to be searched is too large to fit into main memory Search tree is stored on disk pages, and the pages required as comparisons proceed may not be

More information

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree. The Lecture Contains: Index structure Binary search tree (BST) B-tree B+-tree Order file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture13/13_1.htm[6/14/2012

More information

Background: disk access vs. main memory access (1/2)

Background: disk access vs. main memory access (1/2) 4.4 B-trees Disk access vs. main memory access: background B-tree concept Node structure Structural properties Insertion operation Deletion operation Running time 66 Background: disk access vs. main memory

More information

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree B-Trees Disk Storage What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree Deletion in a B-tree Disk Storage Data is stored on disk (i.e., secondary memory) in blocks. A block is

More information

Balanced Search Trees

Balanced Search Trees Balanced Search Trees Computer Science E-22 Harvard Extension School David G. Sullivan, Ph.D. Review: Balanced Trees A tree is balanced if, for each node, the node s subtrees have the same height or have

More information

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25 Multi-way Search Trees (Multi-way Search Trees) Data Structures and Programming Spring 2017 1 / 25 Multi-way Search Trees Each internal node of a multi-way search tree T: has at least two children contains

More information

Physical Level of Databases: B+-Trees

Physical Level of Databases: B+-Trees Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,

More information

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 530A. B+ Trees. Washington University Fall 2013 CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key

More information

Material You Need to Know

Material You Need to Know Review Quiz 2 Material You Need to Know Normalization Storage and Disk File Layout Indexing B-trees and B+ Trees Extensible Hashing Linear Hashing Decomposition Goals: Lossless Joins, Dependency preservation

More information

What is a Multi-way tree?

What is a Multi-way tree? B-Tree Motivation for studying Multi-way and B-trees A disk access is very expensive compared to a typical computer instruction (mechanical limitations) -One disk access is worth about 200,000 instructions.

More information

Comp 335 File Structures. B - Trees

Comp 335 File Structures. B - Trees Comp 335 File Structures B - Trees Introduction Simple indexes provided a way to directly access a record in an entry sequenced file thereby decreasing the number of seeks to disk. WE ASSUMED THE INDEX

More information

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

B-Trees. Version of October 2, B-Trees Version of October 2, / 22 B-Trees Version of October 2, 2014 B-Trees Version of October 2, 2014 1 / 22 Motivation An AVL tree can be an excellent data structure for implementing dictionary search, insertion and deletion Each operation

More information

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University Extra: B+ Trees CS1: Java Programming Colorado State University Slides by Wim Bohm and Russ Wakefield 1 Motivations Many times you want to minimize the disk accesses while doing a search. A binary search

More information

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file. Large Trees 1 Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file. Trees can also be used to store indices of the collection

More information

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler Lecture 12: BT Trees Courtesy to Goodrich, Tamassia and Olga Veksler Instructor: Yuzhen Xie Outline B-tree Special case of multiway search trees used when data must be stored on the disk, i.e. too large

More information

CS350: Data Structures B-Trees

CS350: Data Structures B-Trees B-Trees James Moscola Department of Engineering & Computer Science York College of Pennsylvania James Moscola Introduction All of the data structures that we ve looked at thus far have been memory-based

More information

CS 350 : Data Structures B-Trees

CS 350 : Data Structures B-Trees CS 350 : Data Structures B-Trees David Babcock (courtesy of James Moscola) Department of Physical Sciences York College of Pennsylvania James Moscola Introduction All of the data structures that we ve

More information

Chapter 12: Indexing and Hashing (Cnt(

Chapter 12: Indexing and Hashing (Cnt( Chapter 12: Indexing and Hashing (Cnt( Cnt.) Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition

More information

amiri advanced databases '05

amiri advanced databases '05 More on indexing: B+ trees 1 Outline Motivation: Search example Cost of searching with and without indices B+ trees Definition and structure B+ tree operations Inserting Deleting 2 Dense ordered index

More information

Motivation for B-Trees

Motivation for B-Trees 1 Motivation for Assume that we use an AVL tree to store about 20 million records We end up with a very deep binary tree with lots of different disk accesses; log2 20,000,000 is about 24, so this takes

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS245-2008S-19 B-Trees David Galles Department of Computer Science University of San Francisco 19-0: Indexing Operations: Add an element Remove an element Find an element,

More information

M-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss)

M-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss) M-ary Search Tree B-Trees (4.7 in Weiss) Maximum branching factor of M Tree with N values has height = # disk accesses for find: Runtime of find: 1/21/2011 1 1/21/2011 2 Solution: B-Trees specialized M-ary

More information

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology Introduction Chapter 4 Trees for large input, even linear access time may be prohibitive we need data structures that exhibit average running times closer to O(log N) binary search tree 2 Terminology recursive

More information

M-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =

M-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height = M-ary Search Tree B-Trees Section 4.7 in Weiss Maximum branching factor of M Complete tree has height = # disk accesses for find: Runtime of find: 2 Solution: B-Trees specialized M-ary search trees Each

More information

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time B + -TREES MOTIVATION An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations finish within O(log N) time The theoretical conclusion

More information

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

2-3 Tree. Outline B-TREE. catch(...){ printf( Assignment::SolveProblem() AAAA!); } ADD SLIDES ON DISJOINT SETS Outline catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } Balanced Search Trees 2-3 Trees 2-3-4 Trees Slide 4 Why care about advanced implementations? Same entries, different insertion sequence:

More information

Multi-way Search Trees

Multi-way Search Trees Multi-way Search Trees Kuan-Yu Chen ( 陳冠宇 ) 2018/10/24 @ TR-212, NTUST Review Red-Black Trees Splay Trees Huffman Trees 2 Multi-way Search Trees. Every node in a binary search tree contains one value and

More information

B-Trees and External Memory

B-Trees and External Memory Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 and External Memory 1 1 (2, 4) Trees: Generalization of BSTs Each internal node

More information

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file. Large Trees 1 Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file. Trees can also be used to store indices of the collection

More information

B-Trees. Introduction. Definitions

B-Trees. Introduction. Definitions 1 of 10 B-Trees Introduction A B-tree is a specialized multiway tree designed especially for use on disk. In a B-tree each node may contain a large number of keys. The number of subtrees of each node,

More information

Laboratory Module X B TREES

Laboratory Module X B TREES Purpose: Purpose 1... Purpose 2 Purpose 3. Laboratory Module X B TREES 1. Preparation Before Lab When working with large sets of data, it is often not possible or desirable to maintain the entire structure

More information

B-Trees and External Memory

B-Trees and External Memory Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 B-Trees and External Memory 1 (2, 4) Trees: Generalization of BSTs Each internal

More information

The B-Tree. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland

The B-Tree. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland Yufei Tao ITEE University of Queensland Before ascending into d-dimensional space R d with d > 1, this lecture will focus on one-dimensional space, i.e., d = 1. We will review the B-tree, which is a fundamental

More information

Introduction to Indexing R-trees. Hong Kong University of Science and Technology

Introduction to Indexing R-trees. Hong Kong University of Science and Technology Introduction to Indexing R-trees Dimitris Papadias Hong Kong University of Science and Technology 1 Introduction to Indexing 1. Assume that you work in a government office, and you maintain the records

More information

Introduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana

Introduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana Introduction to Indexing 2 Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana Indexed Sequential Access Method We have seen that too small or too large an index (in other words too few or too

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L08: B + -trees and Dynamic Hashing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,

More information

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree Chapter 4 Trees 2 Introduction for large input, even access time may be prohibitive we need data structures that exhibit running times closer to O(log N) binary search tree 3 Terminology recursive definition

More information

CS127: B-Trees. B-Trees

CS127: B-Trees. B-Trees CS127: B-Trees B-Trees 1 Data Layout on Disk Track: one ring Sector: one pie-shaped piece. Block: intersection of a track and a sector. Disk Based Dictionary Structures Use a disk-based method when the

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Algorithms. Deleting from Red-Black Trees B-Trees

Algorithms. Deleting from Red-Black Trees B-Trees Algorithms Deleting from Red-Black Trees B-Trees Recall the rules for BST deletion 1. If vertex to be deleted is a leaf, just delete it. 2. If vertex to be deleted has just one child, replace it with that

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

CS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives.

CS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives. CS 310 B-trees, Page 1 Motives Large-scale databases are stored in disks/hard drives. Disks are quite different from main memory. Data in a disk are accessed through a read-write head. To read a piece

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Design and Analysis of Algorithms Lecture- 9: B- Trees

Design and Analysis of Algorithms Lecture- 9: B- Trees Design and Analysis of Algorithms Lecture- 9: B- Trees Dr. Chung- Wen Albert Tsao atsao@svuca.edu www.408codingschool.com/cs502_algorithm 1/12/16 Slide Source: http://www.slideshare.net/anujmodi555/b-trees-in-data-structure

More information

CS F-11 B-Trees 1

CS F-11 B-Trees 1 CS673-2016F-11 B-Trees 1 11-0: Binary Search Trees Binary Tree data structure All values in left subtree< value stored in root All values in the right subtree>value stored in root 11-1: Generalizing BSTs

More information

Tree-Structured Indexes

Tree-Structured Indexes Introduction Tree-Structured Indexes Chapter 10 As for any index, 3 alternatives for data entries k*: Data record with key value k

More information

Main Memory and the CPU Cache

Main Memory and the CPU Cache Main Memory and the CPU Cache CPU cache Unrolled linked lists B Trees Our model of main memory and the cost of CPU operations has been intentionally simplistic The major focus has been on determining

More information

CSE 326: Data Structures B-Trees and B+ Trees

CSE 326: Data Structures B-Trees and B+ Trees Announcements (2/4/09) CSE 26: Data Structures B-Trees and B+ Trees Midterm on Friday Special office hour: 4:00-5:00 Thursday in Jaech Gallery (6 th floor of CSE building) This is in addition to my usual

More information

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation! Lecture 10: Multi-way Search Trees: intro to B-trees 2-3 trees 2-3-4 trees Multi-way Search Trees A node on an M-way search tree with M 1 distinct and ordered keys: k 1 < k 2 < k 3

More information

B-Trees. CS321 Spring 2014 Steve Cutchin

B-Trees. CS321 Spring 2014 Steve Cutchin B-Trees CS321 Spring 2014 Steve Cutchin Topics for Today HW #2 Once Over B Trees Questions PA #3 Expression Trees Balance Factor AVL Heights Data Structure Animations Graphs 2 B-Tree Motivation When data

More information

THE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan

THE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan THE B+ TREE INDEX CS 564- Spring 2018 ACKs: Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? The B+ tree index Basics Search/Insertion/Deletion Design & Cost 2 INDEX RECAP We have the following query:

More information

Lecture 3: B-Trees. October Lecture 3: B-Trees

Lecture 3: B-Trees. October Lecture 3: B-Trees October 2017 Remarks Search trees The dynamic set operations search, minimum, maximum, successor, predecessor, insert and del can be performed efficiently (in O(log n) time) if the search tree is balanced.

More information

Indexing and Hashing

Indexing and Hashing C H A P T E R 1 Indexing and Hashing Solutions to Practice Exercises 1.1 Reasons for not keeping several search indices include: a. Every index requires additional CPU time and disk I/O overhead during

More information

Tree-Structured Indexes

Tree-Structured Indexes Tree-Structured Indexes Chapter 9 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Introduction As for any index, 3 alternatives for data entries k*: ➀ Data record with key value k ➁

More information

In-Memory Searching. Linear Search. Binary Search. Binary Search Tree. k-d Tree. Hashing. Hash Collisions. Collision Strategies.

In-Memory Searching. Linear Search. Binary Search. Binary Search Tree. k-d Tree. Hashing. Hash Collisions. Collision Strategies. In-Memory Searching Linear Search Binary Search Binary Search Tree k-d Tree Hashing Hash Collisions Collision Strategies Chapter 4 Searching A second fundamental operation in Computer Science We review

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Database Systems. File Organization-2. A.R. Hurson 323 CS Building

Database Systems. File Organization-2. A.R. Hurson 323 CS Building File Organization-2 A.R. Hurson 323 CS Building Indexing schemes for Files The indexing is a technique in an attempt to reduce the number of accesses to the secondary storage in an information retrieval

More information

2-3 and Trees. COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli

2-3 and Trees. COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli 2-3 and 2-3-4 Trees COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli Multi-Way Trees A binary search tree: One value in each node At most 2 children An M-way search tree: Between 1 to (M-1) values

More information

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees B-Trees AVL trees and other binary search trees are suitable for organizing data that is entirely contained within computer memory. When the amount of data is too large to fit entirely in memory, i.e.,

More information

B-Tree. CS127 TAs. ** the best data structure ever

B-Tree. CS127 TAs. ** the best data structure ever B-Tree CS127 TAs ** the best data structure ever Storage Types Cache Fastest/most costly; volatile; Main Memory Fast access; too small for entire db; volatile Disk Long-term storage of data; random access;

More information

Trees. (Trees) Data Structures and Programming Spring / 28

Trees. (Trees) Data Structures and Programming Spring / 28 Trees (Trees) Data Structures and Programming Spring 2018 1 / 28 Trees A tree is a collection of nodes, which can be empty (recursive definition) If not empty, a tree consists of a distinguished node r

More information

Intro to DB CHAPTER 12 INDEXING & HASHING

Intro to DB CHAPTER 12 INDEXING & HASHING Intro to DB CHAPTER 12 INDEXING & HASHING Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing

More information

Chapter 17 Indexing Structures for Files and Physical Database Design

Chapter 17 Indexing Structures for Files and Physical Database Design Chapter 17 Indexing Structures for Files and Physical Database Design We assume that a file already exists with some primary organization unordered, ordered or hash. The index provides alternate ways to

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Indexing and Hashing

Indexing and Hashing C H A P T E R 1 Indexing and Hashing This chapter covers indexing techniques ranging from the most basic one to highly specialized ones. Due to the extensive use of indices in database systems, this chapter

More information

Introduction to File Structures

Introduction to File Structures 1 Introduction to File Structures Introduction to File Organization Data processing from a computer science perspective: Storage of data Organization of data Access to data This will be built on your knowledge

More information

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 2004 Goodrich, Tamassia (2,4) Trees 1 Multi-Way Search Tree A multi-way search tree is an ordered tree such that Each internal node has at least two children and stores d -1 key-element

More information

Spring 2017 B-TREES (LOOSELY BASED ON THE COW BOOK: CH. 10) 1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 1

Spring 2017 B-TREES (LOOSELY BASED ON THE COW BOOK: CH. 10) 1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 1 Spring 2017 B-TREES (LOOSELY BASED ON THE COW BOOK: CH. 10) 1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 1 Consider the following table: Motivation CREATE TABLE Tweets ( uniquemsgid INTEGER,

More information

UNIT III BALANCED SEARCH TREES AND INDEXING

UNIT III BALANCED SEARCH TREES AND INDEXING UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant

More information

Binary Search Trees. Analysis of Algorithms

Binary Search Trees. Analysis of Algorithms Binary Search Trees Analysis of Algorithms Binary Search Trees A BST is a binary tree in symmetric order 31 Each node has a key and every node s key is: 19 23 25 35 38 40 larger than all keys in its left

More information

CISC 235: Topic 4. Balanced Binary Search Trees

CISC 235: Topic 4. Balanced Binary Search Trees CISC 235: Topic 4 Balanced Binary Search Trees Outline Rationale and definitions Rotations AVL Trees, Red-Black, and AA-Trees Algorithms for searching, insertion, and deletion Analysis of complexity CISC

More information

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See  for conditions on re-use Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files Static

More information

CS 234. Module 6. October 16, CS 234 Module 6 ADT Dictionary 1 / 33

CS 234. Module 6. October 16, CS 234 Module 6 ADT Dictionary 1 / 33 CS 234 Module 6 October 16, 2018 CS 234 Module 6 ADT Dictionary 1 / 33 Idea for an ADT Te ADT Dictionary stores pairs (key, element), were keys are distinct and elements can be any data. Notes: Tis is

More information

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University Trees Reading: Weiss, Chapter 4 1 Generic Rooted Trees 2 Terms Node, Edge Internal node Root Leaf Child Sibling Descendant Ancestor 3 Tree Representations n-ary trees Each internal node can have at most

More information

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010 Uses for About Binary January 31, 2010 Uses for About Binary Uses for Uses for About Basic Idea Implementing Binary Example: Expression Binary Search Uses for Uses for About Binary Uses for Storage Binary

More information

CS350: Data Structures Red-Black Trees

CS350: Data Structures Red-Black Trees Red-Black Trees James Moscola Department of Engineering & Computer Science York College of Pennsylvania James Moscola Red-Black Tree An alternative to AVL trees Insertion can be done in a bottom-up or

More information

Question Bank Subject: Advanced Data Structures Class: SE Computer

Question Bank Subject: Advanced Data Structures Class: SE Computer Question Bank Subject: Advanced Data Structures Class: SE Computer Question1: Write a non recursive pseudo code for post order traversal of binary tree Answer: Pseudo Code: 1. Push root into Stack_One.

More information

(2,4) Trees. 2/22/2006 (2,4) Trees 1

(2,4) Trees. 2/22/2006 (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 2/22/2006 (2,4) Trees 1 Outline and Reading Multi-way search tree ( 10.4.1) Definition Search (2,4) tree ( 10.4.2) Definition Search Insertion Deletion Comparison of dictionary

More information

Module 4: Tree-Structured Indexing

Module 4: Tree-Structured Indexing Module 4: Tree-Structured Indexing Module Outline 4.1 B + trees 4.2 Structure of B + trees 4.3 Operations on B + trees 4.4 Extensions 4.5 Generalized Access Path 4.6 ORACLE Clusters Web Forms Transaction

More information

Copyright 1998 by Addison-Wesley Publishing Company 147. Chapter 15. Stacks and Queues

Copyright 1998 by Addison-Wesley Publishing Company 147. Chapter 15. Stacks and Queues Copyright 1998 by Addison-Wesley Publishing Company 147 Chapter 15 Stacks and Queues Copyright 1998 by Addison-Wesley Publishing Company 148 tos (-1) B tos (1) A tos (0) A A tos (0) How the stack routines

More information

Search Trees - 1 Venkatanatha Sarma Y

Search Trees - 1 Venkatanatha Sarma Y Search Trees - 1 Lecture delivered by: Venkatanatha Sarma Y Assistant Professor MSRSAS-Bangalore 11 Objectives To introduce, discuss and analyse the different ways to realise balanced Binary Search Trees

More information

Search Trees. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Binary Search Trees

Search Trees. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Binary Search Trees Unit 9, Part 2 Search Trees Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Binary Search Trees Search-tree property: for each node k: all nodes in k s left subtree are < k all nodes

More information

Multiway Search Trees

Multiway Search Trees Multiway Search Trees Intuitive Definition A multiway search tree is one with nodes that have two or more children. Within each node is stored a given key, which is associated to an item we wish to access

More information

CSE 373 OCTOBER 25 TH B-TREES

CSE 373 OCTOBER 25 TH B-TREES CSE 373 OCTOBER 25 TH S ASSORTED MINUTIAE Project 2 is due tonight Make canvas group submissions Load factor: total number of elements / current table size Can select any load factor (but since we don

More information

Trees. Eric McCreath

Trees. Eric McCreath Trees Eric McCreath 2 Overview In this lecture we will explore: general trees, binary trees, binary search trees, and AVL and B-Trees. 3 Trees Trees are recursive data structures. They are useful for:

More information

Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See   for conditions on re-use Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Multiway Search Trees. Multiway-Search Trees (cont d)

Multiway Search Trees. Multiway-Search Trees (cont d) Multiway Search Trees Each internal node v of a multi-way search tree T has at least two children contains d-1 items, where d is the number of children of v an item is of the form (k i,x i ) for 1 i d-1,

More information

Algorithms. AVL Tree

Algorithms. AVL Tree Algorithms AVL Tree Balanced binary tree The disadvantage of a binary search tree is that its height can be as large as N-1 This means that the time needed to perform insertion and deletion and many other

More information

8. Binary Search Tree

8. Binary Search Tree 8 Binary Search Tree Searching Basic Search Sequential Search : Unordered Lists Binary Search : Ordered Lists Tree Search Binary Search Tree Balanced Search Trees (Skipped) Sequential Search int Seq-Search

More information

Problem. Indexing with B-trees. Indexing. Primary Key Indexing. B-trees: Example. B-trees. primary key indexing

Problem. Indexing with B-trees. Indexing. Primary Key Indexing. B-trees: Example. B-trees. primary key indexing 15-82 Advanced Topics in Database Systems Performance Problem Given a large collection of records, Indexing with B-trees find similar/interesting things, i.e., allow fast, approximate queries 2 Indexing

More information

CSCI Trees. Mark Redekopp David Kempe

CSCI Trees. Mark Redekopp David Kempe CSCI 104 2-3 Trees Mark Redekopp David Kempe Trees & Maps/Sets C++ STL "maps" and "sets" use binary search trees internally to store their keys (and values) that can grow or contract as needed This allows

More information

Red-black trees (19.5), B-trees (19.8), trees

Red-black trees (19.5), B-trees (19.8), trees Red-black trees (19.5), B-trees (19.8), 2-3-4 trees Red-black trees A red-black tree is a balanced BST It has a more complicated invariant than an AVL tree: Each node is coloured red or black A red node

More information

Multi-Way Search Trees

Multi-Way Search Trees Multi-Way Search Trees Manolis Koubarakis 1 Multi-Way Search Trees Multi-way trees are trees such that each internal node can have many children. Let us assume that the entries we store in a search tree

More information

Augmenting Data Structures

Augmenting Data Structures Augmenting Data Structures [Not in G &T Text. In CLRS chapter 14.] An AVL tree by itself is not very useful. To support more useful queries we need more structure. General Definition: An augmented data

More information

Remember. 376a. Database Design. Also. B + tree reminders. Algorithms for B + trees. Remember

Remember. 376a. Database Design. Also. B + tree reminders. Algorithms for B + trees. Remember 376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 14 B + trees, multi-key indices, partitioned hashing and grid files B and B + -trees are used one implementation

More information

Cpt S 122 Data Structures. Data Structures Trees

Cpt S 122 Data Structures. Data Structures Trees Cpt S 122 Data Structures Data Structures Trees Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Motivation Trees are one of the most important and extensively

More information

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes

More information

Chapter 18 Indexing Structures for Files. Indexes as Access Paths

Chapter 18 Indexing Structures for Files. Indexes as Access Paths Chapter 18 Indexing Structures for Files Indexes as Access Paths A single-level index is an auxiliary file that makes it more efficient to search for a record in the data file. The index is usually specified

More information

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler Lecture 11: Multiway and (2,4) Trees 9 2 5 7 10 14 Courtesy to Goodrich, Tamassia and Olga Veksler Instructor: Yuzhen Xie Outline Multiway Seach Tree: a new type of search trees: for ordered d dictionary

More information