Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

Size: px
Start display at page:

Download "Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file."

Transcription

1 Large Trees 1 Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file. Trees can also be used to store indices of the collection of records in a file. In either case, if the collection of records is quite large, the tree may be so large that it is unacceptable to store it all in memory at once. or example, if we have a database file holding 2 30 records, and each index entry requires 8 bytes of storage, a BST holding the index would require 2 30 nodes, each taking 16 bytes of memory (assuming 32-bit pointers), or 16 B of memory. An alternative would be to store the entire tree in a file on disk, and only load the immediately relevant portions of it into memory isk Representation 2 It is a relatively simple matter to write any binary tree to a disk file, by representing each tree node by a data record that holds the data element and two file offsets specifying the locations of the children, if any of that node. B C The nodes don't need to be stored in any particular order. NULL pointers may be represented by a negative offset ata B C lchild rchild

2 isk Representation 3 The problem is that this disk representation will require too many individual disk accesses when processing a typical tree operation, such as a search or a traversal. Why? These tree operations typically require transiting from a node to one or both of its children. But there's no reason that the child nodes will be stored anywhere near the parent node (although we could at least guarantee that siblings are adjacent). Since each node stores only one data value, and the nodes we might well perform one disk access for every node that is accessed during the tree operation. iven the extremely slow nature of disk access, this is unacceptable ata B C lchild rchild Trees 4 These difficulties can be moderated by allowing tree nodes to have more than two children to store more than a single value. In a 2-3 tree: - each node contains either one or two key values - every internal node either holds one key value and has two children, or holds two key values and has three children - every leaf is at the same level in the tree - there is a BST-like arrangement of values for efficient searching Compared to a BST: trees have lower update cost on average - a 2-3 tree storing N keys will be shallower than the corresponding BST, but - search in a 2-3 tree has essentially the same cost as in a BST

3 2-3 Tree Properties A 2-3 tree: ach node can store up to two key values and up to three pointers. The left child holds values less than the first key. The middle child holds values greater than the first key and less than the second key. The right child holds values greater than the second key. 2-3 Tree Insertion Insertion into a 2-3 tree is similar to a BST. irst find the appropriate leaf Inserting 14 leaf is not full, so just insert new key, shifting old key if necesssary. Inserting 55 leaf is full, but parent is not, so create a new right child, promoting median key from the children to the parent

4 2-3 Tree Splitting If a subtree is sufficiently full, insertion may cause the parent to split: 7 Inserting 19 leaf is full, and so is the parent Median value is passed up to the parent node which has now acquired another child Tree Splitting the Root In the worst case, the tree root splits, increasing the height of the tree: 8 Inserting 19 caused an internal node to split and a displaced value to be passed up. ere, the parent is the tree root and is already full so it splits also

5 2-3 Tree eletion 9 eletion from a 2-3 tree is, of course, essentially just the inverse of insertion. As with the BST, there are cases: eletion of a value that lies in an internal node: - replace the value with its closest successor (which must lie in a leaf) - delete the successor from the leaf eletion of a value that lies in a leaf: - if the leaf is still non-empty, nothing else needs to be done - if the leaf is now empty, the parent has an empty child, which is not allowed, and merely deleting the leaf would leave the parent with too many values - if the left or right sibling has two values, we can borrow and rearrange values - if not, we merge the leaf and its sibling (if any), demoting a value from the parent etails are interesting some are addressed on the following slides. eletion from a Leaf I Consider deletion of the value 24: This would leave the middle leaf empty, which is not allowed. owever, the left sibling contains an extra value. We may manage the deletion by demoting and borrowing : We move the separator value between the siblings to the leaf, and move the appropriate value from the sibling to the parent

6 eletion from a Leaf II Consider deletion of the value 24: This would also leave the middle leaf empty, but neither sibling has any values to spare. So, again we demote and borrow, but we also delete an empty leaf We pick a sibling, move the separating value between the empty leaf and the sibling, and the sibling s value both into the empty leaf, and delete the sibling eletion from a Leaf III 12 Consider deletion of the value 24: This would also leave the middle leaf empty, but neither sibling has any values to spare, nor can we delete either child without leaving the parent with a single child, which is unacceptable we can still demote a value from the parent But, now the parent has too few values (underflow), and we must handle that recursively by considering the siblings and parent of that node 20 23

7 trees illustrate the basic concepts for the B-tree family. In a B-tree of order m: - the root has at least two subtrees, unless it is a leaf - each nonroot and nonleaf node holds k 1 data values, and k pointers to subtrees, where m / 2 k m - each leaf node holds k 1 data values, where m / 2 k m - all leaves are on the same level - the data values in each node are in ascending order - for all i, the data values in the first i children are less than the i-th data value - for all i, the data values in the last m i children are larger than the i-th data value So, a B-tree is always at least half full, has a relatively small number of levels, and is perfectly balanced. Typically, m will be fairly large. B Tree xample 14 A B-tree of order 5: L O S W A B C I J K M N P Q R T U V X Y Z Since a binary search may be applied to the data values in each node, searching is highly efficient.

8 B Tree Insertion 15 Insertion follows the same logic as in the 2-3 tree, with the complication that we must make nodes obey the more complex restriction where k is the number of children the node has. m / 2 k m The basic idea is the same: search for the appropriate leaf, add the new value, then split and promote as necessary. or instance, inserting the values W and then X into the B tree at right would cause the right-most leaf to split and the value V to be promoted to the root. Then, inserting the value Z would cause the root to split, and the value Q to be promoted to a new root node. A K Q M O P S T V Insertion xample I 16 K Q Inserting W just fills the leaf. A M O P S T V W Inserting X causes the leaf to overflow. So, we split the leaf and promote the median value, which is V, up one level. K Q V The node there had room for the new value, so no further splitting occurs. A M O P S T W X

9 Insertion xample II 17 K Q V A M O P S T Inserting I just fills the second leaf. Then, inserting J causes the leaf to overflow. So, we split the leaf and promote the median value, which is, up one level. Now, the parent is full, and so splitting proceeds W X K Q V A I M O P S T W X Insertion xample III Splitting the root sends K up: 18 K Q V A I J M O P S T W X So, the B-tree grows by pushing up a new root, which keeps all leaves at the same level. As you can see here, the root must be an exception to the requirement that each node contains a minimum number of data values, since root-splitting will naturally lead to a new root node holding only one value.

10 B Tree Insertion Algorithm 19 Insert(K) find a leaf, node, to accept K while ( true ) find the proper position for K within node if node is not full insert K into node return else split node into node1 and node2 distribute values evenly between node1 and node2 let K = middle value if node was the root create a new root as parent of node1 and node2 return else let node = parent of node rozdek, p 307 Search Cost 20 If we let q = m / 2 then we can derive an upper bound on the height of the B-tree with n nodes: n + 1 h log q This is very small. or example, if m = 200 and n = 2,000,000 then h <= 4. But don t get too excited by this. The cost of doing a binary search of the data values in a node would be at least log 2 (q), and if we do that at each level in the tree, the total cost would be n + 1 n + 1 Θ log( q) logq = Θ log 2 2 Keep in mind that the motivation is to find a tree structure that can be efficiently stored to disk, and matching the search cost of a perfectly balanced binary tree is a plus.

11 Cost of Splitting 21 It would seem that the primary concern about the cost of insertion would be the number of splits that must be performed (everything else is essentially analogous to BST insertion). It is possible to show that as n increases, the average probability of a split is approximated by 1 m / 2 1 So, for example, if m = 100 then the probability of a split is about 2%. That shouldn t be surprising. Splitting a node is fairly expensive since about half the data values in the node must be moved to a new location, but for typical B-trees it won t be required all that often. B Tree eletion 22 As with the 2-3 tree, deletion involves locating the value, replacing it if it s in an internal node, and deleting a value from a leaf. eleting a value from a leaf will require, in some cases, merging the leaf with a sibling and demoting a value from the parent node. etails are interesting

12 B Tree eletion Algorithm 23 elete(k) find the node, node, containing K if node is not a leaf find a leaf with the closest successor, S, of K copy S to replace K in node let node = leaf containing S delete S from node else delete K from node while ( true ) if node did not underflow return if node has a sibling with enough values redistribute values between node and the sibling return... rozdek, p 312 B Tree eletion Algorithm... else if node s parent is the root if the parent has only one key merge node, its sibling, and the root else merge node and its sibling return else merge node and its sibling let node = its parent 24 rozdek, p 312

13 B Tree Storage fficiency 25 In a B tree: - nodes are guaranteed to be (essentially) at least 50% full - node could also be only 50% full, wasting half the data space in the nodes - analysis and simulation indicates that in typical use a B tree will be about 70% full This expectation of wasted space is a motivation for some variants of the basic B tree. B* Trees 26 In B* trees: - all nodes except the root are required to be at least 2/3 full rather than 1/2 full - splitting transforms 2 nodes into 3, rather than 1 node into 2 - analysis indicates the average utilization of a B* tree will be about 81% - can be generalized to specify a fill factor of (n+1)/(n+2); a B n tree Knuth

14 B+ Trees 27 In B+ trees: - Internal nodes store only key values and pointers*. - All records, or pointers to records, are stored in leaves. - Commonly, the leaves are simply the logical blocks of a database file index, storing key values and offsets. In this case, many key values will occur twice in the tree, once at an internal node to guide searching, and again in a leaf. - If the leaves are simply an index, it is common to implement the leaf level as a linked list of B tree nodes why? The B+ tree is the most commonly implemented variant of the B-tree family, and the structure of choice for large databases. * In small databases, it is fairly common to use a B-tree as a direct data structure, with nodes storing records. B Tree Implementation A sample B tree template interface is shown below: 28 template <typename T> class BTreeT { public: BTreeT(unsigned int Order = 3); ~BTreeT(); // some public fns not shown bool Insert(const T& Val); bool elete(const T& Val); void isplay(ostream& Out) const; T* const ind(const T& Target); const T* const ind(const T& Target) const; private: unsigned int Order; BNodeT<T>* Root; // some helper fns not shown }; The order could be supplied as a non-type template parameter, rather than via the constructor, but you could not then use a local variable when declaring a B tree, so the approach shown here is more flexible from the client perspective.

15 B Tree Node Implementation 29 The design of the B tree node type raises some interesting issues: - the node must provide a way to store Order - 1 data values - the node must provide a way to store Order pointers - the node must support binary search of the data values, so dynamically-allocated arrays are appropriate - the node should, perhaps, provide the necessary search function, and functions to remove and add data values (and adjust the pointer array as needed) - the node destructor must destroy the arrays - the node should provide an isull() function, and perhaps other tests such as underflow, or whether the node has extra values that could be borrowed - the split and merge operations needed for the B tree implementation could be made the responsibility of the node Since leaf nodes do not need pointers, there is a case for having two distinct node types. While that would save memory (or disk space), the idea will not be pursued here. B Tree Node Implementation 30 The most common approach seems to be to use one array, of dimension Order 1, to store the data values, and another, of dimension Order, to store the pointers. It is possible to simplify the shifting needed when inserting and removing data values by coalescing the two arrays into one, storing data/pointer pairs. or example: template <typename T> class Pair { public: T ata; BNodeT<T>* Right; // some members perhaps not shown }; The node could then store an array of Pair objects, of dimension Order, using the 0-th cell to store only a pointer (to the left-most child). This approach also simplifies some communication issues when splitting a node. On the other hand, this approach also adds some syntactic complexity. The discussion of the implementation of B tree insertion that follows uses the approach shown here, but is easily adapted to a separate-arrays approach.

16 B Tree Insert 31 Insertion follows the usual public/private pattern. The public stub function might look something like this: template <typename T> bool BTreeT<T>::Insert(const T& Val) { if ( Root == NULL ) { Root = new BNodeT<T>(Order); } Pair<T> Comm(Val, NULL); bool splitoccurred = false; if (!insertelper(val, Root, Comm, splitoccurred) ) return false; } if ( splitoccurred ) { // need to grow new root here... } return true; When a child splits, the parent must be notified of that fact, and supplied with the data value that is passed upwards, and the address of the new child node that has been created. In the implementation here, this communication is supported by adding parameters to the interface of the helper function. B Tree Insert elper The private helper function might look something like this: 32 template <typename T> bool BTreeT<T>::insertelper(const T& Val, BNodeT<T>* sroot, Pair<T>& Comm, bool& splitoccurred) { if ( sroot == NULL ) return false; if ( sroot->isleaf() ) { // OK, we're at a leaf, so do it! if (!(sroot->isull()) ) { // don't need to split // add data value to node splitoccurred = false; return true; } else { // need to split leaf // split leaf, identifying value to pass up, and address of // new child node (of parent of current node) splitoccurred = true; return true; } } //... continues...

17 B Tree Insert elper The private helper function might look something like this: 33 } //... continued... else { // need to continue downward // Look for index, Successor, of first element > Val // check for pre-existence of data, return false if found // OK, the new value would go before Content[currata] // recurse to child preceding index currata // On return, determine if child split // If yes, need to accept the value that was passed up. // If the current node splits in response to that, // need to pass correct value and pointer upwards. } The basic process is straightforward. Note that since you determine the index via which you descend, you already know where to place the passed value when the recursive call returns, and no further searching is required.

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file. Large Trees 1 Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file. Trees can also be used to store indices of the collection

More information

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is?

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is? Multiway searching What do we do if the volume of data to be searched is too large to fit into main memory Search tree is stored on disk pages, and the pages required as comparisons proceed may not be

More information

B-Trees & its Variants

B-Trees & its Variants B-Trees & its Variants Advanced Data Structure Spring 2007 Zareen Alamgir Motivation Yet another Tree! Why do we need another Tree-Structure? Data Retrieval from External Storage In database programs,

More information

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 530A. B+ Trees. Washington University Fall 2013 CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key

More information

CSCI Trees. Mark Redekopp David Kempe

CSCI Trees. Mark Redekopp David Kempe CSCI 104 2-3 Trees Mark Redekopp David Kempe Trees & Maps/Sets C++ STL "maps" and "sets" use binary search trees internally to store their keys (and values) that can grow or contract as needed This allows

More information

CS127: B-Trees. B-Trees

CS127: B-Trees. B-Trees CS127: B-Trees B-Trees 1 Data Layout on Disk Track: one ring Sector: one pie-shaped piece. Block: intersection of a track and a sector. Disk Based Dictionary Structures Use a disk-based method when the

More information

Algorithms. Deleting from Red-Black Trees B-Trees

Algorithms. Deleting from Red-Black Trees B-Trees Algorithms Deleting from Red-Black Trees B-Trees Recall the rules for BST deletion 1. If vertex to be deleted is a leaf, just delete it. 2. If vertex to be deleted has just one child, replace it with that

More information

CS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives.

CS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives. CS 310 B-trees, Page 1 Motives Large-scale databases are stored in disks/hard drives. Disks are quite different from main memory. Data in a disk are accessed through a read-write head. To read a piece

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS245-2008S-19 B-Trees David Galles Department of Computer Science University of San Francisco 19-0: Indexing Operations: Add an element Remove an element Find an element,

More information

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key and satisfies the conditions:

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key and satisfies the conditions: 1 The general binary tree shown is not terribly useful in practice. The chief use of binary trees is for providing rapid access to data (indexing, if you will) and the general binary tree does not have

More information

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

B-Trees. Version of October 2, B-Trees Version of October 2, / 22 B-Trees Version of October 2, 2014 B-Trees Version of October 2, 2014 1 / 22 Motivation An AVL tree can be an excellent data structure for implementing dictionary search, insertion and deletion Each operation

More information

B-Trees. Introduction. Definitions

B-Trees. Introduction. Definitions 1 of 10 B-Trees Introduction A B-tree is a specialized multiway tree designed especially for use on disk. In a B-tree each node may contain a large number of keys. The number of subtrees of each node,

More information

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation! Lecture 10: Multi-way Search Trees: intro to B-trees 2-3 trees 2-3-4 trees Multi-way Search Trees A node on an M-way search tree with M 1 distinct and ordered keys: k 1 < k 2 < k 3

More information

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University Extra: B+ Trees CS1: Java Programming Colorado State University Slides by Wim Bohm and Russ Wakefield 1 Motivations Many times you want to minimize the disk accesses while doing a search. A binary search

More information

Binary Trees. For example: Jargon: General Binary Trees. root node. level: internal node. edge. leaf node. Data Structures & File Management

Binary Trees. For example: Jargon: General Binary Trees. root node. level: internal node. edge. leaf node. Data Structures & File Management Binary Trees 1 A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the root, which are disjoint from

More information

Binary Tree Node Relationships. Binary Trees. Quick Application: Expression Trees. Traversals

Binary Tree Node Relationships. Binary Trees. Quick Application: Expression Trees. Traversals Binary Trees 1 Binary Tree Node Relationships 2 A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the

More information

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree B-Trees Disk Storage What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree Deletion in a B-tree Disk Storage Data is stored on disk (i.e., secondary memory) in blocks. A block is

More information

B-Trees. Based on materials by D. Frey and T. Anastasio

B-Trees. Based on materials by D. Frey and T. Anastasio B-Trees Based on materials by D. Frey and T. Anastasio 1 Large Trees n Tailored toward applications where tree doesn t fit in memory q operations much faster than disk accesses q want to limit levels of

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L08: B + -trees and Dynamic Hashing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,

More information

(2,4) Trees. 2/22/2006 (2,4) Trees 1

(2,4) Trees. 2/22/2006 (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 2/22/2006 (2,4) Trees 1 Outline and Reading Multi-way search tree ( 10.4.1) Definition Search (2,4) tree ( 10.4.2) Definition Search Insertion Deletion Comparison of dictionary

More information

B-trees. It also makes sense to have data structures that use the minimum addressable unit as their base node size.

B-trees. It also makes sense to have data structures that use the minimum addressable unit as their base node size. B-trees Balanced BSTs such as RBTs are great for data structures that can fit into the main memory of the computer. But what happens when we need to use external storage? Here are some approximate speeds

More information

Chapter 20: Binary Trees

Chapter 20: Binary Trees Chapter 20: Binary Trees 20.1 Definition and Application of Binary Trees Definition and Application of Binary Trees Binary tree: a nonlinear linked list in which each node may point to 0, 1, or two other

More information

Balanced Search Trees

Balanced Search Trees Balanced Search Trees Computer Science E-22 Harvard Extension School David G. Sullivan, Ph.D. Review: Balanced Trees A tree is balanced if, for each node, the node s subtrees have the same height or have

More information

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25 Multi-way Search Trees (Multi-way Search Trees) Data Structures and Programming Spring 2017 1 / 25 Multi-way Search Trees Each internal node of a multi-way search tree T: has at least two children contains

More information

CS F-11 B-Trees 1

CS F-11 B-Trees 1 CS673-2016F-11 B-Trees 1 11-0: Binary Search Trees Binary Tree data structure All values in left subtree< value stored in root All values in the right subtree>value stored in root 11-1: Generalizing BSTs

More information

IX. Binary Trees (Chapter 10)

IX. Binary Trees (Chapter 10) IX. Binary Trees (Chapter 10) -1- A. Introduction: Searching a linked list. 1. Linear Search /* To linear search a list for a particular Item */ 1. Set Loc = 0; 2. Repeat the following: a. If Loc >= length

More information

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology Introduction Chapter 4 Trees for large input, even linear access time may be prohibitive we need data structures that exhibit average running times closer to O(log N) binary search tree 2 Terminology recursive

More information

Trees. (Trees) Data Structures and Programming Spring / 28

Trees. (Trees) Data Structures and Programming Spring / 28 Trees (Trees) Data Structures and Programming Spring 2018 1 / 28 Trees A tree is a collection of nodes, which can be empty (recursive definition) If not empty, a tree consists of a distinguished node r

More information

Computer Science 210 Data Structures Siena College Fall Topic Notes: Binary Search Trees

Computer Science 210 Data Structures Siena College Fall Topic Notes: Binary Search Trees Computer Science 10 Data Structures Siena College Fall 018 Topic Notes: Binary Search Trees Possibly the most common usage of a binary tree is to store data for quick retrieval. Definition: A binary tree

More information

Chapter 12: Indexing and Hashing (Cnt(

Chapter 12: Indexing and Hashing (Cnt( Chapter 12: Indexing and Hashing (Cnt( Cnt.) Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition

More information

Background: disk access vs. main memory access (1/2)

Background: disk access vs. main memory access (1/2) 4.4 B-trees Disk access vs. main memory access: background B-tree concept Node structure Structural properties Insertion operation Deletion operation Running time 66 Background: disk access vs. main memory

More information

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree. The Lecture Contains: Index structure Binary search tree (BST) B-tree B+-tree Order file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture13/13_1.htm[6/14/2012

More information

Lecture 3: B-Trees. October Lecture 3: B-Trees

Lecture 3: B-Trees. October Lecture 3: B-Trees October 2017 Remarks Search trees The dynamic set operations search, minimum, maximum, successor, predecessor, insert and del can be performed efficiently (in O(log n) time) if the search tree is balanced.

More information

Motivation for B-Trees

Motivation for B-Trees 1 Motivation for Assume that we use an AVL tree to store about 20 million records We end up with a very deep binary tree with lots of different disk accesses; log2 20,000,000 is about 24, so this takes

More information

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler Lecture 12: BT Trees Courtesy to Goodrich, Tamassia and Olga Veksler Instructor: Yuzhen Xie Outline B-tree Special case of multiway search trees used when data must be stored on the disk, i.e. too large

More information

CS102 Binary Search Trees

CS102 Binary Search Trees CS102 Binary Search Trees Prof Tejada 1 To speed up insertion, removal and search, modify the idea of a Binary Tree to create a Binary Search Tree (BST) Binary Search Trees Binary Search Trees have one

More information

Trees, Part 1: Unbalanced Trees

Trees, Part 1: Unbalanced Trees Trees, Part 1: Unbalanced Trees The first part of this chapter takes a look at trees in general and unbalanced binary trees. The second part looks at various schemes to balance trees and/or make them more

More information

Advanced Set Representation Methods

Advanced Set Representation Methods Advanced Set Representation Methods AVL trees. 2-3(-4) Trees. Union-Find Set ADT DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 1 Advanced Set Representation. AVL Trees Problem with BSTs: worst case operation

More information

amiri advanced databases '05

amiri advanced databases '05 More on indexing: B+ trees 1 Outline Motivation: Search example Cost of searching with and without indices B+ trees Definition and structure B+ tree operations Inserting Deleting 2 Dense ordered index

More information

Tree-Structured Indexes

Tree-Structured Indexes Introduction Tree-Structured Indexes Chapter 10 As for any index, 3 alternatives for data entries k*: Data record with key value k

More information

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010 Uses for About Binary January 31, 2010 Uses for About Binary Uses for Uses for About Basic Idea Implementing Binary Example: Expression Binary Search Uses for Uses for About Binary Uses for Storage Binary

More information

Tree-Structured Indexes

Tree-Structured Indexes Tree-Structured Indexes Chapter 9 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Introduction As for any index, 3 alternatives for data entries k*: ➀ Data record with key value k ➁

More information

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof.

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof. Priority Queues 1 Introduction Many applications require a special type of queuing in which items are pushed onto the queue by order of arrival, but removed from the queue based on some other priority

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Spring 2018 Mentoring 8: March 14, Binary Trees

Spring 2018 Mentoring 8: March 14, Binary Trees CSM 6B Binary Trees Spring 08 Mentoring 8: March 4, 08 Binary Trees. Define a procedure, height, which takes in a Node and outputs the height of the tree. Recall that the height of a leaf node is 0. private

More information

Introduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana

Introduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana Introduction to Indexing 2 Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana Indexed Sequential Access Method We have seen that too small or too large an index (in other words too few or too

More information

Red-Black trees are usually described as obeying the following rules :

Red-Black trees are usually described as obeying the following rules : Red-Black Trees As we have seen, the ideal Binary Search Tree has height approximately equal to log n, where n is the number of values stored in the tree. Such a BST guarantees that the maximum time for

More information

V Advanced Data Structures

V Advanced Data Structures V Advanced Data Structures B-Trees Fibonacci Heaps 18 B-Trees B-trees are similar to RBTs, but they are better at minimizing disk I/O operations Many database systems use B-trees, or variants of them,

More information

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

2-3 Tree. Outline B-TREE. catch(...){ printf( Assignment::SolveProblem() AAAA!); } ADD SLIDES ON DISJOINT SETS Outline catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } Balanced Search Trees 2-3 Trees 2-3-4 Trees Slide 4 Why care about advanced implementations? Same entries, different insertion sequence:

More information

Material You Need to Know

Material You Need to Know Review Quiz 2 Material You Need to Know Normalization Storage and Disk File Layout Indexing B-trees and B+ Trees Extensible Hashing Linear Hashing Decomposition Goals: Lossless Joins, Dependency preservation

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Multi-way Search Trees

Multi-way Search Trees Multi-way Search Trees Kuan-Yu Chen ( 陳冠宇 ) 2018/10/24 @ TR-212, NTUST Review Red-Black Trees Splay Trees Huffman Trees 2 Multi-way Search Trees. Every node in a binary search tree contains one value and

More information

Laboratory Module X B TREES

Laboratory Module X B TREES Purpose: Purpose 1... Purpose 2 Purpose 3. Laboratory Module X B TREES 1. Preparation Before Lab When working with large sets of data, it is often not possible or desirable to maintain the entire structure

More information

Binary Search Trees. Contents. Steven J. Zeil. July 11, Definition: Binary Search Trees The Binary Search Tree ADT...

Binary Search Trees. Contents. Steven J. Zeil. July 11, Definition: Binary Search Trees The Binary Search Tree ADT... Steven J. Zeil July 11, 2013 Contents 1 Definition: Binary Search Trees 2 1.1 The Binary Search Tree ADT.................................................... 3 2 Implementing Binary Search Trees 7 2.1 Searching

More information

Intro to DB CHAPTER 12 INDEXING & HASHING

Intro to DB CHAPTER 12 INDEXING & HASHING Intro to DB CHAPTER 12 INDEXING & HASHING Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing

More information

Problem. Indexing with B-trees. Indexing. Primary Key Indexing. B-trees: Example. B-trees. primary key indexing

Problem. Indexing with B-trees. Indexing. Primary Key Indexing. B-trees: Example. B-trees. primary key indexing 15-82 Advanced Topics in Database Systems Performance Problem Given a large collection of records, Indexing with B-trees find similar/interesting things, i.e., allow fast, approximate queries 2 Indexing

More information

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring Instructions:

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring Instructions: VIRG INIA POLYTECHNIC INSTITUTE AND STATE U T PROSI M UNI VERSI TY Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted

More information

Binary Trees. BSTs. For example: Jargon: Data Structures & Algorithms. root node. level: internal node. edge.

Binary Trees. BSTs. For example: Jargon: Data Structures & Algorithms. root node. level: internal node. edge. Binary Trees 1 A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the root, which are disjoint from

More information

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 2004 Goodrich, Tamassia (2,4) Trees 1 Multi-Way Search Tree A multi-way search tree is an ordered tree such that Each internal node has at least two children and stores d -1 key-element

More information

M-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss)

M-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss) M-ary Search Tree B-Trees (4.7 in Weiss) Maximum branching factor of M Tree with N values has height = # disk accesses for find: Runtime of find: 1/21/2011 1 1/21/2011 2 Solution: B-Trees specialized M-ary

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

V Advanced Data Structures

V Advanced Data Structures V Advanced Data Structures B-Trees Fibonacci Heaps 18 B-Trees B-trees are similar to RBTs, but they are better at minimizing disk I/O operations Many database systems use B-trees, or variants of them,

More information

Introduction. Choice orthogonal to indexing technique used to locate entries K.

Introduction. Choice orthogonal to indexing technique used to locate entries K. Tree-Structured Indexes Werner Nutt Introduction to Database Systems Free University of Bozen-Bolzano 2 Introduction As for any index, three alternatives for data entries K : Data record with key value

More information

Chapter 17 Indexing Structures for Files and Physical Database Design

Chapter 17 Indexing Structures for Files and Physical Database Design Chapter 17 Indexing Structures for Files and Physical Database Design We assume that a file already exists with some primary organization unordered, ordered or hash. The index provides alternate ways to

More information

B-Trees and External Memory

B-Trees and External Memory Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 and External Memory 1 1 (2, 4) Trees: Generalization of BSTs Each internal node

More information

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text) Lec 17 April 8 Topics: binary Trees expression trees Binary Search Trees (Chapter 5 of text) Trees Linear access time of linked lists is prohibitive Heap can t support search in O(log N) time. (takes O(N)

More information

(2,4) Trees Goodrich, Tamassia. (2,4) Trees 1

(2,4) Trees Goodrich, Tamassia. (2,4) Trees 1 (2,4) Trees 9 2 5 7 10 14 (2,4) Trees 1 Multi-Way Search Tree ( 9.4.1) A multi-way search tree is an ordered tree such that Each internal node has at least two children and stores d 1 key-element items

More information

UNIT III BALANCED SEARCH TREES AND INDEXING

UNIT III BALANCED SEARCH TREES AND INDEXING UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant

More information

Physical Level of Databases: B+-Trees

Physical Level of Databases: B+-Trees Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,

More information

M-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =

M-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height = M-ary Search Tree B-Trees Section 4.7 in Weiss Maximum branching factor of M Complete tree has height = # disk accesses for find: Runtime of find: 2 Solution: B-Trees specialized M-ary search trees Each

More information

Main Memory and the CPU Cache

Main Memory and the CPU Cache Main Memory and the CPU Cache CPU cache Unrolled linked lists B Trees Our model of main memory and the cost of CPU operations has been intentionally simplistic The major focus has been on determining

More information

IX. Binary Trees (Chapter 10) Linear search can be used for lists stored in an array as well as for linked lists. (It's the method used in the find

IX. Binary Trees (Chapter 10) Linear search can be used for lists stored in an array as well as for linked lists. (It's the method used in the find IX. Binary Trees IX-1 IX. Binary Trees (Chapter 10) A. Introduction: Searching a linked list. 1. Linear Search /* To linear search a list for a particular Item */ 1. Set Loc = 0; 2. Repeat the following:

More information

B-Trees and External Memory

B-Trees and External Memory Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 B-Trees and External Memory 1 (2, 4) Trees: Generalization of BSTs Each internal

More information

Tree-Structured Indexes

Tree-Structured Indexes Tree-Structured Indexes Chapter 10 Comp 521 Files and Databases Fall 2010 1 Introduction As for any index, 3 alternatives for data entries k*: index refers to actual data record with key value k index

More information

Hash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell

Hash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell Hash Tables CS 311 Data Structures and Algorithms Lecture Slides Wednesday, April 22, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks CHAPPELLG@member.ams.org 2005

More information

A set of nodes (or vertices) with a single starting point

A set of nodes (or vertices) with a single starting point Binary Search Trees Understand tree terminology Understand and implement tree traversals Define the binary search tree property Implement binary search trees Implement the TreeSort algorithm 2 A set of

More information

Indexing. Outline. Introduction (Ch 10) Linear Index (Ch 10.1) Tree Index (Ch 10.3) 2-3 Tree (Ch 10.4) B-Tree (Ch 10.5) Data Structure Chapter 10

Indexing. Outline. Introduction (Ch 10) Linear Index (Ch 10.1) Tree Index (Ch 10.3) 2-3 Tree (Ch 10.4) B-Tree (Ch 10.5) Data Structure Chapter 10 http://125.216.243.100/csds/ Data Structure Chapter 10 Indexing Dr. Patrick Chan School of Computer Science and Engineering South China University of Technology Outline Introduction (Ch 10) Linear Index

More information

Announcements. Reading Material. Recap. Today 9/17/17. Storage (contd. from Lecture 6)

Announcements. Reading Material. Recap. Today 9/17/17. Storage (contd. from Lecture 6) CompSci 16 Intensive Computing Systems Lecture 7 Storage and Index Instructor: Sudeepa Roy Announcements HW1 deadline this week: Due on 09/21 (Thurs), 11: pm, no late days Project proposal deadline: Preliminary

More information

2-3 and Trees. COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli

2-3 and Trees. COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli 2-3 and 2-3-4 Trees COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli Multi-Way Trees A binary search tree: One value in each node At most 2 children An M-way search tree: Between 1 to (M-1) values

More information

CS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial

CS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial Week 7 General remarks Arrays, lists, pointers and 1 2 3 We conclude elementary data structures by discussing and implementing arrays, lists, and trees. Background information on pointers is provided (for

More information

CS 3114 Data Structures and Algorithms READ THIS NOW!

CS 3114 Data Structures and Algorithms READ THIS NOW! READ THIS NOW! Print your name in the space provided below. There are 7 short-answer questions, priced as marked. The maximum score is 100. This examination is closed book and closed notes, aside from

More information

Data Structures Week #6. Special Trees

Data Structures Week #6. Special Trees Data Structures Week #6 Special Trees Outline Adelson-Velskii-Landis (AVL) Trees Splay Trees B-Trees 21.Aralık.2010 Borahan Tümer, Ph.D. 2 AVL Trees 21.Aralık.2010 Borahan Tümer, Ph.D. 3 Motivation for

More information

18.3 Deleting a key from a B-tree

18.3 Deleting a key from a B-tree 18.3 Deleting a key from a B-tree B-TREE-DELETE deletes the key from the subtree rooted at We design it to guarantee that whenever it calls itself recursively on a node, the number of keys in is at least

More information

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree Chapter 4 Trees 2 Introduction for large input, even access time may be prohibitive we need data structures that exhibit running times closer to O(log N) binary search tree 3 Terminology recursive definition

More information

Organizing Spatial Data

Organizing Spatial Data Organizing Spatial Data Spatial data records include a sense of location as an attribute. Typically location is represented by coordinate data (in 2D or 3D). 1 If we are to search spatial data using the

More information

[ DATA STRUCTURES ] Fig. (1) : A Tree

[ DATA STRUCTURES ] Fig. (1) : A Tree [ DATA STRUCTURES ] Chapter - 07 : Trees A Tree is a non-linear data structure in which items are arranged in a sorted sequence. It is used to represent hierarchical relationship existing amongst several

More information

Another solution: B-trees

Another solution: B-trees Trees III: B-trees The balance problem Binary search trees provide efficient search mechanism only if they re balanced Balance depends on the order in which nodes are added to a tree We have seen red-black

More information

Indexing. Introduction. Outline. Search Approach. Linear Index (Ch 10.1) Tree Index (Ch 10.3) 2-3 Tree (Ch 10.4) B-Tree (Ch 10.5) Introduction (Ch 10)

Indexing. Introduction. Outline. Search Approach. Linear Index (Ch 10.1) Tree Index (Ch 10.3) 2-3 Tree (Ch 10.4) B-Tree (Ch 10.5) Introduction (Ch 10) http://125.216.243.100/csds/ Data Structure Chapter 10 Indexing Dr. Patrick Chan School of Computer Science and Engineering South China University of Technology Search Approach Sequential and List Method

More information

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. B-Trees Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory.

More information

Search Trees. The term refers to a family of implementations, that may have different properties. We will discuss:

Search Trees. The term refers to a family of implementations, that may have different properties. We will discuss: Search Trees CSE 2320 Algorithms and Data Structures Alexandra Stefan Based on slides and notes from: Vassilis Athitsos and Bob Weems University of Texas at Arlington 1 Search Trees Preliminary note: "search

More information

B-Trees. nodes with many children a type node a class for B-trees. an elaborate example the insertion algorithm removing elements

B-Trees. nodes with many children a type node a class for B-trees. an elaborate example the insertion algorithm removing elements B-Trees 1 B-Trees nodes with many children a type node a class for B-trees 2 manipulating a B-tree an elaborate example the insertion algorithm removing elements MCS 360 Lecture 35 Introduction to Data

More information

Tree-Structured Indexes

Tree-Structured Indexes Tree-Structured Indexes Chapter 9 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Introduction As for any index, 3 alternatives for data entries k*: Data record with key value k

More information

MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct.

MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct. MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct. In linked list the elements are necessarily to be contiguous In linked list the elements may locate at far positions

More information

Data Structures Week #6. Special Trees

Data Structures Week #6. Special Trees Data Structures Week #6 Special Trees Outline Adelson-Velskii-Landis (AVL) Trees Splay Trees B-Trees October 5, 2015 Borahan Tümer, Ph.D. 2 AVL Trees October 5, 2015 Borahan Tümer, Ph.D. 3 Motivation for

More information

Search Trees. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Binary Search Trees

Search Trees. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Binary Search Trees Unit 9, Part 2 Search Trees Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Binary Search Trees Search-tree property: for each node k: all nodes in k s left subtree are < k all nodes

More information

Binary Trees. For example: Jargon: General Binary Trees. root node. level: internal node. edge. leaf node. Data Structures & File Management

Binary Trees. For example: Jargon: General Binary Trees. root node. level: internal node. edge. leaf node. Data Structures & File Management Binary Trees 1 A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the root, which are disjoint from

More information

CMPS 2200 Fall 2017 B-trees Carola Wenk

CMPS 2200 Fall 2017 B-trees Carola Wenk CMPS 2200 Fall 2017 B-trees Carola Wenk 9/18/17 CMPS 2200 Intro. to Algorithms 1 External memory dictionary Task: Given a large amount of data that does not fit into main memory, process it into a dictionary

More information

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT. Week 10 1 2 3 4 5 6 Sorting 7 8 General remarks We return to sorting, considering and. Reading from CLRS for week 7 1 Chapter 6, Sections 6.1-6.5. 2 Chapter 7, Sections 7.1, 7.2. Discover the properties

More information

Trees. Carlos Moreno uwaterloo.ca EIT https://ece.uwaterloo.ca/~cmoreno/ece250

Trees. Carlos Moreno uwaterloo.ca EIT https://ece.uwaterloo.ca/~cmoreno/ece250 Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Today's class: We'll discuss one possible implementation for trees (the general type of trees) We'll look at tree

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information