B-Trees. Large degree B-trees used to represent very large dictionaries that reside on disk.

Similar documents
Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Laboratory Module X B TREES

Chapter 20: Binary Trees

What is a Multi-way tree?

Data Structures. Motivation

Balanced Search Trees

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Motivation for B-Trees

CS Fall 2010 B-trees Carola Wenk

ECE250: Algorithms and Data Structures B-Trees (Part A)

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

Comp 335 File Structures. B - Trees

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Trees. Eric McCreath

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

Algorithms. AVL Tree

Lecture 3: B-Trees. October Lecture 3: B-Trees

CS127: B-Trees. B-Trees

Self-Balancing Search Trees. Chapter 11

Augmenting Data Structures

CMPS 2200 Fall 2017 B-trees Carola Wenk

Data Structures Week #6. Special Trees

M-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =

(2,4) Trees. 2/22/2006 (2,4) Trees 1

m-way Search Tree: Empty, or if not empty then: Each internal node has q children and q 1 elements, for 2 q m.

Data Structures Week #6. Special Trees

Binary Search Trees. Analysis of Algorithms

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

M-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss)

CS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives.

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is?

B-Trees Data structures on secondary storage primary memory main memory

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 326: Data Structures B-Trees and B+ Trees

CIS265/ Trees Red-Black Trees. Some of the following material is from:

Data Structures Week #6. Special Trees

Section 4 SOLUTION: AVL Trees & B-Trees

Algorithms. Deleting from Red-Black Trees B-Trees

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University

Cpt S 122 Data Structures. Data Structures Trees

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Background: disk access vs. main memory access (1/2)

B-Trees. CS321 Spring 2014 Steve Cutchin

Material You Need to Know

Search Trees. Data and File Structures Laboratory. DFS Lab (ISI) Search Trees 1 / 17

CSE 373 OCTOBER 25 TH B-TREES

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees

A red-black tree is a balanced binary search tree with the following properties:

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

Red-black trees (19.5), B-trees (19.8), trees

void insert( Type const & ) void push_front( Type const & )

Red-Black Trees. Based on materials by Dennis Frey, Yun Peng, Jian Chen, and Daniel Hood

Main Memory and the CPU Cache

Section 1: True / False (1 point each, 15 pts total)

CSE 100: B+ TREE, 2-3 TREE, NP- COMPLETNESS

CS350: Data Structures Red-Black Trees

Advanced Tree Data Structures

Multi-Way Search Trees

Trees. (Trees) Data Structures and Programming Spring / 28

CS 315 Data Structures mid-term 2

Multiway Search Trees. Multiway-Search Trees (cont d)

Multi-Way Search Trees

Binary Tree. Binary tree terminology. Binary tree terminology Definition and Applications of Binary Trees

File Structures and Indexing

AVL Trees. (AVL Trees) Data Structures and Programming Spring / 17

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

Outline. Computer Science 331. Insertion: An Example. A Recursive Insertion Algorithm. Binary Search Trees Insertion and Deletion.

B-Trees & its Variants

Discussion 2C Notes (Week 8, February 25) TA: Brian Choi Section Webpage:

examines every node of a list until a matching node is found, or until all nodes have been examined and no match is found.

Balanced Search Trees. CS 3110 Fall 2010

Question Bank Subject: Advanced Data Structures Class: SE Computer

CS350: Data Structures B-Trees

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

CS102 Binary Search Trees

Indexing. Outline. Introduction (Ch 10) Linear Index (Ch 10.1) Tree Index (Ch 10.3) 2-3 Tree (Ch 10.4) B-Tree (Ch 10.5) Data Structure Chapter 10

Search Trees (Ch. 9) > = Binary Search Trees 1

Balanced search trees

A set of nodes (or vertices) with a single starting point

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

Transform & Conquer. Presorting

CSC Design and Analysis of Algorithms

Data Structures and Algorithms

Multi-way Search Trees

Search Trees. The term refers to a family of implementations, that may have different properties. We will discuss:

Indexing. Introduction. Outline. Search Approach. Linear Index (Ch 10.1) Tree Index (Ch 10.3) 2-3 Tree (Ch 10.4) B-Tree (Ch 10.5) Introduction (Ch 10)

CS 380 ALGORITHM DESIGN AND ANALYSIS

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Dictionaries. Priority Queues

Fall, 2015 Prof. Jungkeun Park

Search Trees - 1 Venkatanatha Sarma Y

CSCI Trees. Mark Redekopp David Kempe

Some Practice Problems on Hardware, File Organization and Indexing

CSC Design and Analysis of Algorithms. Lecture 7. Transform and Conquer I Algorithm Design Technique. Transform and Conquer

Exercise 1 : B-Trees [ =17pts]

EECS 311 Data Structures Midterm Exam Don t Panic!

Intro to DB CHAPTER 12 INDEXING & HASHING

Transcription:

B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internalmemory dictionaries to overcome cache-miss penalties.

B-Trees Main Memory ( RAM ) Secondary Memory ( disks ) x a pointer to some object DISK - READ(x) operations that access and/or modify the fields of x DISK - WRITE(x) others operations that access but do not modify the fields of x

AVL Trees n = 2 30 = 10 9 (approx). 30 <= height <= 43. When the AVL tree resides on a disk, up to 43 disk access are made for a search. This takes up to (approx) 4 seconds. Not acceptable.

Red-Black Trees n = 2 30 = 10 9 (approx). 30 <= height <= 60. When the red-black tree resides on a disk, up to 60 disk access are made for a search. This takes up to (approx) 6 seconds. Not acceptable.

A Disk Page an AVL node Useless content A Search Tree Node

m-way Search Trees Each node has up to m 1 pairs and m children. m = 2 => binary search tree.

4-Way Search Tree 10 30 35 k < 10 10 < k < 30 30 < k < 35 k > 35

Maximum # Of Pairs Happens when all internal nodes are m-nodes. Full degree m tree. # of nodes = 1 + m + m 2 + m 3 + + m h-1 = (m h 1)/(m 1). Each node has m 1 pairs. So, # of pairs = m h 1.

Capacity Of m-way Search Tree m = 2 m = 200 h = 3 7 8 * 10 6-1 h = 5 31 3.2 * 10 11-1 h = 7 127 1.28 * 10 16-1

Definition Of B-Tree Definition assumes external nodes (extended m-way search tree). B-tree of order m. m-way search tree. Not empty => root has at least 2 children. Remaining internal nodes (if any) have at least ceil(m/2) children. External (or failure) nodes on same level.

2-3 And 2-3-4 Trees B-tree of order m. m-way search tree. Not empty => root has at least 2 children. Remaining internal nodes (if any) have at least ceil(m/2) children. External (or failure) nodes on same level. 2-3 tree is B-tree of order 3. 2-3-4 tree is B-tree of order 4.

B-Trees Of Order 5 And 2 B-tree of order m. m-way search tree. Not empty => root has at least 2 children. Remaining internal nodes (if any) have at least ceil(m/2) children. External (or failure) nodes on same level. B-tree of order 5 is 3-4-5 tree (root may be 2-node though). B-tree of order 2 is full binary tree.

n = # of pairs. Minimum # Of Pairs # of external nodes = n + 1. Height = h => external nodes on level h + 1. level # of nodes 1 1 2 >= 2 3 >= 2*ceil(m/2) h + 1 >= 2*ceil(m/2) h-1 n + 1 >= 2*ceil(m/2) h-1, h >= 1

Minimum # Of Pairs n + 1 >= 2*ceil(m/2) h-1, h >= 1 m = 200. height # of pairs 2 >= 199 3 >= 19,999 4 >= 2 * 10 6 1 5 >= 2 * 10 8 1 h <= log ceil(m/2) [(n+1)/2] + 1

Worst-case search time. Choice Of m (time to fetch a node + time to search node) * height search time 50 400 m

convention: Root of the B-tree is always in main memory. Any nodes that are passed as parameters must already have had a DISK_READ operation performed on them. Operations: Searching a B-Tree. Creating an empty B-tree. Splitting a node in a B-tree. Inserting a key into a B-tree. Deleting a key from a B-tree.

Node Structure n c 0 k 1 c 1 k 2 c 2 k n c n c i is a pointer to a subtree. k i is a dictionary pair(key).

Search BT_Search(x, k) i 0 while i < n and k > k i+1 [x] do i i +1 if i < n and k = k i+1 [x] then return(x,i +1) if leaf [x] then return NULL else DISK-READ(C i [x]) return B-Tree-Search(C i [x], k)

B-Tree-Created(T): Algorithm: B-Tree-Create(T) { } x Allocate Node() Leaf[ x] TRUE n[x] 0 DISK - WRITE(x) root[t] x time: O(1)

4 Insert 8 15 20 Insert 10? Insert 18? 1 3 5 6 9 16 17 30 40 Insertion into a full leaf triggers bottom-up node splitting pass.

Split An Overfull Node m c 0 k 1 c 1 k 2 c 2 k m c m c i is a pointer to a subtree. k i is a dictionary pair(key). ceil(m/2)-1 c 0 k 1 c 1 k 2 c 2 k ceil(m/2)-1 c ceil(m/2)-1 m-ceil(m/2) c ceil(m/2) k ceil(m/2)+1 c ceil(m/2)+1 k m c m k ceil(m/2) plus pointer to new node is inserted in parent.

Insert 8 4 15 20 1 3 5 6 9 16 17 30 40 Insert a pair with key = 2. New pair goes into a 3-node.

Insert Into A Leaf 3-node Insert new pair so that the 3 keys are in ascending order. 1 2 3 Split overflowed node around middle key. 2 1 3 Insert middle key and pointer to new node into parent.

Insert 8 4 15 20 1 3 5 6 9 16 17 30 40 Insert a pair with key = 2.

Insert 8 4 2 15 20 1 3 5 6 9 16 17 30 40 Insert a pair with key = 2 plus a pointer into parent.

Insert 8 2 4 15 20 1 3 5 6 9 16 17 30 40 Now, insert a pair with key = 18.

Insert Into A Leaf 3-node Insert new pair so that the 3 keys are in ascending order. 16 17 18 Split the overflowed node. 17 16 18 Insert middle key and pointer to new node into parent.

Insert 8 2 4 15 20 1 3 5 6 9 16 17 30 40 Insert a pair with key = 18.

Insert 8 17 2 4 15 20 18 1 3 5 6 9 16 30 40 Insert a pair with key = 17 plus a pointer into parent.

Insert 8 17 2 4 15 20 1 3 5 6 9 16 18 30 40 Insert a pair with key = 17 plus a pointer into parent.

Insert 8 17 2 4 15 20 1 3 5 6 9 16 18 30 40 Now, insert a pair with key = 7.

Insert 8 17 6 7 2 4 15 20 1 3 5 9 16 18 30 40 Insert a pair with key = 6 plus a pointer into parent.

Insert 8 17 4 6 2 15 20 1 9 3 16 18 30 40 5 7 Insert a pair with key = 4 plus a pointer into parent.

Insert 4 8 17 2 6 15 20 1 3 5 7 9 16 18 30 40 Insert a pair with key = 8 plus a pointer into parent. There is no parent. So, create a new root.

Insert 8 4 17 2 6 15 20 1 3 5 7 9 16 18 30 40 Height increases by 1.

Btree::InsertNode(Key k, Element e) { bool overflow = Insert(root, k, e); if (overflow) <Key, Node*> newpair= split(root); root = new Node(root, newpair); return; }

Bool Insert(node* x, Key k, Element e) { if (leaf(x)) insertleaf(x, k, e); if (size(x) > m-1) return true; else return false; idx = keysearch(x, k); bool overflow = Insert(x->C[idx], k, e);

if (overflow) <Key, Node*> newpair = split(x->c[idx]); InsertPair(x, newpair); if(size(x) > m-1) return true; else return false; }

Exercises: P609-3