CS Fall 2010 B-trees Carola Wenk

Similar documents
CMPS 2200 Fall 2017 B-trees Carola Wenk

CS 3343 Fall 2007 Red-black trees Carola Wenk

CMPS 2200 Fall 2015 Red-black trees Carola Wenk

CMPS 2200 Fall 2017 Red-black trees Carola Wenk

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

CSE 530A. B+ Trees. Washington University Fall 2013

Data Structures. Motivation

Laboratory Module X B TREES

Range Searching and Windowing

Orthogonal range searching. Range Trees. Orthogonal range searching. 1D range searching. CS Spring 2009

Data Structures and Algorithms

(2,4) Trees. 2/22/2006 (2,4) Trees 1

M-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

M-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss)

Lecture 3: B-Trees. October Lecture 3: B-Trees

CS 350 : Data Structures B-Trees

Augmenting Data Structures

CS350: Data Structures B-Trees

Multiway Search Trees. Multiway-Search Trees (cont d)

Algorithms. Deleting from Red-Black Trees B-Trees

Design and Analysis of Algorithms Lecture- 9: B- Trees

Physical Level of Databases: B+-Trees

Balanced Search Trees

CSE 326: Data Structures B-Trees and B+ Trees

V Advanced Data Structures

(2,4) Trees Goodrich, Tamassia. (2,4) Trees 1

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University

Data Structures Week #6. Special Trees

V Advanced Data Structures

Data Structures Week #6. Special Trees

B-Trees and External Memory

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

B-Trees and External Memory

Search Trees - 1 Venkatanatha Sarma Y

B-Trees. CS321 Spring 2014 Steve Cutchin

Lower Bound on Comparison-based Sorting

Material You Need to Know

CISC 235: Topic 4. Balanced Binary Search Trees

CSIT5300: Advanced Database Systems

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

B-Trees Data structures on secondary storage primary memory main memory

CS F-11 B-Trees 1

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees

18.3 Deleting a key from a B-tree

Data Structures Week #6. Special Trees

Spring 2017 B-TREES (LOOSELY BASED ON THE COW BOOK: CH. 10) 1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 1

Background: disk access vs. main memory access (1/2)

CIS265/ Trees Red-Black Trees. Some of the following material is from:

Binary Trees

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

2-3 and Trees. COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli

Algorithms. Red-Black Trees

How fast can we sort? Sorting. Decision-tree example. Decision-tree example. CS 3343 Fall by Charles E. Leiserson; small changes by Carola

Trees. Eric McCreath

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is?

Programming II (CS300)

Balanced search trees. DS 2017/2018

CSCI Trees. Mark Redekopp David Kempe

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

13.4 Deletion in red-black trees

Chapter 2: Basic Data Structures

Multi-Way Search Trees

Search Trees - 2. Venkatanatha Sarma Y. Lecture delivered by: Assistant Professor MSRSAS-Bangalore. M.S Ramaiah School of Advanced Studies - Bangalore

Disk Accesses. CS 361, Lecture 25. B-Tree Properties. Outline

Algorithms for Memory Hierarchies Lecture 2

Multi-Way Search Trees

Comp 335 File Structures. B - Trees

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

Chapter 12: Indexing and Hashing (Cnt(

Chapter 11: Indexing and Hashing

Data Structure: Search Trees 2. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

What is a Multi-way tree?

B-Trees. Large degree B-trees used to represent very large dictionaries that reside on disk.

Outline. Definition. 2 Height-Balance. 3 Searches. 4 Rotations. 5 Insertion. 6 Deletions. 7 Reference. 1 Every node is either red or black.

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

13.4 Deletion in red-black trees

A red-black tree is a balanced binary search tree with the following properties:

Red-Black Trees. 2/24/2006 Red-Black Trees 1

Self-Balancing Search Trees. Chapter 11

Balanced Trees Part One

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

CSCI 136 Data Structures & Advanced Programming. Lecture 25 Fall 2018 Instructor: B 2

Intro to DB CHAPTER 12 INDEXING & HASHING

CS 380 ALGORITHM DESIGN AND ANALYSIS

OPPA European Social Fund Prague & EU: We invest in your future.

B Tree. Also, every non leaf node must have at least two successors and all leaf nodes must be at the same level.

Sorting. CMPS 2200 Fall Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk

Properties of red-black trees

Database Systems. File Organization-2. A.R. Hurson 323 CS Building

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Lecture 6: Analysis of Algorithms (CS )

Computational Geometry

Binary Trees, Binary Search Trees

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Module 4: Dictionaries and Balanced Search Trees

Transcription:

CS 3343 -- Fall 2010 B-trees Carola Wenk 10/19/10 CS 3343 Analysis of Algorithms 1

External memory dictionary Task: Given a large amount of data that does not fit into main memory, process it into a dictionary data structure Need to minimize number of disk accesses With each disk read, read a whole block of data Construct a balanced search tree that uses one disk block per tree node Each node needs to contain more than one key 10/19/10 CS 3343 Analysis of Algorithms 2

k-ary y search trees A k-ary search tree T is defined as follows: For each node x of T: x has at most k children (i.e., T is a k-ary y tree) x stores an ordered list of pointers to its children, and an ordered list of keys For every internal node: #keys = #children-1 x fulfills the search tree property: keys in subtree rooted at i-th child i-th key < keys in subtree rooted at t(i+1) (i+1)-st child 10/19/10 CS 3343 Analysis of Algorithms 3

Example of a 4-ary tree 10/19/10 CS 3343 Analysis of Algorithms 4

Example of a 4-ary search tree 10 25 6 12 15 21 30 45 2 7 8 11 14 20 23 24 27 40 50 1 10/19/10 CS 3343 Analysis of Algorithms 5

B-tree A B-tree T with minimum degree k 2 is defined as follows: 1. T is a (2k)-ary search tree 2. Every node, except the root, stores at least k-11 keys (every internal non-root node has at least k children) 3. The root must store at least one key 4. All leaves have the same depth 10/19/10 CS 3343 Analysis of Algorithms 6

B-tree with k=2 10 25 6 12 15 21 30 45 2 7 8 11 14 20 23 24 27 40 50 1. T is a (2k)-ary search tree 10/19/10 CS 3343 Analysis of Algorithms 7

B-tree with k=2 10 25 6 12 15 21 30 45 2 7 8 11 14 20 23 24 27 40 50 2. Every node, except the root, stores at least k-1 keys 10/19/10 CS 3343 Analysis of Algorithms 8

B-tree with k=2 10 25 6 12 15 21 30 45 2 7 8 11 14 20 23 24 27 40 50 3. The root must store at least one key 10/19/10 CS 3343 Analysis of Algorithms 9

B-tree with k=2 10 25 6 12 15 21 30 45 2 7 8 11 14 20 23 24 27 40 50 4. All leaves have the same depth 10/19/10 CS 3343 Analysis of Algorithms 10

B-tree with k=2 10 25 6 12 15 21 30 45 2 7 8 11 14 20 23 24 27 40 50 Remark: This is a (2,3,4)-tree. (234)t 10/19/10 CS 3343 Analysis of Algorithms 11

Height of a B-tree Theorem: For a B-tree with minimum degree k 2 which stores n keys has height h holds: h log k (n+1)/2 Proof: #nodes 1+2+2k+2k 2 + +2k h-1 level 1 level 3 level 0 level 2 h-1 n = #keys 1+(k-1)Σ2k i = 1+2(k-1) k h -1 = 2k h -1 i=0 k-1 10/19/10 CS 3343 Analysis of Algorithms 12

B-tree search B-TREE-SEARCH(x,key) i 1 while i #keys of x and key > i-th key of x do i i+1 if i #keys of x and key = i-th key of x then return (xi) (x,i) if x is a leaf then return NIL else b=disk-read(i-th child of x) return B-TREE-SEARCH(b,key) 10/19/10 CS 3343 Analysis of Algorithms 13

B-tree search runtime O(k) per node Path has height h = O(log k n) CPU-time: O(k log k n) Disk accesses: O(log k n) disk accesses are more expensive than CPU time 10/19/10 CS 3343 Analysis of Algorithms 14

B-tree insert There are different insertion strategies. We just cover one of them Make one pass down the tree: The goal is to insert the new key into a leaf Search where key should be inserted Only descend into non-full nodes: If a node is full, split it. Then continue descending. Splitting of the root node is the only way a B- tree grows in height 10/19/10 CS 3343 Analysis of Algorithms 15

B-TREE-SPLIT-CHILD(x,i,y),y) has 2k-1 keys Split full node y into two nodes y and z of k-1 keys Median key S of y is moved up into y s parent x Example below for k = 4 10/19/10 CS 3343 Analysis of Algorithms 16

Split root: B-TREE-SPLIT-CHILD(s,1,r) The full root node r is split in two. A new root node s is created s contains the median key H of r and has the two halves of r as children Example below for k = 4 10/19/10 CS 3343 Analysis of Algorithms 17

B-TREE-INSERT(T,key) r = root[t] if (# keys in r) = 2k-1 // root r is full //insert new root node: s ALLOCATE-NODE() root[t] s // split old root r to be two children of new root s B-TREE-SPLIT-CHILD(s,1,r) B-TREE-INSERT-NONFULL(s,key) k else B-TREE-INSERT-NONFULL(r,key) 10/19/10 CS 3343 Analysis of Algorithms 18

B-TREE-INSERT-NONFULL(x,key) y if x is a leaf then insert key at the correct (sorted) position in x DISK-WRITE(x) else find child c of x which by the search tree property should contain key DISK-READ(c) if c is full then // c contains 2k-1 keys B-TREE-SPLIT-CHILD(x,i,c) c=child of x which should contain key B-TREE-INSERT-NONFULL(c,key) 10/19/10 CS 3343 Analysis of Algorithms 19

Insert example (k=3) G M P X A C D E J K N O R S T U V Y Z Insert B: G M P X A B C D E J K N O R S T U V Y Z 10/19/10 CS 3343 Analysis of Algorithms 20

Insert example (k=3) -- cont. G M P X A B C D E J K N O R S T U V Y Z Insert Q: G M P T X node is full A B C D E J K N O Q R SS U V Y Z 10/19/10 CS 3343 Analysis of Algorithms 21

Insert example (k=3) -- cont. node is full G M P T X A B C D E J K N O Y Z Q R S U V Insert L: P GM TX ABCDE JKL J K NO QRS UV YZ 10/19/10 CS 3343 Analysis of Algorithms 22

Insert example (k=3) -- cont. P node is full G M T X A B C D E J K L N O Q R S U V Y Z Insert F: P C G M T X AB DEF D E JKL NO QRS UV YZ 10/19/10 CS 3343 Analysis of Algorithms 23

Runtime of B-TREE-INSERT O(k) runtime per node Path has height h = O(log k n) CPU-time: O(k log k n) Disk accesses: O(log k n) disk accesses are more expensive than CPU time 10/19/10 CS 3343 Analysis of Algorithms 24

Deletion of an element Similar to insertion, but a bit more complicated; see book for details If sibling nodes get not full enough, they are merged into a single node Same complexity as insertion 10/19/10 CS 3343 Analysis of Algorithms 25

B-trees -- Conclusion B-trees are balanced 2k-ary search trees The degree of each node is bounded from above and below using the parameter k All leaves are at the same height No rotations are needed: During insertion (or deletion) the balance is maintained by node splitting (or node merging) ) The tree grows (shrinks) in height only by splitting (or merging) the root 10/19/10 CS 3343 Analysis of Algorithms 26