Data Structures and Algorithms

Similar documents
CS F-11 B-Trees 1

CS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives.

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Programming II (CS300)

Binary Trees

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Design and Analysis of Algorithms Lecture- 9: B- Trees

(2,4) Trees. 2/22/2006 (2,4) Trees 1

CS 171: Introduction to Computer Science II. Binary Search Trees

Recall: Properties of B-Trees

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is?

Physical Level of Databases: B+-Trees

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

CSE 530A. B+ Trees. Washington University Fall 2013

CS 350 : Data Structures B-Trees

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University

2-3 and Trees. COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees

B-Trees. Based on materials by D. Frey and T. Anastasio

UNIT III BALANCED SEARCH TREES AND INDEXING

Advanced Set Representation Methods

Balanced Search Trees

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

Trees. Eric McCreath

Data Structures and Algorithms

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

CSIT5300: Advanced Database Systems

TREES. Trees - Introduction

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Binary Trees, Binary Search Trees

Module 8: Binary trees

Self-Balancing Search Trees. Chapter 11

CS350: Data Structures B-Trees

Material You Need to Know

Introduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana

M-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss)

Algorithms. Deleting from Red-Black Trees B-Trees

EE 368. Weeks 5 (Notes)

CS Fall 2010 B-trees Carola Wenk

CSCI Trees. Mark Redekopp David Kempe

Module 9: Binary trees

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

Data Structures and Algorithms

Trees. (Trees) Data Structures and Programming Spring / 28

Indexing and Hashing

Balanced Binary Search Trees. Victor Gao

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Motivation for B-Trees

Balanced Trees Part One

Multiway Search Trees

Problem. Indexing with B-trees. Indexing. Primary Key Indexing. B-trees: Example. B-trees. primary key indexing

M-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

CSE 326: Data Structures B-Trees and B+ Trees

Multi-way Search Trees

B-Trees. CS321 Spring 2014 Steve Cutchin

Trees. Q: Why study trees? A: Many advance ADTs are implemented using tree-based data structures.

THE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan

Graduate Algorithms CS F-07 Red/Black Trees

A set of nodes (or vertices) with a single starting point

(2,4) Trees Goodrich, Tamassia. (2,4) Trees 1

Analysis of Algorithms

Data Structures and Algorithms for Engineers

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

Module 4: Dictionaries and Balanced Search Trees

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

8. Binary Search Tree

Augmenting Data Structures

CSE 214 Computer Science II Introduction to Tree

Comp 335 File Structures. B - Trees

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

CIS265/ Trees Red-Black Trees. Some of the following material is from:

Background: disk access vs. main memory access (1/2)

CS24 Week 8 Lecture 1

Data Structures and Algorithms

Spring 2018 Mentoring 8: March 14, Binary Trees

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

B-Trees & its Variants

B-Trees. Introduction. Definitions

CMPS 2200 Fall 2017 B-trees Carola Wenk

Tree: non-recursive definition. Trees, Binary Search Trees, and Heaps. Tree: recursive definition. Tree: example.

Quiz 1 Solutions. (a) f(n) = n g(n) = log n Circle all that apply: f = O(g) f = Θ(g) f = Ω(g)

Binary heaps (chapters ) Leftist heaps

Algorithms. AVL Tree

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

V Advanced Data Structures

Intro to DB CHAPTER 12 INDEXING & HASHING

9. Heap : Priority Queue

INF2220: algorithms and data structures Series 1

Search Trees - 1 Venkatanatha Sarma Y

CS-301 Data Structure. Tariq Hanif

BINARY SEARCH TREES cs2420 Introduction to Algorithms and Data Structures Spring 2015

CS350: Data Structures Red-Black Trees

Data Structures in Java

Graduate Algorithms CS F-08 Extending Data Structures

Transcription:

Data Structures and Algorithms CS245-2008S-19 B-Trees David Galles Department of Computer Science University of San Francisco

19-0: Indexing Operations: Add an element Remove an element Find an element, using a key Find all elements in a range of key values

19-1: Indexing Sorted List Find / Find in Range fast Add / Remove slow Unsorted List / Hash Table Add, Find, Remove fast (hash) Find in Range slow Binary Search Tree All operations are fast (O(lg n)) if the tree is balanced

19-2: Indexing Generalized Binary Search Trees Each node can store several keys, instead of just one Values in subtrees between values in surrounding keys For non leaves, # of children = # of keys + 1 2 6 1 3 4 7

19-3: 2-3 Trees Generalized Binary Search Tree Each node has 1 or 2 keys Each (non-leaf) node has 2-3 children hence the name, 2-3 Trees All leaves are at the same depth

19-4: Example 2-3 Tree 6 16 3 8 13 18 2 5 7 11 12 14 17 20

19-5: Finding in 2-3 Trees How can we find an element in a 2-3 tree?

19-6: Finding in 2-3 Trees How can we find an element in a 2-3 tree? If the tree is empty, return false If the element is stored at the root, return true Otherwise, recursively find in the appropriate subtree

19-7: Inserting into 2-3 Trees Always insert at the leaves To insert an element: Find the leaf where the element would live, if it was in the tree Add the element to that leaf

19-8: Inserting into 2-3 Trees Always insert at the leaves To insert an element: Find the leaf where the element would live, if it was in the tree Add the element to that leaf What if the leaf already has 2 elements?

19-9: Inserting into 2-3 Trees Always insert at the leaves To insert an element: Find the leaf where the element would live, if it was in the tree Add the element to that leaf What if the leaf already has 2 elements? Split!

19-10: Splitting Nodes 5 6 7 6 5 7

19-11: Splitting Nodes 4 1 2 5 6 7 Too many elements

19-12: Splitting Nodes 4 Promote to parent 1 2 5 6 7 Left child of 6 Right child of 6

19-13: Splitting Nodes 4 6 1 2 5 7

19-14: Splitting Root When we split the root: Create a new root Tree grows in height by 1

19-15: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 1

19-16: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 1 2

19-17: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 1 2 3 Too many keys, need to split

19-18: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 2 1 3

19-19: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 2 1 3 4

19-20: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 2 1 3 4 5 Too many keys, need to split

19-21: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 2 4 1 3 5

19-22: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 2 4 1 3 5 6

19-23: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 2 4 1 3 5 6 7 Too many keys need to split

19-24: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 2 4 6 Too many keys need to split 1 3 5 7

19-25: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 4 2 6 1 3 5 7

19-26: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 4 2 6 1 3 5 7 8

19-27: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 4 2 6 1 3 5 7 8 9 Too many keys need to split

19-28: 2-3 Tree Example Inserting elements 1-9 (in order) into a 2-3 tree 4 2 6 8 1 3 5 7 9

19-29: Deleting from 2-3 Tree As with BSTs, we will have 2 cases: Deleting a key from a leaf Deleting a key from an internal node

19-30: Deleting Leaves If leaf contains 2 keys Can safely remove a key

19-31: Deleting Leaves 4 8 3 5 7 11 Deleting 7

19-32: Deleting Leaves 4 8 3 5 11 Deleting 7

19-33: Deleting Leaves If leaf contains 1 key Cannot remove key without making leaf empty Try to steal extra key from sibling

19-34: Deleting Leaves 4 8 3 5 7 11 Deleting 3 we can steal the 5

19-35: Deleting Leaves 4 8 5 7 11 Not a 2-3 tree. What can we do?

19-36: Deleting Leaves 4 8 5 7 11 Steal key from sibling through parent

19-37: Deleting Leaves 5 8 4 7 11 Steal key from sibling through parent

19-38: Deleting Leaves If leaf contains 1 key, and no sibling contains extra keys Cannot remove key without making leaf empty Cannot steal a key from a sibling Merge with sibling split in reverse

19-39: Merging Nodes 5 8 4 7 11 Removing the 4

19-40: Merging Nodes 5 8 Removing the 4 7 11 Combine 5, 7 into one node

19-41: Merging Nodes 8 5 7 11

19-42: Merging Nodes Merge decreases the number of keys in the parent May cause parent to have too few keys Parent can steal a key, or merge again

19-43: Merging Nodes 4 2 6 8 1 3 5 7 9 Deleting the 3 cause a merge

19-44: Merging Nodes 4 6 8 1 2 5 7 9 Deleting the 3 cause a merge Not enough keys in parent

19-45: Merging Nodes 4 6 8 1 2 5 7 9 Steal key from sibling

19-46: Merging Nodes 6 4 8 1 2 5 7 9 Steal key from sibling

19-47: Merging Nodes 6 4 8 1 2 5 7 9 When we steal a key from an internal node, steal nearest subtree as well

19-48: Merging Nodes 6 4 8 1 2 5 7 9 When we steal a key from an internal node, steal nearest subtree as well

19-49: Merging Nodes 6 4 8 1 2 5 7 9 Deleting the 7 cause a merge

19-50: Merging Nodes 6 4 1 2 5 8 9 Parent has too few keys merge again

19-51: Merging Nodes 4 6 1 2 5 8 9 Root has no keys delete

19-52: Merging Nodes 4 6 1 2 5 8 9

19-53: Deleting Interior Keys How can we delete keys from non-leaf nodes? HINT: How did we delete non-leaf nodes in standard BSTs?

19-54: Deleting Interior Keys How can we delete keys from non-leaf nodes? Replace key with smallest element subtree to right of key Recursivly delete smallest element from subtree to right of key (can also use largest element in subtree to left of key)

19-55: Deleting Interior Keys 4 2 7 Deleting the 4 1 3 5 6 8 9

19-56: Deleting Interior Keys 4 2 7 Deleting the 4 1 3 5 6 8 9 Replace 4 with smallest element in tree to right of 4

19-57: Deleting Interior Keys 5 2 7 1 3 6 8 9

19-58: Deleting Interior Keys 5 2 7 Deleting the 5 1 3 6 8 9

19-59: Deleting Interior Keys 5 2 7 Deleting the 5 1 3 6 8 9 Replace the 5 with the smallest element in tree to right of 5

19-60: Deleting Interior Keys 6 2 7 Deleting the 5 1 3 8 9 Replace the 5 with the smallest element in tree to right of 5 Node with two few keys

19-61: Deleting Interior Keys 6 2 7 1 3 8 9 Node with two few keys Steal a key from a sibling

19-62: Deleting Interior Keys 6 2 8 1 3 7 9

19-63: Deleting Interior Keys 6 10 2 8 11 1 3 7 9 12 13 Removing the 6

19-64: Deleting Interior Keys 6 10 2 8 11 1 3 7 9 12 13 Removing the 6 Replace the 6 with the smallest element in the tree to the right of the 6

19-65: Deleting Interior Keys 7 10 2 8 11 1 3 9 Node with too few keys Can t steal key from sibling Merge with sibling 12 13

19-66: Deleting Interior Keys 7 10 2 11 1 3 8 9 12 13 Node with too few keys Can t steal key from sibling Merge with sibling (arbitrarily pick right sibling to merge with)

19-67: Deleting Interior Keys 7 2 10 11 1 3 8 9 12 13

19-68: Generalizing 2-3 Trees In 2-3 Trees: Each node has 1 or 2 keys Each interior node has 2 or 3 children We can generalize 2-3 trees to allow more keys / node

19-69: B-Trees A B-Tree of maximum degree k: All interior nodes have k/2...k children All nodes have k/2 1...k 1 keys 2-3 Tree is a B-Tree of maximum degree 3

19-70: B-Trees 5 11 16 19 1 3 7 8 9 12 15 17 18 22 23 B-Tree with maximum degree 5 Interior nodes have 3 5 children All nodes have 2-4 keys

19-71: B-Trees Inserting into a B-Tree Find the leaf where the element would go If the leaf is not full, insert the element into the leaf Otherwise, split the leaf (which may cause further splits up the tree), and insert the element

19-72: B-Trees 5 11 16 19 1 3 7 8 9 12 15 17 18 22 23 Inserting a 6..

19-73: B-Trees 5 11 16 19 1 3 6 7 8 9 12 15 17 18 22 23

19-74: B-Trees 5 11 16 19 1 3 6 7 8 9 12 15 17 18 22 23 Inserting a 10..

19-75: B-Trees 5 11 16 19 1 3 6 7 8 9 10 12 15 17 18 22 23 Too many keys need to split Promote 8 to parent (between 5 and 11) Make nodes out of (6, 7) and (9, 10)

19-76: B-Trees Too many keys need to split 5 8 11 16 19 1 3 6 7 9 10 12 15 17 18 22 23 Promote 11 to parent (new root) Make nodes out of (5, 8) and (6, 19)

19-77: B-Trees 11 5 8 16 19 1 3 6 7 9 10 12 15 17 18 22 23 Note that the root only has 1 key, 2 children All nodes in B-Trees with maximum degree 5 should have at least 2 keys The root is an exception it may have as few as one key and two children for any maximum degree

19-78: B-Trees B-Tree of maximum degree k Generalized BST All leaves are at the same depth All nodes (other than the root) have k/2 1...k 1 keys All interior nodes (other than the root) have k/2... k children

19-79: B-Trees B-Tree of maximum degree k Generalized BST All leaves are at the same depth All nodes (other than the root) have k/2 1...k 1 keys All interior nodes (other than the root) have k/2... k children Why do we need to make exceptions for the root?

19-80: B-Trees Why do we need to make exceptions for the root? Consider a B-Tree of maximum degree 5 with only one element

19-81: B-Trees Why do we need to make exceptions for the root? Consider a B-Tree of maximum degree 5 with only one element Consider a B-Tree of maximum degree 5 with 5 elements

19-82: B-Trees Why do we need to make exceptions for the root? Consider a B-Tree of maximum degree 5 with only one element Consider a B-Tree of maximum degree 5 with 5 elements Even when a B-Tree could be created for a specific number of elements, creating an exception for the root allows our split/merge algorithm to work correctly.

19-83: B-Trees Deleting from a B-Tree (Key is in a leaf) Remove key from leaf Steal / Split as necessary May need to split up tree as far as root

19-84: B-Trees 5 11 16 19 1 3 7 8 9 12 15 17 18 22 23 Deleting the 15

19-85: B-Trees 5 11 16 19 1 3 7 8 9 12 17 18 22 23 Too few keys

19-86: B-Trees 5 11 16 19 1 3 7 8 9 12 17 18 22 23 Steal a key from sibling

19-87: B-Trees 5 9 16 19 1 3 7 8 11 12 17 18 22 23

19-88: B-Trees 5 9 16 19 1 3 7 8 11 12 17 18 22 23 Delete the 11

19-89: B-Trees 5 9 16 19 1 3 7 8 12 17 18 22 23 Too few keys

19-90: B-Trees 5 9 16 19 1 3 7 8 12 17 18 22 23 Combine into 1 node Merge with a sibling (pick the left sibling arbitrarily)

19-91: B-Trees 5 16 19 1 3 7 8 9 12 17 18 22 23

19-92: B-Trees Deleting from a B-Tree (Key in internal node) Replace key with largest key in right subtree Remove largest key from right subtree (May force steal / merge)

19-93: B-Trees 5 16 19 1 3 7 8 9 12 17 18 22 23 Remove the 5

19-94: B-Trees 5 16 19 1 3 7 8 9 12 17 18 22 23 Remove the 5

19-95: B-Trees 7 16 19 1 3 8 9 12 17 18 22 23

19-96: B-Trees 7 16 19 1 3 8 9 12 17 18 22 23 Remove the 19

19-97: B-Trees 7 16 19 1 3 8 9 12 17 18 22 23 Remove the 19

19-98: B-Trees 7 16 22 1 3 8 9 12 17 18 23 Too few keys

19-99: B-Trees 7 16 22 1 3 8 9 12 17 18 23 Merge with left sibling

19-100: B-Trees 7 16 1 3 8 9 12 17 18 22 23

19-101: B-Trees Almost all databases that are large enough to require storage on disk use B-Trees Disk accesses are very slow Accessing a byte from disk is 10,000 100,000 times as slow as accessing from main memory Recently, this gap has been getting even bigger Compared to disk accesses, all other operations are essentially free Most efficient algorithm minimizes disk accesses as much as possible

19-102: B-Trees Disk accesses are slow want to minimize them Single disk read will read an entire sector of the disk Pick a maximum degree k such that a node of the B-Tree takes up exactly one disk block Typically on the order of 100 children / node

19-103: B-Trees With a maximum degree around 100, B-Trees are very shallow Very few disk reads are required to access any piece of data Can improve matters even more by keeping the first few levels of the tree in main memory For large databases, we can t store the entire tree in main memory but we can limit the number of disk accesses for each operation to only 1 or 2