B-Trees. CS321 Spring 2014 Steve Cutchin

Similar documents
B-Trees. Disk Storage. What is a multiway tree? What is a B-tree? Why B-trees? Insertion in a B-tree. Deletion in a B-tree

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

What is a Multi-way tree?

Data Structures and Algorithms

CS 206 Introduction to Computer Science II

Trees. Eric McCreath

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

Algorithms. AVL Tree

Trees (Part 1, Theoretical) CSE 2320 Algorithms and Data Structures University of Texas at Arlington

M-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss)

Binary Trees

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

Trees. (Trees) Data Structures and Programming Spring / 28

CMPS 2200 Fall 2017 B-trees Carola Wenk

M-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =

CS 171: Introduction to Computer Science II. Binary Search Trees

CS Fall 2010 B-trees Carola Wenk

Augmenting Data Structures

Trees. Q: Why study trees? A: Many advance ADTs are implemented using tree-based data structures.

CS F-11 B-Trees 1

Cpt S 122 Data Structures. Data Structures Trees

Self-Balancing Search Trees. Chapter 11

Recall: Properties of B-Trees

CS 350 : Data Structures B-Trees

CS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives.

CSE 326: Data Structures B-Trees and B+ Trees

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

Data Structures. Motivation

Section 4 SOLUTION: AVL Trees & B-Trees

Programming II (CS300)

CS350: Data Structures B-Trees

Chapter 20: Binary Trees

CS24 Week 8 Lecture 1

B-Trees & its Variants

Multiway Search Trees. Multiway-Search Trees (cont d)

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

amiri advanced databases '05

BBM 201 Data structures

C SCI 335 Software Analysis & Design III Lecture Notes Prof. Stewart Weiss Chapter 4: B Trees

OPPA European Social Fund Prague & EU: We invest in your future.

Physical Level of Databases: B+-Trees

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

B-Trees and External Memory

Binary Trees, Binary Search Trees

CSE 530A. B+ Trees. Washington University Fall 2013

CS350: Data Structures Red-Black Trees

Fall, 2015 Prof. Jungkeun Park

B-Trees and External Memory

CSE 214 Computer Science II Introduction to Tree

Chapter 12: Indexing and Hashing (Cnt(

Objectives. Upon completion you will be able to:

Lecture 3: B-Trees. October Lecture 3: B-Trees

CSIT5300: Advanced Database Systems

CS127: B-Trees. B-Trees

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

Motivation for B-Trees

Multi-Way Search Tree

CSE 373 OCTOBER 11 TH TRAVERSALS AND AVL

Algorithms. Deleting from Red-Black Trees B-Trees

Abstract Data Structures IB Computer Science. Content developed by Dartford Grammar School Computer Science Department

Storing Data: Disks and Files

Chapter 12: Indexing and Hashing. Basic Concepts

Intro to DB CHAPTER 12 INDEXING & HASHING

Trees. CSE 373 Data Structures

Data Structures and Algorithms

Data Structure Lecture#10: Binary Trees (Chapter 5) U Kang Seoul National University

Chapter 11: Indexing and Hashing

Tree traversals and binary trees

Chapter 12: Indexing and Hashing

(2,4) Trees. 2/22/2006 (2,4) Trees 1

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

CSE 373 OCTOBER 25 TH B-TREES

Background: disk access vs. main memory access (1/2)

Algorithms and Data Structures CS-CO-412

Data Structures in Java

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Balanced Search Trees

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College!

Lecture 7. Binary Search Trees and Red-Black Trees

Trees 11/15/16. Chapter 11. Terminology. Terminology. Terminology. Terminology. Terminology

TREES Lecture 12 CS2110 Spring 2019

CSCI-401 Examlet #5. Name: Class: Date: True/False Indicate whether the sentence or statement is true or false.

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

AVL Trees. (AVL Trees) Data Structures and Programming Spring / 17

Postfix (and prefix) notation

Chapter 12: Indexing and Hashing

2-3 and Trees. COL 106 Shweta Agrawal, Amit Kumar, Dr. Ilyas Cicekli

BINARY SEARCH TREES cs2420 Introduction to Algorithms and Data Structures Spring 2015

Data Structures Week #6. Special Trees

Balanced search trees

Data Structures and Algorithms for Engineers

THE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan

Multiway searching. In the worst case of searching a complete binary search tree, we can make log(n) page faults Everyone knows what a page fault is?

Material You Need to Know

ITEC2620 Introduction to Data Structures

TREES. Trees - Introduction

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

Transcription:

B-Trees CS321 Spring 2014 Steve Cutchin

Topics for Today HW #2 Once Over B Trees Questions PA #3 Expression Trees Balance Factor AVL Heights Data Structure Animations Graphs 2

B-Tree Motivation When data is too large to fit in main memory, then the number of disk accesses becomes important. A disk access is unbelievably expensive compared to a typical computer instruction (mechanical limitations). One disk access is worth about 200,000 instructions. The number of disk accesses will dominate the running time. 3

Motivation Cont.. Secondary memory (disk) is divided into equalsized blocks (typical sizes are 512, 2048, 4096 or 8192 bytes) The basic I/O operation transfers the contents of one disk block to/from main memory. Our goal is to devise a multiway search tree that will minimize file accesses (by exploiting disk block read). 4

m-ary Trees K1 K2 K3 K4 T1 T2 T3 Etc. K < K1 K1 < K < K2 A node contains multiple keys. Order of subtrees is based on parent node s keys If each node has m children & there are n keys then the average time taken to search the tree is log m n. 5

B Tree Definition A B-Tree is a search tree with a root node. Each node in a B-Tree can have multiple keys. Each node in a B-Tree can have multiple children. The number of children is dependent on the number of keys. A node in a B-Tree has at most 1 more child than it has keys. 6

Layout of a B-Tree Each node has at most 3 keys and 4 children. Each node has a minimum of 2 children. This is a 2-3-4 B-Tree 7

Important Metrics The minimal degree of a B-Tree is defined as: Degree = t, t >= 2. Every node except root has at least t children. Every node except root has at least t-1 keys. Every node except root has at most 2*t 1 keys. The order of a B-Tree is defined as: Order = m No node may have more than m children. Therefore: Order = 2*degree; 8

Layout of a B-Tree What is the degree of this B-Tree? What is the order of this B-Tree? 9

Size of B Trees All leaves in a tree have the same depth. The depth of a B-Tree is uniform and equal to its height. By definition all B-Trees are balanced. 10

Size of B Trees For a given B-Tree with n keys and degree t Height h <= log t ((n+1)/2); For a given B-tree with height of h and degree t n >= 2 * t h - 1 11

B-Tree and Block Size A B-Tree Node is usually the size of a Disk Page. So if a Disk Page = 4096 bytes we want our Node to be that size: Say, 84 bytes overhead for the Node. 4 Bytes for each key. 4 Bytes for each child pointer. 4 bytes for num keys, 4 bytes num children. 12

B-Tree and Block Size 4096 = 4K + 4C + 4 + 4 + 84. C = K+1. 4096 = 4K + 4K+4 + 4 + 4 + 84. 4096 = 8K + 12 + 84 4096-12 -84 = 8K K = 500 Keys per Node for one block. C = 501 Children per Node for each block. A tree of height 2 has 125,751,500 Keys A tree of height 2 has 251,503 Disk Blocks. 13

Definition of a B-Tree Def: B-tree of degree t is a tree with the following properties.: The root has at least 2 children, unless it is a leaf. Every non-root node must have t-1 keys. Every non-root internal node has t children. If the tree is non-empty the root has at least one key. Every node may have at most 2t-1 keys. An internal node may have at most 2t children. A full tree occurs when every node has 2t-1 keys. 14

Components of B-Tree Nodes Every node x has the following attributes: X.n = the number of keys in X X.keys[n] = the actual keys. X.leaf = is this a leaf? Can the root be a leaf? X.child[n+1] = array of pointers to the children. Rule: key[1] <= key[2] <= key[n]. 15

Definition of a B-Tree Def: B-tree of order m is a tree with the following properties: The root has at least 2 children, unless it is a leaf. No node in the tree has more then m children. Every node except for the root and the leaves have at least m/2 children. All leaves appear at the same level. An internal node with k children contains exactly k-1 keys. 16

B-Trees & Efficiency Used in Mac, NTFS, OS2 for file structure. Allow insertion and deletion into a tree structure, based on log m n property, where m is the order of the tree. The idea is that you leave some key spaces open. So an insert of a new key is done using available space (most cases). Less dynamic then our typical Binary Tree Efficient for disk based operations. 17

2-3 Trees G C I M A D E H J K N O 18

B Tree Operations (adt) Search(key) Insert(key) Delete(key) 19

Searching m-ary Trees A generalized SOT will visit all keys in ascending order. for (i==1;i<=m-1;i++) { visit subtree to left of k i visit k i } visit subtree to right of k m-1 20

Basic Recursive Search Ordered Recursive Search. Array indexed by 1. Search(T,k) for (i==1;i<=m-1;i++) { if (k < k i ) return Search(T.child[i],k); } Return Search(T.child[m],k); Notice the for loop! O(?) 21