A set of nodes (or vertices) with a single starting point

Similar documents
Trees. Introduction & Terminology. February 05, 2018 Cinda Heeren / Geoffrey Tien 1

Binary Trees, Binary Search Trees

Trees. (Trees) Data Structures and Programming Spring / 28

CS302 - Data Structures using C++

Binary Trees

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

TREES. Trees - Introduction

3137 Data Structures and Algorithms in C++

Introduction to Binary Trees

Binary Tree. Preview. Binary Tree. Binary Tree. Binary Search Tree 10/2/2017. Binary Tree

Trees, Binary Trees, and Binary Search Trees

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

Chapter 20: Binary Trees

Principles of Computer Science

Complete Binary Trees

CS102 Binary Search Trees

BE ISE 3 rd Semester

Binary Trees. Examples:

BINARY SEARCH TREES. Problem Solving with Computers-II

BBM 201 Data structures

Binary Trees and Binary Search Trees

Programming II (CS300)

Tree traversals. Review: recursion Tree traversals. October 05, 2017 Cinda Heeren / Geoffrey Tien 1

Trees. Truong Tuan Anh CSE-HCMUT

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

Binary Trees. BSTs. For example: Jargon: Data Structures & Algorithms. root node. level: internal node. edge.

CMPT 225. Binary Search Trees

Binary Trees. Height 1

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Tree Structures. Definitions: o A tree is a connected acyclic graph. o A disconnected acyclic graph is called a forest

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

Trees 11/15/16. Chapter 11. Terminology. Terminology. Terminology. Terminology. Terminology

Binary search trees (BST) Binary search trees (BST)

Lecture 6: Analysis of Algorithms (CS )

Terminology. The ADT Binary Tree. The ADT Binary Search Tree

Outline. Preliminaries. Binary Trees Binary Search Trees. What is Tree? Implementation of Trees using C++ Tree traversals and applications

Chapter 11.!!!!Trees! 2011 Pearson Addison-Wesley. All rights reserved 11 A-1

Chapter 11.!!!!Trees! 2011 Pearson Addison-Wesley. All rights reserved 11 A-1

Algorithms. Deleting from Red-Black Trees B-Trees

BST Deletion. First, we need to find the value which is easy because we can just use the method we developed for BST_Search.

6-TREE. Tree: Directed Tree: A directed tree is an acyclic digraph which has one node called the root node

Lecture Notes 16 - Trees CSS 501 Data Structures and Object-Oriented Programming Professor Clark F. Olson

INF2220: algorithms and data structures Series 1

Queues. ADT description Implementations. October 03, 2017 Cinda Heeren / Geoffrey Tien 1

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

8. Binary Search Tree

Binary Trees. College of Computing & Information Technology King Abdulaziz University. CPCS-204 Data Structures I

Data Structures and Algorithms for Engineers

Chapter 5. Binary Trees

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

a graph is a data structure made up of nodes in graph theory the links are normally called edges

Discussion 2C Notes (Week 8, February 25) TA: Brian Choi Section Webpage:

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

(2,4) Trees. 2/22/2006 (2,4) Trees 1

Algorithms and Data Structures (INF1) Lecture 8/15 Hua Lu

BINARY TREES, THE SEARCH TREE ADT BINARY SEARCH TREES, RED BLACK TREES, TREE TRAVERSALS, B TREES WEEK - 6

We have the pointers reference the next node in an inorder traversal; called threads

Informatik II Tutorial 2. Subho Shankar Basu

Associate Professor Dr. Raed Ibraheem Hamed

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

Trees. Dr. Ronaldo Menezes Hugo Serrano Ronaldo Menezes, Florida Tech

Analysis of Algorithms

BINARY SEARCH TREES (CONTD) Problem Solving with Computers-II

Data and File Structures Laboratory

Trees. Q: Why study trees? A: Many advance ADTs are implemented using tree-based data structures.

CS 234. Module 5. October 18, CS 234 Module 5 ADTS with items related by structure 1 / 25

Data Structure. Chapter 5 Trees (Part II) Angela Chih-Wei Tang. National Central University Jhongli, Taiwan

Binary search trees. Binary search trees are data structures based on binary trees that support operations on dynamic sets.

Define the red- black tree properties Describe and implement rotations Implement red- black tree insertion

Tree Structures. A hierarchical data structure whose point of entry is the root node

Binary search trees 3. Binary search trees. Binary search trees 2. Reading: Cormen et al, Sections 12.1 to 12.3

Garbage Collection: recycling unused memory

Lecture: Analysis of Algorithms (CS )

To find a value in a BST search from the root node:

Tree Data Structures CSC 221

Properties of red-black trees

Trees. CSE 373 Data Structures

Algorithms. AVL Tree

Data Structures and Algorithms

Trees. Chapter 6. strings. 3 Both position and Enumerator are similar in concept to C++ iterators, although the details are quite different.

Week 2. TA Lab Consulting - See schedule (cs400 home pages) Peer Mentoring available - Friday 8am-12pm, 12:15-1:30pm in 1289CS

Algorithms and Data Structures

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

Tree: non-recursive definition. Trees, Binary Search Trees, and Heaps. Tree: recursive definition. Tree: example.

Lecture 11: Multiway and (2,4) Trees. Courtesy to Goodrich, Tamassia and Olga Veksler

CISC 235 Topic 3. General Trees, Binary Trees, Binary Search Trees

Data Structures and Algorithms

Fall, 2015 Prof. Jungkeun Park

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

CSCI-401 Examlet #5. Name: Class: Date: True/False Indicate whether the sentence or statement is true or false.

Binary Trees. Recursive definition. Is this a binary tree?

Friday Four Square! 4:15PM, Outside Gates

CSE 230 Intermediate Programming in C and C++ Binary Tree

Trees : Part 1. Section 4.1. Theory and Terminology. A Tree? A Tree? Theory and Terminology. Theory and Terminology

CSI33 Data Structures

Binary Tree. Binary tree terminology. Binary tree terminology Definition and Applications of Binary Trees

binary tree empty root subtrees Node Children Edge Parent Ancestor Descendant Path Depth Height Level Leaf Node Internal Node Subtree

Data Structure Lecture#10: Binary Trees (Chapter 5) U Kang Seoul National University

CS301 - Data Structures Glossary By

CE 221 Data Structures and Algorithms

Transcription:

Binary Search Trees

Understand tree terminology Understand and implement tree traversals Define the binary search tree property Implement binary search trees Implement the TreeSort algorithm 2

A set of nodes (or vertices) with a single starting point called the root Each node is connected by an edge to another node A tree is a connected graph There is a path to every node in the tree A tree has one less edge than the number of nodes node root edge 4

yes! NO! All the nodes are not connected yes! (but not a binary tree) NO! There is an extra edge (5 nodes and 5 edges) yes! (it s actually a similar graph to the blue one) 5

Node v is said to be a child of u, and u the parent of v if There is an edge between the nodes u and v, and edge root A parent of B, C, D u is above v in the tree, This relationship can be generalized B C D E and F are descendants of A D and A are ancestors of G E F G B, C and D are siblings F and G are? 6

A leaf is a node with no children A path is a sequence of nodes v 1 v n where v i is a parent of v i+1 (1 i n) A subtree is any node in the tree along with all of its descendants A binary tree is a tree with at most two children per node The children are referred to as left and right We can also refer to left and right subtrees 7

A path from A to D to G B C D subtree rooted at B E F C, E, F and G are leaves G 8

A B C right child of A left subtree of A D E F G right subtree of C H I J 9

The height of a node v is the length of the longest path from v to a leaf The height of the tree is the height of the root The depth of a node v is the length of the path from v to the root This is also referred to as the level of a node Note that there is a slightly different formulation of the height of a tree Where the height of a tree is said to be the number of different levels of nodes in the tree (including the root) 10

A height of the tree is 3 height of node B is 2 B C level 1 D EA F G level 2 H depth of node E is 2 I J level 3 11

A binary tree is perfect, if No node has only one child And all the leaves have the same depth A perfect binary tree of height h has 2 h+1 1 nodes, of which 2 h are leaves Perfect trees are also complete B A D E F C G 12

Each level doubles the number of nodes Level 1 has 2 nodes (2 1 ) Level 2 has 4 nodes (2 2 ) or 2 times the number in Level 1 Therefore a tree with h levels has 2 h+1 1 nodes The root level has 1 node 11 01 12 the bottom level has 2 h nodes, that is, just over ½ the nodes are leaves 21 22 23 24 31 32 33 34 35 36 37 38 13

A binary tree is complete if The leaves are on at most two different levels, The second to bottom level is completely filled in and The leaves on the bottom level are as far to the left as possible B A D E F C 14

A binary tree is balanced if Leaves are all about the same distance from the root The exact specification varies Sometimes trees are balanced by comparing the height of nodes e.g. the height of a node s right subtree is at most one different from the height of its left subtree Sometimes a tree's height is compared to the number of nodes e.g. red-black trees 15

A A B C B C D E F D E F G 16

A A B C B D E C D F 17

A traversal algorithm for a binary tree visits each node in the tree Typically, it will do something while visiting each node! Traversal algorithms are naturally recursive There are three traversal methods Inorder Preorder Postorder 19

inorder(node* nd) { if (nd!= NULL) { inorder(nd->leftchild); visit(nd); inorder(nd->rightchild); } } The visit function would do whatever the purpose of the traversal is, for example print the data in the node 20

visit(nd) preorder(nd->leftchild) preorder(nd->rightchild) 1 17 2 visit preorder(left) 13 6 27 preorder(right) visit preorder(left) preorder(right) 3 visit 9 preorder(left) 5 16 7 20 8 39 preorder(right) 4 11 visit preorder(left) preorder(right) visit preorder(left) preorder(right) visit preorder(left) preorder(right) visit preorder(left) preorder(right) 21

postorder(nd->leftchild) postorder(nd->rightchild) visit(nd) 8 17 4 postorder(left) 13 postorder(right) 7 27 visit postorder(left) postorder(right) visit 2 postorder(left) postorder(right) visit 9 16 5 20 6 39 1 11 3 postorder(left) postorder(right) visit postorder(left) postorder(right) visit postorder(left) postorder(right) visit postorder(left) postorder(right) visit 22

The binary tree ADT can be implemented using different data structures Reference structures (similar to linked lists) Arrays Example implementations Binary search trees (references) Red black trees (references again) Heaps (arrays) not a binary search tree B trees (arrays again) not a binary search tree 24

Consider maintaining data in some order The data is to be frequently searched on the sort key e.g. a dictionary Possible solutions might be: A sorted array Access in O(logn) using binary search Insertion and deletion in linear time An ordered linked list Access, insertion and deletion in linear time 25

The data structure should be able to perform all these operations efficiently Create an empty dictionary Insert Delete Look up The insert, delete and look up operations should be performed in at most O(logn) time 26

A binary search tree is a binary tree with a special property For all nodes in the tree: All nodes in a left subtree have labels less than the label of the subtree's root All nodes in a right subtree have labels greater than or equal to the label of the subtree's root Binary search trees are fully ordered 27

17 13 27 9 16 20 39 11 28

inorder(nd->leftchild) visit(nd) inorder(nd->rightchild) 5 17 An inorder traversal retrieves the data in sorted order 3 inorder(left) 13 visit 7 27 inorder(right) inorder(left) visit inorder(right) 1 inorder(left) 9 visit 4 16 6 20 8 39 inorder(right) 2 11 inorder(left) visit inorder(right) inorder(left) visit inorder(right) inorder(left) visit inorder(right) inorder(left) visit inorder(right) 29

Binary search trees can be implemented using a reference structure Tree nodes contain data and two pointers to nodes Node* leftchild data Node* rightchild data to be stored in the tree (usually an object) references or pointers to Nodes 31

To find a value in a BST search from the root node: If the target is less than the value in the node search its left subtree If the target is greater than the value in the node search its right subtree Otherwise return true, (or a pointer to the data, or ) How many comparisons? One for each node on the path Worst case: height of the tree + 1 32

click on a node to show its value 17 13 27 9 16 20 39 11 33

bool search(node* nd, int x){ if (nd == NULL){ reached the end of this path return false; }else if(x == nd->data){ return true; note the similarity } else if (x < nd->data){ to binary search return search(x, nd->left); } else { return search(x, nd->right); } } called by a helper method like this: search(root, target) 34

The BST property must hold after insertion Therefore the new node must be inserted in the correct position This position is found by performing a search If the search ends at the NULL left child of a node make its left child refer to the new node If the search ends at the NULL right child of a node make its right child refer to the new node The cost is about the same as the cost for the search algorithm, O(height) 36

insert 43 create new node find position 32 47 63 insert new node 43 19 41 54 79 10 23 37 44 53 59 96 7 12 30 43 57 91 97 37

After deletion the BST property must hold Deletion is not as straightforward as search or insertion With insertion the strategy is to insert a new leaf Which avoids changing the internal structure of the tree This is not possible with deletion Since the deleted node's position is not chosen by the algorithm There are a number of different cases to be considered 39

The node to be deleted has no children Remove it (assigning NULL to its parent s reference) The node to be deleted has one child Replace the node with its subtree The node to be deleted has two children 40

delete 30 47 32 63 19 41 54 79 10 23 37 44 53 59 96 7 12 30 57 91 97 41

delete 79 replace with subtree 47 32 63 19 41 54 79 10 23 37 44 53 59 96 7 12 30 57 91 97 42

delete 79 after deletion 47 32 63 19 41 54 10 23 37 44 53 59 96 7 12 30 57 91 97 43

One of the issues with implementing a BST is the necessity to look at both children And, just like a linked list, look ahead for insertion and deletion And check that a node is null before accessing its member variables Consider deleting a node with one child in more detail 44

delete 59 Step 1 - we need to find the node to delete and its parent it s useful to know if nd is a left or right child while (nd!= target) if (nd == NULL) return if (target < nd->data) parent = nd nd = nd->left isleftchild = true else parent = nd nd = nd->right isleftchild = false 63 54 79 53 59 96 57 91 97 45

delete 59 parent 63 Now we have enough information to detach 59 and attach its child to 54 54 79 nd 53 59 96 isleftchild = false 57 91 97 46

The most difficult case is when the node to be deleted has two children The strategy when the deleted node had one child was to replace it with its child But when the node has two children problems arise Which child should we replace the node with? We could solve this by just picking one But what if the node we replace it with also has two children? This will cause a problem 47

delete 32 let s say that we decide to replace it with its right child (41) 32 47 63 But 41 has 2 children, and it also has to inherit (adopt?) the other child of its deleted parent 19 41 54 10 23 37 44 53 59 96 7 12 30 57 91 97 48

When a node has two children, instead of replacing it with one of its children find its predecesor A node s predecessor is the right most node of its left subtree The predecessor is the node in the tree with the largest value less than the node s value The predecesor cannot have a right child and can therefore have at most one child Why? 49

32 s predecessor the predecessor of 32 is the right most node in its left subtree 32 47 63 The predecessor cannot have a right child as it wouldn t then be the right most node 10 19 23 41 54 37 44 53 59 96 7 12 30 57 91 97 50

The predecssor has some useful properties Because of the BST property it must be the largest value less than its ancestor s value It is to the right of all of the nodes in its ancestor s left subtree so must be greater than them It is less than the nodes in its ancestor s right subtree It can have only one child These properties make it a good candidate to replace its ancestor 51

The successor to a node is the left most child of its right subtree It has the smallest value greater than its ancestor s value And cannot have a left child The successor can also be used to replace a deleted node Pick either the precedessor or successor! 52

delete 32 find successor and detach 32 47 63 19 41 54 10 23 37 44 53 59 96 temp 7 12 30 57 91 97 53

delete 32 find successor and detach 47 attach node s children to its successor 32 37 temp 63 19 41 54 10 23 37 44 53 59 96 temp 7 12 30 57 91 97 54

delete 32 find successor and detach 47 attach node s children 32 37 make successor child of node s parent 19 41 temp 54 63 10 23 44 53 59 96 7 12 30 57 91 97 55

delete 32 find successor and detach 47 attach node s children 37 make successor child temp 63 in this example the successor had no subtree 19 41 54 10 23 44 53 59 96 7 12 30 57 91 97 56

delete 63 find predecessor* 32 47 *just because 63 19 41 54 10 23 37 44 53 59 96 temp 7 12 30 57 91 97 57

delete 63 find predecessor attach predecessor s subtree to its parent 32 47 63 19 41 54 10 23 37 44 53 59 96 temp 7 12 30 57 91 97 58

delete 63 find predecessor 47 attach pre s subtree 32 63 59 attach node s children to predecessor 19 41 54 temp 10 23 37 44 53 59 96 temp 7 12 30 57 91 97 59

delete 63 find predecessor 47 attach pre s subtree 32 63 59 attach node s children temp attach the predecessor to the node s parent 19 41 54 10 23 37 44 53 96 7 12 30 57 91 97 60

delete 63 find predecessor 47 attach pre s subtree 32 59 attach node s children attach the predecessor to the node s parent 19 41 54 10 23 37 44 53 96 7 12 30 57 91 97 61

Instead of deleting a BST node mark it as deleted in some way Set the data object to null, for example And change the insertion algorithm to look for empty nodes And insert the new item in an empty node that is found on the way down the tree 62

An alternative to the deletion approach for nodes with 2 children is to replace the data The data from the predecessor node is copied into the node to be deleted And the predecessor node is then deleted Using the approach described for deleting nodes with one or no children This avoids some of the complicated pointer assignments 63

The efficiency of BST operations depends on the height of the tree All three operations (search, insert and delete) are O(height) If the tree is complete the height is log(n) What if it isn t complete? 65

Insert 7 Insert 4 Insert 1 Insert 9 Insert 5 It s a complete tree! 7 4 9 1 5 height = log(5) = 2 66

Insert 9 Insert 1 Insert 7 Insert 4 Insert 5 It s a linked list with a lot of extra pointers! 1 4 7 9 height = n-1 = 4 = O(n) 5 67

It would be ideal if a BST was always close to complete i.e. balanced How do we guarantee a balanced BST? We have to make the structure and / or the insertion and deletion algorithms more complex e.g. red black trees. 68

It is possible to sort an array using a binary search tree Insert the array items into an empty tree Write the data from the tree back into the array using an InOrder traversal Running time = n*(insertion cost) + traversal Insertion cost is O(h) Traversal is O(n) Total = O(n) * O(h) + O(n), i.e. O(n * h) If the tree is balanced = O(n * log(n)) 69