Trees. Dr. Ronaldo Menezes Hugo Serrano Ronaldo Menezes, Florida Tech

Trees Dr. Ronaldo Menezes Hugo Serrano (hbarbosafilh2011@my.fit.edu)

Introduction to Trees Trees are very common in computer science They come in different variations They are used as data representation in many applications They appear frequently in several algorithmic solution

Trees Set of nodes (vertices) and edges (arcs) that connect the nodes together Nodes

Trees Set of nodes (vertices) and edges (arcs) that connect the nodes together Edges

Trees Set of nodes (vertices) and edges (arcs) that connect the nodes together A path is connected sequence of edges Path

Trees Set of nodes (vertices) and edges (arcs) that connect the nodes together A path is connected sequence of edges There is exactly one path between any two nodes in a tree. There is no cycles in trees Path

Trees Trees may have one distinguished node called the root These trees are called rooted trees. A tree with no root is called free tree. Rooted Tree

Trees The convention is draw the root on the top. In a rooted tree, any node is the root of a subtree. root of subtree Root A subtree

Relationships between nodes

Relationships between nodes Inspiration from family trees to determine relationships between nodes: Parent Node Children Siblings Ancestor Grandparents

Relationships between nodes Relationships Inspiration from family trees to name relationships between nodes: Parent Node Children Siblings Ancestors Parent node Every node except the root has one parent. The parent is the first node on path from a node to the root. Root has no parent If the node p is the c s parent, node c is p s child

Relationships between nodes Relationships Inspiration from family trees to name relationships between nodes: Parent Node Children Siblings Ancestors Leaf node A node can have any number of children Nodes with no children are called leaf nodes

Relationships between nodes Relationships Inspiration from family trees to name relationships between nodes: Parent Node Children Siblings Ancestors Siblings Nodes that have the same parent.

Relationships between nodes Relationships Inspiration from family trees to name relationships between nodes: Parent Node Children Siblings Ancestors Ancestors Nodes on the path from the node to the root including the root. A node is an ancestor of itself. If a node a is an ancestor of a node b, than b is a descendant of a.

Structural Properties Property Length of a path Depth of a node Height of a node Height of the tree Subtree rooted at a node Length of a Path Number of edges in a path

Structural Properties Property Length of a path Depth of a node Height of a node Height of the tree Subtree rooted at a node Depth of a node Length of the path from the node to the root

Structural Properties Property Length of a path Depth of a node Height of a node Height of the tree Subtree rooted at a node Height of a node Length of the path from the node to its deepest descendant.

Structural Properties Property Length of a path Depth of a node Height of a node Height of the tree Subtree rooted at a node Height of the tree Height of the root

Structural Properties Property Length of a path Depth of a node Height of a node Height of the tree Subtree rooted at a node Subtree rooted at a node Tree formed by the node a and all its descendants

Alternate Tree Representation There are several ways to represent a tree. Using the definition based on sets we could say that: T = {A, {G,{Y}}, {P,{K},{L}}, {W,{T},{C},{B}}} A A G P W Y K L T C B G P W Y K T C L B

Trees with Specific Number of Children

Trees with Specific Number of Children Binary Tree Each node has at most two children Quadree One left child One right child At most 4 children Octree At most 8 children M-ary tree At most M children

M-ary tree Examples Quadtree

M-ary tree Examples Octree

The Tree Abstract Data Type Generic methods size(), isempty(), enumerate() Positional methods swapelement(p,q), replaceelement(p,n) Query methods isroot(), isleaf() Accessor methods root(), parent() children() Update methods() insert(p), delete(p)

The Tree Abstract Data Type parent Element childrencollection

Structure of a Binary Tree Node Nodes in binary trees have structure similar to linked list. Rather than a next and previous pointers, they maintain the following: Right: pointer the root of the right subtree Left: pointer to the the root of the left subtree Parent: pointer to the parent node parent left DATA right

Tree Traversal

Traversing Trees In general there are two ways of traversing trees Depth-based traversal: Level-order traversall

Cases of Depth-based Traversal There are three basic ways to traverse a tree using the a depth-based approach Preorder Inorder This search is better associated with binary trees Postorder This is what normally authors mean if they mention just depth-first search in the context of trees

Preorder Traversal A B C D E F G H I

Preorder Traversal A B C D E F G H I Result: A

Preorder Traversal A B C D E F G H I Result: AB

Preorder Traversal A B C D E F G H I Result: ABD

Preorder Traversal A B C D E F G H I Result: ABDE

Preorder Traversal A B C D E F G H I Result: ABDEH

Preorder Traversal A B C D E F G H I Result: ABDEHC

Preorder Traversal A B C D E F G H I Result: ABDEHCF

Preorder Traversal A B C D E F G H I Result: ABDEHCFG

Preorder Traversal A B C D E F G H I Result: ABDEHCFGI

Preorder Algorithm void preorder(tree t) { if (t!= null) { visit(t) preorder(t.getleft()) preorder(t.getright()) } }

Inorder Traversal A B C D E F G H I

Inorder Traversal A B C D E F G H I Result: D

Inorder Traversal A B C D E F G H I Result: DB

Inorder Traversal A B C D E F G H I Result: DBH

Inorder Traversal A B C D E F G H I Result: DBHE

Inorder Traversal A B C D E F G H I Result: DBHEA

Inorder Traversal A B C D E F G H I Result: DBHEAF

Inorder Traversal A B C D E F G H I Result: DBHEAFC

Inorder Traversal A B C D E F G H I Result: DBHEAFCG

Inorder Traversal A B C D E F G H I Result: DBHEAFCGI

Inorder Algorithm void inorder(tree t) { if (t!= null) { inorder(t.getleft()) visit(t) inorder(t.getright()) } }

Postorder Traversal A B C D E F G H I

Postorder Traversal A B C D E F G H I Result:

Postorder Traversal A B C D E F G H I Result: D

Postorder Traversal A B C D E F G H I Result: DH

Postorder Traversal A B C D E F G H I Result: DHE

Postorder Traversal A B C D E F G H I Result: DHEB

Postorder Traversal A B C D E F G H I Result: DHEBF

Postorder Traversal A B C D E F G H I Result: DHEBFI

Postorder Traversal A B C D E F G H I Result: DHEBFIG

Postorder Traversal A B C D E F G H I Result: DHEBFIGC

Postorder Traversal A B C D E F G H I Result: DHEBFIGCA

Postorder Algorithm void postorder(tree t) { if (t!= null) { postorder(t.getleft()) postorder(t.getright()) visit(t) } }

Generic Traversal The depth-based traversals described are just a special case of the generic traversal for binary trees The Euler Tour Traversal It consists of walking around the tree and visiting each edge exactly 2 times times (or each node 3 times) A B C D E F G U P H R X N Q I W L

Level-order traversal A B C D E F G H I

Level-order traversal A B C D E F G H I Result: A

Level-order traversal A B C D E F G H I Result: AB

Level-order traversal A B C D E F G H I Result: ABC

Level-order traversal A B C D E F G H I Result: ABCD

Level-order traversal A B C D E F G H I Result: ABCDE

Level-order traversal A B C D E F G H I Result: ABCDEF

Level-order traversal A B C D E F G H I Result: ABCDEFG

Level-order traversal A B C D E F G H I Result: ABCDEFGH

Level-order traversal A B C D E F G H I Result: ABCDEFGHI

Full & Complete Trees A full binary tree is binary tree in which each node has exactly zero or two children. A perfect binary tree is one in which all the nodes except the leaves have exactly two children and all leaves are at the same level. A complete binary tree is one that is either full or full up to the last but one level, and have all the nodes in the bottommost level shifted to the left. A complete tree An incomplete tree

Array Implementation of Binary Trees There are several ways to implement a binary tree. Although it may appear that we must use pointers, this is not mandatory and other implementations may be more efficient depending on the tree structure. One way to represent binary trees is using arrays. For this to be possible we need to number the elements in such a way that operations to find the other nodes can be done in a systematic way One of best way to number the elements (nodes) is using a level-ordering approach. Start from the topmost number the node with 1, move down to the next level and number the nodes with 2 and 3 starting from the left, repeat the process until there are no more levels Similar to what level-order traversal does. Sequential representation is more cost-effective when the tree is complete

Level Order of a Binary Tree Given a complete binary tree as below if we perform the level-by-level numbering we get... Z F S G J H Q X C P

Level Order of a Binary Tree Given a complete binary tree as below if we perform the level-by-level numbering we get the level order numbering of all nodes. 1 2 Z 3 4 F 5 6 S 7 8 X G 9 C 1 0 P J H Q

Implementing Trees with Arrays By ordering a complete binary tree using level ordering we can see the relationship of the order and indexes of an array as below: Z F S G J H Q X C P 0 1 2 3 4 Given the structure above, how do we find the nodes of the binary tree? 5 To find Use Provided The left child of bt[i] bt[2 * i] 2 * i <= n The right child of bt[i] bt[2 * i + 1] 2 * i + 1 <= n The parent of bt[i] bt[i / 2] i > 1 The root bt[1] bt is non-empty Whether bt[i] is leaf TRUE 2 * i > n 6 7 8 9 10

Binary Search Trees BSTs are binary trees where the nodes are organized as follows: All the elements in the left subtree of a node are less than the node All the elements in the right subtree of a node are greater than the node Both left and right subtrees are BSTs, meaning they have to conform to these properties Based on the description above you can see that BSTs do not accept repeated elements. The elements are all unique in value

Implementing Binary Search Trees The structure of a node for a binary search tree is very much like the structure for a doubly linked list. The main difference is that now from each node the references are interpreted differently: they are left or right This leads to the following definition for a node in a binary search tree (integer tree in this case) class NodeType { int data; // or other type NodeType left; NodeType right; NodeType parent; }

Searching in Binary Search Trees Searching is the reason we have BSTs. Like most operations in a BST, search is better described in a recursive fashion. The pseudo-code below describe the idea search (target) 1. If head of the tree is null return null; 2. Test whether the target is the same as the key in the head of the tree and return a pointer to the head if true 3. If not true, if target is less than the key in the head return the search of the target in the left subtree 4. If target is greater than the key in the head return the search of the target in the right subtree

search(10) 12 9 16 2 11 13 21 1 3 10

search in (quasi-)java // Simplified idea of searching a binary search tree // You may (or may not) need to account for "special" // cases. Also the syntax is not necessarily correct it is simplified // to improve clarity NodeType search (NodeType link, int target) { if (link == null) return null; if (target == link.data) // Found the node return link; if (target < link.data) return search(link.left,target); // Try the left subtree else return search(link.right,target); // Try the right subtree }

Insertion in Binary Search Trees To insert an element in a BST we follow the same principle applied in the searching. Compare the value to be inserted with the current key and decide if the new element should be inserted on the left or on the right The element is always inserted as a leaf in the tree. insert (target) 1. If head of the tree is null create new node to store target; 2. Test whether the target is the same as the key in the head of the tree and return null (this means that the element cannot be added to the tree) 3. If not true, if target is less than the key in the head insert the target in the in the left subtree 4. If target is greater than the key in the head insert the target in the right subtree

insert(14) 12 9 16 2 11 13 21 1 3 10

insert(14) 12 9 16 2 11 13 21 1 3 10 14

Find Maximum Find maximum is another common operation in binary search trees and is helpful when implementing deletion. Again the recursive thinking makes this implementation much easier to understand The idea is simple. If a node has a non-null pointer to the right this node cannot be the maximum as the element in the right must be greater. findmax () 1. If head of the tree is null return null; 2. If the right pointer from the head is null head is the maximum 3. Else (it is not null) return the result of findmax in the right subtree

Find Minimum The idea of finding the minimum value in a binary search tree will also help the delete operation Its idea is very much like the maximum findmin () 1. If head of the tree is null return null; 2. If the left pointer from the head is null head is the minimum 3. Else (it is not null) return the result of findmin in the left subtree

Delete Deletion in BST is probably the least easy of the algorithms. The reason is that nodes can be deleted from anywhere, not only leaves. When inserting we know that we're not dealing with internal nodes and this make it straightforward When deleting a node we have two cases Deleting a leaf. This is easy. Since a leaf does not point to any other node we can just remove it from the tree Deleting from a internal node involves moving nodes around. The reason for this is that after the deletion the tree must still hold its's main property Let's see some examples

delete(10) 12 9 16 2 11 13 21 1 3 10

delete(10) 12 9 16 2 11 13 21 1 3

delete(9) 12 9 16 2 11 13 21 1 3

delete(9) What we need is move elements around so that the tree does not become disconnected and we keep the property of a BST 12 16 2 11 13 21 1 3 This is not an option. You can see if we do this the tree is disconnected and we lose this part of the tree

delete(9) We can try to find an element below 9 that can be used in the position where 9 is 12 9 16 2 11 13 21 1 3 If we want to maintain the property of the tree we have to options: either the maximum element of the left subtree or the minimum element of the right subtree

delete(9) 12 9 16 2 11 13 21 1 3 Let us choose this one

delete(9) 12 3 16 2 11 13 21 1 3 The number 3 is copied to the node where 9 was.

delete(9) 12 3 16 2 11 13 21 1 3 This does not solve the problem because we now have two elements with the same value

delete(9) 12 3 16 2 11 13 21 1 3 But we can solve the problem by deleting the number 3 from the left subtree of the current node (also 3)

delete(9) 12 3 16 2 11 13 21 1 3

delete(9) 12 3 16 2 11 13 21 1

Delete Delete can be made very simple if we implement functions like findmax, findmin and isleaf This code should make you appreciate recursion since it simplifies the job of deleting a node quite a lot. These function are used when deciding what to do. Using what we've discussed we can defined a pseudo-code for delete The elements return from findmax and findmin are equivalent to the in-order predecessor and in-order successor of the node being deleted.

Delete pseudo-code void delete (head,target) { if (head == null) return; if (target < head.key) delete (head.left,target); else if (target > head.key) delete(head.right,target); // element not in the list else delete if (isleaf(head)) { remove head from the tree else if (head.left!= null) { int max = findmax(head.left); head.key = max; delete(head.left,max); } else { int min = findmin(head.right); head.key = min; delete(head.right,min); } } // found the element to

Final comments about BSTs Desirable characteristics of a BST Most of operations can be done in O(h), where h is the height of the tree. In a random generated tree height is normally O (log n), where n is the number of nodes in the tree It can be used to help in other operations. For instance, sort a list. Undesirable characteristics The main one is that it may become unbalanced (possibly leading to something that looks like a linked list) which degrades all operations. To solve this problem we'll look at balanced trees Most of your operations are better understood recursively and we know that recursion is more expensive than iterative solutions

BST Exercise The concept of BST is quite common by interviewers because it is a simple concept and yet requires some thinking for some problems. Take for instance the following problem (which has been asked in at least 3 interview instances that I know of) The problem is Given an ordinary binary tree, test whether it is a BST.

Solution 1 Using findmin() and findmax() This is a modified version of findmax and findmin. Not the one described earlier. The tree may not be a BST. boolean isbst(tree node) { if (node==null) return true; // false if the max of the left is > than the current node if ((node.getleft()!=null) && ( findmax(node.getleft()) >= node.getvalue())) return false; // false if the min of the right is < than the current node if ((node.getright()!=null) && (findmin(node.getright()) <= node.getvalue())) return false; // false if the left node or right node are not a BST if (!isbst(node.getleft())!isbst(node.getright())) return false; // it is a BST if it gets here return true; }

Solution 2 Uses range of values. More efficient!!! int isbst(tree node) { // assuming MIN_VALUE and MAX_VALUE cannot be in the tree return isbstrange(node, MIN_VALUE, MAX_VALUE); } int isbstrange(tree node, int min, int max) { if (node==null) return true; // false if this value in current node violates the range constraints if (node.getvalue() <= min node.getvalue() => max) return false; // check the subtrees recursively, // range has to be modified with new range return isbstrange(node.getleft(), min, node.getvalue()) && isbstrange(node.getright(), node.getvalue(), max); }

Solution 3 I ll let you write the code. Do the inorder traversal of the tree and verify if the result is a sorted list.