Trees. A tree is a directed graph with the property

Similar documents
COMP : Trees. COMP20012 Trees 219

CSCI2100B Data Structures Trees

Binary Trees

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Trees. Q: Why study trees? A: Many advance ADTs are implemented using tree-based data structures.

Tree: non-recursive definition. Trees, Binary Search Trees, and Heaps. Tree: recursive definition. Tree: example.

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

Programming II (CS300)

Why Do We Need Trees?

Binary search trees (chapters )

Programming II (CS300)

Algorithms. AVL Tree

Programming II (CS300)

Trees. (Trees) Data Structures and Programming Spring / 28

Recall: Properties of B-Trees

Chapter 20: Binary Trees

Binary Trees, Binary Search Trees

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

UNIT III BALANCED SEARCH TREES AND INDEXING

Self-Balancing Search Trees. Chapter 11

Trees. CSE 373 Data Structures

Garbage Collection: recycling unused memory

Data and File Structures Laboratory

Data Structures and Algorithms

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

DATA STRUCTURES AND ALGORITHMS

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

Analysis of Algorithms

Binary heaps (chapters ) Leftist heaps

Heaps in C. CHAN Hou Pong, Ken CSCI2100 Data Structures Tutorial 7

Trees. Introduction & Terminology. February 05, 2018 Cinda Heeren / Geoffrey Tien 1

Balanced Search Trees. CS 3110 Fall 2010

Trees. Eric McCreath

CMSC 341 Leftist Heaps

CMSC 341 Lecture 15 Leftist Heaps

4. Trees. 4.1 Preliminaries. 4.2 Binary trees. 4.3 Binary search trees. 4.4 AVL trees. 4.5 Splay trees. 4.6 B-trees. 4. Trees

Some Search Structures. Balanced Search Trees. Binary Search Trees. A Binary Search Tree. Review Binary Search Trees

Binary search trees (chapters )

CMSC 341 Lecture 15 Leftist Heaps

Chapter 4: Trees. 4.2 For node B :

BBM 201 Data structures

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

3137 Data Structures and Algorithms in C++

CMSC 341 Priority Queues & Heaps. Based on slides from previous iterations of this course

Tree Structures. A hierarchical data structure whose point of entry is the root node

Advanced Set Representation Methods

Learning Goals. CS221: Algorithms and Data Structures Lecture #3 Mind Your Priority Queues. Today s Outline. Back to Queues. Priority Queue ADT

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Outline. Preliminaries. Binary Trees Binary Search Trees. What is Tree? Implementation of Trees using C++ Tree traversals and applications

[ DATA STRUCTURES ] Fig. (1) : A Tree

Unit III - Tree TREES

Sorting and Searching

TREES. Trees - Introduction

CMSC 341 Lecture 14: Priority Queues, Heaps

3. Priority Queues. ADT Stack : LIFO. ADT Queue : FIFO. ADT Priority Queue : pick the element with the lowest (or highest) priority.

Binary Trees and Binary Search Trees

Final Examination. Algorithms & Data Structures II ( )

Data structures. Priority queues, binary heaps. Dr. Alex Gerdes DIT961 - VT 2018

Chapter 2: Basic Data Structures

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

Data Structure - Binary Tree 1 -

Sorting and Searching

Chapter 6 Heaps. Introduction. Heap Model. Heap Implementation

CE 221 Data Structures and Algorithms

Data Structures. Trees. By Dr. Mohammad Ali H. Eljinini. M.A. Eljinini, PhD

CMPE 160: Introduction to Object Oriented Programming

Tree. A path is a connected sequence of edges. A tree topology is acyclic there is no loop.

Trees 11/15/16. Chapter 11. Terminology. Terminology. Terminology. Terminology. Terminology

CS301 - Data Structures Glossary By

Trees. Trees. CSE 2011 Winter 2007

CS 240 Fall Mike Lam, Professor. Priority Queues and Heaps

Binary Trees. Height 1

Priority Queues. 04/10/03 Lecture 22 1

Heaps. 2/13/2006 Heaps 1

Lecture 9: Balanced Binary Search Trees, Priority Queues, Heaps, Binary Trees for Compression, General Trees

Chapter 11.!!!!Trees! 2011 Pearson Addison-Wesley. All rights reserved 11 A-1

Chapter 11.!!!!Trees! 2011 Pearson Addison-Wesley. All rights reserved 11 A-1

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Data Structure. IBPS SO (IT- Officer) Exam 2017

Priority Queues. Lecture15: Heaps. Priority Queue ADT. Sequence based Priority Queue

CP2 Revision. theme: dynamic datatypes & data structures

CSCI-401 Examlet #5. Name: Class: Date: True/False Indicate whether the sentence or statement is true or false.

CSCI 136 Data Structures & Advanced Programming. Lecture 22 Fall 2018 Instructor: Bills

Week 2. TA Lab Consulting - See schedule (cs400 home pages) Peer Mentoring available - Friday 8am-12pm, 12:15-1:30pm in 1289CS

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

Data Structure - Advanced Topics in Tree -

CSI33 Data Structures

A set of nodes (or vertices) with a single starting point

Course Review. Cpt S 223 Fall 2009

Heap Model. specialized queue required heap (priority queue) provides at least

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

CS350: Data Structures B-Trees

Advanced Tree Data Structures

ITI Introduction to Computing II

Trees, Part 1: Unbalanced Trees

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard

Announcements. Midterm exam 2, Thursday, May 18. Today s topic: Binary trees (Ch. 8) Next topic: Priority queues and heaps. Break around 11:45am

COSC 2007 Data Structures II Final Exam. Part 1: multiple choice (1 mark each, total 30 marks, circle the correct answer)

Transcription:

2: Trees

Trees A tree is a directed graph with the property There is one node (the root) from which all other nodes can be reached by exactly one path. Seen lots of examples. Parse Trees Decision Trees Search Trees Family Trees Hierarchical Structures Management Directories Trees 2-1

Trees (continued) Trees have natural recursive structure Any node in tree has number of children each of which is a tree. Trees 2-2

Descriptions of Graphs and Trees A directed graph is a pair (N,E) consisting of a set of nodes N, together with a relation E on N. There is no restriction on the relation E. ae b iff there is an edge from a to b n2 n1 n3 n5 n4 Trees 2-3

CS2011 Descriptions of Graphs and Trees (continued) A tree is a graph (N,P) (where the relation P is called has parent), with the following property For any node n, there is at most one node n with npn. n1 n2 n3 n4 n5 n6 n7 n8 Trees 2-4

Descriptions of Graphs and Trees (continued) If there is no node n with npn, n is called a root node. A tree with a single root node is called a rooted tree. Often the word tree is used to mean rooted tree, and the more general collection is known as a forest of trees. For any node n, the set {n n Pn} is called the set of children of n. If a node n has no children it is called a leaf Not difficult to see that this is equivalent to the more normal recursive definition of a rooted tree Trees 2-5

Height and Depth For any node n in a tree the depth of n is the length of the path from the root to n (so the root has depth 0) For any node n in a tree the height of n is the length of the longest path from n to a leaf (so all leaves have height 0) The height of a tree is the height of its root. Trees 2-6

Representations of Trees In this section we look at different ways in which rooted trees can be represented in a programming language Have seen both SML and C representations of binary trees. datatype a TREE = Empty Node of ( a TREE * a * a TREE) Trees 2-7

Representations of Trees (continued) and in C typedef struct TreeNode *PtrToNode; struct TreeNode { ElementType element; PtrToNode left; PtrToNode right; }; typedef PtrToNode Tree; Trees 2-8

Representations of Trees (continued) A B C E F G H D Trees 2-9

Representations of Trees (continued) Non-binary trees are almost as simple datatype a TREE = Empty Node of ( a * a TREE list) Trees 2-10

Representations of Trees (continued) and typedef struct TreeNode *PtrToNode; struct TreeNode { ElementType element; PtrToNode FirstChild; PtrToNode NextSibling; }; typedef PtrToNode Tree; Trees 2-11

Representations of Trees (continued) This looks almost the same as the binary tree representation, but is interpreted quite differently A B C D E F G H I J Trees 2-12

Traversing Trees Can list the nodes of a tree in one of several orders Preorder: List the node, then recursively list all children subtrees Postorder: Recursively list all children subtrees, then list the node Inorder: Only suitable for binary trees. List left subtree, node, then right subtree Exercise: Write preorder and postorder listing functions for both the binary and n-ary trees. Trees 2-13

A Pointer-Free Representation Suppose the nodes of a tree have names 1...n (or something that we can conveniently map to 1...n). We can represent a tree (or even a forest of trees) with these nodes by use of single array. The array element a[i] should contain the parent of the node i, or if i is a root node, i itself. Trees 2-14

A Pointer-Free Representation (continued) So the array index 1 2 3 4 5 6 contents 1 2 5 5 5 2 Represents the forest of trees 1 2 6 5 3 4 Note that there is no restriction here on the amount of branching in the tree, since we give the parent relation directly, and not the children. Trees 2-15

A Pointer-Free Representation (continued) How would we find the first child of a node or next sibling in this context? Trees 2-16

A Pointer-Free Representation (continued) This representation is very useful for representing a partition of the set 1...n. A partition of a set X is a set of subsets X i with the properties The union of all the sets X i is X i.e. X i = X All the sets X i are pairwise disjoint i.e. i, j X i X j = /0 So {{1,3},{2},{4}} is a partition of the set {1,2,3,4}. Trees 2-17

A Pointer-Free Representation (continued) A simple way of representing a partition is by using a forest of trees For example X = {1,2,3,4,5,6} The partition {{1},{2,6},{3,4,5}} can be represented by the forest 1 2 6 5 3 4 To determine whether 2 elements are in the same member of the partition, just find the root of the trees they are in. Trees 2-18

CS2011 A Pointer-Free Representation (continued) To combine two elements of the partition, just graft the trees together, by making the root of one tree the parent of the root of the other. e.g. to combine {2,6} and {3,4,5} 2 6 "# "# "# "# "# 5 "# "# "# "# "# "# "# "# "# "# 3!!!!!!!!!!!!!!!!!!!! Can optimise the tree by flattening 4 or 2 5 6 3 4 Trees 2-19

Binary Heaps This is another type of tree with a pointer-free representation. Recall from tutorial 3 that an essentially complete binary tree is one in which all nodes have exactly two children, except possibly those at the lowest level, which is filled from left to right. Trees 2-20

Binary Heaps (continued) A binary tree has the heap property if the value of the key at any node is less than or equal to the values of all the keys of its children. 13 16 22 35 20 32 24 26 33 Sometimes greater than or equal to is used instead. Trees 2-21

Binary Heaps (continued) A binary heap is a binary tree which has the type invariant it is essentially complete it has the heap property The smallest element in a heap is always at its root. All operations on a heap must preserve the type invariant Trees 2-22

Binary Heaps (continued) The above heap (or any complete binary tree) can be stored in an array as follows index 0 1 2 3 4 5 6 7 8 9 10 value 13 16 20 22 35 32 24 26 33 Trees 2-23

Binary Heaps (continued) In general, the root is stored in slot 1, and the children of the element at position i can be found at positions 2i and 2i + 1 To insert into a heap just insert after the last element (as long as there is room.) This can destroy the heap property, so then need to repair the heap to reestablish the invariant. Trees 2-24

Binary Heaps (continued) For example to insert 15 into the heap above 13 13 16 20 16 20 22 35 32 24 22 32 24 26 33 26 33 35 13 13 20 15 20 22 16 32 24 22 16 32 24 26 33 35 26 33 35 Trees 2-25

Binary Heaps (continued) The major use of heaps is as priority queues, so a general delete is not usually needed. Usual form of delete is deletemin, which removes the smallest element from the heap Trees 2-26

Binary Heaps (continued) This is performed as follows 15 15 20 20 22 16 32 24 22 16 32 24 26 33 35 26 33 35 15 15 16 20 16 20 22 32 24 22 35 32 24 26 33 35 26 33 Trees 2-27

Binary Heaps (continued) If we delete the next lowest 16 16 16 20 20 22 20 22 35 32 24 22 35 32 24 35 32 24 26 33 26 33 26 33 16 16 22 20 22 20 26 35 32 24 26 35 32 24 33 33 Trees 2-28

Binary Heaps (continued) and the next lowest 20 20 22 20 22 22 24 33 26 35 32 24 33 26 35 32 24 33 26 35 32 20 22 24 26 35 32 33 etc etc Trees 2-29

Building Heaps Given a list of keys, a heap can be built simply by using the insert function described above. A more efficient O(n) technique is the following: Put the list elements into the heap array in any order, without worrying about heap invariant Turn the array into a heap as follows: Starting at the rightmost, deepest node with a child, swap its contents with that of one of its children to ensure the heap property for the tree below, then percolate that element down if necessary Work leftwards and upwards to the root. Trees 2-30

For example: Building Heaps (continued) index 0 1 2 3 4 5 6 7 8 9 10 value 43 6 20 16 13 42 4 12 33 43 43 43 6 20 6 20 6 4 16 13 42 4 12 13 42 4 12 13 42 20 12 33 16 33 16 33 43 4 4 6 4 6 43 6 20 12 13 42 20 12 13 42 20 12 13 42 43 16 33 16 33 16 33 Trees 2-31

Building Heaps (continued) Using this technique followed by removing the smallest element from the heap until it is empty, gives an O(nlgn) sort technique called heapsort Trees 2-32

Priority Queues Normal queues are FIFO devices In a Priority Queue each element entering a queue is assigned a value, usually a number, and the first element to leave the queue is that with the lowest value. Used in Operating Systems etc Most common implementation of Priority Queues is the heap. Trees 2-33

Data Types for Priority Queues struct HeapStruct { unsigned int Capacity; /* Capacity of heap */ unsigned int Size; /* Current # of elements in heap */ ElementType *Elements; }; typedef struct HeapStruct *PriorityQueue; Trees 2-34

Initialising a Priority Queue PriorityQueue CreatePq( unsigned int MaxElements ) { PriorityQueue H; H = (PriorityQueue) malloc( sizeof( struct HeapStruct ) ); if( H == NULL ) FatalError("Out of space!!!"); /* Allocate the array + one extra for sentinel */ H->Elements= (ElementType *) malloc((maxelements+1)*sizeof(elementtype)); if( H->Elements == NULL ) FatalError("Out of space!!!"); Trees 2-35

Initialising a Priority Queue (continued) H->Capacity = MaxElements; H->Size = 0; H->Elements[0] = MINDATA; /* MINDATA less than ALL */ /* possible data elements */ } return H; Trees 2-36

/* H->element[0] is a sentinel */ Insertion in a Priority Queue void insert( ElementType x, PriorityQueue H ) { unsigned int i; if( IsFull ( H ) ) error("priority queue is full"); else { i = ++H->Size; Trees 2-37

Insertion in a Priority Queue (continued) while( H->Elements[i/2] > x ) { H->Elements[i] = H->Elements[i/2]; i /= 2; } } } H->Elements[i] = x; Trees 2-38

Deletion from a Priority Queue ElementType DeleteMin( PriorityQueue H ) { unsigned int i, child; ElementType MinElement, LastElement; if( IsEmpty( H ) ) { error("priority queue is empty"); return H->Elements[0]; } MinElement = H->Elements[1]; LastElement = H->Elements[ H->Size-- ]; Trees 2-39

Deletion from a Priority Queue (continued) for (i=1; i*2<=h->size; i=child ) { child = i*2; /* find smaller child */ if((child!=h->size) && (H->Elements[child+1] < H->Elements[child])) child++; if( LastElement > H->Elements[child] ) /*percolate */ H->Elements[i] = H->Elements[child]; else break; } H->Elements[i] = LastElement; } return MinElement; Trees 2-40

Ordered Binary Trees A binary tree is said to be ordered if, for every node n in the tree, the values of the keys in its left subtree are smaller that the key at n, and those in the right subtree are greater than the key at n. The operations required on an ordered binary tree are Initialise a tree Find the location (if any) of a given key in a tree. Insert a given key in a tree. Delete a given key from a tree. List the contents of a tree. For implementations of these see the lab exercise Trees 2-41

Ordered Binary Trees (continued) Another common operation is Find the largest/smallest element in a tree Trees 2-42

Ordered Binary Trees (continued) If we adopt a naïve approach to insertion and the data is pre-sorted, the cost of building a tree becomes quadratic in the size of the data (Why?) One solution is to attempt to keep the tree balanced There are several possible definitions of the term balanced, one is A tree is balanced if every node has left and right subtrees whose heights differ by at most 1 This is another type invariant A tree with this property is called an AVL tree. (Adelson, Velski and Landis) Trees 2-43

Ordered Binary Trees (continued) For example These trees are balanced Trees 2-44

Ordered Binary Trees (continued) These trees are not Trees 2-45

Ordered Binary Trees (continued) The problem is that when a new node is inserted, or an old one deleted, the tree can become unbalanced When inserting a new node in an AVL tree, find a place to put it in the usual way Then check the tree to be sure that it is still balanced If tree has lost the AVL property, only nodes between inserted element and root can have balance destroyed. (Why?) Balancing algorithm performs at most 1 constant time operation on each of the ancestors of the unbalanced node. So, restoring the AVL property to the tree is an O(log N) operation. Trees 2-46

AVL trees: restoring the balance For example if we insert 6 into the tree below, unbalances tree First unbalanced ancestor of new node is node containing 9 Can restore balance by rotating unbalanced subtree. 5 5 2 9 2 7 1 4 7 1 4 6 9 3 6 3 Trees 2-47

AVL trees: restoring the balance (continued) Suppose n is the first node (going up) which is unbalanced. Four different possibilities Imbalance due to insertion in left subtree of left child Imbalance due to insertion in right subtree of right child Imbalance due to insertion in right subtree of left child Imbalance due to insertion in left subtree of right child Trees 2-48

AVL trees: restoring the balance (continued) Single right rotation, imbalance in left subtree of left child B A A B T3 T1 T1 T2 T2 T3 Trees 2-49

AVL trees: restoring the balance (continued) Single left rotation, imbalance in right subtree of right child A B B A T1 T3 T2 T3 T1 T2 Trees 2-50

AVL trees: restoring the balance (continued) Now if we insert 8 into the tree below, unbalances tree First unbalanced ancestor of new node is again node containing 9 Need two rotations to restore balance 5 2 9 1 4 7 3 8 Trees 2-51

AVL trees: restoring the balance (continued) 5 5 2 9 2 9 1 4 7 1 4 8 3 8 3 7 5 2 1 4 8 7 9 3 Trees 2-52

AVL trees: restoring the balance (continued) Double left-right rotation, imbalance on right subtree of left child C A T4 B T1 T2 T3 Trees 2-53

AVL trees: restoring the balance (continued) C B B T4 A C A T3 T1 T2 T3 T4 T1 T2 Trees 2-54

AVL trees: restoring the balance (continued) Double right-left rotation, imbalance on left subtree of right child A T1 C B T4 T2 T3 Trees 2-55

AVL trees: restoring the balance (continued) A B T1 B A C C T2 T1 T2 T3 T4 T3 T4 Trees 2-56

AVL Trees: An example Starting tree 1 2 3 4 6 5 7 Trees 2-57

AVL Trees: An example (continued) Insert 16 1 2 3 4 6 5 7 16 Trees 2-58

AVL Trees: An example (continued) Insert 15 1 2 3 4 6 5 7 15 16 Trees 2-59

AVL Trees: An example (continued) Insert 14, initial position 1 2 3 4 6 5 7 15 16 14 Trees 2-60

AVL Trees: An example (continued) After one rotation 1 2 3 4 5 6 7 15 14 16 Trees 2-61

AVL Trees: An example (continued) Final position, after two rotations 1 2 3 5 4 7 6 14 15 16 Trees 2-62

AVL Trees: An example (continued) Insert 13 1 2 3 4 5 6 7 15 14 13 16 Trees 2-63

AVL Trees: An example (continued) Insert 12 1 2 3 4 5 6 7 12 15 13 14 16 Trees 2-64

AVL Trees: An example (continued) Insert 11 1 2 3 4 5 6 7 13 12 11 14 15 16 Trees 2-65

AVL Trees: An example (continued) Insert 10 1 2 3 4 5 6 7 10 13 11 12 14 15 16 Trees 2-66

AVL Trees: An example (continued) Insert 8 1 2 3 4 5 8 6 7 10 13 11 12 14 15 16 Trees 2-67

AVL Trees: An example (continued) Insert 9 1 2 3 4 5 8 6 9 7 13 10 11 12 14 15 16 Trees 2-68

2-3 Trees 2-3 trees are search trees that maintain their balance by relaxing the structural constraint of being a binary tree. Some nodes have 2 children and some have 3, hence the name Trees 2-69

2-3 Trees (continued) A 2-3 tree is a tree with the following properties The root is either a leaf or has either 2 or 3 children All data is stored at the leaves. All leaves are at the same depth The first two of these constraints are structural and the third is part of the type invariant Trees 2-70

2-3 Trees (continued) The type invariant is completed with the following constraint In each node, we store the value of smallest leaf in tree of second child, and value of smallest leaf in tree of third child, if there is one 7 16 - - 5 8 12 19 2 5 7 8 12 16 19 Trees 2-71

2-3 Trees (continued) Insert 18 7 16 5-8 12 18 19 2 5 7 8 12 16 18 19 Trees 2-72

2-3 Trees (continued) Insert 10 7 16 5-8 - 12-18 19 2 5 7 8 10 12 16 18 19 Wants to go between 8 and 12, so need to split 3 node into 2 2 s But root now appears to have 4 children, so need to split and make new root Trees 2-73

2-3 Trees (continued) 10-7 - 16-5 - 8-12 - 18 19 2 5 7 8 10 12 16 18 19 Trees 2-74

B trees 2-3 trees can be generalised to trees that have the following properties, for some natural number M The root is either a leaf or has between 2 and M children All non-leaf nodes (except the root) have between M/2 and M children All leaves have the same depth. Such trees are called B-trees of order M. Main use is in database systems, where tree is on disk rather than in memory Want to minimise the number of disk accesses B-trees are in general very shallow Trees 2-75