Lecture 32 No computer use today. Reminders: Homework 11 is due today. Project 6 is due next Friday. Questions? Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 1
Outline Introduction to trees Tree terminology Binary trees Binary tree representations Array Linked nodes Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 2
Introduction to Trees So far, all containers that we have looked at are sequential and access is by position (either index or iterator). Many applications access data by value rather than by position. For example, a phone book entry is accessed by a person's name. Could keep entries in a list and scan the list for the name. As we have seen scanning a list is a slow O(n) operation. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 3
Introduction to Trees Containers that are accessed by value are called associative containers. Unfortunately, we will not get to the STL associative containers, map, set, and hashmap, in this class. However, we will look at the tree data structure that is often used to implement associative containers. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 4
Tree Terminology A tree is a hierarchical data structure. It places elements in nodes connect by branches that originate from a single root. For example, the organizational structure of the academic programs at UE. UE root node SOBA CEHS CECS CAS ACCT & BA NHS SED EXSS EECS MCE FL MATH NURS PT EE CoE CS SPAN FREN Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 5
Tree Terminology The root node is the top level and has no predecessor (or parent). It can have multiple successors (called children who are siblings to each other) to the next level. Each node contains a value and a set of 0 or more links to children nodes. The links are called edges. A node with no children is called a leaf. All other non-root, non-leaf nodes are interior nodes with at least one child. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 6
Tree Terminology Every node is the root of a subtree the node and all its descendants. (Its children and the children's children, etc.) A node's parent and the parent's parent, etc. are the node's ancestors. A path is a sequence of nodes from a node (N) to the root (R): N = X 0 X 1...X k = R where k is the length of the path. Each node X i+1 is the parent of X i. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 7
Tree Terminology The length of a path defines the depth of the node. The depth of a tree is the maximum depth of any node in the tree. Sometimes the (depth of a tree +1) is called the height of the tree (though the textbook says that the height of a tree is the same as the depth of a tree). Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 8
Tree Terminology Here is the UE organization chart again, annotated with some of the tree terminology. subtree rooted at SOBA UE root; parent of CECS; ancestor to all SOBA CEHS interior node CECS CAS ACCT & BA NHS SED EXSS EECS MCE FL MATH NURS PT EE CoE CS SPAN FREN children of NHS; siblings to each other leaf; child of EECS; descendant of CECS Path from CS: CS->EECS->CECS->UE; CS at depth 3 Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 9
Binary Trees Organization charts are one of a few applications that need trees where the nodes have more than 2 children. A file system with its directories would be another. Most applications only need a binary tree, a tree where each node has at most 2 children. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 10
Binary Trees Formally, a binary tree is a finite set of nodes. The set of nodes may be empty, called an empty tree. If the set is not empty, it meets the following rules: 1. There is one special node called the root. 2. Each node may be associated with up to two other different nodes, called its left child and its right child. 3. Each node, except the root, has exactly one parent; the root has no parent. 4. There is a path from every node following its parent back to the root. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 11
Binary Trees A binary tree is said to be full, if every leaf has the same depth and every node has two children. A binary tree is said to be complete, if every level except the deepest level is full and the nodes in the deepest level are as far left as possible. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 12
Binary Trees Here are some binary trees: full binary tree complete binary tree binary tree that is neither full nor complete Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 13
Array Representation When a binary tree is complete, we can use a simple array representation (a fixed-size array if the tree has a maximum size or a dynamic array if can grow indefinitely). Suppose we number the nodes starting at the root and going from left to right at each level and then top to bottom. Call this number i. Then we store each node's value in an array at index i. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 14
Array Representation Here is a picture: 'A' 0 'L' 1 'G' 2 'O' 'R' 'I' 'T' 3 4 5 6 'H' 'M' 'S' 7 8 9 array 'A' 'L' 'G' 'O' 'R' 'I' 'T' 'H' 'M' 'S' [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 15
Array Representation This representation is convenient for several reasons: The root is always at array[0] Suppose that the data for a node appears in array[i]. The locations of the parent and children node can be computed. For a non-root node, the parent is always located at array[(i-1)/2] (using integer division) The children (if they exist) are located at array[2i+1] (left child) and array[2i+2] (right child) Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 16
Array Representation The main problem with the array representation is that if the tree is not complete, then there must be a way to indicate which elements of the array actually exist. The representation also can be very inefficient if the tree is very deep with few nodes at each level. The array would have to be very large, but would be mostly empty. Solve these problems by implementing nodes and edges directly. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 17
Linked Node Representation We can represent a tree node using a class that has an attribute to hold the node value (data) and two tree node pointer attributes (leftchild and rightchild). The pointer attributes are used to link a node to the nodes of its children. An entire tree is used represented as a pointer to the root node. The empty tree is represented using the null pointer. A partial picture of the previous example tree is shown on the next slide. Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 18
Linked Node Representation root 'A' data leftchild rightchild 'L' 'G' 'O' 'R' 'I' 'T' Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 19