Steven J. Zeil June 25, 2013 Contents 1 Iterating over Trees 4 1.1 begin()..................................... 6 1.2 operator++................................... 7 2 Iterators using Parent Pointers 11 2.1 Basic operations................................ 14 1
2.2 begin() and end()............................... 17 2.3 operator++................................... 20 2.3.1 Implementing operator++...................... 28 3 Threads 32 CS361 2
The recursive traversal algorithms work well for implementing tree-based ADT member functions, but if we are trying to hide the trees inside some ADT (e.g., std::set), we may need to provide iterators for walking though the contents of the tree. Iterators for tree-based data structures can be more complicated than those for linear structures. CS361 3
For arrays (and vectors and deques and other array-like structures) and linked lists, a single pointer can implement an iterator: 3 Given the current position, it is easy to move forward to the next element. For anything but a singly-linked list, we can also easily move backwards. 1 Iterating over Trees Adams Baker Chen current current 1 3 4 5 7 12 19 23 CS361 4
But look at this tree, and suppose that you were implementing tree iterators as a single pointer. Let s see if we can 30 "think" our way through the process of traversing this tree, one step at a time, without needing to keep a whole stack of 20 unfinished recursive calls around. It s not immediately obvious what our data structure for 10 50 storing our "current position" (i.e., an iterator) will be. We might suspect that a pointer to a tree node will be part or 40 whole of that data structure, in only because that worked for us with iterators over linked lists. With that in mind,... Question: How would you implement begin()? 70 60 CS361 5
1.1 begin() 30 We find the begin() position by starting from the root and working our way down, always taking left children, until we come to a node with no left child. 20 10 50 40 70 60 That wasn t so hard. Careful, now. Question: How would you implement end()? CS361 6
Probably by returning a null pointer. It s tempting to guess that you could do much the same as for begin(), this time seeking out the rightmost node. But that would leave you pointing to the last node in the tree, and end(), for any container, is supposed to denote the position after the last element in the container. 20 30 10 50 40 70 60 1.2 operator++ CS361 7
30 Now it get s trickier. Suppose you are still trying to implement iterators using a single pointer, you have one such pointer named current as shown in the figure. 10 20 50 70 Question: How would you implement ++current? current 40 60 CS361 8
You can t, not with just a pointer to the node and all the nodes pointing only to their children. The only place you can go within this tree is down, and there is no down from our current position. In a binary tree, to do operator++. current 10 30 20 70 50 40 60 We need to know not only where we are, but also how we got here. One way is to do that is to implement the iterator as a stack of pointers containing the path to the current node. In essence, we would use the stack to simulate the activation stack during a recursive traversal, but that s pretty clumsy. Iterators tend to get assigned (copied) a lot, and we d really like that to be an O(1) operation. Having to copy an entire stack of pointers just isn t very attractive. CS361 9
CS361 10
2 Iterators using Parent Pointers template <typename T> class stnode { public: // stnode is used to implement the binary search tree class // making the data public simplifies building the We class can make functions the task of creating tree T nodevalue; // node data stnode<t> *left, *right, *parent; iterators much easier if we redesign the tree nodes to // child pointers and pointer to the node s parent add pointers from each node to its // constructor parent. stnode (const T& item, stnode<t> *lptr = NULL, stnode<t> *rptr = NULL, stnode<t> *pptr = NULL): nodevalue(item), left(lptr), right(rptr), parent(pptr) {} }; CS361 11
These nodes are then used to implement a tree class, which keeps track of the root of our tree in a data member. template <typename T> class stree { public: typedef stree_iterator<t> iterator; typedef stree_const_iterator<t> const_iterator; stree(); // constructor. initialize root to NULL and size to 0. iterator find(const T& item); // search for item. if found, return an iterator pointing CS361 12
// at it in the tree; otherwise, return end() const_iterator find(const T& item) const; // constant version. iterator begin(); // return an iterator pointing to the first item // inorder const_iterator begin() const; // constant version iterator end(); // return an iterator pointing just past the end of // the tree data const_iterator end() const; // constant version private: CS361 13
stnode<t> *root;. stnode<t> *findnode(const T& item) const; // search for item in the tree. if it is in the tree, // return a pointer to its node; otherwise, return NULL. // used by find() and erase() friend class stree_iterator<t>; friend class stree_const_iterator<t>; // allow the iterator classes to access the private section // of stree }; 2.1 Basic operations Here s the basic declaration for an iterator to do in-order traversals. template <typename T> class s t r e e _ i t e r a t o r CS361 14
{ friend class stree <T>; friend class stree_const_iterator <T>; public : / / constructor s t r e e _ i t e r a t o r ( ) { } / / comparison operators. j u s t compare node pointers bool operator== ( const s t r e e _ i t e r a t o r& rhs ) const ; bool operator!= ( const s t r e e _ i t e r a t o r& rhs ) const ; / / dereference operator. return a reference to / / the value pointed to by nodeptr T& operator * ( ) const ; / / preincrement. move forward to next l a r g e r value CS361 15
s t r e e _ i t e r a t o r& operator++ ( ) ; / / postincrement s t r e e _ i t e r a t o r operator++ ( int ) ; / / predecrement. move backward to l a r g e s t value < current value s t r e e _ i t e r a t o r& operator ( ) ; / / postdecrement s t r e e _ i t e r a t o r operator ( int ) ; private : / / nodeptr i s the current location in the t r e e. we can move / / f r e e l y about the t r e e using l e f t, right, and parent. / / t r e e i s the address of the s t r e e object associated / / with t h i s i t e r a t o r. i t i s used only to access the / / root pointer, which i s needed f o r ++ and / / when the i t e r a t o r value i s end ( ) stnode<t> * nodeptr ; CS361 16
stree <T> * tree ; / / used to construct an i t e r a t o r return value from / / an stnode pointer s t r e e _ i t e r a t o r ( stnode<t> *p, stree <T> * t ) : nodeptr (p ), tree ( t ) { } } ; You will note that the public interface is pretty much a standard iterator. The private section declares a pair of pointers. One points to the three that we are walking through. The other points to the node denoting our current position within that tree. 2.2 begin() and end() As noted earlier, begin() works by finding the leftmost node in the tree, and end() uses a null pointer. CS361 17
template <typename T> class stree { public:. iterator begin(); // return an iterator pointing to the first item // inorder iterator end(); // return an iterator pointing just past the end of // the tree data. private: stnode<t> *root; // pointer to tree root CS361 18
. }; int treesize; // number of elements in the tree. template <typename T> typename stree<t>::iterator stree<t>::begin() { stnode<t> *curr = root; // if the tree is not empty, the first node // inorder is the farthest node left from root if (curr!= nullptr) while (curr->left!= nullptr) curr = curr->left; CS361 19
} // build return value using private constructor return iterator(curr, this); template <typename T> typename stree<t>::iterator stree<t>::end() { // end indicated by an iterator with nullptr stnode pointer return iterator(nullptr, this); } 2.3 operator++ CS361 20
A Before trying to write the code for this iterator s operator++, let s try to figure out just what it should do. D B E C F G Question: Suppose that we are currently at node E. What is the in-order successor (the node that comes next during an in-order traversal) of E? CS361 21
A G is the in-order successor of E. (If you answered F, remember that in an in-order traversal, we visit a node only after visiting all of its left descendents and before visiting any of its right descendents. Since we re at E, we must have already visited F.) D B E C F G The previous example suggests that a node s in-order successor tends to be among its right descendents. Let s explore that idea further. Question: Suppose that we are currently at node A. What is the in-order successor (the node that comes next during an in-order traversal) of A? CS361 22
F is the in-order successor of A. If we are at A during an in-order traversal, we have already visited all of A s left descendents. So the answer has to be C or one of its descendents. It s tempting to B pick C because it s only one step away from A. But, remember, during an in-order traversal, we visit a node only after visiting all of its left descendents and before D visiting any of its right descendents. We have not yet visited C s left descendents. So have to run down from F C to the left as far as we can go. This suggests that, if a node has any right descendents, we should Take a step down to the right, then Run as far down to the left as we can. A E C G You can see how this would take us from A to F. But that "step to the right, then run left" procedure raises a new question. What happens if we are at a node with no right descendents? CS361 23
Question: Suppose that we are currently at node C. What is the in-order successor of C? CS361 24
C does not have an in-order successor. C is actually the final node in an in-order traversal. After C is only end(). OK, that s an interesting special case, but it doesn t make clear what should happen in the more general case where we have no right child. D B A E C Question: What is the in-order successor of F? F G CS361 25
A E is the in-order successor of F. So, when we have no right child, we may need to move back up in the tree. D B E C Question: What is the in-order successor of G? F G CS361 26
C is the in-order successor of G. Why did we move up two steps in the tree this time, A when from F we only moved up one step? The answer lies in whether we moved back up over a left-child edge or a right-child edge. If we move up over a right-child edge, we re returning B C to a node that has already had all of its descendents, left and right, visited. So we must have already visited D E this node as well, otherwise we would never have made it into its right descendents. F G If we move up over a left-child edge, we re returning to a node that has already had all of its left descendents visited but none of its right descendents. That s the definition of when we want to visit a node during an in-order traversal, so it s time to visit this node. So, if a node has no right child, we move up in the tree (following the parent pointers) until we move back over a left edge. Then we stop. Notice that, applying this procedure to C, we would move up to A (right edge), then try to move up again to A s parent. But since A is the tree root, it s parent CS361 27
pointer will be null, which is our signal that C has no in-order successor. 2.3.1 Implementing operator++ To summarize, If the current node has a non-null right child, Take a step down to the right Then run down to the left as far as possible If the current node has a null right child, move up the tree until we have moved over a left child link With that in mind, the operator++ code should be easily :-) understood. template <typename T> s t r e e _ i t e r a t o r <T>& s t r e e _ i t e r a t o r <T > : : operator++ ( ) { CS361 28
stnode<t> *p ; i f ( nodeptr == nullptr ) { / / ++ from end ( ). get the root of the t r e e nodeptr = tree >root ; / / error! ++ requested for an empty t r e e i f ( nodeptr == nullptr ) throw underflowerror ( " stree i t e r a t o r operator++ ( ) : tree empty" ) ; / / move to the smallest value in the tree, / / which i s the f i r s t node inorder while ( nodeptr > l e f t!= nullptr ) { nodeptr = nodeptr > l e f t ; } } else i f ( nodeptr >r i g h t!= nullptr ) CS361 29
{ / / successor i s the f u r t hest l e f t node of / / right subtree nodeptr = nodeptr >r i g h t ; while ( nodeptr > l e f t!= nullptr ) { nodeptr = nodeptr > l e f t ; } } else { / / have already processed the l e f t subtree, and / / there i s no r i g h t subtree. move up the tree, / / looking for a parent for which nodeptr i s a l e f t child, / / stopping i f the parent becomes NULL. a non NULL parent / / i s the successor. i f parent i s NULL, the original node / / was the l a s t node inorder, and i t s successor / / i s the end of the l i s t p = nodeptr >parent ; while (p!= nullptr && nodeptr == p >r i g h t ) CS361 30
{ } nodeptr = p ; p = p >parent ; } / / i f we were previously at the right most node in / / the tree, nodeptr = nullptr, and the i t e r a t o r s p e c i f i e s / / the end of the l i s t nodeptr = p ; return * this ; } You can run these iterator algorithms here. A similar process of analysis would eventually lead us to an implementation of operator--. CS361 31
3 Threads Another approach to supporting iteration is threading: Threaded trees replace all null right pointers by a thread (pointer) to that node s in-order successor. 10 need a boolean flag to tell if the right current pointer is a child or a thread 20 40 30 50 70 60 Can also thread the null left pointers to allow operator-- CS361 32