Implementing the Iterator Pattern on Search Tree’s without Parent Pointers.

Two Posts on Iterators in a row? I know, I know. But at the risk of sounding like some kind of evangelist: If you’re going to insist on implementing your own containers, you must implement iterators for them. I won’t repeat my entire diatribe on the why of the iterator pattern, If you’re reading this post it’s because you’ve already decided on adding iterators to your container.

The academic literature is fairly divided when it comes to the use of parent pointers in search tree structures. Some resources, such as CLRS[1] include them implicitly in all of their examples. Others, such as Sedgewick[2] disavow their use completely.

There are pro’s and con’s to both sides of the argument. The big one for the anti-parent pointer side of the argument is that they waste space by needing an extra pointer in every node. However, including a parent pointer in your node structures means that iterators for that tree can implemented in O(1) space and O(1) complexity. To implement iterators at all – let alone efficiently in a tree without parent pointers requires the use of so called “Fat Iterators” and are the subject of this post.

Fat Iterators

Iterators work by breaking up the act of traversing a container into it’s individual steps, so that the traversal can then be performed step-wise. Tree based structures are by their very nature non-linear. This means that in order for an iterator to know where to go next, for a tree it also needs to know where it came from. In a tree with parent pointers, we can simply follow the link back to the nodes parent.

In a tree with out parent pointers, we need to cache the path that was followed to get to the current node. So-called “Fat” iterators work by saving this path in a stack. If the current node is a leaf node, we pop the first node off the stack and check if it has an unexplored child, if it does, we move down that branch, saving the new path on to the stack. In this way, the iterator can explore the tree in the same step-wise manner as a tree containing a parent pointer.

A quick warm up: Stack-based in-order tree traversal

Search Tree’s maintain a total ordering which can be realized by performing an in-order traversal of a BST. The “natural” form of stack based traversal for binary trees performs a pre-order traversal, but we want to retrieve the items in sorted order. In-order traversal works by first visiting the left child, then process the current node, and finally visiting the right child. When written recursively, it is a very simple algorithm.

 void inorderTraversal(bstnode* node) {
if (node != nullptr) {
inorderTraversal(node->left);
processNode(node);
inorderTraversal(node->right);
}
}

To do the same iteratively, we need to use an explicit stack in place of the call stack utilized by the recursive version. We begin by initializing the stack with the path from the root to the left most node.

 void inorderIter(bstnode* node) {
    stack<bstnode*> sf;
    bstnode* curr = node;
    while (curr) {
        sf.push(curr);
        curr = curr->left;
    }

With our stack initialized, we loop until the stack is empty. During each iteration of the loop the top node is removed from the stack and processed. Once processed, we move to the current nodes right child, and once again perform a traversal to the left most child of the subtree rooted at current node, saving the path to the stack. Once this loop is complete, all nodes will have been processed in an in-order ordering. (Say that ten times fast…)

    while (!sf.empty()) {
        curr = sf.top();
        sf.pop();
        processNode(curr);
        curr = curr->right;
        while (curr) {
            sf.push(curr);
            curr = curr->left;
        }
    }
}

int main() {
    string sed = "abstiteratorexample";
    bstnode* root = nullptr;
    for (char c : sed) {
        root = put(root, c);
    }
    inorderRec(root);
    inorderIter(root);
}

max@MaxGorenLaptop:~/cppEx$ ./a.out
a a a b e e e i l m o p r r s t t t x
a a a b e e e i l m o p r r s t t t x
max@MaxGorenLaptop:~/cppEx$

From Traversal to Iteration

Now that we see how we can traverse the tree without recursion, it’s time to turn this method of traversal into an iterator object. An iterator needs to support at the very least, the following operations:

  • Initialization – the iterator should be directly ready for use upon being instantiated.
  • Data access – we need to be able to retrieve the data held at the position in the tree represented by the iterator.
  • Traversal – the iterator must have a way to move to the next position in the container.
  • Existence – A method for determining if the iteration has completed or should continue.

I’ve covered enough articles on iterators, that if you want to implement an STL compatible operator, you can easily adapt the following interface. To facilitate the required operations, our Iterator will utilize the following interface:

template <class K>
class Iterator {
private:
stack<bstnode<K>*> sf;
public:
Iterator(bstnode<K>* root);
bool hasNext();
void next();
K get() const;
};

Looking at the stack based traversal code we just covered, its not hard to see which pieces of code will go where in order to implement the required methods. To initialize our iterator, we pass the root of the tree to be traversed to the constructor, which initializes our stack with a path to the left more child.

Iterator<K>::Iterator(bstnode<K>* root) {
bstnode<K>* t = root;
while (t) {
sf.push(t);
t = t->left;
}
}

Many of the options now translate directly to stack operations. To check if we have complete iteration of the tree, we check if there are any nodes left on the stack to explore:

bool Iterator<K>::hasNext() {
return !sf.empty();
}

Similarly, to retrieve the value of the current iterator, we simply retrieve the element from the node held on the top of the stack. In an ordered container, you want immutable access to the key-value, as changing it could invalidate the search property. If you’re tree hold’s Key-Value pairs, having the value be mutable is fine however.

template <class K>
K Iterator<K>::get() const {
    return sf.top()->key;
}

To move to the next position in the tree, we want the in-order successor of the current node, which we get by retrieving the left-most child of the current-nodes right child. Being that the current node is the top most item currently on the stack, we can write this method as follows:

template <class K>
void Iterator<K>::next() {
    bstnode<K>* curr = sf.top()->right;
    sf.pop();
    while (curr) {
        sf.push(curr);
        curr = curr->left;
    }
}

For reasons I will never understand, the STL stack’s pop() operation is returns void, or else we could just write curr = sf.pop()->right instead of the ugly top()/pop() idiom you get with the STL.

Anyway, now that our Iterator is complete, if you wanted to make it STL compatible, you would just need to implement the appropriate operators. At the least, you’d need operator*, operator++, operator++(int), operator==, and operator!=. I will leave those as “an exercise to the reader”.

With our iterator complete we can test it using our previous example from above:

int main() {
  string sed = "abstiteratorexample";
  bstnode<char>* root = nullptr;
  for (char c : sed) {
        root = put(root, c);
  }
  Iterator<char> it = Iterator<char>(root);
while (it.hasNext()) {
        cout<<it.get()<<" ";
it.next();
  }
  cout<<endl;
}
max@MaxGorenLaptop:~/$ ./a.out
a a a b e e e i l m o p r r s t t t x
max@MaxGorenLaptop:~/$

That’s all i’ve got for today. Until next time, Happy Hacking!