Section 2 - Implementing a General Tree

9A.2.1 General Tree Implementation

Before jumping in, let's imagine an operation like removeNode(). Removing TreeNodes -- I mean really removing them, not just marking them as deleted -- is one of the trickiest operations to implement in a tree ADT. What would this look like? First, it would be a member function - an instance method -- of our tree template. This means we'd call it using a tree object:

myTree.removeNode( ... );

What goes into the argument list? One idea is that we would pass it a TreeNode*, that is a pointer to our basic node type. Imagine we somehow got our hands on a pointer to nodeK in the above diagram. Call it nodeKPtr. If we wanted to remove nodeK from the tree, the invocation might look like this:

myTree.removeNode( nodeKPtr );

Imagine, in the implementation of removeNode(), trying to move the arrows around in the above diagram to make this happen. Somewhere along the line, you would see that you need to point "around" node K, so that its parent, node F, must be modified to make its firstChild pointer point to node L. The problem, of course, is nodeKPtr, cannot be used to access node F. The best we can do with nodeKPtr is to point either to the right, to node K's sibling, or down to node K's first child (if either or both exist). But that doesn't help us fix node F.

We solve the problem by adding a third pointer to our proposed TreeNode class so that it now contained data plus three pointers, sib, firstChild and prev. prev would point back to the node pointing to it in the implementation diagram:

TreeNode E: prev points to D
TreeNode K: prev points to F
TreeNode J: prev points to I

Note that this prev pointer might point to a sibling or it might point to a parent. In the case of the tree root, it might even be NULL.

We are ready to take on the task of writing this template class, which we'll call FHtree. We first define the FHtreeNode class.

I will make it a separate data template, defined along side the FHtree. The form, within the include file, FHtree.h will be like so:

template <class Object>
class FHtreeNode
{
  friend class FHtree<Object>;
  ...
};

template <class Object>
class FHtree
{
  ...
private:
  FHtreeNode<Object> *mRoot;
};

The effect is notationally simple. The client will define the FHtree objects and any needed FHtreeNode objects separately:

#include "FHtree.h"
#include "iTunes.h"

int main()
{
  FHtree<iTunesEntry> tunesTree;
  FHtreeNode<iTunesEntry> *tn;
  ...

I'll make a few comments before we start writing code.

We will not use iterators. Because it is unclear what, exactly, an iterator would do -- move to the next sib? or move to a child? -- we would have to spend too much time defining different sorts of iterators and iteration operations. It just would not pay, alogorithmically. Instead, we use simple pointers to the individual FHtreeNodes when we have to point to a node in the tree.
You might think that we could somehow build the tree structure on top of STD lists. No such luck. Again, the work that would be required to do even this, is unwieldy. These general trees are just too amorphous to yield a neat solution based on a plain doubly linked list. You are free to try on your own.
As you read these function implementations, note how recursion is used. Recursion is a very common technique in trees because here it does not result in significant overhead.
It's not necessary to follow every detail of every algorithm. However, pick a few and do make an effort to fully understand those. See if you can modify it to work better, faster, smaller. It's good for your programming skills.

9A.2.2 FHtreeNode

The FHtreeNode is very short. Here is the main part, with a couple items removed to make it clearer for the lesson:

template <class Object>
class FHtreeNode
{
  friend class FHtree<Object>;

private:
  FHtreeNode *firstChild, *sib, *prev;
  Object data;
  const FHtreeNode *myRoot;  // needed to test for certain error

public:
  // constructor that takes an Object and some node pointers
  FHtreeNode( ... )
  { }

  // ... a couple others

};

We declare FHtree to be a friend. This enables FHtree algorithms to get to FHtreeNode data without restriction.

As advertized, we have three pointers and a data member. However, there is an extra pointer that I didn't mention: myRoot. Can you guess what this one does? We will use it to guarantee that we are not trying to use a tree A node pointer in a tree B operation:

myTree.removeNode( nodeKPtr );

We have to protect ourselves from an evil client passing in a nodeKPtr which points to a node that does not belong to myTree. The myRoot pointer will help with that, as every FHtreeNode will point to the root of the tree to which it belongs. We can then test the FHtreeNode argument passed in to this function, comparing its myRoot pointer to the mRoot member of the calling object (we define mRoot in FHtree next).

9A.2.3 FHtree Prototype

We complete this section by looking at a representative part of the FHtree class prototype. It is not too scary looking.

template <class Object>
class FHtree
{
private:
  int mSize;
  FHtreeNode<Object>  *mRoot;

public:
  FHtree() { mSize = 0; mRoot = NULL; }
  ~FHtree() { clear(); }

  // several items not shown.
  FHtreeNode<Object> *addChild( FHtreeNode<Object>  *treeNode, const Object &x );

  FHtreeNode<Object> *find(const Object &x) { return find(mRoot, x); }

  FHtreeNode<Object> *find(
  FHtreeNode<Object>  *root, const Object &x, int level = 0);

};

Private Members and Constructor

You can see the private members, mSize and mRoot, as expected. The constructor sets mRoot to NULL, which tells you that we are not using any "header nodes" to our tree. Header nodes don't confer the same advantage to tree algorithms, as they did in linear list algorithms, so there is no reason to have them here. The destructor calls a method, clear(), whose prototype is not shown, but it basically deletes all the nodes in the tree

addChild()

Here we are taking an Object &, x, which is the intrinsic type-parameter of the instantiated (or perhaps we should say "specialized" because this is a template not a class) FHtree. The method will wrap x in an FHtreeNode object (allocated in the function) and link-it-in the tree as a child of the tree node which is also passed in as a parameter. In other words, this method can only be used after the client has identified a particular tree node in the tree that represents where we wish to place our Object -- this treeNode will be the parent of the new object.

The way this will work in practice is that we will use a method find(), described next, that will search for an Object in the tree and return a pointer to the tree node that contains it.

The two find()s

This next part is very important so you must understand it thoroughly. In many recursive implementations, trees being a major example, the client will often call a public member function . However, that public function will go on to call a nearly identical, overloaded, version of itself that takes extra parameters. These extra parameters control the recursion; they change as the call is repeated downward, bringing the recursive calls closer to some terminal non-recursive case. Here, we have two find() methods, one that takes the Object that we wish to find, and a second overloaded variant that takes that Object together with a treeNode pointer which represents the root of the subtree in which we are looking. Initially, the subtree that we are searching is the whole tree, as we see by noting that the first call to the overloaded find() passes mRoot as the extra parameter. What you will see when we study the implementation of find() is that every recursive call will change the parameter root to move it further down the tree toward its leafs, so that we are searching in an ever shrinking tree. Eventually we find the Object (or reach a NULL). When we do, we pass the FHtreeNode pointer where it lives (or the NULL) back up to the original call.

This is a theme in recursion, and you will see pairs of functions, the public version and the more technical, often private version. It is crucial that you understand why this is done and that it must be done.

This overloaded variation is usually private in most data structures since it is not good form to allow the client such technical access to it through these extra parameters. However, in our FHtree, we will find that it is useful to let the client search for something in some subtree that it has already identified. For example, there may be several "left arm" objects in the tree, one that belongs to "Carlo", one that belongs to "Singh", one that belongs to "Nana", and so on. So we first find the node that points to Carlo:

FHtreeNode *tn = sceneTree.find("Carlo");

The returned FHtreeNode pointer now identifies the portion of the tree that defines "Carlo" and it is in this subtree that we wish to search. From there, we can start our search for "left arm":

tn = sceneTree.find(tn, "left arm");

Note the first find() invocation uses the simpler variant that takes no subtree parameter and starts from the root, but the second find() invocation uses the more detailed variant and thus starts from a subtree of interest.

By the way, the int parameter, level, is not important to this discussion so I am ignoring it here.

This is an overview of a general tree. Next I'll show you how to implement a find() and a few other methods of the FHtree. You will, of course, receive a complete include file that you can play with, FHtree.h.