Section 4 - Implementing Kruskal

11A.4.1 Prototpyte

We are implementing the kruskal algorithm using a class. The data of the class consists of

  1. a pointer to the input graph and
  2. a priority queue, or heap: a working data structure to store the bag of input edges.

The input graph is assigned by the constructor, but can be changed later using the mutator, setInGraph(). The driver for the algorithm is genKruskal() which computes the output graph for us, leaving the solution in the reference parameter graph supplied by the client. These are the main ingredients of the class. You can see them inside the prototype along with a few other members and methods that we'll discuss presently. Note the usual band of typedefs at the top which makes the vocabulary and syntax simpler throughout the subsequent member method definitions.

The prototype:

template <class Object, typename CostType>
class FHkruskal
{
   // internal typedefs to simplify syntax
   typedef FHvertex<Object, CostType> Vertex;
   typedef FHvertex<Object, CostType>* VertPtr;
   typedef map<VertPtr, CostType> EdgePairList;
   typedef set<VertPtr> VertPtrSet;
   typedef vector<VertPtrSet> Forest;
   typedef FHedge<Object, CostType>  Edge;
   typedef Edge * EdgePtr;
   typedef FHgraph<Object, CostType> Graph;

   // make priority_queue a min heap rather than max heap
   // it requires > operator be defined for the underlying Edge class
   typedef priority_queue<Edge, vector<Edge>, greater<Edge> > edgeHeap;

private:
   edgeHeap edgeHeap;
   const Graph *inGraph;

public:
   FHkruskal (Graph const & grph) { setInGraph(grph); }
   FHkruskal () : inGraph(NULL) {}
   void clear();
   void setInGraph(Graph const & grph)
   { 
      inGraph = &grph;
      clear();
   }

   // algorithms
   bool genKruskal(Graph &outGraph);

private:
   bool buildEdgeHeap();
   static void setUnion(VertPtrSet &a, VertPtrSet &b);
   static bool memberOf(VertPtr v, VertPtrSet &vset);
};

Let's do commentary now, from top to bottom.

We discussed the typedefs. edgeHeap is a min heap, so we need to supply a greater() functor to the constructor to get that "minimum" behavior. We discussed the one-parameter constructor. The default is included in case a client wants to create an empty object and use setInGraph() to later define it. clear() is used to reset the internal edgeHeap to an empty state. genKruskal() is the main method that we'll study in the next section.

11A.4.2 Helper Methods

There are three private methods which we tackle now:

buildEdgeHeap()

The input graph does not have any FHedge objects; the edges are implied by the adjacency lists buried in the FHvertex sets. Therefore we need to construct a true priority_queue of edges. This is done in a private helper method, buildEdgeHeap(). The plan is straightforward. We iterate through the set of vertices in the graph, and for each vertex, we form one edge for each vertex in its adjacency list. The details are best left to the source code:

template <class Object, typename CostType>
bool FHkruskal<Object, CostType>::buildEdgeHeap()
{
   VertPtrSet vertsInGraph;
   typename VertPtrSet::iterator vertIter;
   typename EdgePairList::iterator adjListIter;
   VertPtr src, dst;
   int cost;
   
   if (inGraph == NULL)
      return false;
   vertsInGraph = inGraph->getVertPtrSet();

   for (vertIter = vertsInGraph.begin(); 
      vertIter != vertsInGraph.end(); vertIter++)
   {
      src =  (*vertIter);
      for (adjListIter = src->adjList.begin(); 
         adjListIter != src->adjList.end(); ++adjListIter)
      {
         dst = (*adjListIter).first;
         cost = (*adjListIter).second;
         edgeHeap.push( Edge(src, dst, cost) );
      }
   }
   return true;
}

Since we are not in an FHgraph member function, we have to grab a copy of the vertex set in a local variable, vertsInGraph. Once we have this set, we can work with it. We are invoking an accessor which was not discussed earlier, getVertPtrSet(). This returns a full set object to the client, and you can go back and examine this accessor inside FHgraph if you like. Also, notice that we continue to work with the set of VerPtrs, not actual FHvertex objects. Dijksrta, kruskal and all algorithms will work with the pointer set for all the reasons we stated when we presented FHgraph.

Two Set Operations: setUnion() and MemberOf()

Because of the unwieldy nature of the STL set_union(), I showed you a simple option last week: a stand-alone setUnion() function. We will implement this idea, as well as a memberOf() operation. They are included in the template for expediency, but a more flexible solution would be to make these external template functions or functors in their own right.
// puts the union of sets a and b into a.  b unchanged
template <class Object, typename CostType>
void FHkruskal<Object, CostType>::setUnion(
   VertPtrSet &a, VertPtrSet &b)
{
   typename VertPtrSet::iterator iter;
   for (iter = b.begin(); iter != b.end(); iter++)
      a.insert(*iter);
}

template <class Object, typename CostType>
bool FHkruskal<Object, CostType>::memberOf(
   VertPtr v, VertPtrSet &vset)
{
   return ( vset.find(v) != vset.end() );
}

These two private methods are used by the kruskal implementation

11A.4.3 genKruskal()

We now get to the meat. The preparation for the main loop consists of the following:

After this, we have the main loop. The correct condition is while (vertexSets.size() > 1) since we must keep going until all the vertices merge into a single set. However, if the client hands us a disconnected set this may never happen. So, we add the condition !edgeHeap.empty() to protect ourselves.

Inside the loop we do exactly what the abstract kruskal outline (and our picturesque description) promised. We look at the two vertices in the edge, and if they belong to the same set in vert_sets[] we toss them. Otherwise, we add the edge to newEdges and merge the two vert_sets[]. Specifically, we add the second set into the first and erase() the second from our vector.

Upon completion of the loop, we use the mutator setGraph() which builds a new graph from scratch using newEdges (see FHgraph for details).

template <class Object, typename CostType>
bool FHkruskal<Object, CostType>::genKruskal(Graph &outGraph)
{
   typename VertPtrSet::iterator iter;
   typename Forest::iterator fIter, fIterSrc, fIterDst;
   Forest vertexSets;
   VertPtrSet vertsInGraph, singleton;
   Edge smallestEdge;
   VertPtr src, dst;
   vector<Edge> newEdges;
   int k, numVertsFound;

   // can't use same Graph for input and output
   if (inGraph == NULL || inGraph == &outGraph)
      return false;

   // get a local list of vertices
   vertsInGraph = inGraph->getVertPtrSet();

   // form a forest of sets, initializing each with only one vertex from the graph
   for (k = 0, iter = vertsInGraph.begin(); 
      iter != vertsInGraph.end(); iter++, k++)
   {
      singleton.clear(); 
      singleton.insert(*iter);
      vertexSets.push_back(singleton);
   }

   // form a binary min heap of edges so we can easily find min costs
   if (!buildEdgeHeap())
      return false;

   // test for empty to avoid inf. loop resulting from disconnected graph
   while (!edgeHeap.empty() && vertexSets.size() > 1)
   {
      // pop smallest edge left in heap
      smallestEdge = edgeHeap.top();
      src = smallestEdge.source;
      dst = smallestEdge.dest;
      edgeHeap.pop();

      // see if src and dst are in different sets.  if so, take union
      for (fIter = vertexSets.begin(), numVertsFound = 0 ; 
         fIter != vertexSets.end()  &&  (numVertsFound < 2) ; 
         fIter++)
      {
         if ( memberOf(src, *fIter) )
         {
            fIterSrc = fIter;
            numVertsFound++;
         }

         if ( memberOf(dst, *fIter) )
         {
            fIterDst = fIter;
            numVertsFound++;
         }
      }
      if (fIterSrc == fIterDst)  // same sets: reject
         continue;

      newEdges.push_back(smallestEdge);
      setUnion( *fIterSrc, *fIterDst );
      vertexSets.erase(fIterDst);
   }

   outGraph.setGraph(newEdges);
   return true;
}

On the next page, I provide both the full template file, FHkruskal.h, and the client main() for your convenience. Of course, you have FHkruskal.h as part of your course download.