Section 2 - mergeSort On Vectors

8B.2.1 merge On Vectors

The vector (or FHvector) version of merge() is not significantly different from its array counterpart. It has an additional complexity because we cannot pass sub-arrays as if they were original arrays (like we can with true arrays, by passing array + offset), so we need more int parameters defining the size and limits of the arrays. We can no longer assume, for example, that each array coming in to us starts at 0, as we could in the pure-array version.

Besides this, the logic is the same.

template <typename Comparable>
void merge(FHvector<Comparable> & client, FHvector<Comparable> & working,
   int leftPos, int rightPos, int rightStop)
{
   int leftStop, workingPos, arraySize;

   workingPos = leftPos;
   leftStop = rightPos - 1;
   arraySize = rightStop - leftPos + 1;

   // as soon as we reach the end of either input array, stop
   while(leftPos <= leftStop && rightPos <= rightStop)
      if(client[leftPos] < client[rightPos])
         working[workingPos++] = client[leftPos++];
      else
         working[workingPos++] = client[rightPos++];

   // merge is over; copy the remainder of one or the other input array
   while(leftPos <= leftStop)
      working[workingPos++] = client[leftPos++];
   while( rightPos <= rightStop )
      working[workingPos++] = client[rightPos++];

   // copy back into client array
   for( ; arraySize > 0; arraySize--, rightStop-- )
      client[rightStop] = working[rightStop];
}

8B.2.2 mergeSort()

The vector mergeSort() (same name for FHvectors as arrays allowed by overloading), is almost identical to the array couterpart:

// mergeSort internal
template <typename Comparable>
void mergeSort(FHvector<Comparable> & a, FHvector<Comparable> & working,
               int start, int stop)
{
   int rightStart;

   if (stop - start < 1)
      return;

   rightStart = (start + stop)/2 + 1;
   mergeSort(a, working, start, rightStart - 1);
   mergeSort(a, working, rightStart, stop);
   merge(a, working, start, rightStart, stop);
}

And we have the vector-version of the public client:

template <typename Comparable>
void mergeSort(FHvector<Comparable> & a)
{
   if (a.size() < 2)
      return;

   FHvector working(a.size());
   mergeSort(a, working, 0, a.size() - 1);
}

8B.2.3 A Client

Since we are now working with vectors, I wanted to show you the results when compared with STL sort().

First, the source.

// Comparison of vector merge and shell
#include <iostream>
using namespace std;
#include "Foothill_Sort.h"
#include "FHsort.h"
#include "FHvector.h"
#include <time.h>
#include <algorithm>
#include <vector>

// --------------- main ---------------
#define ARRAY_SIZE 1000000

int main()
{
   int k;
   FHvector<int> fhVectorOfInts;
   FHvector<int> fhVectorOfInts2;
   vector<int> stlVectorOfInts;

   // build an array and two vector for comparing sorts
   for (k = 0; k < ARRAY_SIZE; k++)
   {
      fhVectorOfInts.push_back( rand()%ARRAY_SIZE );
      fhVectorOfInts2.push_back( fhVectorOfInts[k] );
      stlVectorOfInts.push_back( fhVectorOfInts[k] );
   }

   clock_t startTime, stopTime;
   startTime = clock();  // ------------------ start 

   // sort using insertionSort (in FHsort.h)
   shellSort1(fhVectorOfInts); 
 
   stopTime = clock();  // ---------------------- stop
   cout << "\nFH Shell-1 Elapsed Time: " 
      << (double)(stopTime - startTime)/(double)CLOCKS_PER_SEC 
      << " seconds." << endl << endl;

   startTime = clock();  // ------------------ start 

   // sort using shellSort #11 (in FHsort.h)
   mergeSort(fhVectorOfInts2); 
 
   stopTime = clock();  // ---------------------- stop
   cout << "\nFH merge Elapsed Time: " 
      << (double)(stopTime - startTime)/(double)CLOCKS_PER_SEC 
      << " seconds." << endl << endl;

   startTime = clock();  // ------------------ start 

   // a pure stl sort - STL's best effort on its own vector types
   sort(stlVectorOfInts.begin(), stlVectorOfInts.end());

   stopTime = clock();  // ---------------------- stop
   cout << "\nSTL Elapsed Time: " 
      << (double)(stopTime - startTime)/(double)CLOCKS_PER_SEC 
      << " seconds." << endl << endl;

   // confirming that we got the correct results on all three arrays
   for (k = 0; k < ARRAY_SIZE; k+= ARRAY_SIZE/5)
   {
      cout << "shell 1 #" << k << ": " << fhVectorOfInts[k] << ", ";
      cout << "merge #" << k << ": " << fhVectorOfInts2[k] << ", ";
      cout << "STL sort #" << k << ": " << stlVectorOfInts[k] << endl;
   }
   return 0; 
}

I ran it twice, once on Windows and once on a Mac (unix-based) using a large (1 million element) vector.

//Windows
FH Shell-1 Elapsed Time: 0.281 seconds.


FH merge Elapsed Time: 0.156 seconds.


STL Elapsed Time: 0.063 seconds.

[snip]

Press any key to continue . . .


// Unix-based Mac
FH Shell-1 Elapsed Time: 1.44355 seconds.


FH merge Elapsed Time: 0.836456 seconds.


STL Elapsed Time: 0.068792 seconds.

[snip]

mergeSort is twice as fast as Shell (with Shell's original gap sequence). On Windows, STL sort is twice as fast as mergeSort, and on a Mac it is 12× as fast on this large vector. A little progress ...