Section 4 - Find() in Hash Tables
6B.4.1 find()
Did anyone notice that there is no find() mechanism? There's a contains(), but for that, you need the object you are looking for. It may be that you only have one field value of the object - the key, say - and you want to know if an object with that key is in the table. If anyone tries to sell you a hash table without a find(), hold on to your wallet.
For learning, what we did was fine. But if you want something you can actually use, you'll need find(). That's why you are going to write one this week. And this find() is different from the find() in a tree, so you'll explore some new horizons.
6B.4.2 Requirements for find()
We work backwards to see what we'll need to do to our hash class templates in order to accommodate find(). Here is a typical use of find() by the client:
// test find: try { // book = hashTable.find( 28493 ); book = hashTable.find( "Fanny, Aunt, 1822-1894" ); cout << "YES: " << book.getSCreator().substr(0,8) << " " << book.getSTitle().substr(0,10); } catch (...) { cout << "no. "; }
If we can make this little client-side fragment work, we will be happy. There are two find() lines, one commented-out. We want to be able to handle a look-up based on an int key or a string key, so we will try to make the program work with either find() (but not in the same run!)
We need a find() that takes the data type that was used as the key. Using the line that is not commented-out above, we see that this is a string from the creator field. Somehow we have to allow a program that lets us pass in a creator to find(), and find() will either return the book with that creator or throw a NotFoundException (which we will catch without caring exactly what the exception is called).
So, what things do you have to change to make this work?
On the Client Side
The hash function that the client supplies which takes an object of the underlying class (always required) must be defined correctly:
int Hash( const EBookEntry & item ) { // define carefully }
This is not as mysterious as it sounds. First, you decide which key field you want to base your equality on. It will either be an int or a string field. This is the same as the one used as a sort key (see below). Next, you pass this key field to hash() - which is one of the two primitive overloads, int or string, already written. So, really, this is literally a one line function.
Also, the client needs to provide a second global-scope function getKey() that returns the key value. We are assuming that we don't have access to the underlying data class, which is why we will make this global, like hash(). The return value will be either int or string, depending on what the type of the key is. Here is an example. If the underlying class had a string member email_addr, with accessor getEmailAddress(), and this was the choice for the key for building and using the hash table, then the client would supply the following one-liner global function for use by one of the hash table template functions (namely findPosKey(), described shortly):
string getKey( const MyDataType & item ) { return item.getEmailAddress() ; }
We'll need a find() function whose signature reflects the key field parameter, not an object. This could be an int or a string. Hmm. How can we allow this? Overload? No. Template parameter. Yes. Our class needs a second type parameter, KeyType, that we can establish when we instantiate (specialize) the template.
FHhashQPwFind<EBookEntry, string> hashTable;
This will inform the template what kind of key value we will be passing to find().
Also, our underlying object class must define equality correctly. This will be handled inside that class, but for us, we have designed-in the static setSortType() method which we can call in the client. This allows us to pick-and-choose our equality member at will.
On the Template Side
To make that last trick, on the client side, work, we need to define our class template with two parameters:
template <class Object, typename KeyType> class FHhashQPwFind { }
The second parameter, KeyType, is inside the FHhashQPwFind class in three places:
- find(): The signature will be
const Object find(const KeyType & key);
- findPosKey(): a function that will be called by find() which finds the position based on the key, not on the object:
int findPosKey( const KeyType & key ) const;
- myHashKey(): a function called by findPosKey() which gets the hash value based on the key, not on the object:
int myHashKey(const KeyType & key) const;
find() is very much like contains(), which should be used as a starting point. The hint is, while contains() has one line, find() has about three or four lines. The other two methods are trivial modifications to the findPos() and myHash() functions. We are not replacing those functions in the class - they are still needed. We are copying them and giving the copies these new names. Then we can adjust the copies to work with the key field.
Since all the other methods will be the same, we can use the existing FHhashQP or FHhashSC as a base class and derive FHhashQPwFind from it. This will be much easier than going into the FHhashQP or FHhashSC files and tinkering there. For one thing, since we are adding a second type parameter, if we tried to modify the original files, we would have to add this second parameter atop each definition. However, using inheritance, we can avoid that.
6B.4.3 Syntax Details
As we have seen before, whenever we derive from a template, there are a few thorns that might stick you if you're not careful. I'll list everything you need for this assignment so you are ready:
- From the new derived FHhashQPwithFind class methods, always refer to the base class instance members using this-> syntax, e.g., this->mTableSize and this->mArray.
- From the new derived FHhashQPwithFind class methods, always refer to base class static constants using full class qualifiers, e.g., FHhashQP<Object>::ACTIVE.
- From the new derived FHhashQPwithFind class methods, always use the typename keyword and full qualification in front of the declaration of any base class nested template type, e.g., typename FHhashQP<Object>::HashEntry searchResult;.
There are only a few places where these details will be needed, but now you've got a reference.
6B.4.4 An Alternate Approach
Another idea that might be used by some developers is to make find() take the object type. (We talked about this in the module on trees. ) This would obviate the requirement for a second type parameter. However, then the client would have to pass an object to find(). What object? The idea here would be that the client would build a default object, prior to the call, then mutate that one key field so it contains the value he is looking for. Then, he would pass that object to find() and find() would, if successful, return the actual object which matches the key field. This would require adding the find() method, only. No other changes to the template would be need. This is less work for the template designer and more work for the client designer.