Sunday 29 January 2012

Hash Table, Hash Map

A hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys  to their associated values. Thus, a hash table implements an associative array. The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought. Ideally, the hash function should map each possible key to a unique slot index.

HashMap has a number of "buckets" which it uses to store key-value pairs in it. Each bucket has a unique number - that's what identifies the bucket. When you put a key-value pair into the map, the hashmap will look at the hash code of the key, and store the key-value pair in the bucket of which the identifier is the hash code of the key. For example: The hash code of the key is 235 -> the pair is stored in bucket number 235. (Note that one bucket can store more then one key-value pair).

When you lookup a value in the hashmap, by giving it a key, it will first look at the hash code of the key that you gave. The hashmap will then look into the corresponding bucket, and then it will compare the key that you gave with the keys of all pairs in the bucket, by comparing them with equals().

The equals() and hashCode()
The methods hashCode() and equals() play a distinct role in the objects you insert into Java collections.

The equals() is used in most collections to determine if a collection contains a given element.
The hashCode() method of objects is used when you insert them into a HashTable, HashMap or HashSet.

When inserting an object into a hastable you use a key. The hash code of this key is calculated, and used to determine where to store the object internally.
When you need to lookup an object in a hashtable you also use a key. The hash code of this key is calculated and used to determine where to search for the object.

The hash code only points to a certain "area" (bucket etc) internally. Since different key objects could potentially have the same hash code, the hash code itself is no guarantee that the right key is found. The hashtable then iterates this area (all keys with the same hash code) and uses the key's equals() method to find the right key. Once the right key is found, the object stored for that key is returned.

So, as you can see, a combination of the hashCode() and equals() methods are used when storing and when looking up objects in a hashtable.

Creation of HashTable:
Always we create Hashtable using the empty constructor Hashtable(). Which is a poor decision and an often repeated mistake. Hashtable has two other constructors
Hashtable(int initialCapacity)
and
Hashtable(int initialCapacity, float loadFactor)

Initial capacity is number of buckets created at the time of Hashtable instantiation. Bucket is a logical space of storage for Hashtable.
The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hashtable exceeds the product of the load factor and the current capacity, the capacity is increased by calling the rehash method.


No comments:

Post a Comment