Learnerslesson
   JAVA   
  SPRING  
  SPRINGBOOT  
 HIBERNATE 
  HADOOP  
   HIVE   
   ALGORITHMS   
   PYTHON   
   GO   
   KOTLIN   
   C#   
   RUBY   
   C++   
   HTML   
   CSS   
   JAVA SCRIPT   
   JQUERY   




Java - equals() & hashCode() with HashSet

So far we have learnt, when an object is added to the HashSet using the add()


i.e. hashSet.add(s1)

method, it doesn't get added to the end of the HashSet.

But the hash code is calculated by java and the object is added to that particular location based on the HashCode.

i.e. If the HashCode of 'object1' is 6. It will get added to the 6th location of memory. Similarly if the HashCode of 'object2' is 3, it will get added to the 3rd location of memory. And if there is an object 'object3' which also has the HashCode 6, that will result in a Hash collision. So 'object3' will be added to the 6th location, just below 'object1'.



But how is 'object3' getting added after 'object1'?

It's because if there is a collision a LinkedList is formed from that location.i.e. In 6th location there is a collision, so a LinkedList is created from the 6th location and 'Object1' and 'Object3' is added to the List.


java_hashing


And how the HashCode of the objects are calculated by java?

It's calculated based on object's memory location.

So, whenever a new object is added to the HashSet, the Hash Code of that object is calculated by java and is put in the respective location(Same as the above example of object1, object2, object3).

Now, when the object is to be retrieved using the get() method, the Hash Code is again calculated by java and java reaches that particular location to fetch the object.

Just like the above case, if you provide 'Object2' to the contains() method of HashSet. Java will calculate the HashCode of 'Object2' and fetch the object from the 3rd location.


i.e. hashSet.contains(Object2)

But if you ask java to retrieve 'Object3', the contains() method will calculate the Hash Code using hashCode() method and reach the 6th location.


i.e. hashSet.contains(Object3)

But there are two objects in 6th location.

In this case java uses the equals() method to check each and every objects until it finds 'Object3'.

Now, the point to note is, there is a method hashCode() in java.lang.Object, which takes care of generating the hash code for objects until there is a situation where we have to calculate the hash code all by ourselves.


But why do we have to calculate the Hash Code ?

Note :Java calculates hashCode() based on the object reference. i.e. Based on the memory location of the object.

And just remember there is no get() method for HashSet. Because it makes no sense in case of HashSet.

So, lets take the Human example to understand the following:

We know there is a Human object with 'age' and 'name' as attributes.


class Human{

int age;
String name;

-- getters, setters and constructors ---

}

Now, lets create Human objects and add it to the HashSet.


public class TestCollection{
  public static void main(String[] arg){

  Human human1 = new Human(21,"Sham");
  Human human2 = new Human(42,"Paul");
  Human human3 = new Human(18,"John");

  // We will be adding these Human objects to the HashSet.
  Set hashSet = new HashSet(); // Declare an HashSet.
  hashSet.add(human1); // Add the Human objects to the HashSet.
  hashSet.add(human2);
  hashSet.add(human3);

  // Below code creates a new object which has the details of 'Paul'.
  Human human4 = new Human(42,"Paul");
  hashSet.add(human4);
  }
}

So, what we have done in the above example is:

1) Created three Human objects


Human human1 = new Human(21,"Sham");
Human human2 = new Human(42,"Paul");
Human human3 = new Human(18,"John");

2) Declare a HashSet and add the Human objects to the HashSet.


Set hashSet = new HashSet();
hashSet.add(human1);

3) We are creating a new object 'human4' for holding the details of 'Paul'.


Human human4 = new Human(42,"Paul");

4) We have added the object 'human4' to the hashSet.


hashSet.add(human4);

But 'human4' contains the details of 'Paul' whose age is '42'. And if we see closely 'Paul' is already in the hashSet, since we have already added 'human2' object.So, the duplicate entry is getting inserted in the HashSet. But why?

Let's understand the cause:


How does a Set prevents duplicate entries while insertion?

Set has a contains() method, which again takes the help of equals() method to check for duplicates.

And if you remember equals() compares the objects by reference. i.e. By its memory location.

Now, when we are trying to insert an object('human4' in the above case), java takes the help of equals() method to check by the memory location of the object.

In simple words the equals() method does not check what is stored inside the Object. It only checks if two objects belongs to the same memory location.


How to solve this problem?

The Solution is, we have to override hashCode() and equals() method in our class.

Let us redefine the hashCode() and equals() in the Human class.


class Human{

  int age;
  String name;

  -- getters, setters and constructors ---

  @Override
  public boolean equals(Object object) {

    Human human = (Human) object;

    if(human.getAge() == age && human.getName().equals(name))
     return true;
    else
     return false;
   }

  @Override
  public int hashCode() {
    int value;
    value = age + name.hashCode();

    return value;
  }

}

We have redefined the equals() method where we have mentioned that the objects should be compared based on value and not on memory location.

Also we have overridden the hashCode() method and put a custom logic. So that the hash code is calculated not based on the memory location(Since the default implementation of hashCode() calculates the hash code based on the memory location).