'Hash Table' is a data structure, where data is stored in Key Value Pairs.
Let me make it a little simple.
Say, you are a class teacher and you made a rule in your class that all the students in your class should sit based on their roll number.
And off course, roll numbers are pasted on their seats.
Now, if you want to search for a particular student, you would simply look into your register where the names of the students are written based on their roll number.
The below is how your register looks like.
Roll Number | Name |
---|---|
1 | Sean Miller |
2 | Rahul Kumar |
... | ... |
... | ... |
Now, it is quite simple! If you want to find a student. You just have to know his/her roll number and you can find him/her.
And exactly the same logic applies for a 'Hash Table'.
So, let us represent your 'register' that has the roll number and the name of the students as a 'Hash Table'.
And, the 'roll number' would be the 'key' for the 'Hash Table' and the 'Name' would be the 'value' for 'Hash Table'.
Seems quite easy, right ?
But in the real world there could be data that would be quite large. Say, you were asked to store the names and addresses of all the people in a particular State.
There could be around a millions of people residing in a State. You can't simply assign a roll number to each people.
So, how would you store bulk amount of people in your 'Hash Table'?
In such cases you would have to calculate the 'Hash Code' and associate the 'Hash Code' to all the people for that State(The same way you have assigned a roll number for each student).
So, that anytime you want to find a particular person, you pass the 'Hash Code' to the 'Hash Table' and you get back the name of that person.
Now, the million dollar question would be,
Let us take the below example to understand 'Hash Code'.
Just imagine there is a hall and the desks and benches in the hall are placed in onlya single row. Benches are marked from roll numbers 1 to 10.
Now, there are a group of students and we need to calculate their rollno based on their names.And make them sit on their seats based on their roll numbers.
But how? Let's mark the alphabets from 1 to 26. i.e. A is 1, B is 2, C is 3 and so on.
Let's suppose there is a guy 'CAB' and we will make his roll number by adding the letters in his names.
i.e.
So, the added value of 'CAB' is 3+1+2 = 6.
So, 'CAB' is assigned roll no 6. And he is made to sit on seat number 6.
Similarly, if a girl named 'ABA' comes, her rollno would be A-1, B-2, A-1. i.e. 1+2+1=4.
Thus, 'ABA' would be assigned roll no 4. And she is made to sit on seat number 4.
So, in this way roll numbers would be assigned to all the students in that group.
And that is Hashing and the roll no we are calculating is called its HashCode.
But stop, think for a while. There can be a guy named 'BAC'.
So, his roll number would be: B-2, A-1, C-3. i.e 2+1+3=6.
Strange ! 'CAB' and 'BAC' both have roll number 6.
This is called a 'Hash Collision' where two elements(or guys in this case) have the same roll numbers.
So, one more seat will be reserved for 'BAC' with number 6 attached on the seat.
Now, just think for a moment!
If 'CAB' and 'BAC' can have the same roll number i.e. 6. There can be so many other students who can have the same roll number 6.
So, it sounds a little weird because in a class, ideally one student should have a unique roll number assigned to him.
How can two or more students have the same roll number ?
Well! In case of 'Hash Table' data structure, 'Hash Collisions' can happen.
So, the logic of calculating 'Hash Code' (roll number in this case) should be in such a way that 'Hash Collisions' (equal roll numbers in this case) should be as minimal as possible.
Now, let us consider the above structure to be a 'Hash Table'. And also let us suppose, we have inserted the 'age' of the particular student along with their names.
So, for 'ABA', we calculate the 'Hash Code' which is,
And we go to location 4 and insert 'ABA' along with her age there(Say ABA's age is 13).
And we go to location 4 and insert 'ABA' along with her age there(Say ABA's age is 13).
So, far we have seen, how to store a name and age to the 'Hash Table'.
Now, we want to retrieve the age of a particular student.
As we know 'ABA' is stored in the 'Hash Table'. Now, let us retrieve ABA's age from the 'Hash Table'.
So, the first thing we would do is, calculate the 'Hash Code' for 'ABA'.
And we go to location 4 and find 'ABA's age i.e. 13.
Now, let us try to find 'BAC's age.
So, we will calculate the 'Hash Code' for 'BAC'.
So, we go to location 6 and find 'CAB' there. But we also know 'BAC' should be in that location.
And we start searching all the names in location 6. And find 'BAC's age i.e. 15.