Open In App

Separate Chaining Collision Handling Technique in Hashing

Last Updated : 12 Jun, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Separate Chaining is a collision handling technique. Separate chaining is one of the most popular and commonly used techniques in order to handle collisions. In this article, we will discuss about what is Separate Chain collision handling technique, its advantages, disadvantages, etc.

What is Collision? 

Since a hash function gets us a small number for a key which is a big integer or string, there is a possibility that two keys result in the same value. The situation where a newly inserted key maps to an already occupied slot in the hash table is called collision and must be handled using some collision handling technique. 

What are the chances of collisions with the large table? 

Collisions are very likely even if we have a big table to store keys. An important observation is Birthday Paradox. With only 23 persons, the probability that two people have the same birthday is 50%.

How to handle Collisions? 

There are mainly two methods to handle collision: 

  • Separate Chaining 
  • Open Addressing 

In this article, only separate chaining is discussed. We will be discussing Open addressing in the next post

Separate Chaining:

The idea behind separate chaining is to implement the array as a linked list called a chain.

The linked list data structure is used to implement this technique. So what happens is, when multiple elements are hashed into the same slot index, then these elements are inserted into a singly-linked list which is known as a chain. 

Here, all those elements that hash into the same slot index are inserted into a linked list. Now, we can use a key K to search in the linked list by just linearly traversing. If the intrinsic key for any entry is equal to K then it means that we have found our entry. If we have reached the end of the linked list and yet we haven’t found our entry then it means that the entry does not exist. Hence, the conclusion is that in separate chaining, if two different elements have the same hash value then we store both the elements in the same linked list one after the other.

Example: Let us consider a simple hash function as “key mod 5” and a sequence of keys as 12, 22, 15, 25

You can refer to the following link in order to understand how to implement separate chaining with C++.
C++ program for hashing with chaining 

Advantages:

  • Simple to implement. 
  • Hash table never fills up, we can always add more elements to the chain. 
  • Less sensitive to the hash function or load factors. 
  • It is mostly used when it is unknown how many and how frequently keys may be inserted or deleted. 

Disadvantages: 

  • The cache performance of chaining is not good as keys are stored using a linked list. Open addressing provides better cache performance as everything is stored in the same table. 
  • Wastage of Space (Some Parts of the hash table are never used) 
  • If the chain becomes long, then search time can become O(n) in the worst case
  • Uses extra space for links

Performance of Chaining: 

Performance of hashing can be evaluated under the assumption that each key is equally likely to be hashed to any slot of the table (simple uniform hashing).  

m = Number of slots in hash table
n = Number of keys to be inserted in hash table

Load factor α = n/m
Expected time to search = O(1 + α)
Expected time to delete = O(1 + α)

Time to insert = O(1)
Time complexity of search insert and delete is O(1) if  Î± is O(1)

Data Structures For Storing Chains: 

1. Linked lists

  • Search: O(l) where l = length of linked list
  • Delete: O(l)
  • Insert: O(l)
  • Not cache friendly

2. Dynamic Sized Arrays ( Vectors in C++, ArrayList in Java, list in Python)

  • Search: O(l) where l = length of array
  • Delete: O(l)
  • Insert: O(l)
  • Cache friendly

3. Self Balancing BST ( AVL Trees, Red-Black Trees)

  • Search: O(log(l)) where l = length of linked list
  • Delete: O(log(l))
  • Insert: O(log(i))
  • Not cache friendly
  • Java 8 onwards use this for HashMap

Related Post: Hashing | Set 1 (Introduction)

Next Post: 
Open Addressing for Collision Handling 


Previous Article
Next Article

Similar Reads

Open Addressing Collision Handling technique in Hashing
Open Addressing is a method for handling collisions. In Open Addressing, all elements are stored in the hash table itself. So at any point, the size of the table must be greater than or equal to the total number of keys (Note that we can increase table size by copying old data if needed). This approach is also known as closed hashing. This entire p
7 min read
Program to implement Separate Chaining in C++ STL without the use of pointers
Pre-requisite: Separate Chaining, STL in C++ This article implements the Separate Chaining in Hashing with help of STL in C++ without the use of pointers. Approach: Make an array of vectors to get a dynamic (resizable) array for every hash index rather than using a linked list to do the same. Now it becomes easier to work on the data-set without us
4 min read
Implementation of Hash Table in Python using Separate Chaining
A hash table is a data structure that allows for quick insertion, deletion, and retrieval of data. It works by using a hash function to map a key to an index in an array. In this article, we will implement a hash table in Python using separate chaining to handle collisions. Separate chaining is a technique used to handle collisions in a hash table.
7 min read
Implementation of Hash Table in C/C++ using Separate Chaining
Introduction: Hashing is a technique that maps a large set of data to a small set of data. It uses a hash function for doing this mapping. It is an irreversible process and we cannot find the original value of the key from its hashed value because we are trying to map a large set of data into a small set of data, which may cause collisions. It is n
10 min read
Implementing our Own Hash Table with Separate Chaining in Java
All data structure has their own special characteristics, for example, a BST is used when quick searching of an element (in log(n)) is required. A heap or a priority queue is used when the minimum or maximum element needs to be fetched in constant time. Similarly, a hash table is used to fetch, add and remove an element in constant time. Anyone mus
10 min read
Program for hashing with chaining
In hashing there is a hash function that maps keys to some values. But these hashing functions may lead to a collision that is two or more keys are mapped to same value. Chain hashing avoids collision. The idea is to make each cell of hash table point to a linked list of records that have same hash function value. Let's create a hash function, such
15+ min read
Top 20 Hashing Technique based Interview Questions
Find whether an array is subset of another arrayUnion and Intersection of two Linked ListsFind a pair with given sumFind Itinerary from a given list of ticketsFind four elements a, b, c and d in an array such that a+b = c+dFind the largest subarray with 0 sumCount distinct elements in every window of size kFind smallest range containing elements fr
1 min read
Collision Course | TCS MockVita 2020
Problem Description On a busy road, multiple cars are passing by. A simulation is run to see what happens if brakes fail for all cars on the road. The only way for them to be safe is if they don't collide and pass by each other. The goal is to identify whether any of the given cars would collide or pass by each other safely around a Roundabout. Thi
8 min read
Probability of collision between two trucks
Given two strings S and T, where S represents the first lane in which vehicles move from left to right and T represents the second lane in which vehicles move from right to left. Vehicles can be either B (bike), C (car), or T (truck). The task is to find the probability of collision between two trucks. Examples: Input: S = "TCCBCTTB", T = "BTCCBBTT
9 min read
First collision point of two series
Given five numbers a, b, c, d and n (where a, b, c, d, n > 0). These values represent n terms of two series. The two series formed by these four numbers are b, b+a, b+2a....b+(n-1)a and d, d+c, d+2c, ..... d+(n-1)c These two series will collide when at any single point summation values becomes exactly the same for both the series.Print the colli
7 min read
Article Tags :
Practice Tags :
three90RightbarBannerImg