Open In App

Hash Table Data Structure

Last Updated : 08 May, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

What is Hash Table?

A Hash table is defined as a data structure used to insert, look up, and remove key-value pairs quickly. It operates on the hashing concept, where each key is translated by a hash function into a distinct index in an array. The index functions as a storage location for the matching value. In simple words, it maps the keys with the value.

What is Load factor?

A hash table’s load factor is determined by how many elements are kept there in relation to how big the table is. The table may be cluttered and have longer search times and collisions if the load factor is high. An ideal load factor can be maintained with the use of a good hash function and proper table resizing.

What is a Hash function?

A Function that translates keys to array indices is known as a hash function. The keys should be evenly distributed across the array via a decent hash function to reduce collisions and ensure quick lookup speeds.

  • Integer universe assumption: The keys are assumed to be integers within a certain range according to the integer universe assumption. This enables the use of basic hashing operations like division or multiplication hashing.
  • Hashing by division: This straightforward hashing technique uses the key’s remaining value after dividing it by the array’s size as the index. When an array size is a prime number and the keys are evenly spaced out, it performs well.
  • Hashing by multiplication: This straightforward hashing operation multiplies the key by a constant between 0 and 1 before taking the fractional portion of the outcome. After that, the index is determined by multiplying the fractional component by the array’s size. Also, it functions effectively when the keys are scattered equally.

Choosing a hash function:

Selecting a decent hash function is based on the properties of the keys and the intended functionality of the hash table. Using a function that evenly distributes the keys and reduces collisions is crucial.

Criteria based on which a hash function is chosen:

  • To ensure that the number of collisions is kept to a minimum, a good hash function should distribute the keys throughout the hash table in a uniform manner. This implies that for all pairings of keys, the likelihood of two keys hashing to the same position in the table should be rather constant.
  • To enable speedy hashing and key retrieval, the hash function should be computationally efficient.
  • It ought to be challenging to deduce the key from its hash value. As a result, attempts to guess the key using the hash value are less likely to succeed.
  • A hash function should be flexible enough to adjust as the data being hashed changes. For instance, the hash function needs to continue to perform properly if the keys being hashed change in size or format.

Collision resolution techniques:

Collisions happen when two or more keys point to the same array index. Chaining, open addressing, and double hashing are a few techniques for resolving collisions.

  • Open addressing: collisions are handled by looking for the following empty space in the table. If the first slot is already taken, the hash function is applied to the subsequent slots until one is left empty. There are various ways to use this approach, including double hashing, linear probing, and quadratic probing.
  • Separate Chaining: In separate chaining, a linked list of objects that hash to each slot in the hash table is present. Two keys are included in the linked list if they hash to the same slot. This method is rather simple to use and can manage several collisions.
  • Robin Hood hashing: To reduce the length of the chain, collisions in Robin Hood hashing are addressed by switching off keys. The algorithm compares the distance between the slot and the occupied slot of the two keys if a new key hashes to an already-occupied slot. The existing key gets swapped out with the new one if it is closer to its ideal slot. This brings the existing key closer to its ideal slot. This method has a tendency to cut down on collisions and average chain length.

Dynamic resizing:

This feature enables the hash table to expand or contract in response to changes in the number of elements contained in the table. This promotes a load factor that is ideal and quick lookup times.

Implementations of Hash Table

Python, Java, C++, and Ruby are just a few of the programming languages that support hash tables. They can be used as a customized data structure in addition to frequently being included in the standard library.

Example – Count characters in the String “geeksforgeeks”.

In this example, we use a hashing technique for storing the count of the string.

C++
#include <bits/stdc++.h>
using namespace std;

int main() {
  //initialize a string
  string s="geeksforgeeks";
  
  // Using an array to store the count of each alphabet 
  // by mapping the character to an index value

  int arr[26]={0};
  
  //Storing the count
  for(int i=0;i<s.size();i++){
    arr[s[i]-'a']++;
  }
  
  //Search the count of the character
  char ch='e';
  
  // get count
  cout<<"The count of " <<ch<< " is " <<arr[ch-'a']<<endl;
  return 0;
}
Java
public class CharacterCount {
    public static void main(String[] args) {
        // Initialize a string
        String s = "geeksforgeeks";

        // Using an array to store the count of each alphabet
        // by mapping the character to an index value

        int[] arr = new int[26];

        // Storing the count
        for (int i = 0; i < s.length(); i++) {
            arr[s.charAt(i) - 'a']++;
        }

        // Search the count of the character
        char ch = 'e';

        // Get count
        System.out.println("The count of " + ch + " is " + arr[ch - 'a']);
    }
}
Python
# Initialize a string
s = "geeksforgeeks"

# Using a list to store the count of each alphabet
# by mapping the character to an index value
arr = [0] * 26

# Storing the count
for i in range(len(s)):
    arr[ord(s[i]) - ord('a')] += 1

# Search the count of the character
ch = 'e'

# Get count
print("The count of ", ch, " is ", arr[ord(ch) - ord('a')])
C#
using System;

class Program {
    static void Main(string[] args) {
        //initialize a string
        string s = "geeksforgeeks";
        // Using an array to store the count of each alphabet 
        // by mapping the character to an index value
        int[] arr = new int[26];
        //Storing the count
        for (int i = 0; i < s.Length; i++) {
            arr[s[i] - 'a']++;
        }
        //Search the count of the character
        char ch = 'e';
        // get count
        Console.WriteLine("The count of " + ch + " is " + arr[ch - 'a']);
    }
}
Javascript
// Initialize a string
const s = "geeksforgeeks";

// Using an array to store the count of each alphabet
// by mapping the character to an index value

const arr = Array(26).fill(0);

// Storing the count
for (let i = 0; i < s.length; i++) {
  arr[s.charCodeAt(i) - 'a'.charCodeAt(0)]++;
}

// Search the count of the character
const ch = 'e';

// Get count
console.log(`The count of ${ch} is ${arr[ch.charCodeAt(0) - 'a'.charCodeAt(0)]}`);


Output:

The count of e is 4

Complexity Analysis of a Hash Table:

For lookup, insertion, and deletion operations, hash tables have an average-case time complexity of O(1). Yet, these operations may, in the worst case, require O(n) time, where n is the number of elements in the table.

Applications of Hash Table:

  • Hash tables are frequently used for indexing and searching massive volumes of data. A search engine might use a hash table to store the web pages that it has indexed.
  • Data is usually cached in memory via hash tables, enabling rapid access to frequently used information. 
  • Hash functions are frequently used in cryptography to create digital signatures, validate data, and guarantee data integrity.
  • Hash tables can be used for implementing database indexes, enabling fast access to data based on key values. 


Similar Reads

Comparison of an Array and Hash table in terms of Storage structure and Access time complexity
Arrays and Hash Tables are two of the most widely used data structures in computer science, both serving as efficient solutions for storing and accessing data in Java. They have different storage structures and time complexities, making them suitable for different use cases. In this article, we will explore the differences between arrays and hash t
3 min read
What are Hash Functions and How to choose a good Hash Function?
Prerequisite: Hashing | Set 1 (Introduction) What is a Hash Function? A function that converts a given big phone number to a small practical integer value. The mapped integer value is used as an index in the hash table. In simple terms, a hash function maps a big number or string to a small integer that can be used as the index in the hash table. W
5 min read
Hash Functions and Types of Hash functions
Hash functions are a fundamental concept in computer science and play a crucial role in various applications such as data storage, retrieval, and cryptography. In data structures and algorithms (DSA), hash functions are primarily used in hash tables, which are essential for efficient data management. This article delves into the intricacies of hash
4 min read
Applications, Advantages and Disadvantages of Hash Data Structure
Introduction : Imagine a giant library where every book is stored in a specific shelf, but instead of searching through endless rows of shelves, you have a magical map that tells you exactly which shelf your book is on. That's exactly what a Hash data structure does for your data! Hash data structures are a fundamental building block of computer sc
7 min read
Top 50 Problems on Hash Data Structure asked in SDE Interviews
Hashing is a technique or process of mapping keys, and values into the hash table by using a hash function. It is done for faster access to elements. The efficiency of mapping depends on the efficiency of the hash function used. To learn more about hashing and hashmaps, please refer to the Tutorial on Hashing. Given below are the most frequently as
3 min read
Static Data Structure vs Dynamic Data Structure
Data structure is a way of storing and organizing data efficiently such that the required operations on them can be performed be efficient with respect to time as well as memory. Simply, Data Structure are used to reduce complexity (mostly the time complexity) of the code. Data structures can be two types : 1. Static Data Structure 2. Dynamic Data
4 min read
Various load balancing techniques used in Hash table to ensure efficient access time
Load balancing refers to the process of distributing workloads evenly across multiple servers, nodes, or other resources to ensure optimal resource utilization, maximize output, minimize response time, and avoid overload of any single resource. Load balancing helps to improve the reliability and scalability of applications and systems, as well as r
3 min read
Implementation of Hash Table in Python using Separate Chaining
A hash table is a data structure that allows for quick insertion, deletion, and retrieval of data. It works by using a hash function to map a key to an index in an array. In this article, we will implement a hash table in Python using separate chaining to handle collisions. Separate chaining is a technique used to handle collisions in a hash table.
7 min read
Implementation of Hash Table in C/C++ using Separate Chaining
Introduction: Hashing is a technique that maps a large set of data to a small set of data. It uses a hash function for doing this mapping. It is an irreversible process and we cannot find the original value of the key from its hashed value because we are trying to map a large set of data into a small set of data, which may cause collisions. It is n
10 min read
Hash Table vs Trie
What is Hash Table? An array that stores pointers to records corresponding to a given element. An entry in the hash table is NIL if no existing element has a hash function value equal to the index for the entry. In simple terms, we can say that a hash table is a generalization of the array. Hash table gives the functionality in which a collection o
5 min read
Article Tags :
Practice Tags :
three90RightbarBannerImg