Cuckoo Hashing – Worst case O(1) Lookup!
Last Updated :
11 Jan, 2023
Background :
There are three basic operations that must be supported by a hash table (or a dictionary):
- Lookup(key): return true if key is there on the table, else false
- Insert(key): add the item ‘key’ to the table if not already present
- Delete(key): removes ‘key’ from the table
Collisions are very likely even if we have a big table to store keys. Using the results from the birthday paradox: with only 23 persons, the probability that two people share the same birth date is 50%! There are 3 general strategies towards resolving hash collisions:
- Closed addressing or Chaining: store colliding elements in an auxiliary data structure like a linked list or a binary search tree.
- Open addressing: allow elements to overflow out of their target bucket and into other spaces.
Although above solutions provide expected lookup cost as O(1), the expected worst-case cost of a lookup in Open Addressing (with linear probing) is ?(log n) and ?(log n / log log n) in simple chaining (Source : Standford Lecture Notes). To close the gap of expected time and worst case expected time, two ideas are used:
- Multiple-choice hashing: Give each element multiple choices for positions where it can reside in the hash table
- Relocation hashing: Allow elements in the hash table to move after being placed
Cuckoo Hashing :
Cuckoo hashing applies the idea of multiple-choice and relocation together and guarantees O(1) worst case lookup time!
- Multiple-choice: We give a key two choices the h1(key) and h2(key) for residing.
- Relocation: It may happen that h1(key) and h2(key) are preoccupied. This is resolved by imitating the Cuckoo bird: it pushes the other eggs or young out of the nest when it hatches. Analogously, inserting a new key into a cuckoo hashing table may push an older key to a different location. This leaves us with the problem of re-placing the older key.
- If the alternate position of older key is vacant, there is no problem.
- Otherwise, the older key displaces another key. This continues until the procedure finds a vacant position, or enters a cycle. In the case of a cycle, new hash functions are chosen and the whole data structure is ‘rehashed’. Multiple rehashes might be necessary before Cuckoo succeeds.
Insertion is expected O(1) (amortized) with high probability, even considering the possibility of rehashing, as long as the number of keys is kept below half of the capacity of the hash table, i.e., the load factor is below 50%.
Deletion is O(1) worst-case as it requires inspection of just two locations in the hash table.
Illustration
Input:
{20, 50, 53, 75, 100, 67, 105, 3, 36, 39}
Hash Functions:
h1(key) = key%11
h2(key) = (key/11)%11
Let’s start by inserting 20 at its possible position in the first table determined by h1(20):
Next: 50
Next: 53. h1(53) = 9. But 20 is already there at 9. We place 53 in table 1 & 20 in table 2 at h2(20)
Next: 75. h1(75) = 9. But 53 is already there at 9. We place 75 in table 1 & 53 in table 2 at h2(53)
Next: 100. h1(100) = 1.
Next: 67. h1(67) = 1. But 100 is already there at 1. We place 67 in table 1 & 100 in table 2
Next: 105. h1(105) = 6. But 50 is already there at 6. We place 105 in table 1 & 50 in table 2 at h2(50) = 4. Now 53 has been displaced. h1(53) = 9. 75 displaced: h2(75) = 6.
Next: 3. h1(3) = 3.
Next: 36. h1(36) = 3. h2(3) = 0.
Next: 39. h1(39) = 6. h2(105) = 9. h1(100) = 1. h2(67) = 6. h1(75) = 9. h2(53) = 4. h1(50) = 6. h2(39) = 3.
Here, the new key 39 is displaced later in the recursive calls to place 105, which it displaced.
Implementation:
Below is the implementation of Cuckoo hashing
C++
#include<bits/stdc++.h>
#define MAXN 11
#define ver 2
int hashtable[ver][MAXN];
int pos[ver];
void initTable()
{
for ( int j=0; j<MAXN; j++)
for ( int i=0; i<ver; i++)
hashtable[i][j] = INT_MIN;
}
int hash( int function, int key)
{
switch (function)
{
case 1: return key%MAXN;
case 2: return (key/MAXN)%MAXN;
}
}
void place( int key, int tableID, int cnt, int n)
{
if (cnt==n)
{
printf ( "%d unpositioned\n" , key);
printf ( "Cycle present. REHASH.\n" );
return ;
}
for ( int i=0; i<ver; i++)
{
pos[i] = hash(i+1, key);
if (hashtable[i][pos[i]] == key)
return ;
}
if (hashtable[tableID][pos[tableID]]!=INT_MIN)
{
int dis = hashtable[tableID][pos[tableID]];
hashtable[tableID][pos[tableID]] = key;
place(dis, (tableID+1)%ver, cnt+1, n);
}
else
hashtable[tableID][pos[tableID]] = key;
}
void printTable()
{
printf ( "Final hash tables:\n" );
for ( int i=0; i<ver; i++, printf ( "\n" ))
for ( int j=0; j<MAXN; j++)
(hashtable[i][j]==INT_MIN)? printf ( "- " ):
printf ( "%d " , hashtable[i][j]);
printf ( "\n" );
}
void cuckoo( int keys[], int n)
{
initTable();
for ( int i=0, cnt=0; i<n; i++, cnt=0)
place(keys[i], 0, cnt, n);
printTable();
}
int main()
{
int keys_1[] = {20, 50, 53, 75, 100, 67, 105,
3, 36, 39};
int n = sizeof (keys_1)/ sizeof ( int );
cuckoo(keys_1, n);
int keys_2[] = {20, 50, 53, 75, 100, 67, 105,
3, 36, 39, 6};
int m = sizeof (keys_2)/ sizeof ( int );
cuckoo(keys_2, m);
return 0;
}
|
Java
import java.util.*;
class GFG
{
static int MAXN = 11 ;
static int ver = 2 ;
static int [][]hashtable = new int [ver][MAXN];
static int []pos = new int [ver];
static void initTable()
{
for ( int j = 0 ; j < MAXN; j++)
for ( int i = 0 ; i < ver; i++)
hashtable[i][j] = Integer.MIN_VALUE;
}
static int hash( int function, int key)
{
switch (function)
{
case 1 : return key % MAXN;
case 2 : return (key / MAXN) % MAXN;
}
return Integer.MIN_VALUE;
}
static void place( int key, int tableID, int cnt, int n)
{
if (cnt == n)
{
System.out.printf( "%d unpositioned\n" , key);
System.out.printf( "Cycle present. REHASH.\n" );
return ;
}
for ( int i = 0 ; i < ver; i++)
{
pos[i] = hash(i + 1 , key);
if (hashtable[i][pos[i]] == key)
return ;
}
if (hashtable[tableID][pos[tableID]] != Integer.MIN_VALUE)
{
int dis = hashtable[tableID][pos[tableID]];
hashtable[tableID][pos[tableID]] = key;
place(dis, (tableID + 1 ) % ver, cnt + 1 , n);
}
else
hashtable[tableID][pos[tableID]] = key;
}
static void printTable()
{
System.out.printf( "Final hash tables:\n" );
for ( int i = 0 ; i < ver; i++, System.out.printf( "\n" ))
for ( int j = 0 ; j < MAXN; j++)
if (hashtable[i][j] == Integer.MIN_VALUE)
System.out.printf( "- " );
else
System.out.printf( "%d " , hashtable[i][j]);
System.out.printf( "\n" );
}
static void cuckoo( int keys[], int n)
{
initTable();
for ( int i = 0 , cnt = 0 ; i < n; i++, cnt = 0 )
place(keys[i], 0 , cnt, n);
printTable();
}
public static void main(String[] args)
{
int keys_1[] = { 20 , 50 , 53 , 75 , 100 ,
67 , 105 , 3 , 36 , 39 };
int n = keys_1.length;
cuckoo(keys_1, n);
int keys_2[] = { 20 , 50 , 53 , 75 , 100 ,
67 , 105 , 3 , 36 , 39 , 6 };
int m = keys_2.length;
cuckoo(keys_2, m);
}
}
|
Python3
MAXN = 11
ver = 2
hashtable = [[ float ( 'inf' )] * MAXN for _ in range (ver)]
pos = [ 0 ] * ver
def init_table():
for i in range (ver):
for j in range (MAXN):
hashtable[i][j] = float ( 'inf' )
def hash (function, key):
if function = = 1 :
return key % MAXN
elif function = = 2 :
return (key / / MAXN) % MAXN
def place(key, table_id, cnt, n):
if cnt = = n:
print (f "{key} unpositioned" )
print ( "Cycle present. REHASH." )
return
for i in range (ver):
pos[i] = hash (i + 1 , key)
if hashtable[i][pos[i]] = = key:
return
if hashtable[table_id][pos[table_id]] ! = float ( 'inf' ):
dis = hashtable[table_id][pos[table_id]]
hashtable[table_id][pos[table_id]] = key
place(dis, (table_id + 1 ) % ver, cnt + 1 , n)
else :
hashtable[table_id][pos[table_id]] = key
def print_table():
print ( "Final hash tables:" )
for i in range (ver):
print ()
for j in range (MAXN):
if hashtable[i][j] = = float ( 'inf' ):
print ( "- " , end = "")
else :
print (f "{hashtable[i][j]} " , end = "")
print ()
def cuckoo(keys, n):
init_table()
for i in range (n):
cnt = 0
place(keys[i], 0 , cnt, n)
print_table()
def main():
keys_1 = [ 20 , 50 , 53 , 75 , 100 , 67 , 105 , 3 , 36 , 39 ]
cuckoo(keys_1, len (keys_1))
keys_2 = [ 20 , 50 , 53 , 75 , 100 , 67 , 105 , 3 , 36 , 39 , 6 ]
cuckoo(keys_2, len (keys_2))
if __name__ = = "__main__" :
main()
|
C#
using System;
class GFG
{
static int MAXN = 11;
static int ver = 2;
static int [,]hashtable = new int [ver, MAXN];
static int []pos = new int [ver];
static void initTable()
{
for ( int j = 0; j < MAXN; j++)
for ( int i = 0; i < ver; i++)
hashtable[i, j] = int .MinValue;
}
static int hash( int function, int key)
{
switch (function)
{
case 1: return key % MAXN;
case 2: return (key / MAXN) % MAXN;
}
return int .MinValue;
}
static void place( int key, int tableID,
int cnt, int n)
{
if (cnt == n)
{
Console.Write( "{0} unpositioned\n" , key);
Console.Write( "Cycle present. REHASH.\n" );
return ;
}
for ( int i = 0; i < ver; i++)
{
pos[i] = hash(i + 1, key);
if (hashtable[i, pos[i]] == key)
return ;
}
if (hashtable[tableID,
pos[tableID]] != int .MinValue)
{
int dis = hashtable[tableID, pos[tableID]];
hashtable[tableID, pos[tableID]] = key;
place(dis, (tableID + 1) % ver, cnt + 1, n);
}
else
hashtable[tableID, pos[tableID]] = key;
}
static void printTable()
{
Console.Write( "Final hash tables:\n" );
for ( int i = 0; i < ver;
i++, Console.Write( "\n" ))
for ( int j = 0; j < MAXN; j++)
if (hashtable[i, j] == int .MinValue)
Console.Write( "- " );
else
Console.Write( "{0} " ,
hashtable[i, j]);
Console.Write( "\n" );
}
static void cuckoo( int []keys, int n)
{
initTable();
for ( int i = 0, cnt = 0;
i < n; i++, cnt = 0)
place(keys[i], 0, cnt, n);
printTable();
}
public static void Main(String[] args)
{
int []keys_1 = {20, 50, 53, 75, 100,
67, 105, 3, 36, 39};
int n = keys_1.Length;
cuckoo(keys_1, n);
int []keys_2 = {20, 50, 53, 75, 100,
67, 105, 3, 36, 39, 6};
int m = keys_2.Length;
cuckoo(keys_2, m);
}
}
|
Javascript
<script>
let MAXN = 11;
let ver = 2;
let hashtable = new Array(ver);
for ( var i = 0; i < hashtable.length; i++) {
hashtable[i] = new Array(2);
}
let pos = Array(ver).fill(0);
function initTable()
{
for (let j = 0; j < MAXN; j++)
for (let i = 0; i < ver; i++)
hashtable[i][j] = Number.MIN_VALUE;
}
function hash( function , key)
{
switch ( function )
{
case 1: return key % MAXN;
case 2: return (Math.floor(key / MAXN)) % MAXN;
}
return Number.MIN_VALUE;
}
function place(key, tableID, cnt, n)
{
if (cnt == n)
{
document.write(key + " unpositioned" + "<br/>" );
document.write( "Cycle present. REHASH." + "<br/>" );
return ;
}
for (let i = 0; i < ver; i++)
{
pos[i] = hash(i + 1, key);
if (hashtable[i][pos[i]] == key)
return ;
}
if (hashtable[tableID][pos[tableID]] != Number.MIN_VALUE)
{
let dis = hashtable[tableID][pos[tableID]];
hashtable[tableID][pos[tableID]] = key;
place(dis, (tableID + 1) % ver, cnt + 1, n);
}
else
hashtable[tableID][pos[tableID]] = key;
}
function printTable()
{
document.write( "Final hash tables:" + "<br/>" );
for (let i = 0; i < ver; i++, document.write( "<br/>" ))
for (let j = 0; j < MAXN; j++)
if (hashtable[i][j] == Number.MIN_VALUE)
document.write( "- " );
else
document.write(hashtable[i][j] + " " );
document.write( "<br/>" );
}
function cuckoo(keys, n)
{
initTable();
for (let i = 0, cnt = 0; i < n; i++, cnt = 0)
place(keys[i], 0, cnt, n);
printTable();
}
let keys_1 = [20, 50, 53, 75, 100,
67, 105, 3, 36, 39];
let n = keys_1.length;
cuckoo(keys_1, n);
let keys_2 = [20, 50, 53, 75, 100,
67, 105, 3, 36, 39, 6];
let m = keys_2.length;
cuckoo(keys_2, m);
</script>
|
Output
Final hash tables:
- 100 - 36 - - 50 - - 75 -
3 20 - 39 53 - 67 - - 105 -
105 unpositioned
Cycle present. REHASH.
Final hash tables:
- 67 - 3 - - 39 - - 53 -
6 20 - 36 50 - 75 - - 100 -
Time Complexity: O(N), the time complexity of the Cuckoo Hashing algorithm is O(N), where N is the number of keys to be stored in the hash table. This is because the algorithm requires only one pass over the list of keys to place them in the hash table.
Auxiliary Space: O(N), the space complexity of the Cuckoo Hashing algorithm is O(N), where N is the number of keys stored in the hash table. This is because the algorithm requires an auxiliary space of size equal to the hash table, where all the keys are stored.
Generalizations of cuckoo hashing that use more than 2 alternative hash functions can be expected to utilize a larger part of the capacity of the hash table efficiently while sacrificing some lookup and insertion speed. Example: if we use 3 hash functions, it’s safe to load 91% and still be operating within expected bounds.
Please Login to comment...