How to do the following operations efficiently if there are large number of queries for them.
- Insertion
- Deletion
- Searching
- Clearing/Removing all the elements.
One solution is to use a Self-Balancing Binary Search Tree like Red-Black Tree, AVL Tree, etc. Time complexity of this solution for insertion, deletion and searching is O(Log n).
We can also use Hashing. With hashing, time complexity of first three operations is O(1). But time complexity of the fourth operation is O(n).
We can also use bit-vector (or direct access table), but bit-vector also requires O(n) time for clearing.
Sparse Set outperforms all BST, Hashing and bit vector. We assume that we are given range of data (or maximum value an element can have) and maximum number of elements that can be stored in set. The idea is to maintain two arrays: sparse[] and dense[].
dense[] ==> Stores the actual elements
sparse[] ==> This is like bit-vector where
we use elements as index. Here
values are not binary, but
indexes of dense array.
maxVal ==> Maximum value this set can
store. Size of sparse[] is
equal to maxVal + 1.
capacity ==> Capacity of Set. Size of sparse
is equal to capacity.
n ==> Current number of elements in
Set.
insert(x): Let x be the element to be inserted. If x is greater than maxVal or n (current number of elements) is greater than equal to capacity, we return.
If none of the above conditions is true, we insert x in dense[] at index n (position after last element in a 0 based indexed array), increment n by one (Current number of elements) and store n (index of x in dense[]) at sparse[x].
search(x): To search an element x, we use x as index in sparse[]. The value sparse[x] is used as index in dense[]. And if value of dense[sparse[x]] is equal to x, we return dense[x]. Else we return -1.
delete(x): To delete an element x, we replace it with last element in dense[] and update index of last element in sparse[]. Finally decrement n by 1.
clear(): Set n = 0.
print(): We can print all elements by simply traversing dense[].
Illustration:
Let there be a set with two elements {3, 5}, maximum
value as 10 and capacity as 4. The set would be
represented as below.
Initially:
maxVal = 10 // Size of sparse
capacity = 4 // Size of dense
n = 2 // Current number of elements in set
// dense[] Stores actual elements
dense[] = {3, 5, _, _}
// Uses actual elements as index and stores
// indexes of dense[]
sparse[] = {_, _, _, 0, _, 1, _, _, _, _,}
'_' means it can be any value and not used in
sparse set
Insert 7:
n = 3
dense[] = {3, 5, 7, _}
sparse[] = {_, _, _, 0, _, 1, _, 2, _, _,}
Insert 4:
n = 4
dense[] = {3, 5, 7, 4}
sparse[] = {_, _, _, 0, 3, 1, _, 2, _, _,}
Delete 3:
n = 3
dense[] = {4, 5, 7, _}
sparse[] = {_, _, _, _, 0, 1, _, 2, _, _,}
Clear (Remove All):
n = 0
dense[] = {_, _, _, _}
sparse[] = {_, _, _, _, _, _, _, _, _, _,}
Below is the implementation of above functions.
C++
#include<bits/stdc++.h>
using namespace std;
class SSet
{
int *sparse;
int *dense;
int n;
int capacity;
int maxValue;
public :
SSet( int maxV, int cap)
{
sparse = new int [maxV+1];
dense = new int [cap];
capacity = cap;
maxValue = maxV;
n = 0;
}
~SSet()
{
delete [] sparse;
delete [] dense;
}
int search( int x);
void insert( int x);
void deletion( int x);
void print();
void clear() { n = 0; }
SSet* intersection(SSet &s);
SSet *setUnion(SSet &s);
};
int SSet::search( int x)
{
if (x > maxValue)
return -1;
if (sparse[x] < n && dense[sparse[x]] == x)
return (sparse[x]);
return -1;
}
void SSet::insert( int x)
{
if (x > maxValue)
return ;
if (n >= capacity)
return ;
if (search(x) != -1)
return ;
dense[n] = x;
sparse[x] = n;
n++;
}
void SSet::deletion( int x)
{
if (search(x) == -1)
return ;
int temp = dense[n-1];
dense[sparse[x]] = temp;
sparse[temp] = sparse[x];
n--;
}
void SSet::print()
{
for ( int i=0; i<n; i++)
printf ( "%d " , dense[i]);
printf ( "\n" );
}
SSet* SSet::intersection(SSet &s)
{
int iCap = min(n, s.n);
int iMaxVal = max(s.maxValue, maxValue);
SSet *result = new SSet(iMaxVal, iCap);
if (n < s.n)
{
for ( int i = 0; i < n; i++)
if (s.search(dense[i]) != -1)
result->insert(dense[i]);
}
else
{
for ( int i = 0; i < s.n; i++)
if (search(s.dense[i]) != -1)
result->insert(s.dense[i]);
}
return result;
}
SSet* SSet::setUnion(SSet &s)
{
int uCap = s.n + n;
int uMaxVal = max(s.maxValue, maxValue);
SSet *result = new SSet(uMaxVal, uCap);
for ( int i = 0; i < n; i++)
result->insert(dense[i]);
for ( int i = 0; i < s.n; i++)
result->insert(s.dense[i]);
return result;
}
int main()
{
SSet s1(100, 5);
s1.insert(5);
s1.insert(3);
s1.insert(9);
s1.insert(10);
printf ( "The elements in set1 are\n" );
s1.print();
int index = s1.search(3);
if (index != -1)
printf ( "\n3 is found at index %d in set1\n" ,index);
else
printf ( "\n3 doesn't exists in set1\n" );
s1.deletion(9);
s1.print();
SSet s2(1000, 6);
s2.insert(4);
s2.insert(3);
s2.insert(7);
s2.insert(200);
printf ( "\nThe elements in set2 are\n" );
s2.print();
SSet *intersect = s2.intersection(s1);
printf ( "\nIntersection of set1 and set2\n" );
intersect->print();
SSet *unionset = s1.setUnion(s2);
printf ( "\nUnion of set1 and set2\n" );
unionset->print();
return 0;
}
|
Java
class SSet {
private int [] sparse;
private int [] dense;
private int capacity;
private int maxValue;
private int n;
public SSet( int maxV, int cap) {
sparse = new int [maxV + 1 ];
dense = new int [cap];
capacity = cap;
maxValue = maxV;
n = 0 ;
}
public int search( int x) {
if (x > maxValue) {
return - 1 ;
}
if (sparse[x] < n && dense[sparse[x]] == x) {
return sparse[x];
}
return - 1 ;
}
public void insert( int x) {
if (x > maxValue || n >= capacity || search(x) != - 1 ) {
return ;
}
dense[n] = x;
sparse[x] = n;
n++;
}
public void deletion( int x) {
int index = search(x);
if (index == - 1 ) {
return ;
}
int temp = dense[n - 1 ];
dense[index] = temp;
sparse[temp] = index;
n--;
}
public void printSet() {
for ( int i = 0 ; i < n; i++) {
System.out.print(dense[i] + " " );
}
System.out.println();
}
public SSet intersection(SSet s) {
int iCap = Math.min(n, s.n);
int iMaxVal = Math.max(maxValue, s.maxValue);
SSet result = new SSet(iMaxVal, iCap);
if (n < s.n) {
for ( int i = 0 ; i < n; i++) {
if (s.search(dense[i]) != - 1 ) {
result.insert(dense[i]);
}
}
} else {
for ( int i = 0 ; i < s.n; i++) {
if (search(s.dense[i]) != - 1 ) {
result.insert(s.dense[i]);
}
}
}
return result;
}
public SSet setUnion(SSet set2) {
int uCap = n + set2.n;
int uMaxVal = Math.max(maxValue, set2.maxValue);
SSet unionSet = new SSet(uMaxVal, uCap);
for ( int i = 0 ; i < n; i++) {
unionSet.insert(dense[i]);
}
for ( int i = 0 ; i < set2.n; i++) {
unionSet.insert(set2.dense[i]);
}
return unionSet;
}
}
public class Main {
public static void main(String[] args) {
SSet s1 = new SSet( 100 , 5 );
s1.insert( 5 );
s1.insert( 3 );
s1.insert( 9 );
s1.insert( 10 );
System.out.println( "The elements in set1 are:" );
s1.printSet();
int index = s1.search( 3 );
if (index != - 1 ) {
System.out.printf( "\n3 is found at index %d in set1" , index);
} else {
System.out.println( "\n3 doesn't exist in set1" );
}
s1.deletion( 9 );
s1.printSet();
SSet s2 = new SSet( 1000 , 6 );
s2.insert( 4 );
s2.insert( 3 );
s2.insert( 7 );
s2.insert( 200 );
System.out.println( "\nThe elements in set2 are:" );
s2.printSet();
SSet intersect = s2.intersection(s1);
System.out.println( "\nIntersection of set1 and set2" );
intersect.printSet();
SSet unionset = s1.setUnion(s2);
System.out.println( "\nUnion of set1 and set2" );
unionset.printSet();
}
}
|
Python3
class SSet:
def __init__( self , maxV, cap):
self .sparse = [ 0 ] * (maxV + 1 )
self .dense = [ 0 ] * cap
self .capacity = cap
self .maxValue = maxV
self .n = 0
def search( self , x):
if x > self .maxValue:
return - 1
if self .sparse[x] < self .n and self .dense[ self .sparse[x]] = = x:
return self .sparse[x]
return - 1
def insert( self , x):
if x > self .maxValue:
return
if self .n > = self .capacity:
return
if self .search(x) ! = - 1 :
return
self .dense[ self .n] = x
self .sparse[x] = self .n
self .n + = 1
def deletion( self , x):
if self .search(x) = = - 1 :
return
temp = self .dense[ self .n - 1 ]
self .dense[ self .sparse[x]] = temp
self .sparse[temp] = self .sparse[x]
self .n - = 1
def print_set( self ):
for i in range ( self .n):
print ( self .dense[i], end = ' ' )
print ()
def intersection( self , s):
iCap = min ( self .n, s.n)
iMaxVal = max (s.maxValue, self .maxValue)
result = SSet(iMaxVal, iCap)
if self .n < s.n:
for i in range ( self .n):
if s.search( self .dense[i]) ! = - 1 :
result.insert( self .dense[i])
else :
for i in range (s.n):
if self .search(s.dense[i]) ! = - 1 :
result.insert(s.dense[i])
return result
def setUnion( self , set2):
uCap = self .n + set2.n
uMaxVal = max ( self .maxValue, set2.maxValue)
union_set = SSet(uMaxVal, uCap)
for i in range ( self .n):
union_set.insert( self .dense[i])
for i in range (set2.n):
union_set.insert(set2.dense[i])
return union_set
if __name__ = = "__main__" :
s1 = SSet( 100 , 5 )
s1.insert( 5 )
s1.insert( 3 )
s1.insert( 9 )
s1.insert( 10 )
print ( "The elements in set1 are:" )
s1.print_set()
index = s1.search( 3 )
if index ! = - 1 :
print (f "\n3 is found at index {index} in set1" )
else :
print ( "\n3 doesn't exist in set1" )
s1.deletion( 9 )
s1.print_set()
s2 = SSet( 1000 , 6 )
s2.insert( 4 )
s2.insert( 3 )
s2.insert( 7 )
s2.insert( 200 )
print ( "\nThe elements in set2 are:" )
s2.print_set()
intersect = s2.intersection(s1)
print ( "\nIntersection of set1 and set2" )
intersect.print_set()
unionset = s1.setUnion(s2)
print ( "\nUnion of set1 and set2" )
unionset.print_set()
|
C#
using System;
public class SSet
{
private int [] sparse;
private int [] dense;
private int n;
private int capacity;
private int maxValue;
public SSet( int maxV, int cap)
{
sparse = new int [maxV + 1];
dense = new int [cap];
capacity = cap;
maxValue = maxV;
n = 0;
}
public int Search( int x)
{
if (x > maxValue)
return -1;
if (sparse[x] < n && dense[sparse[x]] == x)
return (sparse[x]);
return -1;
}
public void Insert( int x)
{
if (n == capacity)
return ;
if (x > maxValue)
return ;
int i = sparse[x];
if (i >= n || dense[i] != x)
{
dense[n] = x;
sparse[x] = n;
n++;
}
}
public void Deletion( int x)
{
if (x > maxValue)
return ;
int i = sparse[x];
if (i < n && dense[i] == x)
{
n--;
dense[i] = dense[n];
sparse[dense[n]] = i;
}
}
public void Print()
{
Console.Write( "Sparse Set: " );
for ( int i = 0; i < n; i++)
Console.Write(dense[i] + " " );
Console.WriteLine();
}
public void Clear()
{
n = 0;
}
public SSet Intersection(SSet s)
{
SSet result = new SSet(maxValue, capacity);
for ( int i = 0; i < n; i++)
{
if (s.Search(dense[i]) != -1)
result.Insert(dense[i]);
}
return result;
}
public SSet SetUnion(SSet s)
{
SSet result = new SSet(maxValue, capacity);
for ( int i = 0; i < n; i++)
result.Insert(dense[i]);
for ( int i = 0; i < s.n; i++)
result.Insert(s.dense[i]);
return result;
}
}
public class Program
{
public static void Main()
{
SSet s1 = new SSet(100, 5);
s1.Insert(5);
s1.Insert(3);
s1.Insert(9);
s1.Insert(10);
Console.WriteLine( "The elements in set1 are" );
s1.Print();
int index = s1.Search(3);
if (index != -1)
Console.WriteLine( "\n3 is found at index {0} in set1" , index);
else
Console.WriteLine( "\n3 doesn't exists in set1" );
s1.Deletion(9);
s1.Print();
SSet s2 = new SSet(1000, 6);
s2.Insert(4);
s2.Insert(3);
s2.Insert(7);
s2.Insert(200);
Console.WriteLine( "\nThe elements in set2 are" );
s2.Print();
SSet intersect = s2.Intersection(s1);
Console.WriteLine( "\nIntersection of set1 and set2" );
intersect.Print();
SSet unionset = s1.SetUnion(s2);
Console.WriteLine( "\nUnion of set1 and set2" );
unionset.Print();
}
}
|
Javascript
class SSet {
constructor(maxV, cap) {
this .sparse = new Array(maxV+1).fill(0);
this .dense = new Array(cap).fill(0);
this .capacity = cap;
this .maxValue = maxV;
this .n = 0;
}
search(x) {
if (x > this .maxValue)
return -1;
if ( this .sparse[x] < this .n && this .dense[ this .sparse[x]] === x)
return this .sparse[x];
return -1;
}
insert(x) {
if (x > this .maxValue)
return ;
if ( this .n >= this .capacity)
return ;
if ( this .search(x) !== -1)
return ;
this .dense[ this .n] = x;
this .sparse[x] = this .n;
this .n++;
}
deletion(x) {
const index = this .search(x);
if (index === -1)
return ;
const temp = this .dense[ this .n-1];
this .dense[index] = temp;
this .sparse[temp] = index;
this .n--;
}
print() {
console.log( this .dense.slice(0, this .n).join( ' ' ));
}
clear() { this .n = 0; }
intersection(s) {
const iCap = Math.min( this .n, s.n);
const iMaxVal = Math.max(s.maxValue, this .maxValue);
const result = new SSet(iMaxVal, iCap);
if ( this .n < s.n) {
for (let i = 0; i < this .n; i++)
if (s.search( this .dense[i]) !== -1)
result.insert( this .dense[i]);
}
else {
for (let i = 0; i < s.n; i++)
if ( this .search(s.dense[i]) !== -1)
result.insert(s.dense[i]);
}
return result;
}
setUnion(s) {
const uCap = s.n + this .n;
const uMaxVal = Math.max(s.maxValue, this .maxValue);
const result = new SSet(uMaxVal, uCap);
for (let i = 0; i < this .n; i++)
result.insert( this .dense[i]);
for (let i = 0; i < s.n; i++)
result.insert(s.dense[i]);
return result;
}
}
const s1 = new SSet(100, 5);
s1.insert(5);
s1.insert(3);
s1.insert(9);
s1.insert(10);
console.log('The elements in set1 are ');
s1.print();
const index = s1.search(3);
// ' index ' variable stores the index of the number to
// be searched.
if (index !== -1) // 3 exists
console.log(`\n3 is found at index ${index} in set1`);
else // 3 doesn' t exist
console.log( '\n3 doesn\'t exists in set1' );
s1.deletion(9);
s1.print();
const s2 = new SSet(1000, 6);
s2.insert(4);
s2.insert(3);
s2.insert(7);
s2.insert(200);
console.log( '\nThe elements in set2 are' );
s2.print();
const intersect = s2.intersection(s1);
console.log( '\nIntersection of set1 and set2' );
intersect.print();
const unionset = s1.setUnion(s2);
console.log( "\nUnion of set1 and set2" );
unionset.print()
|
Output :
The elements in set1 are
5 3 9 10
3 is found at index 1 in set1
5 3 10
The elements in set2 are-
4 3 7 200
Intersection of set1 and set2
3
Union of set1 and set2
5 3 10 4 7 200
This is a C++ program that implements a sparse set and its operations. A sparse set is a data structure that is useful when the maximum value of the set is known, and the set contains only a small subset of the maximum value range. The sparse set uses two arrays, sparse and dense, to store the elements. The sparse array maps each element to its index in the dense array, and the dense array contains the actual elements of the set.
The program defines a class SSet to represent the sparse set. The constructor takes the maximum value of the set and the capacity of the dense array as input parameters. The search, insert, deletion, print, clear, intersection, and setUnion functions are defined for the class.
The search function takes an element as input and returns its index in the dense array if it is present, otherwise -1. The insert function takes an element as input and inserts it into the set if it is not already present and the set is not full. The deletion function takes an element as input and removes it from the set if it is present. The print function prints the elements of the set in the dense array. The clear function removes all elements from the set.
The intersection function takes another SSet object as input and returns a new SSet object that contains the intersection of the two sets. The setUnion function takes another SSet object as input and returns a new SSet object that contains the union of the two sets. The intersection and setUnion functions use the search and insert functions to perform the set operations.
Overall, this program provides a simple and efficient implementation of a sparse set and its operations.
Additional Operations:
The following operations are also efficiently implemented using sparse set. It outperforms all the solutions discussed here and bit vector based solution, under the assumptions that range and maximum number of elements are known.
union():
1) Create an empty sparse set, say result.
2) Traverse the first set and insert all elements of it in result.
3) Traverse the second set and insert all elements of it in result (Note that sparse set doesn’t insert an entry if it is already present)
4) Return result.
intersection():
1) Create an empty sparse set, say result.
2) Let the smaller of two given sets be first set and larger be second.
3) Consider the smaller set and search every element of it in second. If element is found, add it to result.
4) Return result.
A common use of this data structure is with register allocation algorithms in compilers, which have a fixed universe(the number of registers in the machine) and are updated and cleared frequently (just like- Q queries) during a single processing run.
References:
http://research.switch.com/sparse
http://codingplayground.blogspot.in/2009/03/sparse-sets-with-o1-insert-delete.html
Please Login to comment...