Median of Stream of Running Integers using STL
Last Updated :
02 Jun, 2023
Given that integers are being read from a data stream. Find the median of all the elements read so far starting from the first integer till the last integer. This is also called the Median of Running Integers. The data stream can be any source of data, for example, a file, an array of integers, input stream etc.
What is Median?
Median can be defined as the element in the data set which separates the higher half of the data sample from the lower half. In other words, we can get the median element as, when the input size is odd, we take the middle element of sorted data. If the input size is even, we pick an average of middle two elements in the sorted stream.
Examples:
Input: 5 10 15
Output: 5, 7.5, 10
Explanation: Given the input stream as an array of integers [5,10,15]. Read integers one by one and print the median correspondingly. So, after reading first element 5,median is 5. After reading 10,median is 7.5 After reading 15 ,median is 10.
Input: 1, 2, 3, 4
Output: 1, 1.5, 2, 2.5
Explanation: Given the input stream as an array of integers [1, 2, 3, 4]. Read integers one by one and print the median correspondingly. So, after reading first element 1,median is 1. After reading 2,median is 1.5 After reading 3 ,median is 2.After reading 4 ,median is 2.5.
Approach: The idea is to use max heap and min heap to store the elements of higher half and lower half. Max heap and min heap can be implemented using priority_queue in C++ STL. Below is the step by step algorithm to solve this problem.
Algorithm:
- Create two heaps. One max heap to maintain elements of lower half and one min heap to maintain elements of higher half at any point of time..
- Take initial value of median as 0.
- For every newly read element, insert it into either max heap or min-heap and calculate the median based on the following conditions:
- If the size of max heap is greater than the size of min-heap and the element is less than the previous median then pop the top element from max heap and insert into min-heap and insert the new element to max heap else insert the new element to min-heap. Calculate the new median as the average of top of elements of both max and min heap.
- If the size of max heap is less than the size of min-heap and the element is greater than the previous median then pop the top element from min-heap and insert into the max heap and insert the new element to min heap else insert the new element to the max heap. Calculate the new median as the average of top of elements of both max and min heap.
- If the size of both heaps is the same. Then check if the current is less than the previous median or not. If the current element is less than the previous median then insert it to the max heap and a new median will be equal to the top element of max heap. If the current element is greater than the previous median then insert it to min-heap and new median will be equal to the top element of min heap.
Below is the implementation of above approach.
C++
#include<bits/stdc++.h>
using namespace std;
void printMedians( double arr[], int n)
{
priority_queue< double > s;
priority_queue< double ,vector< double >,greater< double > > g;
double med = arr[0];
s.push(arr[0]);
cout << med << endl;
for ( int i=1; i < n; i++)
{
double x = arr[i];
if (s.size() > g.size())
{
if (x < med)
{
g.push(s.top());
s.pop();
s.push(x);
}
else
g.push(x);
med = (s.top() + g.top())/2.0;
}
else if (s.size()==g.size())
{
if (x < med)
{
s.push(x);
med = ( double )s.top();
}
else
{
g.push(x);
med = ( double )g.top();
}
}
else
{
if (x > med)
{
s.push(g.top());
g.pop();
g.push(x);
}
else
s.push(x);
med = (s.top() + g.top())/2.0;
}
cout << med << endl;
}
}
int main()
{
double arr[] = {5, 15, 10, 20, 3};
int n = sizeof (arr)/ sizeof (arr[0]);
printMedians(arr, n);
return 0;
}
|
Java
import java.util.Collections;
import java.util.PriorityQueue;
public class MedianMaintain
{
public static void printMedian( int [] a)
{
double med = a[ 0 ];
PriorityQueue<Integer> smaller = new PriorityQueue<>
(Collections.reverseOrder());
PriorityQueue<Integer> greater = new PriorityQueue<>();
smaller.add(a[ 0 ]);
System.out.println(med);
for ( int i = 1 ; i < a.length; i++)
{
int x = a[i];
if (smaller.size() > greater.size())
{
if (x < med)
{
greater.add(smaller.remove());
smaller.add(x);
}
else
greater.add(x);
med = ( double )(smaller.peek() + greater.peek())/ 2 ;
}
else if (smaller.size() == greater.size())
{
if (x < med)
{
smaller.add(x);
med = ( double )smaller.peek();
}
else
{
greater.add(x);
med = ( double )greater.peek();
}
}
else
{
if (x > med)
{
smaller.add(greater.remove());
greater.add(x);
}
else
smaller.add(x);
med = ( double )(smaller.peek() + greater.peek())/ 2 ;
}
System.out.println(med);
}
}
public static void main(String []args)
{
int [] arr = new int []{ 5 , 15 , 10 , 20 , 3 };
printMedian(arr);
}
}
|
Python3
from heapq import *
def printMedians(arr, n):
s = []
g = []
heapify(s)
heapify(g)
med = arr[ 0 ]
heappush(s, arr[ 0 ])
print (med)
for i in range ( 1 , n):
x = arr[i]
if len (s) > len (g):
if x < med:
heappush(g, heappop(s))
heappush(s, x)
else :
heappush(g, x)
med = (nlargest( 1 , s)[ 0 ] + nsmallest( 1 , g)[ 0 ]) / 2
elif len (s) = = len (g):
if x < med:
heappush(s, x)
med = nlargest( 1 , s)[ 0 ]
else :
heappush(g, x)
med = nsmallest( 1 , g)[ 0 ]
else :
if x > med:
heappush(s, heappop(g))
heappush(g, x)
else :
heappush(s, x)
med = (nlargest( 1 , s)[ 0 ] + nsmallest( 1 , g)[ 0 ]) / 2
print (med)
arr = [ 5 , 15 , 10 , 20 , 3 ]
printMedians(arr, len (arr))
|
C#
using System;
using System.Collections.Generic;
public class MedianMaintain
{
public static void printMedian( int [] a)
{
double med = a[0];
List< int > smaller = new List< int >();
List< int > greater = new List< int >();
smaller.Add(a[0]);
Console.WriteLine(med);
for ( int i = 1; i < a.Length; i++)
{
int x = a[i];
if (smaller.Count > greater.Count)
{
if (x < med)
{
smaller.Sort();
smaller.Reverse();
greater.Add(smaller[0]);
smaller.RemoveAt(0);
smaller.Add(x);
}
else
greater.Add(x);
smaller.Sort();
smaller.Reverse();
greater.Sort();
med = ( double )(smaller[0] + greater[0])/2;
}
else if (smaller.Count == greater.Count)
{
if (x < med)
{
smaller.Add(x);
smaller.Sort();
smaller.Reverse();
med = ( double )smaller[0];
}
else
{
greater.Add(x);
greater.Sort();
med = ( double )greater[0];
}
}
else
{
if (x > med)
{
greater.Sort();
smaller.Add(greater[0]);
greater.RemoveAt(0);
greater.Add(x);
}
else
smaller.Add(x);
smaller.Sort();
smaller.Reverse();
med = ( double )(smaller[0] + greater[0])/2;
}
Console.WriteLine(med);
}
}
public static void Main(String []args)
{
int [] arr = new int []{5, 15, 10, 20, 3};
printMedian(arr);
}
}
|
Javascript
<script>
function printMedian(a)
{
let med = a[0];
let smaller = [];
let greater = [];
smaller.push(a[0]);
document.write(med+ "<br>" );
for (let i = 1; i < a.length; i++)
{
let x = a[i];
if (smaller.length > greater.length)
{
if (x < med)
{
smaller.sort( function (a,b){ return b-a;});
greater.push(smaller.shift());
smaller.push(x);
}
else
{ greater.push(x);}
smaller.sort( function (a,b){ return b-a;});
greater.sort( function (a,b){ return a-b;});
med = (smaller[0] + greater[0])/2;
}
else if (smaller.length == greater.length)
{
if (x < med)
{
smaller.push(x);
smaller.sort( function (a,b){ return b-a;});
med = smaller[0];
}
else
{
greater.push(x);
greater.sort( function (a,b){ return a-b;});
med = greater[0];
}
}
else
{
if (x > med)
{
greater.sort( function (a,b){ return a-b;});
smaller.push(greater.shift());
greater.push(x);
}
else
{
smaller.push(x);}
smaller.sort( function (a,b){ return b-a;});
med = (smaller[0] + greater[0])/2;
}
document.write(med+ "<br>" );
}
}
let arr=[5, 15, 10, 20, 3];
printMedian(arr);
</script>
|
Complexity Analysis:
- Time Complexity: O(n Log n).
Time Complexity to insert element in min heap is log n. So to insert n element is O( n log n).
- Auxiliary Space : O(n).
The Space required to store the elements in Heap is O(n).
Please Login to comment...