Open In App

Median of Stream of Running Integers using STL

Last Updated : 02 Jun, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Given that integers are being read from a data stream. Find the median of all the elements read so far starting from the first integer till the last integer. This is also called the Median of Running Integers. The data stream can be any source of data, for example, a file, an array of integers, input stream etc.

What is Median?

Median can be defined as the element in the data set which separates the higher half of the data sample from the lower half. In other words, we can get the median element as, when the input size is odd, we take the middle element of sorted data. If the input size is even, we pick an average of middle two elements in the sorted stream.
 

Examples: 

Input: 5 10 15 
Output: 5, 7.5, 10 
Explanation: Given the input stream as an array of integers [5,10,15]. Read integers one by one and print the median correspondingly. So, after reading first element 5,median is 5. After reading 10,median is 7.5 After reading 15 ,median is 10.
Input: 1, 2, 3, 4 
Output: 1, 1.5, 2, 2.5 
Explanation: Given the input stream as an array of integers [1, 2, 3, 4]. Read integers one by one and print the median correspondingly. So, after reading first element 1,median is 1. After reading 2,median is 1.5 After reading 3 ,median is 2.After reading 4 ,median is 2.5. 
 

Recommended Practice

Approach: The idea is to use max heap and min heap to store the elements of higher half and lower half. Max heap and min heap can be implemented using priority_queue in C++ STL. Below is the step by step algorithm to solve this problem.
Algorithm: 

  1. Create two heaps. One max heap to maintain elements of lower half and one min heap to maintain elements of higher half at any point of time..
  2. Take initial value of median as 0.
  3. For every newly read element, insert it into either max heap or min-heap and calculate the median based on the following conditions: 
    • If the size of max heap is greater than the size of min-heap and the element is less than the previous median then pop the top element from max heap and insert into min-heap and insert the new element to max heap else insert the new element to min-heap. Calculate the new median as the average of top of elements of both max and min heap.
    • If the size of max heap is less than the size of min-heap and the element is greater than the previous median then pop the top element from min-heap and insert into the max heap and insert the new element to min heap else insert the new element to the max heap. Calculate the new median as the average of top of elements of both max and min heap.
    • If the size of both heaps is the same. Then check if the current is less than the previous median or not. If the current element is less than the previous median then insert it to the max heap and a new median will be equal to the top element of max heap. If the current element is greater than the previous median then insert it to min-heap and new median will be equal to the top element of min heap.

Below is the implementation of above approach. 

C++




// C++ program to find med in
// stream of running integers
#include<bits/stdc++.h>
using namespace std;
 
// function to calculate med of stream
void printMedians(double arr[], int n)
{
    // max heap to store the smaller half elements
    priority_queue<double> s;
 
    // min heap to store the greater half elements
    priority_queue<double,vector<double>,greater<double> > g;
 
    double med = arr[0];
    s.push(arr[0]);
 
    cout << med << endl;
 
    // reading elements of stream one by one
    /*  At any time we try to make heaps balanced and
        their sizes differ by at-most 1. If heaps are
        balanced,then we declare median as average of
        min_heap_right.top() and max_heap_left.top()
        If heaps are unbalanced,then median is defined
        as the top element of heap of larger size  */
    for (int i=1; i < n; i++)
    {
        double x = arr[i];
 
        // case1(left side heap has more elements)
        if (s.size() > g.size())
        {
            if (x < med)
            {
                g.push(s.top());
                s.pop();
                s.push(x);
            }
            else
                g.push(x);
 
            med = (s.top() + g.top())/2.0;
        }
 
        // case2(both heaps are balanced)
        else if (s.size()==g.size())
        {
            if (x < med)
            {
                s.push(x);
                med = (double)s.top();
            }
            else
            {
                g.push(x);
                med = (double)g.top();
            }
        }
 
        // case3(right side heap has more elements)
        else
        {
            if (x > med)
            {
                s.push(g.top());
                g.pop();
                g.push(x);
            }
            else
                s.push(x);
 
            med = (s.top() + g.top())/2.0;
        }
 
        cout << med << endl;
    }
}
 
// Driver program to test above functions
int main()
{
    // stream of integers
    double arr[] = {5, 15, 10, 20, 3};
    int n = sizeof(arr)/sizeof(arr[0]);
    printMedians(arr, n);
    return 0;
}


Java




// Java program to find med in
// stream of running integers
import java.util.Collections;
import java.util.PriorityQueue;
 
public class MedianMaintain
{
     
    // method to calculate med of stream
    public static void printMedian(int[] a)
    {
         
        double med = a[0];
         
        // max heap to store the smaller half elements
        PriorityQueue<Integer> smaller = new PriorityQueue<>
        (Collections.reverseOrder());
         
        // min-heap to store the greater half elements
        PriorityQueue<Integer> greater = new PriorityQueue<>();
         
        smaller.add(a[0]);
        System.out.println(med);
         
        // reading elements of stream one by one
        /* At any time we try to make heaps balanced and
            their sizes differ by at-most 1. If heaps are
            balanced,then we declare median as average of
            min_heap_right.top() and max_heap_left.top()
            If heaps are unbalanced,then median is defined
            as the top element of heap of larger size */
        for(int i = 1; i < a.length; i++)
        {
             
            int x = a[i];
             
            // case1(left side heap has more elements)
            if(smaller.size() > greater.size())
            {
                if(x < med)
                {
                    greater.add(smaller.remove());
                    smaller.add(x);
                }
                else
                    greater.add(x);
                med = (double)(smaller.peek() + greater.peek())/2;
            }
             
            // case2(both heaps are balanced)
            else if(smaller.size() == greater.size())
            {
                if(x < med)
                {
                    smaller.add(x);
                    med = (double)smaller.peek();
                }
                else
                {
                    greater.add(x);
                    med = (double)greater.peek();
                }
            }
             
            // case3(right side heap has more elements)
            else
            {
                if(x > med)
                {
                    smaller.add(greater.remove());
                    greater.add(x);
                }
                else
                    smaller.add(x);
                med = (double)(smaller.peek() + greater.peek())/2;
                 
            }
            System.out.println(med);
        }
    }
     
    // Driver code
    public static void main(String []args)
    {
         
        // stream of integers
        int[] arr = new int[]{5, 15, 10, 20, 3};
        printMedian(arr);
    }
}
 
// This code is contributed by Kaustav kumar Chanda.


Python3




# python3 program to find med in
# stream of running integers
from heapq import *
 
# function to calculate med of stream
def printMedians(arr, n):
    # max heap to store the smaller half elements
    s = []
    # min heap to store the greater half elements
    g = []
 
    heapify(s)
    heapify(g)
 
    med = arr[0]
    heappush(s, arr[0])
 
    print(med)
 
    # reading elements of stream one by one
    for i in range(1, n):
        x = arr[i]
 
        # case1(left side heap has more elements)
        if len(s) > len(g):
            if x < med:
                heappush(g, heappop(s))
                heappush(s, x)
            else:
                heappush(g, x)
            med = (nlargest(1, s)[0] + nsmallest(1, g)[0])/2
 
        # case2(both heaps are balanced)
        elif len(s) == len(g):
            if x < med:
                heappush(s, x)
                med = nlargest(1, s)[0]
            else:
                heappush(g, x)
                med = nsmallest(1, g)[0]
 
        # case3(right side heap has more elements)
        else:
            if x > med:
                heappush(s, heappop(g))
                heappush(g, x)
            else:
                heappush(s, x)
            med = (nlargest(1, s)[0] + nsmallest(1, g)[0])/2
 
        print(med)
 
# Driver program to test above functions
arr = [5, 15, 10, 20, 3]
printMedians(arr, len(arr))
 
# This code is contributed by cavi4762.


C#




// C# program to find med in
// stream of running integers
using System;
using System.Collections.Generic;
public class MedianMaintain
{
 
  // method to calculate med of stream
  public static void printMedian(int[] a)
  {   
    double med = a[0];
 
    // max heap to store the smaller half elements
    List<int> smaller = new List<int>();
 
    // min-heap to store the greater half elements
    List<int>  greater = new  List<int>();    
    smaller.Add(a[0]);
    Console.WriteLine(med);
 
    // reading elements of stream one by one
    /* At any time we try to make heaps balanced and
            their sizes differ by at-most 1. If heaps are
            balanced,then we declare median as average of
            min_heap_right.top() and max_heap_left.top()
            If heaps are unbalanced,then median is defined
            as the top element of heap of larger size */
    for(int i = 1; i < a.Length; i++)
    {
 
      int x = a[i];
 
      // case1(left side heap has more elements)
      if(smaller.Count > greater.Count)
      {
        if(x < med)
        {
          smaller.Sort();
          smaller.Reverse();
          greater.Add(smaller[0]);
          smaller.RemoveAt(0);
          smaller.Add(x);
        }
        else
          greater.Add(x);
        smaller.Sort();
        smaller.Reverse();
        greater.Sort();
        med = (double)(smaller[0] + greater[0])/2;
      }
 
      // case2(both heaps are balanced)
      else if(smaller.Count == greater.Count)
      {
        if(x < med)
        {
          smaller.Add(x);
          smaller.Sort();
          smaller.Reverse();
          med = (double)smaller[0];
        }
        else
        {
          greater.Add(x);
          greater.Sort();
          med = (double)greater[0];
        }
      }
 
      // case3(right side heap has more elements)
      else
      {
        if(x > med)
        {
          greater.Sort();
          smaller.Add(greater[0]);
          greater.RemoveAt(0);
          greater.Add(x);
        }
        else
          smaller.Add(x);
        smaller.Sort();
        smaller.Reverse();
        med = (double)(smaller[0] + greater[0])/2;
 
      }
      Console.WriteLine(med);
    }
  }
 
  // Driver code
  public static void Main(String []args)
  {
 
    // stream of integers
    int[] arr = new int[]{5, 15, 10, 20, 3};
    printMedian(arr);
  }
}
 
// This code is contributed by Rajput-Ji


Javascript




<script>
// Javascript program to find med in
// stream of running integers
 
// method to calculate med of stream
function printMedian(a)
{
    let med = a[0];
           
        // max heap to store the smaller half elements
        let smaller = [];
           
        // min-heap to store the greater half elements
        let greater = [];
           
        smaller.push(a[0]);
        document.write(med+"<br>");
        // reading elements of stream one by one
        /* At any time we try to make heaps balanced and
            their sizes differ by at-most 1. If heaps are
            balanced,then we declare median as average of
            min_heap_right.top() and max_heap_left.top()
            If heaps are unbalanced,then median is defined
            as the top element of heap of larger size */
        for(let i = 1; i < a.length; i++)
        {
               
            let x = a[i];
               
            // case1(left side heap has more elements)
            if(smaller.length > greater.length)
            {
                if(x < med)
                {
                    smaller.sort(function(a,b){return b-a;});
                     
                    greater.push(smaller.shift());
                    smaller.push(x);
                }
                else
                {    greater.push(x);}
                    smaller.sort(function(a,b){return b-a;});
                    greater.sort(function(a,b){return a-b;});
                    med = (smaller[0] + greater[0])/2;
                 
            }
               
            // case2(both heaps are balanced)
            else if(smaller.length == greater.length)
            {
                if(x < med)
                {
                    smaller.push(x);
                    smaller.sort(function(a,b){return b-a;});
         
                    med = smaller[0];
                }
                else
                {
                    greater.push(x);
                     
                    greater.sort(function(a,b){return a-b;});
                    med = greater[0];
                }
            }
               
            // case3(right side heap has more elements)
            else
            {
                if(x > med)
                {
                 
                    greater.sort(function(a,b){return a-b;});
                    smaller.push(greater.shift());
                    greater.push(x);
                }
                else
                {
                    smaller.push(x);}
                    smaller.sort(function(a,b){return b-a;});
         
                    med = (smaller[0] + greater[0])/2;
                   
            }
             
            document.write(med+"<br>");
             
        }
}
 
// Driver code
// stream of integers
let arr=[5, 15, 10, 20, 3];
printMedian(arr);
 
// This code is contributed by avanitrachhadiya2155
</script>


Output: 

5
10
10
12.5
10

 

Complexity Analysis: 

  • Time Complexity: O(n Log n). 
    Time Complexity to insert element in min heap is log n. So to insert n element is O( n log n).
  • Auxiliary Space : O(n). 
    The Space required to store the elements in Heap is O(n).

 



Previous Article
Next Article

Similar Reads

Median of Stream of Running Integers using STL | Set 2
Given an array arr[] of size N representing integers required to be read as a data stream, the task is to calculate and print the median after reading every integer. Examples: Input: arr[] = { 5, 10, 15 } Output: 5 7.5 10 Explanation: After reading arr[0] from the data stream, the median is 5. After reading arr[1] from the data stream, the median i
6 min read
Mode in a stream of integers (running integers)
Given that integers are being read from a data stream. Find the mode of all the elements read so far starting from the first integer till the last integer. Mode is defined as the element which occurs the maximum time. If two or more elements have the same maximum frequency, then take the one with the last occurrence. Examples: Input: stream[] = {2,
6 min read
Median Of Running Stream of Numbers - (using Set)
Given that integers are being read from a data stream. Find the median of all the elements read so far starting from the first integer until the last integer. This is also called Median of Running Integers. The given link already contains solution of this problem using Priority Queue. However, the following solution uses the same concept but the im
9 min read
Find Median from Running Data Stream
Given that integers are read from a data stream. Find the median of elements read so far in an efficient way. There are two cases for median on the basis of data set size. If the data set has an odd number then the middle one will be consider as median.If the data set has an even number then there is no distinct middle value and the median will be
15 min read
Expression for mean and variance in a running stream
Let we have a running stream of numbers as x1,x2,x3,...,xn. The formula for calculating mean and variance at any given point is given as : Mean = E(x) = u = 1/n ∑i=1n xiStandard Deviation = s = 1/n ∑i=1n (xi - u) 2Variance = s2 However, it would be a very slow approach if we calculate these expressions by looping through all numbers each time a new
2 min read
Find Median for each Array element by excluding the index at which Median is calculated
Given an array A[] of N integers where N is even, . the task is to generate an array of medians where the median of the array is calculated by taking the median of array A[] excluding A[i]-th element. Examples: Input N = 4, A = [2, 4, 4, 3]Output: [4, 3, 3, 4]Explanation: Median after removing A[0]: New sorted array will be [3, 4, 4]. Median = 4. M
10 min read
Finding Median of unsorted Array in linear time using C++ STL
Given an unsorted array arr[] having N elements, the task is to find out the median of the array in linear time complexity. Examples: Input: N = 5, arr[] = {4, 1, 2, 6, 5} Output: 4 Explanation: Since N = 5, which is odd, therefore the median is the 3rd element in the sorted array. The 3rd element in the sorted arr[] is 4. Hence the median is 4. In
3 min read
Median after K additional integers
Given an array of n integers. We are allowed to add k additional integer in the array and then find the median of the resultant array. We can choose any k values to be added. Constraints: k &lt; n n + k is always odd. Examples : Input : arr[] = { 4, 7 } k = 1 Output : 7 Explanation : One of the possible solutions is to add 8 making the array [4, 7,
5 min read
Sort a stream of integers
Given an array, arr[] of size N whose elements from left to right, must be read as an incoming stream of integers, the task is to sort the stream of integers and print accordingly. Examples: Input: arr[] = {2, 4, 1, 7, 3}Output: 1 2 3 4 7Explanation: First element in the stream: 2 ? Sorted list: {2}Second element in the stream: 4 ? Sorted list: {2,
13 min read
Find maximum XOR of given integer in a stream of integers
You are given a number of queries Q and each query will be of the following types: Query 1 : add(x) This means add x into your data structure.Query 2 : maxXOR(y) This means print the maximum possible XOR of y with all the elements already stored in the data structure. 1 &lt;= x, y &lt;= 10^9 1 &lt;= 10^5 &lt;= Q The data structure begins with only
10 min read
three90RightbarBannerImg