Open In App

Python – Group dates in K ranges

Last Updated : 01 May, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Given a list of dates, group the dates in a successive day ranges from the initial date of the list. We will form a group of each successive range of K dates, starting from the smallest date. 

Input : test_list = [datetime(2020, 1, 4), datetime(2019, 12, 30), datetime(2020, 1, 7), datetime(2019, 12, 27), datetime(2020, 1, 20), datetime(2020, 1, 10)], K = 10

Output : [(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 4, 0, 0)]), (1, [datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), (2, [datetime.datetime(2020, 1, 20, 0, 0)])]

Explanation : 27 Dec – 4 Jan is in same group as diff. of dates are less than 10, successively, each set of dates are grouped by 10 days delta.

Input : test_list = [datetime(2020, 1, 4), datetime(2019, 12, 30), datetime(2020, 1, 7), datetime(2019, 12, 27), datetime(2020, 1, 20), datetime(2020, 1, 10)], K = 14

Output : [(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0)]), (1, [datetime.datetime(2020, 1, 10, 0, 0), datetime.datetime(2020, 1, 20, 0, 0)])]

Explanation : 27 Dec – 7 Jan is in same group as diff. of dates are less than 14, successively, each set of dates are grouped by 14 days delta.

Method : Using groupby() + sort()

In this, we sort the dates and then perform grouping of a set of dates depending upon grouping function. 

Python3




# Python3 code to demonstrate working of
# Group dates in K ranges
# Using groupby() + sort()
from itertools import groupby
from datetime import datetime
 
# initializing list
test_list = [datetime(2020, 1, 4),
             datetime(2019, 12, 30),
             datetime(2020, 1, 7),
             datetime(2019, 12, 27),
             datetime(2020, 1, 20),
             datetime(2020, 1, 10)]
              
# printing original list
print("The original list is : " + str(test_list))
 
# initializing K
K = 7
 
# initializing start date
min_date = min(test_list)
 
# utility fnc to form groupings
def group_util(date):
    return (date-min_date).days // K
 
# sorting before grouping
test_list.sort()
 
temp = []
# grouping by utility function to group by K days
for key, val in groupby(test_list , key = lambda date : group_util(date)):
    temp.append((key, list(val)))
 
# using strftime to convert to userfriendly
# format
res = []
for sub in temp:
  intr = []
  for ele in sub[1]:
    intr.append(ele.strftime("%Y/%m/%d"))
  res.append((sub[0], intr))
     
# printing result
print("Grouped Digits : " + str(res))


Output:

The original list is : [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2020, 1, 20, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]

Grouped Digits : [(0, [‘2019/12/27’, ‘2019/12/30’]), (1, [‘2020/01/04’, ‘2020/01/07’]), (2, [‘2020/01/10’]), (3, [‘2020/01/20’])]

Method #2: Using Sort and iterate

Approach

1. Sort the list of dates in ascending order.
2. Initialize a list of tuples to store the groups.
3. Initialize variables to keep track of the current group number and the start date of the current group.
4. Iterate through the sorted list of dates, comparing the current date with the start date of the current group.
5. If the difference between the current date and the start date is less than or equal to K days, add the current date to the current group.
6. If the difference between the current date and the start date is greater than K days, create a new group with the current date as the start date and add the current date to the new group.
7. Return the list of tuples.

Algorithm

1. Sort the given list of dates in ascending order.
2. Initialize an empty dictionary to store the groups of dates.
3. For each date in the sorted list, calculate the number of days since the previous date using the timedelta function.
4. If the number of days is greater than K, add the date to a new group. Otherwise, add the date to the previous group.
5. Convert the dictionary into a list of tuples and return the result.

Python3




from datetime import datetime, timedelta
from collections import defaultdict
 
def group_dates(dates, K):
    groups = defaultdict(list)
    dates.sort()
    group_num = 0
    start_date = None
    for date in dates:
        if start_date is None:
            start_date = date
        else:
            diff = (date - start_date).days
            if diff > K:
                group_num += 1
                start_date = date
        groups[group_num].append(date)
    return list(groups.items())
dates = [datetime(2020, 1, 4),
             datetime(2019, 12, 30),
             datetime(2020, 1, 7),
             datetime(2019, 12, 27),
             datetime(2020, 1, 20),
             datetime(2020, 1, 10)]
K = 7
print(group_dates(dates, K))


Output

[(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)]), (1, [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), (2, [datetime.datetime(2020, 1, 20, 0, 0)])]

Time complexity: O(n log n) – sorting the list of dates takes O(n log n) time, where n is the number of dates. The loop that iterates through the sorted list of dates takes O(n) time.

Auxiliary Space: O(n) – we store the groups of dates in a dictionary that can potentially contain n elements.

Method 3 :  use a while loop to iterate over the dates and create groups based on the K value. 

Approach:

  1. Sort the dates in ascending order
  2. Initialize an empty list called “groups”
  3. Set a variable called “current_group” to 0
  4. Set a variable called “group_start_date” to the first date in the sorted list
  5. Set a variable called “group_end_date” to None
  6. While there are still dates left in the list:
  7. Get the next date in the list
  8. If the difference between the current date and the group start date is less than or equal to K:
  9. Set the group end date to the current date
    Else:
  10. Append the current group (i.e., the dates between the group start date and the group end date) to the “groups” list
  11. Set the group start date to the current date
  12. Set the group end date to None
  13. Increment the current group number
  14. Append the final group to the “groups” list
  15. Return the “groups” list.

Python3




from collections import defaultdict
from datetime import datetime, timedelta
 
 
def group_dates(dates, K):
   
    groups = defaultdict(list)
    dates.sort()
     
    group_num = 0
     
    start_date = None
     
    for date in dates:
        if start_date is None:
            start_date = date
        else:
            diff = (date - start_date).days
            if diff > K:
                group_num += 1
                start_date = date
                 
        groups[str(group_num)].append(date)
         
    print(groups)
     
    return list(groups.items())
 
# input
dates = [datetime(2020, 1, 4),
         datetime(2019, 12, 30),
         datetime(2020, 1, 7),
         datetime(2019, 12, 27),
         datetime(2020, 1, 20),
         datetime(2020, 1, 10)]
 
K = 7
 
print(group_dates(dates, K))


Output

defaultdict(<class 'list'>, {'0': [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)], '1': [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)], '2': [datetime.datetime(2020, 1, 20, 0, 0)]})
[('0', [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)]), ('1', [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), ('2', [datetime.datetime(2020, 1, 20, 0, 0)])]

Time complexity: O(n), where n is the number of dates in the input list.
Auxiliary space: O(1) since it only uses a fixed number of variables.



Previous Article
Next Article

Similar Reads

Python - Find consecutive dates in a list of dates
Given a list of dates, the task is to write a Python program to check if all the dates are consecutive in the list. Input : [datetime(2019, 12, 30), datetime(2019, 12, 31), datetime(2020, 1, 1), datetime(2020, 1, 2), datetime(2020, 1, 3), datetime(2020, 1, 4)] Output : True Explanation : All dates are consecutive, from 30 Dec 2019 to 4 January 2020
4 min read
Python - Generate k random dates between two other dates
Given two dates, the task is to write a Python program to get K dates randomly. Input : test_date1, test_date2 = date(2015, 6, 3), date(2015, 7, 1), K = 7 Output : [datetime.date(2015, 6, 18), datetime.date(2015, 6, 25), datetime.date(2015, 6, 29), datetime.date(2015, 6, 11), datetime.date(2015, 6, 11), datetime.date(2015, 6, 10), datetime.date(201
4 min read
Python | Convert String ranges to list
Sometimes, while working in applications we can have a problem in which we are given a naive string that provides ranges separated by a hyphen and other numbers separated by commas. This problem can occur across many places. Let's discuss certain ways in which this problem can be solved. Method #1: Using sum() + split() + list comprehension + enume
6 min read
Python | Slice String from Tuple ranges
Sometimes, while working with data, we can have a problem in which we need to perform the removal from strings depending on specified substring ranges. Let's discuss certain ways in which this task can be performed. Method #1: Using loop + list slicing: This is the brute force task to perform this task. In this, we remake the String by carefully om
3 min read
Python | Valid Ranges Product
Many times we need to get the product of not the whole list but just a part of it and at regular intervals. These intervals are sometimes decided statically before traversal, but sometimes, the constraint is more dynamic and hence we require to handle it in more complex way. Criteria discussed here is product of non-zero groups. Let’s discuss certa
6 min read
Python - Non-overlapping Random Ranges
Sometimes, while working with Python, we can have problem in which we need to extract N random ranges which are non-overlapping in nature and with given range size. This can have applications in which we work with data. Lets discuss certain way in which this task can be performed. Method : Using any() + randint() + loop This is brute force way in w
4 min read
Python - Extract Missing Ranges
Given list of tuples, start range and end range values, extract the ranges that are missing from the list. Input : test_list = [(7, 2), (15, 19), (38, 50)], strt_val = 5, stop_val = 60 Output : [(5, 7), (2, 60), (2, 15), (19, 60), (19, 38), (50, 60)] Explanation : Missing element ranges starting from 5 and ending at 50-60 are output as desired. Inp
2 min read
Python - Remove index ranges from String
Given a string and ranges list, remove all the characters that occur in ranges. Input : test_str = 'geeksforgeeks is best for geeks', range_list = [(3, 6), (7, 10)] Output : geeks is best for geeks Explanation: The required ranges removed. Input : test_str = 'geeksforgeeks is best for geeks', range_list = [(3, 6)] Output : georgeeks is best for gee
3 min read
Python - Extract elements from Ranges in List
Given a list, and a list of tuples with ranges, extract all elements in those ranges from list. Input : test_list = [4, 5, 4, 6, 7, 5, 4, 5, 6, 10], range_list = [(2, 4), (7, 8)] Output : [4, 6, 7, 5, 6] Explanation : 4, 6, 7 are elements at idx 2, 3, 4 and 5, 6 at idx 7, 8. Input : test_list = [4, 5, 4, 6, 7, 5, 4, 5, 6, 10], range_list = [(2, 6)]
3 min read
Python - Extract range of Consecutive Similar elements ranges from string list
Given a list, extract a range of consecutive similar elements. Input : test_list = [2, 3, 3, 3, 8, 8] Output : [(2, 0, 0), (3, 1, 3), (8, 4, 5)] Explanation : 2 occurs from 0th to 0th index, 3 from 1st to 3rd index.Input : test_list = [3, 3, 3] Output : [(3, 0, 3)] Explanation : 3 from 0th to 3rd index. Approach: Using loopThis is a brute way to ta
5 min read
Practice Tags :