Open In App

Python – Join Tuples if similar initial element

Last Updated : 01 Jun, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Sometimes, while working with Python tuples, we can have a problem in which we need to perform concatenation of records from the similarity of initial element. This problem can have applications in data domains such as Data Science. Let’s discuss certain ways in which this task can be performed.

Input  : test_list = [(5, 6), (5, 7), (5, 8), (6, 10), (7, 13)] 
Output : [(5, 6, 7, 8), (6, 10), (7, 13)] 
Input  : test_list = [(5, 6), (6, 7), (6, 8), (6, 10), (7, 13)] 
Output : [(5, 6), (6, 7, 8, 10), (7, 13)]

Method #1 : Using loop 

This is brute way in which this task can be done. In this, we create new tuple, if we find no occurrence of similar tuple values previously. Slicing is used to add rest of elements to created tuple. 

Python3




# Python3 code to demonstrate working of
# Join Tuples if similar initial element
# Using loop
 
# initializing list
test_list = [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
 
# printing original list
print("The original list is : " + str(test_list))
 
# Join Tuples if similar initial element
# Using loop
res = []
for sub in test_list:
    if res and res[-1][0] == sub[0]:
        res[-1].extend(sub[1:])
    else:
        res.append([ele for ele in sub])
res = list(map(tuple, res))
 
# printing result
print("The extracted elements : " + str(res))


Output : 

The original list is : [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
The extracted elements : [(5, 6, 7), (6, 8, 10), (7, 13)]

Time complexity: O(n) where n is the number of tuples in the list.
Auxiliary space: O(n) where n is the number of tuples in the list.

Method #2 : Using defaultdict() + loop 

The combination of above functions can be used to solve this problem. The advantages it offers from above method are, that it reduces one check of initializing new key, and also works well even if similar elements are not consecutive. 

Python3




# Python3 code to demonstrate working of
# Join Tuples if similar initial element
# Using defaultdict() + loop
from collections import defaultdict
 
# initializing list
test_list = [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
 
# printing original list
print("The original list is : " + str(test_list))
 
# Join Tuples if similar initial element
# Using defaultdict() + loop
mapp = defaultdict(list)
for key, val in test_list:
    mapp[key].append(val)
res = [(key, *val) for key, val in mapp.items()]
 
# printing result
print("The extracted elements : " + str(res))


Output : 

The original list is : [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
The extracted elements : [(5, 6, 7), (6, 8, 10), (7, 13)]

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #3: Using for loop

Python3




# Python3 code to demonstrate working of
# Join Tuples if similar initial element
# Using loop
 
# initializing list
test_list = [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
 
# printing original list
print("The original list is : " + str(test_list))
 
# Join Tuples if similar initial element
# Using loop
res = []
x = []
for i in test_list:
    if i[0] not in x:
        x.append(i[0])
for i in x:
    p = []
    p.append(i)
    for j in test_list:
        if i == j[0]:
            p.append(j[1])
    res.append(p)
 
# printing result
print("The extracted elements : " + str(res))


Output

The original list is : [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
The extracted elements : [[5, 6, 7], [6, 8, 10], [7, 13]]

Time complexity: O(n^2), where n is the length of the input list test_list. 
Auxiliary space: O(n), where n is the length of the input list test_list.

Method #4: using dictionary and list comprehension

step-by-step algorithm for the given approach

  1. Initialize an empty dictionary temp_dict to store the unique initial elements as keys and a list of corresponding second elements as values.
  2. Iterate through each tuple x in the given list test_list.
  3. If the initial element of x already exists as a key in temp_dict, then append the second element of x to the corresponding value list.
    If the initial element of x does not exist as a key in temp_dict, then add the initial element as a new key and a list containing the second element as its value.
  4. After all tuples are processed, create a new list res by iterating through the key-value pairs of temp_dict, and joining each key with its corresponding list of values.
  5. Return the final list res as the result.

Python3




# initializing list
test_list = [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
 
# printing original list
print("The original list is : " + str(test_list))
 
# Join Tuples if similar initial element
# Using dictionary and list comprehension
temp_dict = {}
for x in test_list:
    temp_dict[x[0]] = temp_dict.get(x[0], []) + list(x[1:])
 
res = [(k,) + tuple(v) for k, v in temp_dict.items()]
 
# printing result
print("The extracted elements : " + str(res))


Output

The original list is : [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
The extracted elements : [(5, 6, 7), (6, 8, 10), (7, 13)]

Time Complexity:

The time complexity of the above algorithm is O(n), where n is the length of test_list. It requires one traversal of test_list and temp_dict.

Auxiliary Space:

The auxiliary space complexity of the above algorithm is O(n), where n is the length of test_list. It requires temp_dict to store the values extracted from test_list. The size of temp_dict will be proportional to the number of unique keys in test_list.

Method #5: Using itertools.groupby()

Step-by-step approach:

  • Import groupby from itertools.
  • Loop through the groups in test_list generated by groupby function. Group the tuples based on their first element (key=lambda x: x[0]).
  • Extract the values of each group and store them in a list (values).
  • Append a new tuple to the result list res that contains the key and all the values.
  • Print the result.

Below is the implementation of the above approach:

Python3




from itertools import groupby
 
# initializing list
test_list = [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
 
# printing original list
print("The original list is : " + str(test_list))
 
# Join Tuples if similar initial element
# Using itertools.groupby()
res = []
for k, g in groupby(test_list, key=lambda x: x[0]):
    values = [v for _, v in g]
    res.append((k, *values))
 
# printing result
print("The extracted elements : " + str(res))


Output

The original list is : [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
The extracted elements : [(5, 6, 7), (6, 8, 10), (7, 13)]

Time complexity: O(n log n), where n is the length of test_list because we are sorting the input list by the first element before grouping.
Auxiliary space: O(n), where n is the length of test_list because we are creating a new list to store the result.

Method #6: Using pandas

Import the pandas library.
Create a DataFrame from the list test_list, with column names “A” and “B” for the two elements in each tuple.
Group the DataFrame by column “A” and aggregate column “B” as a list for each group.
Reset the index of the resulting DataFrame.
Convert the DataFrame to a list of tuples.

Python3




# Define the list test_list
test_list = [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
 
# Import the pandas library
import pandas as pd
 
# Create a DataFrame from the list test_list
df = pd.DataFrame(test_list, columns=["A", "B"])
 
# Group the DataFrame by column A and aggregate column B as a list for each group
grouped = df.groupby("A")["B"].apply(list)
 
# Convert the resulting Series to a list of tuples
res = [tuple([k] + v) for k, v in grouped.items()]
 
# Print the result
print("The extracted elements : " + str(res))


OUTPUT : 
The extracted elements : [(5, 6, 7), (6, 8, 10), (7, 13)]

Time complexity: O(n log n), where n is the length of test_list. 
Auxiliary space: O(n), where n is the length of test_list. 

Method #7 : Using recursion

Step-by-step approach:

Define a recursive function that takes the input list and an index as parameters.
Inside the function, check if the index is equal to the length of the list minus 1. If it is, return the list.
Otherwise, compare the initial element of the tuple at the current index with the initial element of the tuple at the next index.
If they are the same, concatenate the tuples and recursively call the function with the updated list and index.
If they are different, recursively call the function with the original list and the next index.
Call the recursive function with the input list and index 0.
Print the extracted elements.

Python3




# Python3 code to demonstrate working of
# Join Tuples if similar initial element
# Using recursion
 
def join_tuples(test_list, index):
    if index == len(test_list) - 1:
        return test_list
    elif test_list[index][0] == test_list[index + 1][0]:
        test_list[index] += test_list[index + 1][1:]
        test_list.pop(index + 1)
        return join_tuples(test_list, index)
    else:
        return join_tuples(test_list, index + 1)
 
# initializing list
test_list = [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
 
# printing original list
print("The original list is : " + str(test_list))
 
# Join Tuples if similar initial element
# Using recursion
res = join_tuples(test_list, 0)
 
# printing result
print("The extracted elements : " + str(res))


Output

The original list is : [(5, 6), (5, 7), (6, 8), (6, 10), (7, 13)]
The extracted elements : [(5, 6, 7), (6, 8, 10), (7, 13)]

The time complexity of this method is O(n^2) in the worst case, where n is the length of the input list. 

The auxiliary space complexity is O(1) since the modifications are done in-place.



Previous Article
Next Article

Similar Reads

Python | Find the tuples containing the given element from a list of tuples
Given a list of tuples, the task is to find all those tuples containing the given element, say n. Examples: Input: n = 11, list = [(11, 22), (33, 55), (55, 77), (11, 44)] Output: [(11, 22), (11, 44)] Input: n = 3, list = [(14, 3),(23, 41),(33, 62),(1, 3),(3, 3)] Output: [(14, 3), (1, 3), (3, 3)] There are multiple ways we can find the tuples contai
6 min read
Python | Pandas str.join() to join string/list elements with passed delimiter
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas str.join() method is used to join all elements in list present in a series with passed delimiter. Since strings are also array of
2 min read
Python Pandas - Difference between INNER JOIN and LEFT SEMI JOIN
In this article, we see the difference between INNER JOIN and LEFT SEMI JOIN. Inner Join An inner join requires two data set columns to be the same to fetch the common row data values or data from the data table. In simple words, and returns a data frame or values with only those rows in the data frame that have common characteristics and behavior
3 min read
Python - Join Tuples to Integers in Tuple List
Sometimes, while working with Python records, we can have a problem in which we need to concatenate all the elements, in order, to convert elements in tuples in List to integer. This kind of problem can have applications in many domains such as day-day and competitive programming. Let's discuss certain ways in which this task can be performed. Inpu
5 min read
Python | Tuples with maximum key of similar values
Sometimes, while working with Python, we can have a problem in which we need to get all the records. This data can have similar values and we need to find maximum key-value pair. This kind of problem can occur while working with data. Let's discuss certain ways in which this task can be done. Method #1: Using max() + groupby() + itemgetter() + list
6 min read
Python - Find the Maximum of Similar Indices in two list of Tuples
Sometimes, while working with Python records, we can have a problem in which we need to perform cross maximization of list of tuples. This kind of application is popular in web development domain. Let’s discuss certain ways in which this task can be performed. Method #1 : Using list comprehension + zip() The combination of above functionalities can
7 min read
Python - Maximum of Similar Keys in Tuples
Sometimes, while working with Python tuples, we can have a problem in which we need to perform maximum of all values of the equal keys in Tuple list. This kind of application in useful in many domains such as web development and day-day programming. Let's discuss certain ways in which this task can be performed. Input : test_list = [(4, 8), (4, 2),
4 min read
PySpark Join Types - Join Two DataFrames
In this article, we are going to see how to join two dataframes in Pyspark using Python. Join is used to combine two or more dataframes based on columns in the dataframe. Syntax: dataframe1.join(dataframe2,dataframe1.column_name == dataframe2.column_name,"type") where, dataframe1 is the first dataframedataframe2 is the second dataframecolumn_name i
13 min read
Outer join Spark dataframe with non-identical join column
In PySpark, data frames are one of the most important data structures used for data processing and manipulation. The outer join operation in PySpark data frames is an important operation to combine data from multiple sources. However, sometimes the join column in the two DataFrames may not be identical, which may result in missing values. In this a
4 min read
Python | Remove duplicate tuples from list of tuples
Given a list of tuples, Write a Python program to remove all the duplicated tuples from the given list. Examples: Input : [(1, 2), (5, 7), (3, 6), (1, 2)] Output : [(1, 2), (5, 7), (3, 6)] Input : [('a', 'z'), ('a', 'x'), ('z', 'x'), ('a', 'x'), ('z', 'x')] Output : [('a', 'z'), ('a', 'x'), ('z', 'x')] Method #1 : List comprehension This is a naive
5 min read
Practice Tags :
three90RightbarBannerImg