Python – Sort String by Custom Integer Substrings
Last Updated :
28 Apr, 2023
Given a list of strings, sort strings by the occurrence of substring from list.
Input : test_list = [“Good at 4”, “Wake at 7”, “Work till 6”, “Sleep at 11”], subord_list = [“11”, “7”, “4”, “6”]
Output : [‘Sleep at 11’, ‘Wake at 7’, ‘Good at 4’, ‘Work till 6’]
Explanation : Strings sorted by substring presence.
Input : test_list = [“Good at 9”, “Wake at 7”, “Work till 6”, “Sleep at 11”], subord_list = [“11”, “7”, “9”, “6”]
Output : [‘Sleep at 11’, ‘Wake at 7’, ‘Good at 9’, ‘Work till 6’]
Explanation : Strings sorted by substring presence.
Method #1 : Using sorted() + zip() + lambda + regex()
The combination of above functions can be used to solve this problem. In this, we perform task of sorting by substring using regex() and sorted(), zip() is used to produce end result.
Python3
import re
test_list = [ "Good at 4" , "Wake at 7" , "Work till 6" , "Sleep at 11" ]
print ( "The original list : " + str (test_list))
subord_list = [ "6" , "7" , "4" , "11" ]
temp_dict = {val: key for key, val in enumerate (subord_list)}
temp_list = sorted ([[ele, temp_dict[re.search( "(\d+)$" , ele).group()]] \
for ele in test_list], key = lambda x: x[ 1 ])
res = [ele for ele in list ( zip ( * temp_list))[ 0 ]]
print ( "The sorted list : " + str (res))
|
Output
The original list : ['Good at 4', 'Wake at 7', 'Work till 6', 'Sleep at 11']
The sorted list : ['Work till 6', 'Wake at 7', 'Good at 4', 'Sleep at 11']
Time Complexity: O(nlogn), where n is the elements of list
Auxiliary Space: O(n), where n is the size of list
Method #2 : Using sorted() + zip() + comparator + regex()
This is yet another way in which this task can be performed. In this similar functionality is used as above method, difference is that comparator function is fed to sorted() rather than lambda to sort.
Python3
import re
def hlper_fnc(ele):
temp = re.search( "(\d+)$" , ele).group()
return temp_dict[temp] if temp in temp_dict else int (temp)
test_list = [ "Good at 4" , "Wake at 7" , "Work till 6" , "Sleep at 11" ]
print ( "The original list : " + str (test_list))
subord_list = [ "6" , "7" , "4" , "11" ]
temp_dict = {val: key for key, val in enumerate (test_list)}
test_list.sort(key = lambda ele: hlper_fnc(ele))
print ( "The sorted list : " + str (test_list))
|
Output
The original list : ['Good at 4', 'Wake at 7', 'Work till 6', 'Sleep at 11']
The sorted list : ['Good at 4', 'Work till 6', 'Wake at 7', 'Sleep at 11']
The Time and Space Complexity for all the methods are the same:
Time Complexity: O(n)
Space Complexity: O(n)
Method 3 : using a dictionary+ using the sorted() method +re module
step by step explanations:
First, a list of strings called “test_list” is initialized with four values.
The original list is printed using the print() function.
A list of substrings called “subord_list” is initialized with the same number of values as the original list.
An empty dictionary called “order_dict” is initialized to store the mappings of substrings to orders.
A for loop is used to iterate through the indices of the “subord_list”. In each iteration, the substring at the current index is added as a key to the “order_dict”, and its corresponding index is added as the value.
The sorted() function is used to sort the “test_list” based on the values of the mappings in the “order_dict”. The key argument of the sorted() function is set to a lambda function that extracts the substring at the end of each string using regular expression and gets its corresponding index from the “order_dict”.
The sorted list is printed using the print() function.
Python3
import re
test_list = [ "Good at 4" , "Wake at 7" , "Work till 6" , "Sleep at 11" ]
print ( "The original list : " + str (test_list))
subord_list = [ "6" , "7" , "4" , "11" ]
order_dict = {}
for i in range ( len (subord_list)):
order_dict[subord_list[i]] = i
test_list = sorted (test_list, key = lambda x: order_dict[re.search( "(\d+)$" , x).group()])
print ( "The sorted list : " + str (test_list))
|
Output
The original list : ['Good at 4', 'Wake at 7', 'Work till 6', 'Sleep at 11']
The sorted list : ['Work till 6', 'Wake at 7', 'Good at 4', 'Sleep at 11']
The time complexity of this approach is O(nlogn) due to the use of sorted().
Auxiliary space complexity is O(n) due to the use of the order_dict dictionary.
Please Login to comment...