Python – Split Strings on Prefix Occurrence
Last Updated :
16 May, 2023
Given a list of Strings, perform string split on the occurrence of prefix.
Input : test_list = [“geeksforgeeks”, “best”, “geeks”, “and”], pref = “geek”
Output : [[‘geeksforgeeks’, ‘best’], [‘geeks’, ‘and’]]
Explanation : At occurrence of string “geeks” split is performed.
Input : test_list = [“good”, “fruits”, “goodness”, “spreading”], pref = “good”
Output : [[‘good’, ‘fruits’], [‘goodness’, ‘spreading’]]
Explanation : At occurrence of string “good” split is performed.
Method #1 : Using loop + startswith()
In this, we iterate each element of List, and check if new list has to be changed using startswith() by checking for prefix, and create new list if prefix is encountered.
Python3
test_list = [ "geeksforgeeks" , "best" , "geeks" , "and" , "geeks" , "love" , "CS" ]
print ( "The original list is : " + str (test_list))
pref = "geek"
res = []
for val in test_list:
if val.startswith(pref):
res.append([val])
else :
res[ - 1 ].append(val)
print ( "Prefix Split List : " + str (res))
|
Output
The original list is : ['geeksforgeeks', 'best', 'geeks', 'and', 'geeks', 'love', 'CS']
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]
Method #2 : Using loop + zip_longest() + startswith()
In this, we zip all the elements with their subsequent element sublist and check for prefix using startswith(), if found, result is appended.
Python3
from itertools import zip_longest
test_list = [ "geeksforgeeks" , "best" , "geeks" , "and" , "geeks" , "love" , "CS" ]
print ( "The original list is : " + str (test_list))
pref = "geek"
res, temp = [], []
for x, y in zip_longest(test_list, test_list[ 1 :]):
temp.append(x)
if y and y.startswith(pref):
res.append(temp)
temp = []
res.append(temp)
print ( "Prefix Split List : " + str (res))
|
Output
The original list is : ['geeksforgeeks', 'best', 'geeks', 'and', 'geeks', 'love', 'CS']
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]
Method #3 : Using list + recursion
Step-by-step approach:
- Iterate over the input list of strings using a for loop:
- For each string, check if it starts with the given prefix using the startswith() method.
- If the string starts with the prefix, add the current sublist to the result list using the append() method, and create a new empty sublist by reassigning sublist to an empty list ([]).
- If the string does not start with the prefix, add it to the current sublist using the append() method.
- If the loop is complete, add the final sublist to the result list using the append() method.
- Return the result list.
- Test the function by calling it with a sample test_list and prefix, and print the output.
Python3
def split_list_at_prefix(test_list, pref):
result = []
sublist = []
for string in test_list:
if string.startswith(pref):
if sublist:
result.append(sublist)
sublist = []
sublist.append(string)
if sublist:
result.append(sublist)
return result
test_list = [ 'geeksforgeeks' , 'best' , 'geeks' , 'and' ]
prefix = 'geek'
result = split_list_at_prefix(test_list, prefix)
print (result)
|
Output
[['geeksforgeeks', 'best'], ['geeks', 'and']]
Time complexity: O(n)
Auxiliary space: O(n)
Method #4 : Using loop + find() method
Step-by-step approach:
- Initiate a for loop to traverse list elements
- Check if new list has to be changed using find() by checking for prefix, and create new list if prefix is encountered.
- Display new list
Python3
test_list = [ "geeksforgeeks" , "best" , "geeks" , "and" , "geeks" , "love" , "CS" ]
print ( "The original list is : " + str (test_list))
pref = "geek"
res = []
for val in test_list:
if val.find(pref) = = 0 :
res.append([val])
else :
res[ - 1 ].append(val)
print ( "Prefix Split List : " + str (res))
|
Output
The original list is : ['geeksforgeeks', 'best', 'geeks', 'and', 'geeks', 'love', 'CS']
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]
Time Complexity: O(N), where N length of list
Auxiliary Space: O(N)
Method #5: Using itertools.groupby() and zip():
Step-by-step approach:
- Initialize the list and prefix to be used for splitting.
- Use groupby function from itertools module on the original list with lambda function checking if each element starts with the prefix.
- Loop through the groups produced by groupby. If a group starts with the prefix, append a new sublist containing only that element to the result list. Otherwise, append the elements to the last sublist in the result list.
- Print the resulting list of sublists.
Python3
from itertools import groupby
test_list = [ "geeksforgeeks" , "best" , "geeks" , "and" , "geeks" , "love" , "CS" ]
print ( "The original list is : " + str (test_list))
pref = "geek"
res = []
for k, g in groupby(test_list, lambda x: x.startswith(pref)):
if k:
res.append( list (g))
else :
if res:
res[ - 1 ].extend( list (g))
print ( "Prefix Split List : " + str (res))
|
Output
The original list is : ['geeksforgeeks', 'best', 'geeks', 'and', 'geeks', 'love', 'CS']
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]
Time complexity: O(n), where n is the length of the input list. This is because the algorithm iterates over each element in the input list once.
Auxiliary space: O(n), where n is the length of the input list. This is because the algorithm creates a list of sublists, where each sublist contains some or all of the elements from the input list. The size of this list of sublists is proportional to the length of the input list.
Method #6: Using the reduce() function from the functools module
Python3
from functools import reduce
test_list = [ "geeksforgeeks" , "best" , "geeks" , "and" , "geeks" , "love" , "CS" ]
pref = "geek"
def split_on_prefix(res, x):
if x.startswith(pref):
res.append([x])
elif res:
res[ - 1 ].append(x)
return res
res = reduce (split_on_prefix, test_list, [])
print ( "Prefix Split List : " + str (res))
|
Output
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]
Time complexity: O(n), where n is the length of the input list of strings, since it involves iterating through the list only once.
Auxiliary space: O(m), where m is the number of prefix occurrences in the input list, since it involves creating a new list for each prefix occurrence.
Please Login to comment...