Open In App

Python | Split by repeating substring

Last Updated : 23 Apr, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Sometimes, while working with Python strings, we can have a problem in which we need to perform splitting. This can be of a custom nature. In this, we can have a split in which we need to split by all the repetitions. This can have applications in many domains. Let us discuss certain ways in which this task can be performed. 

Method #1: Using * operator + len() This is one of the way in which we can perform this task. In this, we compute the length of the repeated string and then divide the list to obtain root and construct new list using * operator. 

Python3




# Python3 code to demonstrate working of
# Split by repeating substring
# Using * operator + len()
 
# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# printing original string
print("The original string is : " + test_str)
 
# initializing target
K = 'gfg'
 
# Split by repeating substring
# Using * operator + len()
temp = len(test_str) // len(str(K))
res = [K] * temp
 
# printing result
print("The split string is : " + str(res))


Output : 

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

  Method #2 : Using re.findall() This is yet another way in which this problem can be solved. In this, we use findall() to get all the substrings and split is also performed internally. 

Python3




# Python3 code to demonstrate working of
# Split by repeating substring
# Using re.findall()
import re
 
# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# printing original string
print("The original string is : " + test_str)
 
# initializing target
K = 'gfg'
 
# Split by repeating substring
# Using re.findall()
res = re.findall(K, test_str)
 
# printing result
print("The split string is : " + str(res))


Output : 

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

Method #3 : Using count() method and * operator

Python3




# Python3 code to demonstrate working of
# Split by repeating substring
 
# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# printing original string
print("The original string is : " + test_str)
 
# initializing target
K = 'gfg'
 
# Split by repeating substring
re=test_str.count(K)
res=[K]*re
 
# printing result
print("The split string is : " + str(res))


Output

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

The Time and Space Complexity for all the methods are the same:

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #4:Using loop and slicing

Python3




# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# printing original string
print("The original string is : " + test_str)
 
# initializing target
K = 'gfg'
 
# Split by repeating substring using loop and slicing
res = []
start = 0
while start < len(test_str):
    end = start + len(K)
    if test_str[start:end] == K:
        res.append(K)
        start = end
    else:
        start += 1
 
# printing result
print("The split string is : " + str(res))
#This code is contributed by Vinay Pinjala.


Output

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

Time complexity: O(n), The time complexity of this method is linear, as it involves looping through the input string once and performing constant time operations on each character.
Auxiliary Space: O(n), The space complexity of this method is linear, as it involves creating a list of strings that will be the split result. The length of this list will be proportional to the length of the input string.

Method 5 : use the regular expression module re 

  1. Import the ‘re’ module which stands for “regular expressions”. This module provides a way to work with regular expressions in Python.
  2. Initialize a string ‘test_str’ with some repeated substrings.
  3. Initialize a target string ‘K’ with a substring we want to split by.
  4. Use the ‘re.findall()’ method to split the ‘test_str’ string by the target ‘K’ substring. This method returns a list of all non-overlapping matches of the regular expression in the string.
  5. Store the result of the ‘re.findall()’ method in a variable named ‘res’.
  6. Print the original string ‘test_str’ using the ‘print()’ function.
  7. Print the split string ‘res’ using the ‘print()’ function.
  8. Convert the ‘res’ list to a string using the ‘str()’ function to make it printable.
  9. Concatenate the string “The original string is : ” with ‘test_str’ using the ‘+’ operator and print the resulting string.
  10. Concatenate the string “The split string is : ” with the converted ‘res’ string using the ‘+’ operator and print the resulting string.
  11. The program execution ends here.

Python3




import re
 
# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# initializing target
K = 'gfg'
 
# Split by repeating substring using re.findall() method
res = re.findall(K, test_str)
 
# printing result
print("The original string is : " + test_str)
print("The split string is : " + str(res))


Output

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

The time complexity of this approach is O(n), where n is the length of the input string. 

The auxiliary space required is O(k), where k is the number of occurrences of the target substring in the input string.



Similar Reads

Python Program To Find Length Of The Longest Substring Without Repeating Characters
Given a string str, find the length of the longest substring without repeating characters.  For “ABDEFGABEF”, the longest substring are “BDEFGA” and "DEFGAB", with length 6.For “BBBB” the longest substring is “B”, with length 1.For "GEEKSFORGEEKS", there are two longest substrings shown in the below diagrams, with length 7 The desired time complexi
6 min read
Python | Pandas Split strings into two List/Columns using str.split()
Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string. It works similarly to Python's default split() method but it can only be applied to an individual string. Pandas &lt;code
4 min read
Python | Shrink given list for repeating elements
Given a list of repetitive elements, Write a Python program to shrink the repetition of the elements and convert the repetition into a tuple element of the list containing the repeated element and the number of times it has repeated, Thus, converting the given list into a list of tuples. Examples Input : [1, 1, 1, 2, 3, 3, 3, 4, 4, 4, 4] Output : [
7 min read
Python | Repeating tuples N times
Sometimes, while working with data, we might have a problem in which we need to replicate, i.e construct duplicates of tuples. This is an important application in many domains of computer programming. Let's discuss certain ways in which this task can be performed. Method #1 : Using * operator The multiplication operator can be used to construct the
5 min read
Python | Non-Repeating value Summation in Matrix
Sometimes we need to find the unique values in a list, which is comparatively easy and its summation has been discussed earlier. But we can also get a matrix as input i.e a list of lists, finding unique in them are discussed in this article. Let’s see certain ways in which this can be achieved. Method #1: Using set() + list comprehension + sum() Th
6 min read
How to upsample a matrix by repeating elements using NumPy in Python?
Prerequisites: Numpy Upsampling a matrix simply means expanding it and obviously upsampling can be done by adding more elements to the original matrix. It can be done in various ways like adding new elements and expanding the original matrix or it can be done by the matrix elements of original matrix itself. The later approach is discussed below al
2 min read
Python Program to convert Dictionary to List by Repeating keys corresponding value times
Given a dictionary where keys are characters and their constituent values are numbers, the task here is to write a python program that can convert it to a list by repeating the key character value times. Input : test_dict = {'g' : 2, 'f' : 3, 'g' : 1, 'b' : 4, 'e' : 1, 's' : 4, 't' : 3}Output : ['g', 'f', 'f', 'f', 'b', 'b', 'b', 'b', 'e', 's', 's'
6 min read
Python - Remove numbers with repeating digits
Given a list of numbers, the task is to write a Python program to remove all numbers with repetitive digits. Examples: Input : test_list = [4252, 6578, 3421, 6545, 6676]Output : test_list = [6578, 3421]Explanation : 4252 has 2 occurrences of 2 hence removed. Similar case for all other removed. Input : test_list = [4252, 6578, 3423, 6545, 6676]Outpu
5 min read
K’th Non-repeating Character in Python using List Comprehension and OrderedDict
Given a string and a number k, find the k-th non-repeating character in the string. Consider a large input string with lacs of characters and a small character set. How to find the character by only doing only one traversal of input string? Examples: Input : str = geeksforgeeks, k = 3 Output : r First non-repeating character is f, second is o and t
2 min read
Minimize total cost without repeating same task in two consecutive iterations
Given an array arr[][] of size M X N where M represents the number of tasks and N represents number of iteration. An entry in the array arr[i][j] represents the cost to perform task j at the ith iteration. Given that the same task j cannot be computed in two consecutive iterations, the task is to compute the minimum cost to perform exactly one task
14 min read
Practice Tags :
three90RightbarBannerImg