Python | Remove all duplicates words from a given sentence
Last Updated :
18 Mar, 2023
Given a sentence containing n words/strings. Remove all duplicates words/strings which are similar to each others.
Examples:
Input : Geeks for Geeks
Output : Geeks for
Input : Python is great and Java is also great
Output : is also Java Python and great
We can solve this problem quickly using python Counter() method. Approach is very simple.
1) Split input sentence separated by space into words.
2) So to get all those strings together first we will join each string in given list of strings.
3) Now create a dictionary using Counter method having strings as keys and their frequencies as values.
4) Join each words are unique to form single string.
Python
from collections import Counter
def remov_duplicates( input ):
input = input .split( " " )
UniqW = Counter( input )
s = " " .join(UniqW.keys())
print (s)
if __name__ = = "__main__" :
input = 'Python is great and Java is also great'
remov_duplicates( input )
|
Output
and great Java Python is also
Time Complexity: O(N)
Auxiliary Space: O(N)
Method 2:
Python
s = "Python is great and Java is also great"
l = s.split()
k = []
for i in l:
if (s.count(i)> = 1 and (i not in k)):
k.append(i)
print ( ' ' .join(k))
|
Output
Python is great and Java also
Time Complexity: O(N*N)
Auxiliary Space: O(N)
Method 3: Another shorter implementation:
Python3
string = 'Python is great and Java is also great'
print ( ' ' .join( dict .fromkeys(string.split())))
|
Output
Python is great and Java also
Time Complexity: O(N)
Auxiliary Space: O(N)
Method 4: Using set()
Python3
string = 'Python is great and Java is also great'
print ( ' ' .join( set (string.split())))
|
Output
Java also great and Python is
Time Complexity: O(N)
Auxiliary Space: O(N)
Method 5: using operator.countOf()
Python3
import operator as op
s = "Python is great and Java is also great"
l = s.split()
k = []
for i in l:
if (op.countOf(l,i)> = 1 and (i not in k)):
k.append(i)
print ( ' ' .join(k))
|
Output
Python is great and Java also
Time Complexity: O(N)
Auxiliary Space: O(N)
Method 6:
It uses a loop to traverse through each word of the sentence, and stores the unique words in a separate list using an if condition to check if the word is already present in the list.
Follow the steps below to implement the above idea:
- Split the given sentence into words/strings and store it in a list.
- Create an empty set to store the distinct words/strings.
- Iterate over the list of words/strings, and for each word, check if it is already in the set.
- If the word is not in the set, add it to the set.
- If the word is already in the set, skip it.
- Finally, join the words in the set using a space and return it as the output.
Below is the implementation of the above approach:
Python3
def remove_duplicates(sentence):
words = sentence.split( " " )
result = []
for word in words:
if word not in result:
result.append(word)
return " " .join(result)
sentence = "Python is great and Java is also great"
print (remove_duplicates(sentence))
|
Output
Python is great and Java also
Time complexity: O(n^2) because of the list result that stores unique words, which is searched for every word in the input sentence.
Auxiliary space: O(n) because we are storing unique words in the result list.
Method 7: Using Recursive method.
Algorithm:
- Split the input sentence into words.
- If there is only one word, return it.
- If the first word is present in the rest of the words, call the function recursively with the rest of the words.
- If the first word is not present in the rest of the words, concatenate it with the result of calling the function recursively with the rest of the words.
- Return the final result as a string.
Python3
def remove_duplicates(sentence):
words = sentence.split( " " )
if len (words) = = 1 :
return words[ 0 ]
if words[ 0 ] in words[ 1 :]:
return remove_duplicates( " " .join(words[ 1 :]))
else :
return words[ 0 ] + " " + remove_duplicates( " " .join(words[ 1 :]))
sentence = "Python is great and Java is also great"
print (remove_duplicates(sentence))
|
Output
Python and Java is also great
Time complexity:
The time complexity of this algorithm is O(n^2), where n is the number of words in the input sentence. This is because for each word in the input sentence, we are checking if it is present in the rest of the words using the in operator, which has a time complexity of O(n) in the worst case. Therefore, the total time complexity of the algorithm is O(n^2).
Space complexity:
The space complexity of this algorithm is O(n), where n is the number of words in the input sentence. This is because we are using recursion to call the function with smaller subsets of the input sentence, which results in a recursive call stack. The maximum depth of the call stack is equal to the number of words in the input sentence, so the space complexity is O(n). Additionally, we are creating a list to store the words in the output, which also takes O(n) space. Therefore, the total space complexity of the algorithm is O(n).
Method #8:Using reduce
- The remove_duplicates function takes an input string as input and splits it into a list of words using the split() method. This takes O(n) time where n is the length of the input string.
- The function initializes an empty list unique_words to store the unique words in the input string.
- The function uses the reduce() function from the functools module to iterate over the list of words and remove duplicates. The reduce() function takes O(n) time to execute where n is the number of words in the input string.
- The lambda function inside the reduce() function checks if a word is already in the accumulator list x and either returns x unchanged or appends the new word y to the list x.
- Finally, the function returns a string joined from the list of unique words using the join() method. This takes O(n) time where n is the length of the output string.
Python3
from functools import reduce
def remove_duplicates(input_str):
words = input_str.split()
unique_words = reduce ( lambda x, y: x if y in x else x + [y], [[], ] + words)
return ' ' .join(unique_words)
input_str = 'Python is great and Java is also great'
print (remove_duplicates(input_str))
|
Output
Python is great and Java also
The time complexity of the remove_duplicates() function is O(n^2) where n is the number of words in the input string.
This is because the reduce() function inside the remove_duplicates() function iterates over each word in the input string, and for each word, it checks whether that word already exists in the list of unique words, which takes O(n) time in the worst case.
Therefore, the time complexity of the function is O(n^2) because it has to perform this check for each word in the input string.
The auxiliary space of the remove_duplicates() function is O(n) because it needs to store all the unique words in the output list.
In the worst case, when there are no duplicates in the input string, the size of the output list is equal to the size of the input list, so the space complexity is O(n).
Please Login to comment...