Open In App

Python Counter to find the size of largest subset of anagram words

Last Updated : 27 Jul, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Given an array of n string containing lowercase letters. Find the size of largest subset of string which are anagram of each others.

An anagram of a string is another string that contains same characters, only the order of characters can be different. For example, “abcd” and “dabc” are anagram of each other. Examples:

Input: 
ant magenta magnate tan gnamate
Output: 3
Explanation
Anagram strings(1) - ant, tan
Anagram strings(2) - magenta, magnate,
gnamate
Thus, only second subset have largest
size i.e., 3

Input:
cars bikes arcs steer
Output: 2

Python Counter to find the size of largest subset of anagram words

We have existing solution for this problem please refer Find the size of largest subset of anagram words link. We can solve this problem quickly in python using Counter() method. Approach is very simple,

  1. Split input string separated by space into words.
  2. As we know two strings are anagram to each other if they contain same character set. So to get all those strings together first we will sort each string in given list of strings.
  3. Now create a dictionary using Counter method having strings as keys and their frequencies as value.
  4. Check for maximum value of frequencies, that will be the largest sub-set of anagram strings.

Python3




# Function to find the size of largest subset
# of anagram words
from collections import Counter
 
def maxAnagramSize(input):
 
    # split input string separated by space
    input = input.split(" ")
 
    # sort each string in given list of strings
    for i in range(0,len(input)):
         input[i]=''.join(sorted(input[i]))
 
    # now create dictionary using counter method
    # which will have strings as key and their
    # frequencies as value
    freqDict = Counter(input)
 
    # get maximum value of frequency
    print (max(freqDict.values()))
 
# Driver program
if __name__ == "__main__":
    input = 'ant magenta magnate tan gnamate'
    maxAnagramSize(input)


Output:

3

Python Counter to find the size of largest subset of anagram words Using dictionary

Approach

it uses a dictionary to group words with the same set of characters. However, instead of using a frozen set of character counts as the key, it sorts the characters in each word and uses the resulting string as the key.

Algorithm

1. Create an empty dictionary called anagram_dict.
2. Loop through each word in the input list words:
a. Sort the characters in the word and store the result as a string.
b. If the sorted string is not already in the dictionary, add it as a key and set its value to an empty list.
c. Append the original word to the list of values for the corresponding key in the dictionary.
3. Find the maximum length of the values in the dictionary.
4. Return the maximum length.

Python3




def largest_anagram_subset_size(words):
    anagram_dict = {}
    for word in words:
        sorted_word = ''.join(sorted(word))
        if sorted_word not in anagram_dict:
            anagram_dict[sorted_word] = []
        anagram_dict[sorted_word].append(word)
    max_count = max([len(val) for val in anagram_dict.values()])
    return max_count
 
words = ['ant', 'magenta', 'magnate', 'tan', 'gnamate']
print(largest_anagram_subset_size(words))


Output

3

Time complexity: O(n * k log k) where n is the number of words and k is the maximum length of a word in the list. This is because for each word, we need to sort its characters, which takes O(k log k) time, and we do this n times for each word in the input list

Auxiliary Space: O(n * k) where n is the number of words and k is the maximum length of a word in the list. This is because we use a dictionary to store the anagram groups, and each word in the list may need to be stored in the dictionary with its sorted characters as the key. The size of each value list in the dictionary can also be up to n, the size of the input list. Therefore, the total space required is proportional to n times k.

Python Counter to find the size of largest subset of anagram words Using lambda

We need to find the size of the largest subset of anagram words in the given list of words. We can use the collections.Counter class to create a dictionary of the counts of each character in a given string. We then compare the Counter objects of each word in the list with the Counter objects of every other word in the list to determine if they are anagrams. Finally, we find the maximum count of anagrams for any word in the list to determine the size of the largest subset of anagram words.

Algorithm

1. Initialize max_anagrams to 0.
2. For each word x in the list of words:
a. Create a generator expression that maps each word y in the list of words to 1 if it is an anagram of x, and 0 otherwise.
b. Sum the resulting list of 1’s and 0’s to obtain the number of anagrams for x.
c. Update max_anagrams to the maximum of its current value and the number of anagrams for x.
3. Output max_anagrams.

Python3




from collections import Counter
 
words = ['cars', 'bikes', 'arcs', 'steer']
 
max_anagrams = max(
    list(
        map(
            lambda x: sum(
                map(
                    lambda y: Counter(y) == Counter(x),
                    words
                )
            ),
            words
        )
    ),
    default=0
)
 
print(max_anagrams)


Output

2

Time complexity: O(n^2 * k), where n is the length of the list of words and k is the maximum number of distinct characters in a word.

Auxiliary Space: O(n * k), where n is the length of the list of words and k is the maximum number of distinct characters in a word.



Similar Reads

Find the size of largest subset of anagram words
Given an array of n string containing lowercase letters. Find the size of largest subset of string which are anagram of each others. An anagram of a string is another string that contains same characters, only the order of characters can be different. For example, “abcd” and “dabc” are anagram of each other. Input: ant magenta magnate tan gnamate O
9 min read
Python - Counter.items(), Counter.keys() and Counter.values()
Counter class is a special type of object data-set provided with the collections module in Python3. Collections module provides the user with specialized container datatypes, thus, providing an alternative to Python’s general-purpose built-ins like dictionaries, lists and tuples. Counter is a sub-class that is used to count hashable objects. It imp
3 min read
Using Counter() in Python to find minimum character removal to make two strings anagram
Given two strings in lowercase, the task is to make them Anagram. The only allowed operation is to remove a character from any string. Find minimum number of characters to be deleted to make both the strings anagram? If two strings contains same data set in any order then strings are called Anagrams. Examples: Input : str1 = "bcadeh" str2 = "hea" O
3 min read
Anagram checking in Python using collections.Counter()
Write a function to check whether two given strings are anagram of each other or not. An anagram of a string is another string that contains same characters, only the order of characters can be different. For example, “abcd” and “dabc” are anagram of each other. Examples: Input : str1 = “abcd”, str2 = “dabc” Output : True Input : str1 = “abcf”, str
2 min read
SymPy | Subset.subset() in Python
Subset.subset() : subset() is a sympy Python library function that returns the subset represented by the current instance. Syntax : sympy.combinatorics.subset.Subset.subset() Return : the subset represented by the current instance. Code #1 : subset() Example # Python code explaining # SymPy.Subset.subset() # importing SymPy libraries from sympy.com
1 min read
Dictionary and counter in Python to find winner of election
Given an array of names of candidates in an election. A candidate name in the array represents a vote cast to the candidate. Print the name of candidates received Max vote. If there is tie, print a lexicographically smaller name. Examples: Input : votes[] = {"john", "johnny", "jackie", "johnny", "john", "jackie", "jamie", "jamie", "john", "johnny",
3 min read
Python Counter| Find duplicate rows in a binary matrix
Given a binary matrix whose elements are only 0 and 1, we need to print the rows which are duplicate of rows which are already present in the matrix. Examples: Input : [[1, 1, 0, 1, 0, 1], [0, 0, 1, 0, 0, 1], [1, 0, 1, 1, 0, 0], [1, 1, 0, 1, 0, 1], [0, 0, 1, 0, 0, 1], [0, 0, 1, 0, 0, 1]] Output : (1, 1, 0, 1, 0, 1) (0, 0, 1, 0, 0, 1) We have existi
2 min read
Python - Compute the frequency of words after removing stop words and stemming
In this article we are going to tokenize sentence, paragraph, and webpage contents using the NLTK toolkit in the python environment then we will remove stop words and apply stemming on the contents of sentences, paragraphs, and webpage. Finally, we will Compute the frequency of words after removing stop words and stemming. Modules Needed bs4: Beaut
8 min read
SymPy | Subset.size() in Python
Subset.size() : size() is a sympy Python library function that returns the size of the subset. Syntax : sympy.combinatorics.subset.Subset.size() Return : the size of the subset. Code #1 : size() Example # Python code explaining # SymPy.Subset.size() # importing SymPy libraries from sympy.combinatorics import Permutation, Cycle from sympy.combinator
1 min read
Find an anagram of given String having different characters at corresponding indices
Given a string S consisting of N characters, the task is to find the anagram of the given string S such that the characters at the same indices are different from the original string. Examples: Input: S = "geek"Output: egkeExplanation:The anagram of the given string such that all the characters at all the corresponding indices are not same is "egke
9 min read
Article Tags :
Practice Tags :
three90RightbarBannerImg