Open In App

Python Program for Anagram Substring Search (Or Search for all permutations)

Last Updated : 07 Jul, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] and its permutations (or anagrams) in txt[]. You may assume that n > m.
Expected time complexity is O(n)

Examples:

1) Input:  txt[] = "BACDGABCDA"  pat[] = "ABCD"
   Output:   Found at Index 0
             Found at Index 5
             Found at Index 6
2) Input: txt[] =  "AAABABAA" pat[] = "AABA"
   Output:   Found at Index 0
             Found at Index 1
             Found at Index 4

We strongly recommend that you click here and practice it, before moving on to the solution.


A simple idea is to modify Rabin Karp Algorithm. For example we can keep the hash value as sum of ASCII values of all characters under modulo of a big prime number. For every character of text, we can add the current character to hash value and subtract the first character of previous window. This solution looks good, but like standard Rabin Karp, the worst case time complexity of this solution is O(mn). The worst case occurs when all hash values match and we one by one match all characters.
We can achieve O(n) time complexity under the assumption that alphabet size is fixed which is typically true as we have maximum 256 possible characters in ASCII. The idea is to use two count arrays:

1) The first count array store frequencies of characters in pattern.
2) The second count array stores frequencies of characters in current window of text.

The important thing to note is, time complexity to compare two count arrays is O(1) as the number of elements in them are fixed (independent of pattern and text sizes). Following are steps of this algorithm.
1) Store counts of frequencies of pattern in first count array countP[]. Also store counts of frequencies of characters in first window of text in array countTW[].

2) Now run a loop from i = M to N-1. Do following in loop.
…..a) If the two count arrays are identical, we found an occurrence.
…..b) Increment count of current character of text in countTW[]
…..c) Decrement count of first character in previous window in countWT[]

3) The last window is not checked by above loop, so explicitly check it.

Python3




# Python program to search all
# anagrams of a pattern in a text
  
MAX = 256 
  
# This function returns true
# if contents of arr1[] and arr2[]
# are same, otherwise false.
def compare(arr1, arr2):
    for i in range(MAX):
        if arr1[i] != arr2[i]:
            return False
    return True
      
# This function search for all
# permutations of pat[] in txt[]  
def search(pat, txt):
  
    M = len(pat)
    N = len(txt)
  
    # countP[]:  Store count of
    # all characters of pattern
    # countTW[]: Store count of
    # current window of text
    countP = [0]*MAX
  
    countTW = [0]*MAX
  
    for i in range(M):
        (countP[ord(pat[i]) ]) += 1
        (countTW[ord(txt[i]) ]) += 1
  
    # Traverse through remaining
    # characters of pattern
    for i in range(M, N):
  
        # Compare counts of current
        # window of text with
        # counts of pattern[]
        if compare(countP, countTW):
            print("Found at Index", (i-M))
  
        # Add current character to current window
        (countTW[ ord(txt[i]) ]) += 1
  
        # Remove the first character of previous window
        (countTW[ ord(txt[i-M]) ]) -= 1
      
    # Check for the last window in text    
    if compare(countP, countTW):
        print("Found at Index", N-M)
          
# Driver program to test above function       
txt = "BACDGABCDA"
pat = "ABCD"       
search(pat, txt)   
  
# This code is contributed
# by Upendra Singh Bartwal


Output:

('Found at Index', 0)
('Found at Index', 5)
('Found at Index', 6)

Please refer complete article on Anagram Substring Search (Or Search for all permutations) for more details!



Previous Article
Next Article

Similar Reads

Python Program To Check Whether Two Strings Are Anagram Of Each Other
Write a function to check whether two given strings are anagram of each other or not. An anagram of a string is another string that contains the same characters, only the order of characters can be different. For example, "abcd" and "dabc" are an anagram of each other. We strongly recommend that you click here and practice it, before moving on to t
8 min read
Python Program to print all permutations of a given string
A permutation also called an "arrangement number" or "order," is a rearrangement of the elements of an ordered list S into a one-to-one correspondence with S itself. A string of length n has n! permutation. Source: Mathword(http://mathworld.wolfram.com/Permutation.html) Below are the permutations of string ABC. ABC ACB BAC BCA CBA CAB Recommended:
2 min read
Python | All possible permutations of N lists
Computing permutations is always a necessary task in many of the practical applications and a concept widely used in Mathematics to achieve solutions to many practical problems. Lets discuss certain ways in which one can perform the task of getting all the permutations of N lists. Method #1 : Using list comprehension List comprehension can be used
4 min read
Python | Remove all duplicates and permutations in nested list
Given a nested list, the task is to remove all duplicates and permutations in that nested list. Input: [[-11, 0, 11], [-11, 11, 0], [-11, 0, 11], [-11, 2, -11], [-11, 2, -11], [-11, -11, 2]] Output: {(-11, 0, 11), (-11, -11, 2)} Input: [[-1, 5, 3], [3, 5, 0], [-1, 5, 3], [1, 3, 5], [-1, 3, 5], [5, -1, 3]] Output: {(1, 3, 5), (0, 3, 5), (-1, 3, 5)}
4 min read
Python - Wildcard Substring search
Sometimes, while working with Python Strings, we have problem in which, we need to search for substring, but have some of characters missing and we need to find the match. This can have application in many domains. Lets discuss certain ways in which this task can be performed. Method #1 : Using re.search() This is one of the way in which this task
6 min read
Generate all possible permutations of words in a Sentence
Given a string S, the task is to print permutations of all words in a sentence. Examples: Input: S = “sky is blue”Output: sky is bluesky blue isis sky blueis blue skyblue sky isblue is sky Input: S =” Do what you love”Output:Do what you loveDo what love youDo you what loveDo you love whatDo love what youDo love you whatwhat Do you lovewhat Do love
13 min read
Python | Get the starting index for all occurrences of given substring
Given a string and a substring, the task is to find out the starting index for all the occurrences of a given substring in a string. Let's discuss a few methods to solve the given task. Method #1: Using Naive Method C/C++ Code # Python3 code to demonstrate # to find all occurrences of substring in # a string # Initialising string ini_string = 'xbze
3 min read
Python - All occurrences of Substring from the list of strings
Given a list of strings and a list of substring. The task is to extract all the occurrences of a substring from the list of strings. Examples: Input : test_list = ["gfg is best", "gfg is good for CS", "gfg is recommended for CS"] subs_list = ["gfg", "CS"] Output : ['gfg is good for CS', 'gfg is recommended for CS'] Explanation : Result strings have
5 min read
Python | All occurrences of substring in string
Many times while working with strings, we have problems dealing with substrings. This may include the problem of finding all positions of particular substrings in a string using Python. Let's discuss specific ways in which this task can be performed. Input: s = "GeeksforGeeks is best for Geeks", f = "Geeks" Output: [0, 8, 26] Explanation: The start
7 min read
Python program to find the occurrence of substring in the string
Given a list of words, extract all the indices where those words occur in the string. Input : test_str = 'geeksforgeeks is best for geeks and cs', test_list = ["best", "geeks"] Output : [2, 4] Explanation : best and geeks occur at 2nd and 4th index respectively. Input : test_str = 'geeksforgeeks is best for geeks and cs', test_list = ["best", "geek
4 min read