Algorithm Wildcard Searching with *

Правка en6, от cardcounter, 2024-12-21 22:58:46

I am thinking about an effient algorithm for wildcard searching with * representing any characters with any length.

aa*c, she*he, *she*he

Example:

  1. caa find aa*c

    return "DOES NOT EXIST"

  2. hesherheshe find she*he return 2

    Because sherheshe begins at index 2

  3. sherheshe find she*he return 0 Because the whole string When I am supposed to return, say, the beginning index of the first matching instance.

Say the pattern is of length M and the document is of length N, and the pattern has K '*' signs. I can think of a solution that first uses AC Automation to find all occurences of each chunk in O(N + M), with bitmask.

While converting the bitmask to indexes takes O(N * K)

Then binary search for the last possible beginning positions for each chunk. This could take O (K log N)

So the overall time complexity is still O (N * K), any way to do better?

References

https://mirror.codeforces.com/blog/entry/111380

https://mirror.codeforces.com/problemset/problem/1023/A

https://mirror.codeforces.com/blog/entry/127169

Теги pattern_matching, wildcard

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en7 Английский cardcounter 2024-12-21 23:01:06 18
en6 Английский cardcounter 2024-12-21 22:58:46 229
en5 Английский cardcounter 2024-12-21 12:46:01 44 Tiny change: 'm/1023/A\n' -> 'm/1023/A\n\nhttps://mirror.codeforces.com/blog/entry/127169\n'
en4 Английский cardcounter 2024-12-21 12:41:11 2 Tiny change: 'y/111380\nhttps://' -> 'y/111380\n\nhttps://'
en3 Английский cardcounter 2024-12-21 12:40:45 108
en2 Английский cardcounter 2024-12-21 12:32:51 7
en1 Английский cardcounter 2024-12-21 12:30:21 746 Initial revision (published)