Efficient algorithms with simple regex search

Revision en4, by cardcounter, 2024-11-26 10:45:12

What is an efficient algorithm for regex search all occurrences (starting indexes) of a pattern in a long string?

Say I only have special character * in this question, which matches zero or any numbers of the preceeding characters: https://leetcode.com/problems/regular-expression-matching/

a* search in aaa returns [0, 1, 2]

a*b search in abb returns [0]

Here a* and a*b are patterns and aaa and abb are text strings to search within and text strings could be very long.

Borrowing https://leetcode.com/problems/regular-expression-matching/ I can only come up with a solution with O(N * N * M)

References:

https://mirror.codeforces.com/blog/entry/73568

https://mirror.codeforces.com/blog/entry/111380

Tags substring search, regular expressions, kmp, fft, fsm, aho-corasick, wildcard

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en4 English cardcounter 2024-11-26 10:45:12 24
en3 English cardcounter 2024-11-26 10:43:29 82
en2 English cardcounter 2024-11-26 10:42:10 2 Tiny change: 'ry/73568\nhttps://' -> 'ry/73568\n\nhttps://'
en1 English cardcounter 2024-11-26 10:40:49 684 Initial revision (published)