Algorithm Wildcard Searching with *

→ Обратите внимание

До соревнования
Codeforces Round 995 (Div. 3)
21:45:08
Зарегистрироваться »

→ Трансляции

Codeforces Round 995 Solution Discussion

aryanc403

До начала 24:05:07

Всё →

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

Страны | Города | Организации

Всё →

→ Лидеры (вклад)

№	Пользователь	Вклад
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя cardcounter

Algorithm Wildcard Searching with *

Автор cardcounter, история, 7 часов назад, По-английски

I am thinking about an effient algorithm for wildcard searching with * representing any characters with any length.

aa*c, she*he, *she*he

When I am supposed to return, say, the beginning index of the first matching instance.

Say the pattern is of length M and the document is of length N, and the pattern has K '*' signs. I can think of a solution that first uses AC Automation to find all occurences of each chunk in O(N + M), with bitmask.

While converting the bitmask to indexes takes O(N * K)

Then binary search for the last possible beginning positions for each chunk. This could take O (K log N)

So the overall time complexity is still O (N * K), any way to do better?

References

https://mirror.codeforces.com/blog/entry/111380

https://mirror.codeforces.com/problemset/problem/1023/A

https://mirror.codeforces.com/blog/entry/127169

pattern_matching, wildcard

cardcounter
7 часов назад
2

Комментарии (2)

Написать комментарий?

piggy123

4 часа назад, # |

I'm not acquainted with regex expressions, so the following might be a misinterpretation of your problem, but it seems that what you are asking is essentially to find a series of non-intersecting occurrences of several strings. We should now seek a way to find all occurrences of a certain string, and find the first one past a certain threshold (determined by the last string's position). This can be accomplished by the use of a Suffix Automaton, which turns the problem into online queries about the lower bound upon the values within a node's subtree (Since the fail tree/suffix tree's subtree values represent the endpos set for a string). This can be done by a persistent segment tree and binary search upon its structure, resulting in a time and memory complexity of O(nlogn).

→ Ответить

piggy123

4 часа назад, # ^ |

Wait wait wait, if my interpretation is correct, there is no need to be so complex. Each time you just need to find the first occurrence of each string between the '*'s, so you can just run KMP upon each of them (taking O(n) time), and scan thru the document, it should only take O(n+m) time.

I'm now convinced I've misunderstood. Please give some more explanation :(

→ Ответить

Соревнования по программированию 2.0

Время на сервере: 21.12.2024 19:49:53 (f1).

Десктопная версия, переключиться на мобильную.

При поддержке