Hello all,
I'm trying to solve this problem on UVa: http://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&category=24&problem=3783 (from ICPC South America Regional Finals 2011).
I've seen how to solve it using Suffix Arrays/LCP Arrays. However, I'm trying to solve it using a Suffix Automaton, mainly to learn more/get more used to it.
So, my approach is this:
Concatenate each file using a different separator $ (one that is not a-z).
Build the Suffix Automaton for the resulting string
Now, instead of finding the end-sets of each node (the indexes on where the substrings represented by that node end), I use a bitmask on each node the following way: If index i would appear in the end-set of node n and i belongs to file f, turn f on in node n (n.mask |= (1<<f)).
To see the how many searchable subsets are there, I do a DFS from the root (ignoring any '$/!/..." transitions) and count how many different bitmasks I have. (Also, I ignore the source node that represents the empty string).
The rationale behind this is that if the substring appears on indexes belonging to the files f0, f1..., that substring (and all substrings represented by that node) can be used to search that subset (and only that one). Ignoring the '$'-transitions should guarantee that we don't end up checking any invalid substring.
I'm getting WA with this approach. (It gives the correct answer for the mojority of the test cases, though :( ).
Could anyone please help me find out what's wrong with this idea?
Полный текст и комментарии »