Блог пользователя Mr.Awesome

Автор Mr.Awesome, история, 5 лет назад, По-английски

Here's a problem I crossed lately:

You have a large String S, and q queries, each query consist of a small string b.

The answer of each query is to find the length of the longest common substring between S and b. ( |S| <= 10^5, |b| <= 100, q <= 100 )

My dp solution to find the length of the largest LCS is O(n*m) per query, but it doesn't seem to pass!

I think will need to do some pre-prcessing of S before starting quering, but I can't find a solution.

Any hint or idea will be appreciated.

  • Проголосовать: нравится
  • +4
  • Проголосовать: не нравится

»
5 лет назад, # |
  Проголосовать: нравится +9 Проголосовать: не нравится

I guess it's a standard application of Suffix Trees. Build a suffix tree for S$b# and then query for the LCS in O(|S|+|b|) for each query. And that should be fast enough, given the constraints.

Read more about it here

»
5 лет назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

You can use hashing. Let the original string be s.
First precompute and store hash (in a 2d vector let's say) for every substring of (size from $$$1$$$ to $$$100$$$) string s. Now for every query hash the current string, and look any of it's substring hash is present in precomputed vector of it's size.

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится +5 Проголосовать: не нравится

    Thanks for your reply, but isn't the complexity of this is 100 * |S| * (complexity of hashing each substring) ? Is this a right code for your solution ?

    for(int i = 0; i < S.length(); i++) {
        string tmp = "";
        for(int j = i; j < min(S.length(), i + 100); j++) {
            tmp += s[j];
            hash_and_add_to_vector(tmp);
         }
    }
    
    • »
      »
      »
      5 лет назад, # ^ |
      Rev. 2   Проголосовать: нравится +5 Проголосовать: не нравится
      Spoiler
»
5 лет назад, # |
  Проголосовать: нравится +7 Проголосовать: не нравится

Could you post the link of the problem?