MarioYC's blog

By MarioYC, 13 years ago, In English
Any idea about how to solve http://acm.tju.edu.cn/toj/showp2463.html ?
I guess dp approach doesn't work here.
Tags lcs
  • Vote: I like it
  • 0
  • Vote: I do not like it

13 years ago, # |
  Vote: I like it 0 Vote: I do not like it
Algorithm here
  • 13 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it
    It's a very clever algorithm, but I'm afraid that's not the answer for the question.
13 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

Let's build suffix tree for string s = s1a1s2a2...skak, where a is set of different delimeters (si don't contains aj ). Consider leaf of builded tree. Let's define color[v] (where v is leaf) as first delimeter symbol which contains suffix of this leaf. Then if vertex has x colors in its subtree then string on this vertex is common string of x given strings. So if it contains k colors it's common string for all given strings. So we need found number of colors in subtree of each vertex. It can be solved using DFS, DSU and Tarjan algorithm. And finally we need only find the deepest vertex which contains k colors. Final complexity is O(n * α(n)), where α(n) is inverse of Ackerman function, and n is sum of strings' lengthes
  • 13 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it
    It looks like the task is to find common subsequence, not substring.
    • 13 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it
      Oh, sorry, for me abbreviation LCS is associated with a largest common substring, and I'm to lazy to follow the link
13 years ago, # |
  Vote: I like it 0 Vote: I do not like it
I've heard of this problem in computational biology. I swear it was proven to be NP-hard. I don't see any constraints that make it solvable otherwise. =/
  • 13 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it
    Are there any approaches using heuristics or a randomized algorithm?
13 years ago, # |
  Vote: I like it 0 Vote: I do not like it
There is a backtracking solution on the average case would run pretty fast given the small alphabet size. And would guarantee a quick solution if one of the strings had a small length. There also are heuristic algorithms. I keep rereading the problem because I find it unlikely that a regional would give an NP-hard problem then call it an easy computational biology problem. =/ So far I have noticed nothing significant.