Блог пользователя Enchom

Автор Enchom, 12 лет назад, По-английски

Hello everybody,

Recently I faced yet another problem and I thought asking the codeforces community is the best decision.

Lately I've been learning Ukkonen's algorithm for linear building of suffix trees. However as I was reading some applications I noticed that often you have to build a generalized suffix tree of usually 2, but sometimes more strings. In the papers about Ukkonen's algorithm I managed to find only how to build a single suffix tree.

Now my initial thought was to build a suffix tree for each string and then try to merge them in linear time, but that seemed like a very annoying to implement idea and looking around the web many people said that Ukkonen's algorithm can be used to produce generalized suffix tree.

I was wondering if someone could outline a solution that keeps the structure and suffix links correct and builds a generalized suffix tree based on Ukkonen's algorithm.

Thanks in advance! :)

  • Проголосовать: нравится
  • +3
  • Проголосовать: не нравится

»
12 лет назад, скрыть # |
 
Проголосовать: нравится +24 Проголосовать: не нравится

The "standard" (as far as I know) method is to concatenate all the strings together, separated with some special characters. Say, for two strings s1 and s2, run Ukkonen's on s1$s2.

»
12 лет назад, скрыть # |
Rev. 2  
Проголосовать: нравится 0 Проголосовать: не нравится

True but if we have "abc#" and "def$" concatenating them we get "abc#def$" which includes suffixes such as "c#def$". In most generalized suffix trees such suffixes are not present, but I guess it won't really worsen the structure, so that's a solution I will have in mind.

Edit: Meant to put the comment as a reply to gawry