Blog entries - Codeforces

skloj's blog

Criteria to decide when a Suffix Tree is required (instead of a suffix array)

By skloj, history, 5 years ago, In English

I was curious to know if anyone uses Suffix trees instead of Suffix arrays. I just settled down using Suffix arrays because I was not able to implement a fast Suffix tree, but not sure if people here had a different experience. For instance, is there a problem that can be solved only with a Suffix trees but not with Suffix arrays?

Thank you,

Full text and comments »

#suffix tree, #suffix_array

-1

skloj
5 years ago
4

A faster Python with Bitset and BST in the stdlib — F#

By skloj, history, 5 years ago, In English

F# feels as high-level as Python but it is faster than Pypy thanks to static typing (even faster than Java due to the native support for Value Types). It provides additional data structures like BitArray/Bitset and BST in the standard library. F# is open source, runs on Linux and MacOS and the IDE support is what you expect from a Microsoft programming language.

I have been using the language on other sites (Kattis, Hackerrank, AtCoder, Exercism...) and I can say it is a wonderful tool for competitive programming.

I see .NET core is already supported in Codeforces for C#, so the infrastructure is already in place for F#. It would be great to be allowed to use the language in competitions, thanks!

Full text and comments »

f#, .net, #python, #language

skloj
5 years ago
0

Implementation of Suffix Tree (Ukkonen)

By skloj, history, 6 years ago, In English

After trying many implementations of Suffix Tree and Suffix Array in Python, the fastest I managed to get was based on adamant Ukkonen version: https://mirror.codeforces.com/blog/entry/16780

Here is my Python version enhanced with memoization (I tested it with UVA 10679 — I Love Strings!!), happy to receive feedback on how to improve the algorithm:

from sys import stdin, stdout, stderr, setrecursionlimit
from functools import lru_cache
setrecursionlimit(100000)

def read():
    return stdin.readline().rstrip()

def readint():
    return int(read())

def make_node(_pos, _len):
    global s, n, sz, to, link, fpos, slen, pos, node
    fpos[sz] = _pos
    slen[sz] = _len
    sz += 1
    return sz-1

def go_edge():
    global s, n, sz, to, link, fpos, slen, pos, node
    while (pos > slen[to[node].get(s[n - pos], 0)]):
        node = to[node].get(s[n - pos], 0)
        pos -= slen[node]

def add_letter(c):
    global s, n, sz, to, link, fpos, slen, pos, node
    s[n] = c
    n += 1
    pos += 1
    last = 0
    while(pos > 0):
        go_edge()
        edge = s[n - pos]
        v = to[node].get(edge, 0)
        t = s[fpos[v] + pos - 1]
        if (v == 0):
            to[node][edge] = make_node(n - pos, inf)
            link[last] = node
            last = 0
        elif (t == c):
            link[last] = node
            return
        else:
            u = make_node(fpos[v], pos - 1)
            to[u][c] = make_node(n - 1, inf)
            to[u][t] = v
            fpos[v] += pos - 1
            slen[v] -= pos - 1
            to[node][edge] = u
            link[last] = u
            last = u
        if(node == 0):
            pos -= 1
        else:
            node = link[node]

def init_tree(st):
    global slen, ans, inf, maxn, s, to, fpos, slen, link, node, pos, sz, n
    inf = int(1e9)
    maxn = len(st)*2+1 #int(1e6+1)
    s = [0]*maxn
    to = [{} for i in range(maxn)]
    fpos, slen, link = [0]*maxn, [0]*maxn, [0]*maxn
    node, pos = 0, 0
    sz = 1
    n = 0
    slen[0] = inf
    ans = 0
    for c in st:
        add_letter(ord(c))

def traverse_edge(st, idx, start, end):
    global len_text, len_st
    k = start
    while k <= end and k < len_text and idx < len_st:
        if text[k] != st[idx]:
            return -1
        k += 1
        idx += 1
    if idx == len_st:
        return idx
    return 0

def edgelen(v, init, e):
    if(v == 0):
        return 0
    return e-init+1

@lru_cache(maxsize=10000001)
def traverse(v, st, idx):
    global len_st
    r = -1
    init = fpos[v]
    end = fpos[v]+slen[v]
    e = end-1
    if v != 0:
        r = traverse_edge(st, idx, init, e)
        if r != 0:
            if r == -1:
                return []
            return [r]
    idx = idx + edgelen(v, init, e)
    if idx > len_st:
        return []
    k = ord(st[idx])
    children = to[v]
    if k in children:
        vv = children.get(k, 0)
        return traverse(vv, st, idx)
    return []

@lru_cache(maxsize=1001*10)
def solve(T, query):
    traverse.cache_clear()
    return "y\n" if traverse(0, query, 0) else "n\n"

def main():
    global text, len_st, len_text
    k = readint()
    for ki in range(k):
        text = read()+"$"
        len_text = len(text)
        init_tree(text)
        q = readint()
        for qi in range(q):
            query = read()
            len_st = len(query)
            stdout.write(solve(text, query))

main()

Full text and comments »

#suffix tree, #python, #strings

skloj
6 years ago
0

Problems about online-construction of suffix structures

By skloj, history, 6 years ago, In English

Does anyone know problems where online-construction of suffix structures is required? I am looking for some practice using the Ukkonen algorithm. Thanks!

Full text and comments »

#suffix tree, ukkonen

skloj
6 years ago
0

How common is the need of online-construction of suffix structures?

By skloj, history, 10 years ago, In English

I was just reading about a new algorithm to build a Suffix Tree: Simplified Weiner

The algorithm is very short and it makes a ton of sense to me (more than Ukkonen). But the algorithm has a drawback: "It is online from right to left but not from left to right. This is not an issue if the problem you solve is offline. Moreover, online problems often can be solved using reversed strings".

To me, the decision between Simplified Weiner vs Ukkonen is related to how often is the need of online-construction of suffix structures in contest. As far as I understand, Suffix Array doesn't support online building and every one seems to use it anyway.

¿What is your opinion about it? Thanks,

Full text and comments »

string suffix structures, suffix array, suffix tree

skloj
10 years ago
0

Wrong ordering in binaryheap

By skloj, history, 10 years ago, In English

I am using the BinaryHeap offered by D language:

import std.stdio, std.container;
void main() {
  auto q = heapify(Array!int([44, 22, 100, -1, 0, 5, 6, 7]));
  while (!q.empty) {
    write(q.front, " ");
    q.removeFront;
  }
}

In my machine (or even on ideone) I got the elements sorted. But in codeforces I got this strange ordering:

44 100 22 7 6 5 0 -1

===== Used: 0 ms, 2856 KB

The codeforces version of D (v2.069.2) is even newer than ideone (v2.067.1), but it is older than my installation (v2.071.0).

Do you know why is the reason? Thank you very much.

Full text and comments »

dlang

skloj
10 years ago
0

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	turmax	3559
6	tourist	3541
7	strapple	3515
8	ksun48	3461
9	dXqwq	3436
10	Otomachi_Una	3413

#	User	Contrib.
1	Qingyu	157
2	adamant	153
3	Um_nik	147
4	Proof_by_QED	146
5	Dominater069	145
6	errorgorn	141
7	cry	139
8	YuukiS	135
9	TheScrasse	134
10	chromate00	133