storrealbac's blog

By storrealbac, history, 2 days ago, In English

Hello everyone! In this blog I want to share a simple but elegant technique: answering RMQ (Range Minimum/Maximum Queries) using binary lifting on what I'm calling the Monotonic Stack Tree.

I came up with this technique independently and I'm not sure if it has an established name. If you've seen it before under a different name, I'd love to know! For now, I'll call it the Monotonic Stack Tree since the tree structure comes directly from the classic monotone stack.

The idea is not complicated, but I find it beautiful: the next-greater-element relationships naturally form a tree, and climbing that tree with binary lifting solves RMQ in $$$O(\log n)$$$ per query. No segment tree, no sparse table. Just a stack and a lifting table.

Prerequisites

You should be comfortable with:

  • The classic stack-based algorithm for computing the Next Greater Element.
  • Binary lifting.

Building the tree

Given an array $$$a[0], a[1], \ldots, a[n-1]$$$ of elements, define $$$\text{nge}(i)$$$ as the index of the first element to the right of $$$i$$$ that is strictly greater than $$$a[i]$$$. If no such element exists, set $$$\text{nge}(i) = n$$$ (a virtual root with value $$$+\infty$$$).

Now build a rooted tree: the parent of node $$$i$$$ is $$$\text{nge}(i)$$$. This is the Monotonic Stack Tree.

For example, consider $$$a = [2, 5, 1, 4, 3, 7, 6]$$$:

Index 0 1 2 3 4 5 6
Value 2 5 1 4 3 7 6
Parent 1 5 3 5 5 $$$\infty$$$ $$$\infty$$$

The parent of each node is the index of its next greater element. Nodes 5 and 6 have no greater element to their right, so their parent is the virtual root ($$$+\infty$$$).

The resulting tree looks like this:

Monotonic Stack Tree

We compute this with the classic monotone stack in $$$O(n)$$$:

vector<int> nge(n, n);
stack<int> st;
for (int i = 0; i < n; i++) {
    while (!st.empty() && a[st.top()] < a[i]) {
        nge[st.top()] = i;
        st.pop();
    }
    st.push(i);
}

Two important properties of this tree:

  1. The path from any node to the root is strictly increasing in value. This follows directly from the definition: each parent is strictly greater than its child.

  2. The maximum of any range $$$[l, r]$$$ is an ancestor of $$$l$$$. This is the crucial insight that makes RMQ possible.

Why property 2 holds

Claim: For any range $$$[l, r]$$$, the position of the maximum element is an ancestor of $$$l$$$ in the Monotonic Stack Tree.

Proof: Let $$$m$$$ be the position of the maximum in $$$[l, r]$$$. We prove $$$m$$$ is an ancestor of $$$l$$$ by induction.

If $$$m = l$$$, we're done. Otherwise $$$m \gt l$$$ and $$$a[m] \gt a[l]$$$, so there exists a first element greater than $$$a[l]$$$ in the range $$$(l, r]$$$. That element is $$$\text{nge}(l)$$$, and by definition $$$\text{nge}(l) \leq m$$$, so $$$\text{nge}(l) \in [l+1, r]$$$.

Now the maximum of the subrange $$$[\text{nge}(l), r]$$$ is still $$$a[m]$$$ (since $$$m \geq \text{nge}(l)$$$). By induction, $$$m$$$ is an ancestor of $$$\text{nge}(l)$$$, which is the parent of $$$l$$$. So $$$m$$$ is an ancestor of $$$l$$$. $$$\blacksquare$$$

RMQ with binary lifting

Since the path from $$$l$$$ to the root is strictly increasing and $$$\max(a[l..r])$$$ is an ancestor of $$$l$$$, we just need to find the deepest ancestor of $$$l$$$ whose index is $$$\leq r$$$. That ancestor holds the maximum value.

We precompute the standard binary lifting table: $$$\text{up}[i][k]$$$ = the $$$2^k$$$-th ancestor of $$$i$$$ in the Monotonic Stack Tree. Then:

int rmq(int l, int r) {
    int v = l;
    for (int k = LOG - 1; k >= 0; k--) {
        if (up[v][k] <= r) {
            v = up[v][k];
        }
    }
    return a[v];
}

We try jumps from largest to smallest. If jumping $$$2^k$$$ levels would land us at an index $$$\leq r$$$, we take the jump. At the end, $$$v$$$ is the deepest reachable ancestor within $$$[l, r]$$$, and $$$a[v]$$$ is the answer.

For example, to answer $$$\max(a[0..4])$$$: we start at index 0 (value 2) and walk up. Its parent is index 1 (value 5), which satisfies $$$1 \leq 4$$$, so we jump there. The next parent is index 5 (value 7), but $$$5 \gt 4$$$, so we stop. The answer is $$$a[1] = 5$$$.

RMQ query walkthrough

Note: this is plain binary lifting on the Monotonic Stack Tree, we are not computing any LCA. We are simply walking up the ancestors of $$$l$$$ and stopping at the right one.

Complexity

  • Preprocessing: $$$O(n)$$$ for the stack + $$$O(n \log n)$$$ for the lifting table = $$$O(n \log n)$$$.
  • Per query: $$$O(\log n)$$$.
  • Space: $$$O(n \log n)$$$.

Comparison with other approaches

Approach Build Query Space
Sparse Table $$$O(n \log n)$$$ $$$O(1)$$$ $$$O(n \log n)$$$
Segment Tree $$$O(n)$$$ $$$O(\log n)$$$ $$$O(n)$$$
Monotonic Stack Tree + Lifting $$$O(n \log n)$$$ $$$O(\log n)$$$ $$$O(n \log n)$$$

On paper, the Monotonic Stack Tree doesn't beat sparse table or segment tree at their own game. So why care? Maybe because learning new stuff is fun :) And also because the tree structure opens up possibilities beyond plain RMQ. In the Alfajores application below, we'll see a problem where having the actual tree lets us do something that a sparse table or segment tree can't do as naturally.

Min vs max, and duplicates

Everything works symmetrically. If you build the tree using the Next Smaller Element instead (parent of $$$i$$$ is the first $$$j \gt i$$$ with $$$a[j] \lt a[i]$$$), the path from any node to the root is strictly decreasing, and you get Range Minimum Queries.

If the array can have duplicate values, use next greater or equal ($$$a[j] \geq a[i]$$$) when building the tree, with ties broken by index. The proof of property 2 goes through unchanged.

Implementation as a struct

Here is a compact, reusable struct in the style of a competitive programming library. It supports both min and max via a template parameter:

/*
 *Description:* Static RMQ via Monotonic Stack Tree + binary lifting,
 *              build O(n log n), query O(log n), positions [0, n - 1]
 *Status:* Tested on CSES Static Range Minimum Queries
 */
template<class T, bool isMin = true>
struct mono_st {
    int n; 
    vector<T> a;
    vector<array<int, 20>> up;

    mono_st() {}
    mono_st(vector<T> &v) : n(v.size()), a(v), up(n + 1) {
        auto cmp = [&](int i, int j) {
            return isMin ? a[j] < a[i] : a[j] > a[i];
        };
        vector<int> nge(n, n);
        stack<int> st;
        for (int i = 0; i < n; i++) {
            while (!st.empty() && cmp(st.top(), i)) {
                nge[st.top()] = i;
                st.pop();
            }
            st.push(i);
        }
        for (int i = 0; i < n; i++) up[i][0] = nge[i];
        up[n][0] = n;
        for (int k = 1; k < 20; k++)
            for (int i = 0; i <= n; i++)
                up[i][k] = up[up[i][k - 1]][k - 1];
    }

    T query(int l, int r) { // [l, r], 0-indexed
        int v = l;
        for (int k = 19; k >= 0; k--)
            if (up[v][k] <= r) v = up[v][k];
        return a[v];
    }
};

Usage:

mono_st<int, true> rmq(a);   // range minimum
mono_st<int, false> rmq(a);  // range maximum
cout << rmq.query(l, r);

Practice problem: Static Range Minimum Queries (CSES)

Static Range Minimum Queries

Given an array of $$$n$$$ integers, process $$$q$$$ queries: what is the minimum value in range $$$[a, b]$$$?

$$$1 \leq n, q \leq 2 \cdot 10^5$$$, $$$\quad 1 \leq x_i \leq 10^9$$$, $$$\quad 1 \leq a \leq b \leq n$$$

This is a direct application. Build the Next Smaller Element tree and answer each query with binary lifting.

Solution

Application: Alfajores with range queries

Now let's see where the Monotonic Stack Tree structure truly shines: in problems where you need to walk along the relevant elements in a range, not just find an aggregate.

The original problem

Alfajores from 2023 Argentinian Programming Tournament (TAP).

Seba has $$$X$$$ alfajores and visits $$$M$$$ offices in order. The $$$i$$$-th office has $$$E_i$$$ employees. At each office, he distributes alfajores equally, so the remainder is $$$X \bmod E_i$$$, which becomes the new $$$X$$$ for the next office. Given $$$X$$$, compute $$$X \bmod E_1 \bmod E_2 \bmod \cdots \bmod E_M$$$.

The basic insight

Taking $$$X \bmod E_i$$$ only changes $$$X$$$ when $$$E_i \leq X$$$. If $$$E_i \gt X$$$, nothing happens. Furthermore, each time $$$X$$$ actually changes, it at least halves: if $$$E_i \leq X$$$, then either $$$E_i \leq X/2$$$ so $$$X \bmod E_i \lt E_i \leq X/2$$$, or $$$X/2 \lt E_i \leq X$$$ so $$$X \bmod E_i = X - E_i \lt X/2$$$. Either way, $$$X$$$ halves. So the number of effective reductions is $$$O(\log X)$$$.

Extending to range queries

Now suppose we have $$$Q$$$ queries, each of the form: given $$$X$$$, $$$l$$$, $$$r$$$, compute $$$X \bmod E_l \bmod E_{l+1} \bmod \cdots \bmod E_r$$$.

Build the Next Smaller on the array $$$E$$$: the parent of position $$$i$$$ is the first $$$j \gt i$$$ with $$$E_j \leq E_i$$$. Precompute binary lifting on this tree.

The path from any position $$$l$$$ to the root is the sequence of decreasing elements, exactly the elements that could reduce $$$X$$$. To answer a query $$$(X, l, r)$$$, we walk along this path using binary lifting, but instead of finding the minimum, we find the first ancestor of $$$l$$$ within $$$[l, r]$$$ with value $$$\leq X$$$:

int find_next(int pos, int r, long long X) {
    if (E[pos] <= X) return pos;
    int v = pos;
    for (int k = LOG - 1; k >= 0; k--)
        if (up[v][k] <= r && E[up[v][k]] > X)
            v = up[v][k];
    int p = up[v][0];
    return (p <= r) ? p : -1;
}

The logic: we skip over ancestors with value $$$ \gt X$$$ (they can't reduce $$$X$$$), and stop at the first one with value $$$\leq X$$$.

Then the full query is:

long long solve(long long X, int l, int r) {
    int pos = l;
    while (pos <= r) {
        int p = find_next(pos, r, X);
        if (p == -1) break;
        X %= E[p];
        pos = up[p][0];
    }
    return X;
}

Each call to find_next is $$$O(\log M)$$$, and we call it at most $$$O(\log X)$$$ times (since $$$X$$$ halves each time). Total per query: $$$O(\log X \cdot \log M)$$$.

The crucial point: the path in the Monotonic Stack Tree from $$$l$$$ to the root is exactly the decreasing subsequence that matters for the modulo chain. Binary lifting lets us jump along this path efficiently, and the constraint $$$\leq r$$$ lets us restrict to any subrange.

Final thoughts

I should be honest: most problems where you'd use this technique can also be solved with a segment tree + walking (finding the first element satisfying some condition in a range). The complexity ends up being the same, and segment trees are more standard.

But I still think it's a beautiful idea. There's something satisfying about realizing that the monotone stack relationships form a tree, that paths in this tree are monotonic, and that this single observation is enough to answer RMQ. It connects two things that don't obviously belong together, and I like that.

If you know problems where this viewpoint leads to a nicer solution, or if this technique already has a name, I'd love to hear about it in the comments!

Full text and comments »

  • Vote: I like it
  • +31
  • Vote: I do not like it

By storrealbac, history, 11 days ago, In English

Hello everyone! In this blog I want to share a simple but powerful technique: counting subsequences of a string that belong to a given regular language, by adding a DFA state dimension to the DP.

The idea itself is not new, but I haven't seen it written down explicitly as a general technique. Once you see it, you'll start recognizing it in many problems.

Maybe this is well known in China, but I'm happy because I discovered it independently.

Prerequisites

You should be comfortable with:

  • Dynamic programming.

  • Basic concepts from the theory of formal languages: alphabets, regular languages, and deterministic finite automata (DFA).

A fun motivating problem

Let's start with something concrete. You have a string $$$s$$$ of length $$$n$$$ over the alphabet $$$\Sigma = {\texttt{a}, \texttt{b}, \ldots, \texttt{z}, \texttt{@}, \texttt{.}}$$$. You want to count how many subsequences of $$$s$$$ form a valid email address.

We define a "valid email" as any string in the language:

$$$L = \{ \; w_1 \texttt{@} w_2 \texttt{.} w_3 \;\mid\; w_1, w_2 \in \{\texttt{a}\text{-}\texttt{z}\}^+, \; w_3 \in \{\texttt{a}\text{-}\texttt{z}\}^3 \;\}$$$

For example, a@b.com is valid. How would you solve this?

Step 1: build the DFA

The language $$$L$$$ is regular, so we can build a DFA $$$\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$$$ that recognizes it. Here's the diagram Email DFA

Step 2: the DP

Now comes the key idea. We define:

$$$dp[i][q] = \text{number of subsequences of } s[0..i-1] \text{ that leave } \mathcal{A} \text{ in state } q$$$

The base case is $$$dp[0][q_0] = 1$$$ (the empty subsequence sits at the start state) and $$$dp[0][q] = 0$$$ for all other states.

When we go from row $$$i$$$ to row $$$i+1$$$, we consider character $$$s[i]$$$ and for each state $$$q$$$:

  • Skip $$$s[i]$$$: the subsequence doesn't change, so $$$dp[i+1][q]$$$ gets $$$dp[i][q]$$$.

  • Take $$$s[i]$$$: the subsequence extends by one character, and $$$\mathcal{A}$$$ transitions from $$$q$$$ to $$$\delta(q, s[i])$$$, so $$$dp[i+1][\delta(q, s[i])]$$$ gets $$$dp[i][q]$$$.

More precisely:

$$$dp[i+1][q] = dp[i][q] + \sum_{\substack{q' \in Q \\ \delta(q', s[i]) = q}} dp[i][q']$$$

The answer is $$$\sum_{q \in F} dp[n][q]$$$.

// dp[i][q] = number of subsequences of s[0..i-1] that leave the DFA in state q
vector<vector<long long>> dp(n + 1, vector<long long>(8, 0));
dp[0][0] = 1;

for (int i = 0; i < n; i++) {
    for (int q = 0; q < 8; q++) {
        // skip s[i]: carry over
        dp[i + 1][q] = (dp[i + 1][q] + dp[i][q]) % MOD;

        // take s[i]: transition
        int nq = delta[q][s[i]];
        if (nq != DEAD) {
            dp[i + 1][nq] = (dp[i + 1][nq] + dp[i][q]) % MOD;
        }
    }
}

cout << dp[n][7] << "\n"; // accepting state

Done. $$$O(n \cdot 8)$$$ time, and the table $$$dp[i][q]$$$ makes the structure completely explicit: row $$$i$$$ summarizes everything about the first $$$i$$$ characters.

The general technique

There's nothing special about emails. The same idea works for any regular language $$$L$$$ recognized by a DFA $$$\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$$$.

We define:

$$$dp[i][q] = \text{number of subsequences of } s[0..i-1] \text{ that leave } \mathcal{A} \text{ in state } q$$$

Base case: $$$dp[0][q_0] = 1$$$.

Transition: for each position $$$i$$$ and each state $$$q$$$, we can skip $$$s[i]$$$ (carry $$$dp[i][q]$$$ to $$$dp[i+1][q]$$$) or take $$$s[i]$$$ (add $$$dp[i][q]$$$ to $$$dp[i+1][\delta(q, s[i])]$$$).

Answer: $$$\sum_{q \in F} dp[n][q]$$$.

Complexity: $$$O(n \cdot |Q|)$$$ time, $$$O(n \cdot |Q|)$$$ space (or $$$O(|Q|)$$$ if you only keep two consecutive rows).

That's the whole technique. The rest of this blog is about recognizing where to apply it.

Why does this work?

Each subsequence of $$$s$$$ is itself a string over $$$\Sigma$$$. We want to count how many of those strings belong to $$$L$$$. The DP simulates $$$\mathcal{A}$$$ on all $$$2^n$$$ possible subsequences simultaneously, grouping them by their current DFA state. This is the product construction: the intersection of the "language of all subsequences of $$$s$$$" with $$$L$$$, done implicitly through the DP.

Composing constraints

What if your problem has two constraints on the subsequence?

  • If both are regular: build the product automaton (states are pairs), or equivalently, add both state dimensions to the DP.

  • If one is regular and one is not (e.g., context-free): track the regular part with a DFA state, and the non-regular part with whatever other DP dimension it needs.

The dimensions just multiply. This is where the technique really shines.

Application: counting C comments

Here's another example. Given a string $$$s$$$ of length $$$n$$$ over the alphabet $$$\Sigma = {\texttt{/}, \texttt{*}, \texttt{a}\text{-}\texttt{z}}$$$, count the number of subsequences that form a valid C-style comment, that is, strings in the language:

$$$L_{\text{comment}} = \{\; \texttt{/} \texttt{*} \, w \, \texttt{*} \texttt{/} \;\mid\; w \in \Sigma^* \;\}$$$

For example, /* hello */ is valid. The subsequence must start with /*, end with */, and can have anything in between.

The DFA has 5 states. Here is the diagram: Comment DFA

We plug this directly into our technique:

$$$dp[i][q] = \text{number of subsequences of } s[0..i-1] \text{ that leave } \mathcal{A} \text{ in state } q$$$

And the answer is $$$dp[n][4]$$$. Five states, one pass, $$$O(5n)$$$. Done.

Now, what if we wanted to count subsequences that are both a balanced bracket sequence and contain a C comment? Suppose $$$\Sigma = {\texttt{(}, \texttt{)}, \texttt{/}, \texttt{*}}$$$. We just compose: track balance $$$b$$$ for the bracket structure and DFA state $$$q$$$ for the comment pattern:

$$$dp[i][b][q]$$$

The answer is $$$dp[n][0][4]$$$. That's it: composing constraints costs nothing beyond multiplying dimensions.

Application: Sub-RBS (hard version)

Let's see this in action on a harder problem. Consider B2. Sub-RBS (Hard Version):

Given a bracket sequence $$$s$$$ of length $$$n \leq 100$$$. For each subsequence $$$t$$$ of $$$s$$$, define its score as the length of the longest regular bracket subsequence of $$$t$$$ that is "better" than $$$t$$$ (under a custom ordering where ( $$$ \lt $$$ )), or $$$0$$$ if no such subsequence exists or $$$t$$$ is not a regular bracket sequence. Find the sum of scores over all non-empty subsequences, modulo $$$998244353$$$.

The key insight: a regular bracket sequence $$$t$$$ admits a "better" regular subsequence if and only if $$$t$$$ contains the subsequence $$$\texttt{)((}$$$, a ) followed (not necessarily immediately) by two (s. In formal terms, $$$t$$$ belongs to the language:

$$$L_{\text{pattern}} = \Sigma^* \texttt{)} \, \Sigma^* \texttt{(} \, \Sigma^* \texttt{(} \, \Sigma^*$$$

where $$$\Sigma = {\texttt{(}, \texttt{)}}$$$. This is a regular language! A 5-state DFA captures it. Here's the diagram: Sub RBS Automaton

Now we need subsequences that satisfy two conditions simultaneously:

  1. Regular bracket sequence: this is context-free (not regular), but we can track it with a "balance" counter $$$b$$$ bounded by $$$n$$$.
  2. Contains the $$$\texttt{)((}$$$ pattern: regular, tracked by our 5-state DFA.

Our DP becomes:

$$$dp[i][b][q]$$$

where $$$dp[i][b][q]$$$ counts the number of subsequences of $$$s[0..i-1]$$$ with bracket balance $$$b$$$ and DFA state $$$q$$$. We process each character choosing to take or skip, updating both dimensions simultaneously. At the end, we look at $$$dp[n][0][3]$$$: balance $$$= 0$$$ (valid bracket sequence) and DFA state $$$= 3$$$ (pattern confirmed).

But the problem doesn't just ask us to count these subsequences, it asks for the sum of their scores, where the score of a qualifying subsequence of length $$$\ell$$$ is $$$\ell - 2$$$. So we need to track not only how many subsequences reach each state, but also the sum of their lengths.

We add one more dimension: for each $$$(i, b, q)$$$ triplet, we store two values: - $$$dp[i][b][q][0]$$$ = the count of subsequences of $$$s[0..i-1]$$$ with balance $$$b$$$ and DFA state $$$q$$$. - $$$dp[i][b][q][1]$$$ = the sum of lengths of those subsequences.

When we extend a subsequence by taking a character, its length increases by 1. So if we have $$$\text{cnt}$$$ subsequences with total length sum $$$\text{sum}$$$, after extending all of them by one character, the new sum becomes $$$\text{sum} + \text{cnt}$$$ (each of the $$$\text{cnt}$$$ subsequences grew by 1).

At the end, the answer is:

$$$\sum_{\text{qualifying}} (\ell - 2) = dp[n][0][3][1] - 2 \cdot dp[n][0][3][0]$$$

The complexity is $$$O(n \cdot n \cdot 5) = O(n^2)$$$.

Solution code for Sub-RBS (hard version)

Code

Final thoughts

Thanks for reading! This is my first blog, so I hope it was useful! I hope this technique becomes a useful tool in your problem-solving toolbox. The recipe is always the same: spot the regular constraint, draw the DFA, add the dimension.

If you found any mistakes, have suggestions to improve the blog, or know other problems where this technique applies, please let me know in the comments. I'd also be curious to hear if this idea has been written up before somewhere else, I'd love to link to it.

Full text and comments »

  • Vote: I like it
  • +55
  • Vote: I do not like it

By storrealbac, history, 7 months ago, In English

Hola Codeforces community!

Last semester I was doing an exchange program abroad, and my ICPC team back home couldn't meet in person for practice sessions.

Normally, when teams practice together in person, they use a simple physical sheet of paper to track problems — you know, just writing down who's working on what, crossing out solved problems, adding quick notes. It's fast, everyone can see it, and it works perfectly.

But when we had to practice remotely, we needed a digital alternative. Like most teams in this situation, we started using a shared Google Sheets to track problem status during virtual contests.

You know, a simple spreadsheet with columns like this:

Example

It worked... sort of. But Google Sheets wasn't really designed for this. I'd find myself spending way too much time adjusting column widths, making sure text fit properly, formatting cells, and dealing with all the spreadsheet overhead that had nothing to do with solving problems.

What is lahojita?

lahojita is a real-time team manager that runs entirely in your browser. No accounts, no servers, no spreadsheet formatting headaches. Just pure focus on what matters: solving problems as a team.

The interface is simple — think of it as a digital version of that physical sheet of paper, but with the benefits of being online and shareable.

lahojita is completely free with no ads, no registration required, and no premium plans. The P2P (peer-to-peer) architecture means your team's data never touches my servers (In fact there are no servers) — it goes directly between your browsers using WebRTC. I literally can't see what problems you're working on or spy on your team strategy, even if I wanted to.

However, there's an important privacy consideration: since team members connect directly to each other, they can potentially see each other's IP addresses. This is inherent to how P2P connections work. Only share your room code with people you trust.

This is just a weekend side project I put together in my spare time, so you’ll probably run into a few bugs and rough edges. I’d really appreciate any feedback.

With more time and effort, this could definitely be improved. If there’s enough interest from the community, I’d be happy to open-source the code and maintain it properly. For now, it’s just a proof of concept to test whether it actually helps solve a real problem for other teams.

Let me know what you think! Does it make remote team coordination any less painful? What features do you feel are missing? And what might I be approaching completely wrong?

Try it out at: https://storrealbac.github.io/lahojita-p2p/

Full text and comments »

  • Vote: I like it
  • +20
  • Vote: I do not like it