Simple rmq O(n)/O(1) method and its improved version that supports updates(with other asymptotic)

#	User	Rating
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

RMQ $$$O(n)/O(1)$$$

Problem

For convenience, we will assume that the array numbering starts from zero.

Let's split our array into blocks of $$$\lceil\log_{2}(n)\rceil$$$ numbers. Let's denote $$$bl = \lceil\log_{2}(n)\rceil$$$. The first block contains numbers from $$$a_{0}$$$ to $$$a_{bl - 1}$$$, the second from с $$$a_{bl}$$$ to $$$a_{2 * bl - 1}$$$ and so on, the last block may contain less than $$$bl$$$ numbers.

Thus, we got $$$\lceil\frac{n}{bl}\rceil$$$ blocks. Let's learn how to find the minimum for a segment that is located entirely inside the block.

Let us consider blocks independently, in each block we will go from left to right and will maintain stack of minimums. For each boundary $$$r$$$ we will save the stack of minimums for a segment from beginning of the block, in which $$$r$$$ is located, to $$$r$$$. We will keep the stack of minimums as a mask of zeroes and ones, the $$$ith$$$ bit will contain $$$1$$$, if the $$$ith$$$ number in the block is now located in the stack of minimums.

Suppose now we want to find the minimum in the segment from $$$l$$$ to $$$r$$$, while $$$l$$$ is in the same block with $$$r$$$. Note that the minimum is at the position of the leftmost bit $$$1$$$ located to the right of the $$$lth$$$ bit(or $$$lth$$$ bit) from the stack of minimums, which we keep in $$$r$$$. Let's denote the mask of stack of minimums located in $$$r$$$ as mask. Then the bit $$$1$$$ that we need is the first bit in $$$mask » (l - start_{l})$$$, where $$$start_{l}$$$ is the start of the block. $$$start_{l} = \lfloor\frac{l}{bl}\rfloor * bl$$$. The count of different masks is $$$2^{bl}$$$ < $$$2 * n$$$, so with the help of dynamic programming we can pre-calculate the index of its leftmost bit $$$1$$$ for each mask.

Thus, we made a pre-calculation $$$O(n)$$$ and are able to find the mininmum on the segment inside the block in $$$O(1)$$$. Now we need to learn how to look for the minimum if $$$l$$$ and $$$r$$$ are in different blocks. Then the minimum we need is $$$min(getmin(l, end_{l}), getmin(start_{r}, r), getmin(end_{l} + 1, start_{r} - 1))$$$. We know how to search for the first 2 values, since the boundaries are in one block, and the third value is the minimum on the segment of blocks. Then, let's find the minimum in each block and build sparse table on array of these minimums. Pre-calculation of sparse table takes $$$O(\lceil\frac{n}{bl}\rceil * log_{2}(\lceil\frac{n}{bl}\rceil))$$$, that is $$$O(n)$$$.

We learnt how to find the minimum on segment in $$$O(1)$$$ based on the $$$O(n)$$$ pre-calculation.

For $$$n = 10^6$$$ my implementation of this algorithm works as fast as the non-recursive segment tree.

Code

#include <bits/stdc++.h>

using namespace std;
const int INF = 1e9;

inline int get_min_sparse_table(int l, int r, vector <vector <int>>& st, vector <int>& max_st2) {
    if (l > r) return INF;
    int len = r - l + 1;
    return min(st[max_st2[len]][l], st[max_st2[len]][r - (1 << max_st2[len]) + 1]);
}

inline int get_min_in_block(vector <int>& a, int l, int r, vector <int>& first_bit, vector <int>& stack_min_mask, int left_block) {
    return a[first_bit[(stack_min_mask[r] >> (l - left_block))] + l];
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(nullptr);
    cout.tie(nullptr);
    int n, m;
    cin >> n >> m;
    vector <int> a(n);
    for (int i = 0; i < n; ++i) cin >> a[i];
    int lg = (int)ceil(log2(n));
    vector <int> first_bit(1 << lg);
    for (int i = 0; i < lg; ++i) {
        for (int mask = 1; mask < (1 << i); ++mask) {
            first_bit[mask + (1 << i)] = first_bit[mask];
        }
        first_bit[(1 << i)] = i;
    }
    vector <int> all_blocks_min;
    vector <int> stack_min_mask(n);
    vector <int> stack_min;
    for (int i = 0; i < n; i += lg) {
        stack_min.clear();
        int cur_min = a[i];
        for (int j = i; j < min(i + lg, n); ++j) {
            cur_min = min(cur_min, a[j]);
        }
        all_blocks_min.push_back(cur_min);
        int cur_mask = 0;
        for (int j = i; j < min(i + lg, n); ++j) {
            while (!stack_min.empty() && a[stack_min.back()] >= a[j]) {
                cur_mask -= (1 << (stack_min.back() - i));
                stack_min.pop_back();
            }
            cur_mask += (1 << (j - i));
            stack_min_mask[j] = cur_mask;
            stack_min.push_back(j);
        }
    }
    int new_len = (int)all_blocks_min.size();
    int new_lg = (int)ceil(log2(new_len));
    vector <vector <int>> st(new_lg, vector <int> (new_len));
    for (int len = 0; len < new_lg; ++len) {
        for (int start = 0; start + (1 << len) - 1 < new_len; ++start) {
            if (len == 0) st[len][start] = all_blocks_min[start];
            else st[len][start] = min(st[len - 1][start], st[len - 1][start + (1 << (len - 1))]);
        }
    }
    vector <int> max_st2(new_len + 1);
    for (int i = 2; i <= new_len; ++i) {
        max_st2[i] = max_st2[i / 2] + 1;
    }
    while (m--) {
        int l, r;
        cin >> l >> r; --l; --r;
        int block_l = l / lg, block_r = r / lg;
        if (block_l == block_r) cout << get_min_in_block(a, l, r, first_bit, stack_min_mask, block_l * lg) << "\n";
        else {
            cout << min(min(get_min_in_block(a, l, lg * (block_l + 1) - 1, first_bit, stack_min_mask, block_l * lg),
                            get_min_in_block(a, block_r * lg, r, first_bit, stack_min_mask, block_r * lg)),
                        get_min_sparse_table(block_l + 1, block_r - 1, st, max_st2)) << "\n";
        }
    }
}

An improved version of this algorithm, supporting updates

Problem

Recently I thought about how this structure can be changed and came up with idea that instead of sparse table on minimums in blocks, I can once again break minimums into blocks by $$$log_{2}(n)$$$, calculate stacks of minimums and again build this structure. In fact, we will have severals levels of the structure, at each it is necessary to support the needed masks and block sizes. We will reduce the length of the next level as long as the count of the numbers on level is greater than $$$2$$$. On each level the count of numbers is reduced by $$$log_{2}(len)$$$ times, where $$$len$$$ is the length of the array on last level. At each level pre-calculation is lineary.

Thus, let $$$f(n) = f(\lceil\frac{n}{\lceil\log_{2}(n)\rceil}\rceil) + n$$$, then pre-calculation requires $$$O(f(n))$$$.

Let's consider an update request: The stack of minimums changed in one block at the longest level, so let's recalculate it simply, then the minimum in the block can change, so we need to recalculate the stack of minimums in one block at the higher level and so on. Thus, let $$$g(n) = g(\lceil\frac{n}{\lceil\log_{2}(n)\rceil}\rceil) + \log_{2}(n)$$$, then an update request takes $$$O(g(n))$$$.

Let's consider a request for a minimum on a segment: If the boundaries are located in one block at the longest level, then we simply find the answer. Otherwise, we need to find the minimum in small subsegment on the left and on the right (in $$$O(1)$$$) and on the segment of blocks. Thus, let $$$t(n) = t(\lceil\frac{n}{\lceil\log_{2}(n)\rceil}\rceil) + 1$$$, then minimum on a segment request takes $$$O(t(n))$$$.

Thus, update operation takes longer than $$$O(log_{2}(n))$$$, but faster than $$$O(log_{2}(n)^2)$$$(because the number of levels is not more than $$$log_{2}(n)$$$ and at each level the length of block is not more than $$$log_{2}(n)$$$. I could not give a more accurate estimate.

The get operation is faster than $$$O(log_{2}(n))$$$, since each time the length of the level is reduced by at least two times $$$2$$$. Again, i don't know how to estimate this more accurately, but i completed several tests.

For $$$n = 10^6$$$, this structure takes on average $$$1250ms$$$, recursive segment tree takes $$$850ms$$$, non-recursive segment tree $$$680ms$$$.

When the number of get requests is $$$100$$$ times more than the number of update requests, this structure takes on average $$$1170ms$$$, recursive segment tree $$$1600ms$$$, non-recursive segment tree $$$1200ms$$$. I don't know why the running time of my structure has not changed much, most likely this is because of a constant, but in theory it should work much faster, since for $$$n = 10^6$$$, $$$t(n) = 6$$$, $$$g(n) = 65$$$. $$$f(n) = 1053421$$$, that is almost $$$10^6$$$. The tests I did are random.

Code

#include <bits/stdc++.h>

using namespace std;
int INF = 1e9;

signed main() {
    if (1) {
        ios_base::sync_with_stdio(false);
        cin.tie(nullptr);
        cout.tie(nullptr);
    }
    int n, m;
    cin >> n >> m;
    vector <int> a(n);
    for (int i = 0; i < n; ++i) cin >> a[i];
    vector <int> block_sizes;
    vector <vector <int>> all_levels_values;
    vector <vector <int>> all_stack_min_masks;
    int max_lg = (int)ceil(log2(n));
    vector <int> pre_calc((1 << max_lg));
    for (int i = 0; i < max_lg; ++i) {
        for (int mask = 1; mask < (1 << i); ++mask) {
            pre_calc[mask + (1 << i)] = pre_calc[mask];
        }
        pre_calc[1 << i] = i;
    }
    all_levels_values.push_back(a);
    block_sizes.push_back(max_lg);
    while (true) {
        all_stack_min_masks.push_back({});
        for (int i = 0; i < all_levels_values.back().size(); i += block_sizes.back()) {
            int cur_mask = 0;
            vector <int> stack_min;
            for (int j = i; j < min((int)all_levels_values.back().size(), i + block_sizes.back()); ++j) {
                while (!stack_min.empty() && all_levels_values.back()[stack_min.back()] >= all_levels_values.back()[j]) {
                    cur_mask -= (1 << (stack_min.back() - i));
                    stack_min.pop_back();
                }
                stack_min.push_back(j);
                cur_mask += (1 << (j - i));
                all_stack_min_masks.back().push_back(cur_mask);
            }
        }
        if (all_levels_values.back().size() > 1) {
            vector <int> now = all_levels_values.back();
            all_levels_values.push_back({});
            for (int i = 0; i < now.size(); i += block_sizes.back()) {
                int minn = now[i];
                for (int j = i; j < min((int)now.size(), i + block_sizes.back()); ++j) minn = min(minn, now[j]);
                all_levels_values.back().push_back(minn);
            }
            block_sizes.push_back(max((int)ceil(log2((int)all_levels_values.back().size())), 2));
        } else break;
    }
    while (m--) {
        int type;
        cin >> type;
        if (type == 1) {
            int i, x;
            cin >> i >> x; --i;
            for (int lvl = 0; lvl < (int)all_levels_values.size(); ++lvl) {
                all_levels_values[lvl][i] = x;
                int start = i / block_sizes[lvl] * block_sizes[lvl];
                vector <int> stack_min;
                int cur_mask = 0, minn = INF;
                for (int j = start; j < min((int)all_levels_values[lvl].size(), start + block_sizes[lvl]); ++j) {
                    while (!stack_min.empty() && all_levels_values[lvl][stack_min.back()] >= all_levels_values[lvl][j]) {
                        cur_mask -= (1 << (stack_min.back() - start));
                        stack_min.pop_back();
                    }
                    minn = min(minn, all_levels_values[lvl][j]);
                    stack_min.push_back(j);
                    cur_mask += (1 << (j - start));
                    all_stack_min_masks[lvl][j] = cur_mask;
                }
                x = minn;
                i /= block_sizes[lvl];
            }
        } else {
            int l, r;
            cin >> l >> r; --l; --r;
            if (l == 0 && r == n - 1) {
                cout << all_levels_values.back().back() << "\n";
                continue;
            }
            int minn = INF;
            for (int lvl = 0; lvl < (int)all_levels_values.size(); ++lvl) {
                if (l > r) break;
                int start_l = l / block_sizes[lvl] * block_sizes[lvl];
                int start_r = r / block_sizes[lvl] * block_sizes[lvl];
                if (start_l == start_r) {
                    int new_mask = (all_stack_min_masks[lvl][r] >> (l - start_l));
                    int ans = all_levels_values[lvl][start_l + pre_calc[new_mask] + l - start_l];
                    minn = min(minn, ans);
                    break;
                }
                int new_mask = (all_stack_min_masks[lvl][start_l + block_sizes[lvl] - 1] >> (l - start_l));
                minn = min(minn, all_levels_values[lvl][start_l + pre_calc[new_mask] + l - start_l]);
                new_mask = all_stack_min_masks[lvl][r];
                minn = min(minn, all_levels_values[lvl][start_r + pre_calc[new_mask]]);
                l /= block_sizes[lvl];
                r /= block_sizes[lvl];
                ++l;
                --r;
            }
            cout << minn << "\n";
        }
    }
}

Comments (10)

Write comment?

allvik66

3 years ago, # |

Auto comment: topic has been updated by daubi (previous revision, new revision, compare).

→ Reply

Makise_Kurisu

+24

Finally, an educational blog that makes sense to me. Thanks!!

sunboy99

This happens and tutorial comes out.. what are the odds though :P

tfg

First part of this blog is described in https://mirror.codeforces.com/blog/entry/78931.

okwedook

+15

Here's my code for the static data structure above. It's not perfect, but it's pretty fast.

As for the dynamic structure. $$$f(n)$$$ is obv linear. My scientific guess is that $$$g(n)=log(n)\cdot t(n)$$$ and $$$t(n)=log^*(n)$$$.

3 years ago, # ^ |

Yes, thanks, I agree about linearity of $$$f(n)$$$, but the second statement does not work if you take a large enough $$$n$$$ and look at these functions.

clyring

FWIW, the number of 'levels' $$$t(n)$$$ is $$$\Theta(\frac{\log{n}}{\log{\log{n}}})$$$, and likewise $$$g(n)$$$ is in $$$\Theta(\frac{(\log{n})^2}{\log{\log{n}}})$$$.

Proof sketch for t(n)

mango_lassi

← Rev. 2 →

+25

For g's complexity, going from $$$n$$$ to $$$\sqrt{n}$$$ adds $$$\mathcal{O}\left(\log_2(n) \log_{\log_2(n)}(n)\right)$$$ or $$$\frac{\log_2(n)^2}{\log_2 \log_2(n)}$$$. Note that $$$\log_2(\sqrt{n}) = \frac{1}{2} \log_2(n)$$$, so the next term $$$\frac{\log_2(\sqrt{n})^2}{\log_2 \log_2(\sqrt{n})}$$$ is at most $$$\frac{1}{2}$$$ the first term. Thus, the complexity is $$$\mathcal{O}\left(\frac{\log_2(n)^2}{\log_2 \log_2(n)}\right)$$$. Similarly, the complexity of t is $$$\mathcal{O}\left(\frac{\log_2(n)}{\log_2 \log_2(n)}\right)$$$.

Edit: sniped :)

allvik66's blog