Offline Range MEX queries in O(log n)

№	Пользователь	Рейтинг
1	Benq	3792
2	VivaciousAubergine	3647
3	jiangly	3631
4	Kevin114514	3574
5	maroonrk	3521
6	strapple	3515
7	Radewoosh	3461
8	tourist	3428
9	turmax	3378
10	Um_nik	3376

№	Пользователь	Вклад
1	Qingyu	162
2	adamant	148
3	Um_nik	146
4	Dominater069	143
5	errorgorn	141
6	cry	138
7	Proof_by_QED	136
8	YuukiS	135
9	chromate00	134
10	soullless	133

Consider the following problem :

Your are given an array $$$a$$$ of size $$$n$$$ and $$$q$$$ queries of form $$$(l_1, r_1), (l_2, r_2), ..., (l_q, r_q)$$$. For each query you have to find the MEX of the array $$$a_l, a_{l+1}, ..., a_r$$$. The queries are offline.

$$$0 \le a_1, a_2, ..., a_n \le n$$$

$$$1 \le l_i \le r_i \le n$$$, for all $$$1 \le i \le q$$$

The approach to the problem is discussed in the comment here.

First let's consider a special case of the queries where $$$r = n$$$. In this case we want to find the smallest integer that does not occur in the subarray $$$a_l, ..., a_r$$$. That is, the smallest integer that does not occur at index $$$l$$$ or after it.

For all $$$0 \le x \le n$$$, let $$$b_x$$$ be the greatest index $$$i$$$ such that $$$a_i = x$$$, and $$$0$$$ if no such $$$a_i$$$ exists.

Now the MEX say $$$m$$$ of subarray $$$a_l, ..., a_n$$$ either does not occur in $$$a$$$, in which case we have $$$b_m = 0$$$, or the last occurence of $$$m$$$ in $$$a$$$ is before $$$l$$$, thus $$$b_m \lt l$$$. We can say that $$$m$$$ is the minimum of all such numbers, thus $$$m$$$ = $$$min$$$ { $$$i$$$ | $$$b_i \lt l$$$ }.

Now since the queries are offline, we can sort the queries by $$$r$$$ and process them such that we can ignore the subarray $$$a_{r + 1}, ..., a_n$$$. We iterate from from $$$r = 1, 2, ..., n$$$, and for each $$$r$$$ we process all queries $$$(l_i, r_i)$$$ where $$$r_i = r$$$.

Now how do we efficiently calculate the answer for queries $$$(l, n)$$$ ?

We maintain a minimum segment tree on the array $$$b$$$ and update all values for $$$i = 1, 2, ..., r$$$. Now since there are $$$n$$$ updates but $$$n + 1$$$ leaf nodes in the segment tree, at least one of them is zero. So value of root node is $$$0$$$. Now we try to find the minimum index leaf node with value < $$$l$$$. We start at root node and travel down to this node. At each node we see if the value of it's left child is < $$$l$$$. If it is we go to left child. Otherwise we go to the right child. Either of left or right child will always have value < $$$l$$$. Finally we reach a leaf node, the index of this node(after adjusting for offset) is the mex of array $$$a_l, a_{l + 1}, ..., a_r$$$.

Consider the array $$$a = [6, 1, 0]$$$.

Here $$$b = [3, 2, 0, 0, 0, 0, 1, 0]$$$.

Consider evaluation for $$$l = 2, r = 3$$$. The subarray is $$$[a_2, a_3] = [1, 0]$$$.

We start at root node.

We see that the left child has value < $$$l$$$, so we go to the left child.

Now for left node $$$2 \ge l$$$. So we go to right node.

Now for the left child value < $$$l$$$. So we go to left child.

We have reached a leaf node. The corresponding index is 2 and thus the MEX of the subarray $$$[a_2, a_3]$$$ is 2.

Calculating MEX of each query takes $$$O(log n)$$$ time, total $$$O(q log n)$$$. Updating the tree for $$$b$$$ takes $$$O(n log n)$$$ time in total. Sorting $$$q$$$ queries takes $$$O(q log q)$$$ time.

Overall time complexity — $$$O(n log n + q(log n + log q))$$$

#include<bits/stdc++.h>
using namespace std;
#define ll long long
struct SegmentTree{
    int n;
    vector<int> tree;
    SegmentTree(int sz){
        n = 1;
        while(n < sz){
            n <<= 1;
        }
        tree.assign(2 * n, 0);
    }
    void set(int ind, int val){
        ind += n;
        tree[ind] = val;
        ind >>= 1;
        while(ind > 0){
            tree[ind] = min(tree[2 * ind], tree[2 * ind + 1]);
            ind >>= 1;
        }
    }
    int get(int x){
        // return the first index i, such that s[i] < x
        int node = 1;
        while(node < n){
            int left = (node << 1);
            int right = (node << 1) + 1;
            if(tree[left] < x){
                node = left;
            }else{
                node = right;
            }
        }
        return (node - n);
    }
};
int main(){
    int n;
    cin >> n;
    vector<int> a(n + 1);
    for(int i = 1; i <= n; ++i){
        cin >> a[i];
    }
    int q;
    cin >> q;
    vector<vector<pair<int, int>>> queries(n + 1);
    for(int i = 0; i < q; ++i){
        int l, r;
        cin >> l >> r;
        queries[r].push_back({l, i});
    }

    vector<int> res(q);

    SegmentTree s(n + 1);
    for(int i = 1; i <= n; ++i){
        // set the last occurence of a[i] to i
        s.set(a[i], i);

        for(auto [l, ind] : queries[i]){
            // find the smallest x, such that last occurence of x < l
            res[ind] = s.get(l);
        }
    }
    for(int elem : res){
        cout << elem << '\n';
    }
    cout << '\n';

    return 0;
}

// Author: Sahil Yasar #include <iostream> #include <vector> #include <cstring> using namespace std; #define endl '\n' namespace persistentSegTree{ const int N = 1e5 + 10; const int M = 5e6 + 10; typedef int T; T arr[M]; int L[M], R[M], root[N]; size_t sz; int nodes = 0, cnt = 0; T val; T (*f)(T, T); void build(T a[], int v, int tl, int tr){ if (tl == tr) arr[v] = a[tl]; else{ int tm = (tl + tr) / 2; build(a, L[v] = nodes++, tl, tm); build(a, R[v] = nodes++, tm+1, tr); arr[v] = f(arr[L[v]], arr[R[v]]); } } void PSEGtree(T a[], size_t n, T (*func)(T, T), T x){ sz = n, f = func, val = x; build(a, root[cnt++] = nodes++, 0, sz-1); } void reset(){ nodes = cnt = 0; } T query(int v, int l, int r, int tl, int tr){ if (l > tr || r < tl) return val; if (l <= tl && r >= tr) return arr[v]; int tm = (tl + tr) / 2; return f(query(L[v], l, r, tl, tm), query(R[v], l, r, tm+1, tr)); } T query(int l, int r, int rt){ return query(root[rt], l, r, 0, sz-1); } void update(int cur, int prev, int pos, T x, int tl, int tr){ if (tl == tr) return void(arr[cur] = x); int tm = (tl + tr) / 2; if (pos <= tm){ R[cur] = R[prev], L[cur] = nodes++; update(L[cur], L[prev], pos, x, tl, tm); } else{ L[cur] = L[prev], R[cur] = nodes++; update(R[cur], R[prev], pos, x, tm+1, tr); } arr[cur] = f(arr[L[cur]], arr[R[cur]]); } int update(int pos, T x, int rt = -1){ if (rt < 0) rt = cnt-1; update(root[cnt] = nodes++, root[rt], pos, x, 0, sz-1); return cnt++; } }; using namespace persistentSegTree; const int MAX = 1e5 + 10; vector<int> g[MAX]; int in[MAX], out[MAX], timer; vector<int> node; void dfs(int s, vector<int>& nums, int p = -1){ node.push_back(nums[s]-1); in[s] = timer++; for (int& c: g[s]) if (c != p) dfs(c, nums, s); out[s] = timer-1; } T rangeMEX(int v, int x, int tl, int tr){ if (tl == tr) return tl; int tm = (tl + tr) / 2; if (arr[L[v]] < x) return rangeMEX(L[v], x, tl, tm); return rangeMEX(R[v], x, tm+1, tr); } int rangeMEX(int rt[], int l, int r){ return rangeMEX(root[rt[r]], l, 0, sz-1); } int main(){ cin.tie(0)->sync_with_stdio(0); cin.exceptions(cin.failbit); int n, i; cin>>n; vector<int> parents(n); vector<int> nums(n); for (i = 0; i < n; ++i) cin>>parents[i]; for (i = 0; i < n; ++i) cin>>nums[i]; // 0 is root for (int i = 1; i < parents.size(); ++i){ g[i].push_back(parents[i]); g[parents[i]].push_back(i); } int temp[MAX]; memset(temp, -1, sizeof(temp)); PSEGtree(temp, MAX, [](int a, int b){ return min(a, b); }, 1e9); timer = 0; dfs(0, nums); int rt[n]; for (i = 0; i < n; ++i) rt[i] = update(node[i], i); vector<int> ans; for (i = 0; i < n; ++i) ans.push_back(rangeMEX(rt, in[i], out[i]) + 1); for (i = 0; i < n; ++i) cout<<ans[i]<<" "; cout<<endl; return 0; }

Комментарии (9)

Показать архивные | Написать комментарий?

one_autum_leaf

3 года назад, скрыть # |

Auto comment: topic has been updated by one_autum_leaf (previous revision, new revision, compare).

→ Ответить

igzou

23 месяца назад, скрыть # |

Practice question

https://leetcode.com/problems/smallest-missing-genetic-value-in-each-subtree/description/

ClosetNarcissist

18 месяцев назад, скрыть # ^ |

← Rev. 2 →

Was able to solve this using persistent segment tree to do range MEX queries online instead of offline. https://leetcode.com/problems/smallest-missing-genetic-value-in-each-subtree/solutions/6126019/online-range-mex-with-persistent-segment-tree

C++ code

hydra_cody

How to solve this

I am thinking of making dfs tree array and work on this array. But how do we solve range MEX queries with update on array?

Hmzaawy

23 месяца назад, скрыть # ^ |

Did you think of square root decomposition?

How will you combine results from two square roots? We can't work with frequency for each square root.

lemelisk

But how do we solve range MEX queries with update on array?

Can be done easily in $$$\mathcal{O}(n^{5/3}+q\sqrt{n})$$$ via 3D Mo and a bit harder in $$$\mathcal{O}(n \log{n}+q\log^2{n})$$$. For the second approach, if the value of $$$x$$$ is at positions $$$p_1, p_2 \dots p_k$$$, consider segments $$$[1; p_1 - 1]$$$, $$$[p_1+1, p_2-1]$$$ ... $$$[p_{k-1}+1;p_k - 1]$$$, $$$[p_k+1, n]$$$ with the value of $$$x$$$ associated with these segments. There will be a total of $$$\mathcal{O}(n)$$$ such segments, and each update changes only $$$\mathcal{O}(1)$$$ of them. And for a range MEX query, we need to find the segment with the smallest $$$x$$$ for all the segments in which the query segment is nested. This can be done for $$$\mathcal{O}(\log^2{n})$$$ in various ways using data structures.

KluydQ

15 месяцев назад, скрыть # ^ |

maybe u should try 3D mo.

saturina

22 месяца назад, скрыть # |

Why is this blog flop?

Блог пользователя one_autum_leaf