Записи в блоге

Блог пользователя arthur_9548

Editorial: XIII UnB Contest Mirror

Автор arthur_9548, 6 месяцев назад, По-английски

We hope everyone enjoyed the problems of XIII UnB Contest Mirror! This editorial contains the description of the solutions and their implementation. Feel free to discuss them in the comments!

Problem A

Author: PedroPacoca
Coauthor: duduFreire

Solution

In this problem, we want to count the number of strings of length $$$n$$$ that have at least one occurrence of the given substring. To do this, we will use the Knuth-Morris-Pratt (KMP) algorithm and the automaton constructed from it, with a complexity of $$$O(m)$$$.

If $$$n$$$ were small, we could solve this problem using dynamic programming, calculating, for each $$$1 \leq i \leq n$$$, how many ways we can reach state $$$0 \leq j \leq m$$$ of the automaton (representing a string with the last $$$j$$$ characters equal to the first $$$j$$$ of $$$s$$$). This solution has $$$n \cdot m$$$ states with transitions of the size of the alphabet, resulting in a time complexity of $$$O(26 \cdot n \cdot m)$$$, which is too slow for this problem, since $$$n$$$ can go up to $$$10^{18}$$$.

Therefore, we need a faster way to solve this problem, which can be achieved by visualizing the KMP automaton as a graph. In this graph, we will have all the states of the automaton as nodes, and the transitions as edges. Additionally, to ensure that we do not count repeated strings or miss counting strings that do not end in the acceptance state (but have passed through this state at least once), we add a new node that will hold all strings that have passed through the acceptance state at least once. Thus, the acceptance state must have an edge to this final state, and to count all the possibilities of strings that have passed through this state once, the final state must have $$$26$$$ edges directed to itself, one for each letter of the alphabet.

Let $$$A$$$ be the adjacency matrix of the constructed graph. We know that $$$A^k$$$ will contain the number of paths of length $$$k$$$ between any two vertices. From this, we can observe that the answer to the problem will be the number of paths of length $$$n+1$$$ from the initial state of the automaton to our final state, since we must consider all letters used in the string and also the additional step between the acceptance state and the final state. We can calculate the matrix $$$A^{n+1}$$$ using the fast exponentiation algorithm with matrix multiplication, resulting in a final complexity of $$$O(m + m^3 \cdot \log(n))$$$, which is sufficient for the limits of this problem.

Code

#include <bits/stdc++.h>
 
using namespace std;
 
typedef long long ll;
 
const int MAXM = 100;
 
int MOD = 998244353;
 
// exponenciacao de matrizes para contar o numero de caminhos no automato
 
struct Matrix {
    int r, c;
    vector<vector<int>> mat;
 
    Matrix(int n, int m, bool id = false): r(n), c(m), mat(r, vector<int>(c, 0)) {
        if (id) {
            for (int i = 0 ; i < min(r, c) ; i++)
                mat[i][i] = 1;
        }
    }
 
    Matrix operator*(const Matrix &o) {
        assert(c == o.r);
 
        Matrix result(r, o.c);
        for (int i = 0 ; i < r ; i++) {
            for (int j = 0 ; j < o.c ; j++) {
                ll sum = 0;
                for (int k = 0 ; k < c ; k++) 
                    sum = (sum + (1LL * mat[i][k] * o.mat[k][j])) % MOD;
                result.mat[i][j] = sum;
            }
        }
 
        return result;
    }
};
 
Matrix binpow(Matrix a, ll b) {
    Matrix res(a.r, a.c, true);
 
    while (b) {
        if (b&1) res = res * a;
 
        a = a * a;
        b >>= 1;
    }
 
    return res;
}
 
// KMP automata
int automata[MAXM+1][26];
 
vector<int> kmp(string s) {
    int n = s.size();
    vector<int> pi(n, 0);
 
    for (int i = 1 ; i < n ; i++) {
        int j = pi[i-1];
        while (j > 0 && s[j] != s[i]) j = pi[j-1];
        j += (s[i] == s[j]);
        pi[i] = j;
    }
 
    return pi;
}
 
void build_aut(string s) {
    s.push_back('#');
    int n = s.size();
    vector<int> p = kmp(s);
 
    for (int i = 0 ; i < n ; i++) {
        for (int c = 0 ; c < 26 ; c++) {
            if (i > 0 && c+'a' != s[i]) automata[i][c] = automata[p[i-1]][c];
            else automata[i][c] = i + (c+'a' == s[i]);
        }
    }
}
 
int main() {
    ll n;
    int m;
    string s;
 
    cin >> n >> m;
    cin >> s;
 
    build_aut(s);
 
    // no nosso grafo, precisamos de um vertice a mais
    // para "prender" todos os matchings que ja foram feitos
    // assim nao contamos repetido ou deixamos de contar algumas strings
    // entao no estado de matching temos que colocar uma aresta para esse estado final
    Matrix adj(m+2, m+2);
    for (int i = 0 ; i < m ; i++) {
        for (int c = 0 ; c < 26 ; c++) {
            adj.mat[i][automata[i][c]]++;
        }
    }
    adj.mat[m][m+1] = 1;
    adj.mat[m+1][m+1] = 26;
 
    Matrix paths = binpow(adj, n+1);
    cout << paths.mat[0][m+1] << '\n';
 
    return 0;
}

Problem B

Author: arthur_9548

Solution

Code

s = input()
if len(s) == 3:
	if s[1] == '0': print(s[:2], s[2:])
	elif s[2] == '0': print(s[:1], s[1:])
	else: print(-1)
elif len(s) < 3: print(s[0], s[1])
else: print(s[:2], s[2:])

Problem C

Author: arthur_9548

Solution

Code

#include<bits/stdc++.h>
using namespace std;
 
int main(){
	ios_base::sync_with_stdio(0); cin.tie(0);
	int n, k; cin >> n >> k;
	vector A(n, 0ll);
	for(auto& x : A)cin >> x;
	vector dp(n, 0ll); // best prefix answer 
	dp[0] = max(A[0], 0ll);
	for(int i = 1; i < k; i++)dp[i] = max(A[i], dp[i-1]);
	for(int i = k; i < n; i++)dp[i] = max(A[i]+dp[i-k], dp[i-1]);
	cout << dp.back() << endl;
}

Problem D

Author: cebolinha

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
const int MOD = 1e4;
const int MAXN = 1e6;
 
array<int, MOD> cnt;
array<int, MAXN> A, B;
array<int, 2*MAXN-1> S;
array<int, 2*MAXN-1> PI;
 
void prefix_function(array<int, 2*MAXN-1> const& S, int len) {
  for (int i = 1; i < len; i++) {
    int j = PI[i-1];
    while (j > 0 && S[i] != S[j]) j = PI[j-1];
    if (S[i] == S[j]) j++;
    PI[i] = j;
  }
}
 
int main() {
  ios::sync_with_stdio(false);
  cin.tie(nullptr);
 
  int N, M;
  cin >> N >> M;
 
  for (int i = 0; i < N; i++) cin >> A[i];
  for (int i = 0; i < M; i++) cin >> B[i];
 
  for (int i = 0; i+1 < M; i++) {
    S[i] = B[i+1]-B[i];
    if (S[i] < 0) S[i] += MOD;
  }
  S[M-1] = -1;
  for (int i = 0; i+1 < N; i++) {
    S[i+M] = A[i+1]-A[i];
    if (S[i+M] < 0) S[i+M] += MOD;
  }
 
  int len = (M-1)+1+(N-1);
  prefix_function(S, len);
 
  for (int i = M; i < len; i++) {
    if (PI[i] == M-1) {
      int A_idx = i - M - (M-2);
      int delta = B.front() - A[A_idx];
      if (delta < 0) delta += MOD;
      cnt[delta]++;
    }
  }
 
  int best = max_element(cnt.begin(), cnt.end()) - cnt.begin();
  cout << best << ' ' << cnt[best] << endl;
}

Problem E

Author: arthur_9548

Solution

Code

q = int(input())
l = []
for _ in range(q):
	line = [int(i) for i in input().split()]
	if line[0] == 1: l.append([])
	if line[0] == 2:
		i, x = line[1]-1, line[2]
		l[i].append(x)
	if line[0] == 3:
		i, j = line[1]-1, line[2]-1
		print(l[i][j])

Problem F

Author: j3r3mias

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
#define FASTIO ios::sync_with_stdio(false); cin.tie(nullptr)
#define pb push_back
#define endl '\n'
 
typedef long long ll;
const int INF = 0x3f3f3f3f;
 
int n, m;
vector<int> G[1005];
 
int bfs(int s) {
    vector<int> neighbors = G[s];
 
    if (neighbors.empty()) {
        return 1;
    }
 
    vector<int> dists(1005, -1);
    queue<int> q;
    int ans, start, u, w;
 
    ans = INF;
 
    for (int i = 0; i < neighbors.size(); i++) {
        start = neighbors[i];
        
        dists.assign(1005, -1);
        while (!q.empty()) {
            q.pop();
        }
 
        dists[start] = 0;
        q.push(start);
 
        while (!q.empty()) {
            u = q.front(); q.pop();
            for (int v : G[u]) {
                if (v == s) {
                    continue;
                }
                if (dists[v] == -1) {
                    dists[v] = dists[u] + 1;
                    q.push(v);
                }
            }
        }
 
        for (int j = i + 1; j < neighbors.size(); j++) {
            if (dists[neighbors[j]] != -1) {
                ans = min(ans, dists[neighbors[j]] + 2);
            }
        }
 
    }
 
    if (ans == INF) {
        return 1;
    }
 
    return ans;
}
 
int main() {
    FASTIO;
 
    int a, b, c;
 
    cin >> n >> m;
 
    for (int i = 0; i < m; i++) {
        cin >> a >> b;
        G[a].pb(b);
        G[b].pb(a);
    }
 
    ll ans = 0;
 
    for (int i = 1; i <= n; i++) {
        c = bfs(i);
        // cout << "Ciclo minimo que contem " << i << ": " << c << endl;
        ans += i * c;
    }
 
    cout << ans << endl;
 
    return 0;
}

Problem G

Author: emershowjr02

Solution

To solve this problem, we need to maximize the quality function of a gelato, which is defined as $$$\min_{x \in E} \, (x) \; \cdot \; |E|$$$, adhering to the restrictions given by the derangement.

From the permutation, we can describe a graph where the edges indicate that the ingredients represented by the vertices cannot be used together in the recipe. Thus, we have some important properties of the graph:

Each vertex has degree $$$2$$$.
The graph is formed by disjoint cycles.
In a cycle of size $$$|C|$$$, the maximum number of vertices we can select without violating the restrictions (i.e., without choosing two adjacent vertices) is $$$\lfloor \frac{|C|}{2} \rfloor$$$.

The main idea is to fixate, in decreasing order, the ingredient with the smallest value in the selected subset. For each ingredient $$$i$$$ (with value $$$a_i$$$), we consider only the ingredients with values greater than or equal to $$$a_i$$$, thus obtaining a subgraph composed only of these vertices and disregarding the others, since they have a lower score than the fixated ingredient.

As we process the ingredients in decreasing order, we will add edges to the graph (when both ends have values $$$\geq a_i$$$) and maintain the state of the connected components using a DSU. This way, we can keep track, for each component, of the size and the maximum number of vertices that can be selected from it.

While we are testing some ingredient, we can visualize as if we were adding the vertex that represents it and the edges that connect it to the elements with scores greater than or equal to its own. Therefore, when adding an ingredient $$$i$$$ (with value $$$a_i$$$), we have three cases:

The ingredient does not add any edges: This occurs when both ingredients with which $$$i$$$ is incompatible have values less than $$$a_i$$$. In this case, $$$i$$$ forms a new component of size $$$1$$$. We add $$$1$$$ to the total number of vertices we are selecting.
The ingredient adds one edge: This occurs when only one of the incompatible ingredients has a value $$$\geq a_i$$$. Then, $$$i$$$ is connected to an existing component, increasing its size by $$$1$$$. If the size was $$$s$$$, it is now $$$s + 1$$$. Thus, the maximum number of vertices selected in that component is altered. The contribution of the component changes from $$$\lceil\frac{s}{2} \rceil$$$ to $$$\lceil\frac{(s + 1)}{2} \rceil$$$.
The ingredient adds two edges: This occurs when both incompatible ingredients have values $$$\geq a_i$$$. Here, we have two possibilities:
Closing a cycle: If the two incompatible vertices are already in the same component, then adding $$$i$$$ completes a cycle. The maximum number of vertices selected in that component becomes $$$\lfloor \frac{|C|}{2} \rfloor$$$, where $$$|C|$$$ is the size of the formed cycle.
Merging two components: If the two vertices are in different components, then we merge these two components, of sizes $$$s_1$$$ and $$$s_2$$$. Thus, we must first address our test: since we are fixing an ingredient as the smallest, we must select the vertex that represents it. Therefore, if any of the components we are merging has an odd size, then in this test the union will contribute with $$$\lfloor \frac{(s_1 + s_2 + 1)}{2} \rfloor$$$ selected. Otherwise, $$$\lceil \frac{(s_1 + s_2 + 1)}{2} \rceil$$$ will be selected. Finally, the maximum number of selected vertices from the new component (in future tests) becomes $$$\lceil \frac{(s_1 + s_2 + 1)}{2} \rceil$$$, where $$$s_1$$$ and $$$s_2$$$ are the sizes of the original components.

To avoid recalculation, we will maintain a variable throughout these tests that represents the sum of the maximum number of selected vertices in all components in the current subgraph, which is updated according to the cases presented. Thus, we achieve a total complexity of $$$O(N \cdot log(N))$$$, resulting from the sorting to process the highest scoring ingredients first.

Code

#include <bits/stdc++.h>
 
using namespace std;
 
#define int long long int
#define float long double
#define ld long double
#define ll long long
#define pb push_back
#define ff first
#define ss second
#define vi vector<int>
#define vl vector<ll>
#define vpii vector<pair<int, int>> 
#define vvi vector<vector<int>>
#define vvf vector<vector<long double>>
#define pii pair<int, int>
#define all(x) x.begin(), x.end()
#define rall(x) x.rbegin(), x.rend()
#define rep(i, x, n) for(int i = x; i < n; i++)
#define in(v) for(auto & x : v) cin >> x;
#define outi(v) for(auto x : v) cout << x << ' ';
#define tiii tuple<int, int, int>
#define tam(x) ((int)x.size())
 
const int MAXN = 2e5+1;
const int LLINF = (int)(1e18);
const float EPS = 1e-7;
int MOD = 998244353;
const int LOG = 30;
const ld PI = acos(-1);
const int MINF = INT64_MIN;
vector<pair<int, int>> dirs = {{1, 0}, {-1, 0}, {0, 1}, {0, -1}};
 
struct DSU {
    int n;
    vector<int> parent, size;
 
    DSU(int n): n(n) {
        parent.resize(n, 0);
        size.assign(n, 1);
 
        for(int i=0;i<n;i++)
            parent[i] = i;
    }
 
    int find(int a) {
        if(a == parent[a]) return a;
        return parent[a] = find(parent[a]);
    }
 
    void join(int a, int b) {
        a = find(a); b = find(b);
        if(a != b) {
            if(size[a] < size[b]) swap(a, b);
            parent[b] = a;
            size[a] += size[b];
        }
    }
};
 
void solve(){
 
    int n; cin >> n;
 
    vpii v;
    vector<vi> edges(n+1);
 
    for(int i = 1; i <= n; i++){
 
        int x; cin >> x;
 
        v.emplace_back(x, i);
    }
 
    for(int i = 1; i <= n; i++){
 
        int p; cin >> p;
 
        edges[i].pb(p);
        edges[p].pb(i);
    }
 
    sort(rall(v));
 
    DSU dsu(n+2);
    vector<bool> put(n+1);
 
    for(auto [val, i] : v){
        put[i] = true;
        if(!put[edges[i][0]]){
 
            edges[i].erase(edges[i].begin());
 
            if(!put[edges[i][0]]) edges[i].erase(edges[i].begin());
          
            continue;
        }
        if(!put[edges[i][1]]) edges[i].pop_back();
    }
 
    int qtd = 0, ans = MINF;
 
    for(auto[val, i] : v){
 
        if(!edges[i].size()){
 
            qtd++;
            ans = max(ans, (qtd)*val);
        } else if(edges[i].size() == 1){
 
            qtd -= ((int)(dsu.size[dsu.find(edges[i][0])]) + 1) / 2;
            dsu.join(i, edges[i][0]);
            qtd += ((int)(dsu.size[dsu.find(i)]) + 1) / 2;
            ans = max(ans, (qtd)*val);
 
        } else{
 
            if(dsu.find(edges[i][1]) == dsu.find(edges[i][0])){
                qtd -= ((int)(dsu.size[dsu.find(edges[i][0])]) + 1) / 2;
                dsu.join(i, edges[i][0]);
                qtd += (int)(dsu.size[dsu.find(edges[i][0])]) / 2;
                
                ans = max(ans, (qtd)*val);
                continue;
            }
 
            qtd -= ((int)(dsu.size[dsu.find(edges[i][0])]) + 1) / 2;
            qtd -= ((int)(dsu.size[dsu.find(edges[i][1])]) + 1) / 2;
            int aux = 0;
            if(dsu.size[dsu.find(edges[i][0])] % 2 || dsu.size[dsu.find(edges[i][1])] % 2){
                dsu.join(i, edges[i][0]);
                dsu.join(i, edges[i][1]);
                aux = dsu.size[dsu.find(i)] / 2;
            } else {
                dsu.join(i, edges[i][0]);
                dsu.join(i, edges[i][1]);
                aux = (int)(dsu.size[dsu.find(i)] + 1) / 2;
            }
            ans = max(ans, (aux+qtd)*val);
            qtd += (int)(dsu.size[dsu.find(i)] + 1) / 2;
        }
    }
 
    cout << ans << '\n';
} 
 
 
int32_t main() { 
    ios::sync_with_stdio(false); cin.tie(NULL);
    cout.tie(NULL);
 
    int t = 1;
    // cin >> t;
 
    while(t--)
        solve();
 
    return 0;
}

Problem H

Author: Vilsu

Solution

Code

#include<bits/stdc++.h>
using namespace std;
 
const long long MOD = 998'244'353;
 
struct Sparse_Table{
    vector<vector<pair<int, int>>> sparse_table;
    Sparse_Table(){}
    Sparse_Table(vector<pair<int,int>> a){
        int n = a.size(), m = 1;
        while((1 << m) <= n) m++;
        
        sparse_table.assign(m, a);
 
        for(int i = 0; i + 1 < m; i++)
            for(int j = 0; j + (1 << (i + 1)) - 1 < n; j++)
                sparse_table[i + 1][j] = min(sparse_table[i][j], sparse_table[i][j + (1 << i)]);
    }
 
    pair<int,int> query(int l, int r){
        int i = 31 - __builtin_clz(r - l + 1);
        return min(sparse_table[i][l], sparse_table[i][r - (1 << i) + 1]);
    }
};
 
struct LCA{
    Sparse_Table sparse_table;
    vector<int> time_in;
 
    LCA(vector<vector<int>> tree) : time_in(tree.size(), 0){
        vector<pair<int,int>> tour;
 
        DFS(tree, tour, time_in, 0, -1, 0);
 
        sparse_table = Sparse_Table(tour);
    }
 
    void DFS(vector<vector<int>> &tree, vector<pair<int,int>> &tour, vector<int> &time_in, int cur, int last, int depth){
        time_in[cur] = tour.size();
        
        tour.push_back(make_pair(depth, cur));
 
        for(int next : tree[cur]){
            if(next == last) continue;
 
            DFS(tree, tour, time_in, next, cur, depth + 1);
 
            tour.push_back(make_pair(depth, cur));
        }
    }
 
    int query(int u, int v){
        int l = time_in[u], r = time_in[v];
        if(l > r) swap(l, r);
 
        return sparse_table.query(l, r).second;
    }
};
 
struct Seg{
    int n;
    vector<int> val, lazy, left_child, right_child;
    Seg(int n) : n(n){}
 
    int copy(int id){
        if(id == -1){
            val.push_back(0);
            lazy.push_back(0);
            left_child.push_back(-1);
            right_child.push_back(-1);
        }else{
            val.push_back(val[id]);
            lazy.push_back(lazy[id]);
            left_child.push_back(left_child[id]);
            right_child.push_back(right_child[id]);
        }
 
        return (int)val.size() - 1;
    }
 
    void prop(int l, int r, int id){
        lazy[left_child[id]] = (lazy[left_child[id]] + lazy[id]) % MOD;
        lazy[right_child[id]] = (lazy[right_child[id]] + lazy[id]) % MOD;
 
        val[id] = ((long long)(r - l + 1) * lazy[id] + val[id]) % MOD;
        lazy[id] = 0;
    }
 
    void update(int lu, int ru, int k, int l, int r, int id){
        if(ru < l || r < lu){
            if(lazy[id]){
                left_child[id] = copy(left_child[id]);
                right_child[id] = copy(right_child[id]);
 
                prop(l, r, id);
            }
        }else if(lu <= l && r <= ru){
            lazy[id] = (lazy[id] + k) % MOD;
            left_child[id] = copy(left_child[id]);
            right_child[id] = copy(right_child[id]);
 
            prop(l, r, id);
        }else{
            int m = (l + r) / 2;
 
            if(lazy[id] || left_child[id] == -1 || lu <= m) 
                left_child[id] = copy(left_child[id]);
 
            if(lazy[id] || right_child[id] == -1 || m < ru) 
                right_child[id] = copy(right_child[id]);
 
 
            if(lazy[id]) prop(l, r, id);
 
            update(lu, ru, k, l, m, left_child[id]);
            update(lu, ru, k, m + 1, r, right_child[id]);
 
            val[id] = (val[left_child[id]] + val[right_child[id]]) % MOD;
        }
    }
 
    void update(int lu, int ru, int k, int id){
        update(lu, ru, k, 0, n - 1, id);
    }
 
    int query(int lq, int rq, int l, int r, int id){
        if(rq < l || r < lq) return 0;
        else if(lq <= l && r <= rq){
            if(lazy[id]){
                left_child[id] = copy(left_child[id]);
                right_child[id] = copy(right_child[id]);
 
                prop(l, r, id);
            }
            return val[id];
        }else{
            int m = (l + r) / 2;
 
            if(lazy[id] || left_child[id] == -1) 
                left_child[id] = copy(left_child[id]);
 
            if(lazy[id] || right_child[id] == -1) 
                right_child[id] = copy(right_child[id]);
 
            if(lazy[id]) prop(l, r, id);
 
            return (query(lq, rq, l, m, left_child[id]) +
                    query(lq, rq, m + 1, r, right_child[id])) % MOD;
        }
    }
 
    int query(int lq, int rq, int id){
        return query(lq, rq, 0, n - 1, id);
    }
};
 
void DFS(Seg &seg, vector<vector<int>> &tree, vector<array<int, 3>> &triple, vector<int> &roots, vector<int> &parent, int cur){
    if(parent[cur] != -1) roots[cur] = seg.copy(roots[parent[cur]]);
    else roots[cur] = seg.copy(-1);
    
    auto [li, ri, ki] = triple[cur];
    seg.update(li, ri, ki, roots[cur]);
 
    for(int next : tree[cur]){
        if(next == parent[cur]) continue;
        
        parent[next] = cur;
        DFS(seg, tree, triple, roots, parent, next);
    }
}
 
signed main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);
 
    int n, m; cin >> n >> m;
 
    vector<vector<int>> tree(n);
 
    for(int i = 0; i < n - 1; i++){
        int u, v; cin >> u >> v;
        u--; v--;
 
        tree[u].push_back(v);
        tree[v].push_back(u);
    }
 
    LCA lca(tree);
    
    vector<array<int, 3>> triple(n);
    for(auto &[li, ri, ki] : triple){
        cin >> li >> ri >> ki; li--; ri--;
    }
 
    Seg seg(m);
    vector<int> roots(n, -1), parent(n, -1);
    DFS(seg, tree, triple, roots, parent, 0);
 
    int q; cin >> q;
 
    int ans = 0;
 
    while(q--){
        int u, v, x, y; cin >> u >> v >> x >> y;
        u ^= ans; v ^= ans; x ^= ans; y ^= ans;
 
        u--; v--;
        int w = lca.query(u, v);
        x--; y--;
 
        int qu = seg.query(x, y, roots[u]);
        int qv = seg.query(x, y, roots[v]);
        int qw = seg.query(x, y, roots[w]);
 
        ans = ((qu + MOD - qw) % MOD + (qv + MOD - qw) % MOD) % MOD;
 
        auto [li, ri, ki] = triple[w];
        li = max(li, x);
        ri = min(ri, y);
 
        if(li <= ri) ans = ((long long) (ri - li + 1) * ki + ans) % MOD;
 
        cout << ans << '\n';
    }
}

Problem I

Author: gnramos

Solution

Code

M, N = map(int, input().split())
matriz = [' ' * (M + 2)] + [f' {input().lower()} ' for _ in range(N)] + [' ' * (M + 2)]
qtde = 0
for i in range(1, N + 1):
    j = -1
    while (j := matriz[i].find('v', j + 1)) != -1:
        # horizontal
        qtde += matriz[i][j-1] == 'u' and matriz[i][j+1] == 'a'
        qtde += matriz[i][j-1] == 'a' and matriz[i][j+1] == 'u'
        # vertical
        qtde += matriz[i-1][j] == 'a' and matriz[i+1][j] == 'u'
        qtde += matriz[i-1][j] == 'u' and matriz[i+1][j] == 'a'
        # diagonais
        qtde += matriz[i-1][j-1] == 'u' and matriz[i+1][j+1] == 'a'
        qtde += matriz[i-1][j-1] == 'a' and matriz[i+1][j+1] == 'u'
        qtde += matriz[i-1][j+1] == 'u' and matriz[i+1][j-1] == 'a'
        qtde += matriz[i-1][j+1] == 'a' and matriz[i+1][j-1] == 'u'
 
print(qtde)

Problem J

Author: arthur_9548

Solution

Code

n, m = map(int, input().split())
if n*m%2: print('W')
else: print('P')

Problem K

Author: dporto

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);
 
    int N, Q;
    if (!(cin >> N >> Q))
        return 0;
 
    vector<long long> t(N);
    for (int i = 0; i < N; ++i)
        cin >> t[i];
 
    sort(t.begin(), t.end());
 
    while (Q--) {
        long long L, R;
        cin >> L >> R;
        auto itL = lower_bound(t.begin(), t.end(), L);
        auto itR = upper_bound(t.begin(), t.end(), R);
        cout << (itR - itL) << '\n';
    }
 
    return 0;
}

Problem L

Author: Maxwell01

Solution

Code

// https://mirror.codeforces.com/blog/entry/125435
#ifdef MAXWELL_LOCAL_DEBUG
#include "../debug_template.cpp"
#else
#define debug(...)
#define debugArr(arr, n)
#endif
 
#define dbg debug()
 
#include <bits/stdc++.h>
#define ff first
#define ss second
 
using namespace std;
using ll = long long;
using ld = long double;
using pii = pair<int,int>;
using vi = vector<int>;
 
using tii = tuple<int,int,int>;
// auto [a,b,c] = ...
// .insert({a,b,c})
 
const int oo = (int)1e9 + 5; //INF to INT
const ll OO = 0x3f3f3f3f3f3f3f3fLL; //INF to LL
 
const int MOD = (int)1e9 + 7;
const int INV3 = 333333336;
 
void add(ll &a, ll b) {
    a = (a + b + MOD) % MOD;
}
 
void mul(ll& a, ll b) {
    a = (a * b) % MOD;
}
 
void solve() {
    ll N;
    cin >> N;
 
    N += 1;
    N %= MOD;
 
    // OEIS A000447
    // a(n) = 1^2 + 3^2 + 5^2 + 7^2 + ... + (2*n-1)^2 = n*(4*n^2 - 1)/3.
 
    ll a = 1;
 
    mul(a, N);
    mul(a, N);
    mul(a, 4);
    add(a, -1);
    mul(a, INV3);
    mul(a, N);
 
    add(a, -1);
 
    cout << a << "\n";
 
}
 
int32_t main() {
    ios::sync_with_stdio(false);
    cin.tie(NULL);
 
    int t = 1;
    //cin >> t;
 
    while(t--) {
        solve();
    }
 
}

Problem M

Author: arthur_9548
Special thanks: VLamarca

Solution

Since the polygons in the plane are distinct and do not intersect, they form a tree, where the parent of a polygon is the smallest polygon that strictly contains it (if it is not contained by anyone, the parent is the root of the tree, an auxiliary vertex). Considering the points of the questions as polygons with a single vertex, it is enough to see their height in this tree to count how many polygons contain them.

To construct this tree, the Sweep Line technique can be used. In this Sweep Line, the points of the questions and the edges of the polygons that are not aligned with the vertical axis will be processed. Each point will be a Sweep Line event, and each edge will generate a creation event (at the leftmost end) and a removal event (at the rightmost end). The order of the events is: coordinate $$$X$$$, addition before removal, coordinate $$$Y$$$ (the points of the questions are considered additions).

During the algorithm, we will maintain the coordinate $$$X$$$ that is currently being processed, called $$$X_a$$$. Additionally, we will maintain a $$$set$$$ that keeps the segments that have been added but not yet removed in an ordered manner. The order of the segments within the $$$set$$$ is as follows: segment $$$A$$$ comes before segment $$$B$$$ if the intersection between $$$A$$$ and the line $$$x = X_a$$$ is below the intersection between $$$B$$$ and $$$x = X_a$$$.

To achieve this ordering, it is enough to keep $$$X_a$$$ as a global variable and create a function that implements this comparison between segments (which can be calculated in $$$O(1)$$$) to be passed to the $$$set$$$. It can be proven that this ordering is consistent as long as:

The segments do not intersect with each other (guaranteed by the statement).
The segments remain in the $$$set$$$ only while there is an intersection between them and $$$X_a$$$ (guaranteed by the processing method of the Sweep Line).

Thus, the events of the Sweep Line are processed in order. Removal events simply remove the segment from the $$$set$$$. Addition events add the segment to the $$$set$$$ (if it is not a point) and also calculate the parent of the current polygon or point, if its parent has not yet been calculated. To do this, it is enough to see the first segment of the $$$set$$$ whose intersection with $$$X_a$$$ is greater than the intersection of the current one with $$$X_a$$$ (i.e., the first segment above the current one). This can be done with binary search since the $$$set$$$ is ordered. If the binary search does not find any segment above, then the parent of the current one is the root of the tree. Otherwise, we check if the found segment is part of the upper or lower edge of the polygon to which it belongs — we say that a segment is part of the upper edge if and only if it connects $$$P1$$$ to $$$P2$$$ (considering the counterclockwise order) and $$$P2$$$ is to the left of $$$P1$$$. If it is part of the upper edge, then the parent of the current one is it (since it contains the current one); if not, the parent of the current one is its parent (since it would be a sibling of the current one in the tree).

We see that the tree can be constructed in $$$O(P \log P)$$$, where $$$P$$$ is the total number of points in the input. Then, we can calculate the height of each vertex using breadth-first or depth-first search. Finally, we just need to print the heights of the vertices that are points of the questions.

Code

#include <bits/stdc++.h>
using namespace std;
#define endl '\n'
#define pb push_back
#define eb emplace_back
#define rep(i, a, b) for (int i = a; i < (int)(b); i++)
#define sz(x) (int)x.size()
typedef long long ll;
typedef long double ld;
struct P { ll x, y; };
 
// Line Segment
int current_x;
struct Segment {
	int idx; P p1, p2; bool is_upper;
	Segment(P p, P q, int i): idx(i), p1(p), p2(q), is_upper(p2.x < p1.x) { if (is_upper)swap(p1, p2); }
	ld get_y(ll x) const { return (ld) (p2.y - p1.y) / (p2.x - p1.x) * (x - p1.x) + p1.y; }
	tuple<ld, bool, int> get_comp() const { return {get_y(current_x), is_upper, p2.x}; }
	bool operator<(const Segment & o) const { return get_comp() < o.get_comp(); }
};
 
signed main() {
	ios_base::sync_with_stdio(0); cin.tie(0);
	int n, q; cin >> n >> q;
	vector<vector<P>> polygons(n);
	rep(i, 0, n) {
		int k; cin >> k;
		rep(j, 0, k) {
			int x, y; cin >> x >> y;
			polygons[i].eb(x, y);
		}
	}
	vector<P> queries(q);
	rep(i, 0, q) {
		int x, y; cin >> x >> y;
		queries[i] = {x, y};
	}
	vector<tuple<int, int, int, Segment>> edges; // polygon edges
	rep(idx, 0, n) {
		const auto & v = polygons[idx];
		rep(i, 0, sz(v)) {
			int j = (i + 1) % sz(v);
			if (v[i].x == v[j].x)continue; // ignores vertical edges
			Segment seg = Segment(v[i], v[j], idx);
			edges.eb(seg.p1.x, 0, -seg.p1.y, seg);
			edges.eb(seg.p2.x, 1, -seg.p2.y, seg);
		}
	}
	rep(idx, 0, q){
		P p = queries[idx];
		P p2 = p; p2.x++; // this wont be used by the sweep line
		Segment seg = Segment(p, p2, n + idx);
		edges.eb(seg.p1.x, 0, -seg.p1.y, seg);
	}
	
	// Sweep Line
	sort(edges.begin(), edges.end());
	set<Segment> s;
	vector pai(n+q+1, n), vis(n+q, 0);
	for (auto [l, t, y, seg]: edges) {
		current_x = l;
		int i = seg.idx;
		if (t == 0) {
			if (not vis[i]) {
				vis[i] = true;
				auto it = s.upper_bound(seg);
				if (it == s.end())pai[i] = n+q;
				else if (it->is_upper)pai[i] = it->idx;
				else pai[i] = pai[it->idx];
			}
			if (i < n)s.insert(seg);
		}
		else if (i < n)s.erase(seg);
	}
	
	// depth in tree
	vector<vector<int>> g(n+q+1);
	rep(i, 0, n+q)g[pai[i]].pb(i), g[i].pb(pai[i]);
	vector h(n+q+1, -2); // plane is -1, bigger polygons are 0
	auto dfs = [&](auto rec, int v, int p) -> void {
		h[v] = h[p] + 1;
		for(int f : g[v])if (f != p)rec(rec, f, v);
	};
	dfs(dfs, n+q, n+q);
	rep(i, n, n+q)cout << h[i] << endl;
}

Problem N

Author: edsomjr

Solution

Code

#include <bits/stdc++.h>
 
using namespace std;
using ll = long long;
 
auto solve(ll N, ll Y) -> pair<string, ll>
{
    ll step = 1, L = 0, R = 0, X = 0;
    int i = 0;
 
    while (llabs(step) < N)
    {
        N -= llabs(step);
        X += step;
 
        L = min(L, X);
        R = max(R, X);
 
        step = i & 1 ? 2*step : -step;
        i++;
    }
 
    ll sig = step > 0 ? 1 : -1;
    X += sig*N;
 
    auto ok = X < L or X > R;
 
    if (not ok)
        return { "Nao", -1 };
 
    ll ans = Y - X;
 
    return { "Sim", ans };
}
 
int main()
{
    ios::sync_with_stdio(false);
 
    ll N, Y;
    cin >> N >> Y;
 
    auto [ans, X] = solve(N, Y);
 
    cout << ans << '\n';
 
    if (ans == "Sim")
        cout << X << '\n';
 
    return 0;
}

Полный текст и комментарии »

Разбор задач XIII UnB Contest Mirror

arthur_9548
6 месяцев назад
6

XIII UnB Contest Mirror

Автор arthur_9548, 6 месяцев назад, По-английски

Hello, Codeforces!

We are happy to invite you to the XIII UnB Contest Mirror on Nov/04/2025 20:00 (Moscow time). This contest was originally held at University of Brasília, during our annual University Week.

This contest is a Codeforces Gym, and therefore is NOT an official Codeforces Round and is NOT rated.. The contest will follow standard ICPC rules.

The problems were created and prepared by:

Arthur Botelho (me)
Eduardo Freire (duduFreire)
Eduardo Quirino (cebolinha)
Emerson Cruz (emershowjr02)
Maxwell Oliveira (Maxwell01)
Pedro Avila Beneveli (PedroPacoca)
Wilson Guimarães (Vilsu)
Professor Daniel Porto (dporto)
Professor Edson Alves (edsomjr)
Professor Guilherme Ramos (gnramos)
Professor Jeremias Gomes (j3r3mias)

Also, we would like to thank:

Our problem reviewers and testers: Guilherme Rocha (TheRockRocha), Lucas Gabriel (lucasg05), Nikolle Licá (june16th) and Ruan Petrus (MagePetrus);
Víctor Lamarca (VLamarca) for contributing to the preparation of the problems.
DOM Judge team for the platform used in the on-site contest;
dario2994 for the brilliant pol2dom tool;
MikeMirzayanov for the amazing Codeforces and Polygon platforms.

Most of the problems are aimed at those who are starting their journey in competitive programming. However, we also included a few problems we believe will be an interesting challenge for more experienced participants, such as finalists of ICPC Regionals.

Good luck to all participants!

Note: if you participated in the on-site contest, please don't make any comments regarding the problems in this blog. As soon as the Mirror finishes, we will publish an editorial blog — you may discuss the problems there.

Полный текст и комментарии »

Анонс XIII UnB Contest Mirror

arthur_9548
6 месяцев назад
0

The Automaton Interpretation for the KMP Algorithm

Автор arthur_9548, 8 месяцев назад, По-английски

Introduction

This article was started by LeoRiether and concluded by me. I found his idea incredible and offered to help him finishing the text he started to write here some years ago. I hope everyone finds it interesting! I think it's the most natural way to understand the KMP algorithm.

When following the examples, you should use LeoRiether's KMP simulator to see for you yourself what is happening.

The definitions presented here will not be so rigorous, but will instead follow a more intuitive approach. We will use examples with strings composed of characters from $$$a$$$ to $$$z$$$, but you could do the exact same with any other alphabet (like the natural numbers or whatever you want).

Automaton?

The text will assume you know what is a nondeterministic finite automaton (NFA) and a deterministic finite automaton (DFA), but I believe that an intuitive comprehension of them is enough to understand the ideas presented. You can think of an automaton as a graph with letters in the edges, where we will call vertices states. There is a starting state and some accepting states. To use an automaton, you maintain "pointers" to it's states/vertices, which will be called "threads".

Feeding a string $$$S$$$ to an nondeterministic finite automaton as input is to feed all of it's characters $$$S_i$$$ sequentially, considering that at the start there is a thread at the starting state. To feed a character to an automaton is basically to follow the procedure:

For all current threads: let's say the thread is at the state $$$A$$$. For all edges from $$$A$$$ to some $$$B$$$ that have the character $$$S_i$$$, we will prepare to create a new thread at $$$B$$$ and delete the thread at $$$A$$$. After considering all current threads, first delete all of them and then create all the new ones we had prepared to create.

If at the end there is any thread at any accepting state, the automaton "accepts" the string. You can assume the deterministic version is the same, but each state should have exactly one outgoing edge for each letter (so simulating it is just traversing a graph).

The Problem

We want to solve the string matching problem: given two strings $$$S$$$ and $$$P$$$, find all substrings of $$$S$$$ that match the pattern $$$P$$$. There are other uses for the concepts shown here, but we'll focus on this one first.

A Simple Solution

There's a pretty simple NFA construction that solves this exact problem. We'll feed this automaton the characters of $$$S$$$ as input, and if at any point a thread is at the accepting state, we'll know there's a substring that matches $$$P$$$. Let the starting state have an edge to itself for every character of the alphabet (remember this is the alphabet of the automaton, not necessarily the English alphabet). This loop will create a thread at every input character and leave it there, at the starting node, ready to match the pattern when the other inputs come. Next, we'll make a path containing the characters of the pattern $$$P$$$ at every edge. Here's an example of an NFA that matches the pattern "abacaba":

abacabautomaton

Notice that the $$$Σ$$$ edge in the first node represents "an edge for every symbol of the alphabet", and that it includes the letter "a" in this case. Thus, if the automaton receives an "a", the thread in the first node splits into one for the second node and one that keeps looping on the starting node. When this split occurs, one of the threads will keep "walking" to the right while the symbols of the string are matching the forward edges, but if at any point they don't match, the thread "dies". Take a moment to appreciate how this simple NFA solves the matching problem.

An Example

Let's look at an automaton that matches the pattern "abaa". The nodes filled in red represent alive threads.

abaa

We'll input the string "abaab" and see how the threads behave. First, let's input the letter "a".

abaa1

The thread we had before split into two, and now the one on the right will try to walk to the right as far as it can. Now we input the second character of our string, "b".

abaa2

Nothing surprising here. Let's input "a".

abaa3

Now there are 3 threads, look at them go! Now input another "a".

abaa4

Notice that the thread that was on the second node didn't have any edge to follow, so it died, but then the first node split and another thread is now at the second state. Also, we have a filled accepting state, so there's a match with the pattern "abaa"! One last input, "b".

abaa5

We already have a thread matching the "b" from "abaa", even if this first "a" overlaps with our first match. This kind of behaviour allows us to match overlapping patterns efficiently.

Running the Automaton

The naive way to run this automaton is to keep a set of threads and advance them individually. This is, however, not efficient at all. Imagine the pattern is something like "aaaaaa", and we match a string "aaaaaaaaaaaaaaaaa". At every iteration a new thread would be spawned and every other thread would advance, with no deaths. This gives us a time complexity of $$$O(|S| × |P|)$$$. We can do better by exploiting some properties.

How many states are reachable?

There is potential to do better than the naive simulation by analyzing what is called the "subset construction" of our NFA (think of all possible combinations (subsets) of threads that can be alive at the same time). Indeed, only $$$|P|+1$$$ subsets would be reachable in it! The proof is as follows: imagine we've input some string to the automaton and there's some set of alive threads. Let $$$u$$$ be the most advanced (closest to the right) thread alive -- we'll call it the "leader" thread --, and let $$$n$$$ be the number of forward edges (the ones that aren't the loop) it followed. There are no threads to the right of $$$u$$$, so we only need to consider the last $$$n$$$ symbols the automaton has seen, and there's only one length-n string possible because there's only one path the thread could have followed. Because of that, the alive threads to the left of $$$u$$$ are uniquely determined! In other words, if we know the leader, there's only one possible set of alive threads to the left of it. With this, it's possible to conclude that there's exactly one reachable subset in the subset construction for every leader state, which gives us $$$|P|+1$$$ states in total (since we are including the subset with only the starting thread).

An idea one could have now is: when simulating the automaton, what if we only keep track of the leader thread, instead of the entire alive subset? This is exactly what we'll do, but there's a problem: what if the leader dies? We need to know who will occupy his place. Before that, though, we'll need one more concept.

Left-set

It turns out that "there's only one possible set of alive threads to the left of the leader" does not apply exclusively to the leader. In fact, if we know some thread is alive, all of the threads to the left of it can be uniquely determined. The argument is very similar to the one shown before, here's a sketch: if a thread is alive in node $$$n$$$ ($$$0$$$-indexed), then there's one possible string of length $$$n$$$ that must have been fed to the automaton in the last $$$n$$$ steps, thus the threads in smaller positions must have come from this string.

Now we can finally know what to do in the case of a failed match.

Dealing with failure

Suppose we've built the automaton for "abababax" and fed "abababa" to it.

abababax

What would happen if we input the letter "b"? Well, the leader can't go forward, so it dies. The thread immediately to the left of the leader, however, can go forward, and thus becomes the new leader. We'll call this "thread immediately to the left of a thread" the "neighbor". This idea outlines a possible efficient way of simulating our NFA. We'll keep track of the position of the leader at every step. How? There are two cases:

The leader matches the input and goes forward. This case is easy, just increment the current leader position.
The leader fails to match and dies. Then, the neighbor of the leader (let him be at position $$$n0$$$) may or may not survive. If it does, the new leader is at $$$n0+1$$$. If it dies, then maybe the neighbor of this other thread (at $$$n1 \lt n0$$$) survives. If it does, the new leader is at $$$n1+1$$$. If it dies, then maybe their neighbor survives. You get the idea; in the worst case, we can always go all way back to the starting node, who always has an alive thread.

Then, if we could know the neighbor of a thread in $$$O(1)$$$, the entire matching process would take time $$$O(|S|)$$$. Yes, even if we walk through neighbors one by one as they die, the algorithm runs in linear time. To see why, consider that we spawn at most one thread per step, so the total number of dead threads is bounded by $$$|S|$$$. It's like pushing threads to a queue when they are spawned and popping the queue when they die.

Precalculating the Neighbor

When building our automaton, we'll precalculate an array neighbor, where neighbor[u] indicates the neighbor of $$$u$$$, in the case where $$$u$$$ is the leader. If we do that, we'll be able to simulate our automaton as described in "Dealing with failure" in $$$O(|S|)$$$!

It turns out precalculating the neighbor is very similar to what we've seen in "Dealing with failure". Let's compute the neighbor array from left to right. Suppose we've already computed the result for the first $$$k$$$ vertices. How do we compute it for the $$$k+1$$$-th? The only way the leader could move from $$$k$$$ to $$$k+1$$$ is to go through the edge between them, right? So it means we got $$$P_k$$$ as input ($$$0$$$-indexed). If neighbor[k] has an edge with the letter $$$P_k$$$, then neighbor[k+1] = neighbor[k]+1. If not, let's check the edge from neighbor[neighbor[k]]], and so on, until we succeed or reach the start state. You can use the same argument from the previous section to prove that this is $$$O(|P|)$$$.

In fact, this is the same as calculating who is the new leader when the leader is neighbor[k] and we receive $$$P_k$$$ as input!

Implementation

I'll take this moment to remember about LeoRiether's amazing KMP simulator. It is very useful to simulate some examples.

Let's get to the actual code. We have come to the conclusion that the only thing that we need is to calculate the new leader based on the previous leader and the input character, which depends whether the leader's edge matches the input or not. We can implement the exact logic we've been discussing:

struct KMP {
	string P;
	int n; // n = |P|
	vector<int> neighbor;
	
	KMP(string& p){
		P = p;
		n = (int)P.size();
		neighbor.resize(n+1);
		neighbor[1] = 0; //starting node is always alive
		for(int k = 1; k < n; k++)
			neighbor[k+1] = next_leader(neighbor[k], P[k]);
	}
	
	bool match(int state, char c){
		return state < n and P[state] == c;
	}
	
	int next_leader(int leader, char input){
		if (leader == 0)return P[0] == input; 
		// either advances to 1 or remains in start
		if (match(leader, input))return leader+1;
		else return next_leader(neighbor[leader], input);
	}
};

Whenever the leader is at the position $$$|P|$$$, we have found a match:

int matches(string P, string S){
    KMP kmp(P);
    int count = 0, state = 0;
    for(char c : S){
        state = kmp.next_leader(state, c);
        if (state == (int)P.size())count++;
    }
    return count;
}

I personally prefer a more reduced implementation:

struct KMP {
	string P; int n; vector<int> nb;
	
	KMP(string& p) : P(p), n((int)P.size()), nb(n+1) {
		for(int k = 1; k < n; k++) nb[k+1] = nxt(nb[k], P[k]);
	}
	
	int nxt(int i, char c){
		for(; i; i = nb[i])if (i < n and P[i]==c)return i+1;
		return P[0]==c;
	}
};

LeoRiether's can be found here.

You can test your implementation here.

Prefix Function

The currently most popular interpretation for the KMP algorithm is as a Prefix Function: for each prefix $$$p$$$ of a string $$$S$$$, it returns the length of the second largest prefix of $$$p$$$ that is also a suffix of it (the first largest is, of course, $$$p$$$ itself).

This is equivalent to, when the leader thread is at the state $$$|p|$$$, the position of the second largest alive thread, which is exactly our neighbor array. You can see this clearly if you think about what happens when we feed the characters of $$$P$$$ to the automaton we have just built.

The KMP Automaton

We have seen how to solve the string matching problem in $$$O(|P| + |S|)$$$. However, in some problems (like this one), we may want to store something like: for all possible leaders and input characters, what would be the new leader. This means to construct an explicit DFA that is equivalent to the NFA we were simulating earlier. When people talk about the "KMP Automaton", this is usually what they mean.

This construction can be made using dynamic programming. If the result has been calculated for the first $$$k$$$ states, for the $$$k+1$$$-th we iterate each character $$$c$$$ of the alphabet. Then, we have two possibilities:

If $$$P_k = c$$$: the new leader is $$$k+1$$$.
If not, the answer is the same as the already calculated at the state neighbor[k+1].

You can add the following to the KMP implementation to achieve this (considering the alphabet from $$$a$$$ to $$$z$$$):

    vector<vector<int>> dfa(n+1, vector<int>(26));
    void build_dfa(){
        dfa[0][P[0]] = 1; //only way to advance at 0
        for(int k = 1; k <= n; k++)
            for(int c = 0; c < 26; c++)
                if (k < n and P[k] == 'a'+c) dfa[k][c] = k+1;
                else dfa[k][c] = dfa[neighbor[k]][c];
    }

Conclusion

I'd like to thank LeoRiether again for letting me complete his work and share the final result with the Codeforces community. Also, thanks to duduFreire and felipe_massa for the excellent review on this blog.

I hope you liked the ideas presented as much as I did!

Полный текст и комментарии »

kmp, automata, strings, tutorial

+106

arthur_9548
8 месяцев назад
6

Editorial: VI UnBalloon Contest

Автор arthur_9548, история, 11 месяцев назад, По-английски

Hey everyone! I hope everyone enjoyed the problems of VI UnBalloon Contest Mirror. This editorial contains the description of the solutions and their implementation. Feel free to discuss them in the comments!

Problem A

Idea: lucassala
Preparation: lucassala

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
int main(){
    int n,m; cin >> n >> m;
 
    for(int i = 0; i < m; i++){
        int a,b; cin >> a >> b;
    }
 
    for(int i = 1; i<=n; i++){
        cout << i << ' ' ;
    }
}

Problem B

Idea: PedroGallo
Preparation: PedroGallo

Solution

Code

#include <bits/stdc++.h>
using namespace std;
#define sws ios_base::sync_with_stdio(false); std::cin.tie(0); std::cout.tie(0);
 
#define int long long
 
const int MOD = 1e9 + 7;
 
 
int32_t main(){sws;
 
    int n, k;
    cin >> n >> k;
 
    int np = 1, ans = 0;
    for(int b=0; b<32; b++){
        if((k >> b) & 1){
            ans = (ans + np) % MOD;
        }
 
        np = (np * n) % MOD;
    }
 
    cout << ans << '\n';
 
    return 0;
}

Problem C

Idea: arthur_9548
Preparation: arthur_9548

Solution

The key observation is: when $$$B_{t,i,j} = 1$$$, $$$A_{t,i,j} = 1$$$, and $$$Prob(S, t_l, t_r)$$$, for any $$$S$$$ containing $$$(i, j)$$$ and $$$t_l \leq t \leq t_r$$$, is $$$\frac{1}{1}$$$, which has a value of $$$1$$$ mod $$$M$$$ — not greater than $$$\lfloor \frac{M}{2} \rfloor$$$ for any $$$M \gt 2$$$. We can verify that for $$$B_{0,i,j}$$$ from $$$1$$$ to $$$10^5$$$, there exists an $$$X$$$ such that $$$B_{X,i,j} = 1$$$. Moreover, this holds for small $$$X \leq T_{max}$$$, where $$$T_{max} = 244$$$. Thus, we can precalculate all $$$A_{t,i,j}$$$, $$$B_{t,i,j}$$$, $$$A'_{t,i,j}$$$, and $$$B'_{t,i,j}$$$ with $$$t$$$ up to $$$T_{max}$$$ at a cost of $$$O(M\log(M) + HWT_{max}\log(M))$$$. This is because, to find $$$NCP(X, M)$$$ with $$$X \lt M$$$, we store the coprimes to $$$M$$$ in a vector and search for the function result using binary search.

Thus, for each query, we can iterate over the subgrids of $$$K_i$$$ in $$$O(H^2W^2)$$$ and, for each of them, check if they meet the condition imposed by the statement. If $$$t_l \gt t_r$$$ or $$$t_r \gt T_{max}$$$, we can already discard the subgrid. Otherwise, we need to calculate the probability mod $$$M$$$. For this, we can directly use the probability values $$$\frac{A'_{t,i,j}}{B'_{t,i,j}}$$$ mod $$$M$$$ and combine them using the probability union formula.

In the end, we want to perform the probability union over a multidimensional interval in time $$$t$$$ and coordinates $$$i$$$ and $$$j$$$ (union of $$$Prob((i,j,i,j), t, t)$$$ with $$$t_l \leq t \leq t_r$$$, $$$x1 \leq i \leq x2$$$, $$$y1 \leq j \leq y2$$$). We can observe that the union operation with probabilities mod $$$M$$$ forms a monoid. Thus, we will use a multidimensional Disjoint Sparse Table ( DiST ) to perform this operation in an interval of dimension $$$D$$$ in $$$O(2^D)$$$ (in this case, $$$D = 3$$$). The preprocessing for this has a cost of $$$O(H\log(H)W\log(W)T_{max}\log(T_{max}))$$$.

In the end, our complexity is $$$O((M\log(M) + HWT_{max}\log(M)) + (H\log(H)W\log(W)T_{max}\log(T_{max}) + (QH^2W^22^D))$$$, with the first phase being the preprocessing of $$$A$$$ and $$$B$$$, the second phase the preprocessing of the DiST, and the third phase the calculation of the queries. We can verify that this meets the limits with the input constraints.

Code

#include<bits/stdc++.h>
using namespace std;
 
// Multidimensional Disjoint Sparse Table (DiST)
 
#define MAs template<class...As>
#define rep(i, a, b) for(int i = (a); i < (b); i++)
#define repinv(i, a, b) for(int i = (a); i >= (b); i--)
template<int D, class S>
struct MDiST{ using T = S::T;
	int n, h; vector<vector<MDiST<D-1, S>>> t;
	int lg(signed x){return __builtin_clz(1)-__builtin_clz(x);}
	MAs MDiST(int s, As... ds):n(1<<(lg(s)+(s!=(1<<lg(s))))), 
	h(lg(n)), t(h+(n==1), vector(n, MDiST<D-1, S>(ds...))){}
	MAs void set(T x, int p, As... ps){t[0][p].set(x, ps...);}
	void join(MDiST& a, MDiST& b){
		rep(d,0,h)rep(i,0,n)t[d][i].join(a.t[d][i], b.t[d][i]);
	}
	void init(){
		rep(i,0,n)t[0][i].init();
		for(int d = 1, s = 2; d < h; d++, s *= 2)
		for(int m = s; m < n; m += 2*s){
			t[d][m] = t[0][m]; t[d][m-1] = t[0][m-1];
			rep(i, m+1, m+s)t[d][i].join(t[d][i-1], t[0][i]);
			repinv(i, m-2, m-s)t[d][i].join(t[0][i], t[d][i+1]);
		}
	}
	MAs T query(int l, int r, As... ps){
		if (l==r)return t[0][l].query(ps...);
		int k = lg(l^r);
		return S::op(t[k][l].query(ps...), t[k][r].query(ps...));
	}
};
 
template<class S>
struct MDiST<0, S>{ using T = typename S::T;
	T val = S::id;
	void set(T x){val = x;}
	void join(MDiST& a, MDiST& b){val = S::op(a.val, b.val);}
	void init(){}
	T query(){return val;}
};
 
// End of Multidimensional Disjoint Sparse Table (DiST)
 
int CF(int x){return x%2 ? (3*x + 1)/(x%4 == 1 ? 4 : 1) : x/2;}
int MOD(int a, int b){return b ? (a < 0 ? a : a%b) : 0;}
int SSF(int a, int b, int c){return MOD(a-c, b-c) + c;}
 
int M;
int mod_mul(int a, int b){return (int)((long long)a*b%M);}
vector<int> coprimes;
int inverse(int a){
	int res = 1, b = ((int)coprimes.size())-1;
	for(;b;a=mod_mul(a, a),b>>=1)if(b&1)res=mod_mul(res,a);
	return res;
}
int NCP(int x){return *lower_bound(coprimes.begin(), coprimes.end(), x);} //NCP(x, M)
 
pair<int, int> get_prob(int a, int b){
	int nb = NCP(SSF(b, M, 2));
	int na = SSF(a, nb, 1);
	return {na, nb};
}
 
struct ProbUnionMonoid{
	using T = int;
	static constexpr T id = 0;
	static T op(T a, T b){return (M + a + b - mod_mul(a, b))%M;}
};
 
int main(){
	ios_base::sync_with_stdio(0); cin.tie(0);
	int h, w; cin >> h >> w >> M;
	for(int i = 1; i < M; i++)if (gcd(i, M) == 1)coprimes.push_back(i);
	constexpr int MAX_T = 255;
	vector A(MAX_T, vector(h, vector(w, 0))), B(MAX_T, vector(h, vector(w, 0)));
	for(int i = 0; i < h; i++)for(int j = 0; j < w; j++){
		cin >> A[0][i][j];
		for(int t = 1; t < MAX_T; t++)A[t][i][j] = CF(A[t-1][i][j]);
	}
	for(int i = 0; i < h; i++)for(int j = 0; j < w; j++){
		cin >> B[0][i][j];
		for(int t = 1; t < MAX_T; t++)B[t][i][j] = CF(B[t-1][i][j]);
	}
	MDiST<3, ProbUnionMonoid> dist(h, w, MAX_T);
	for(int i = 0; i < h; i++)for(int j = 0; j < w; j++)for(int t = 0; t < MAX_T; t++){
		auto [na, nb] = get_prob(A[t][i][j], B[t][i][j]);
		int prob = mod_mul(na, inverse(nb));
		dist.set(prob, i, j, t);
	}
	dist.init();
	int q; cin >> q;
	while(q--){
		int T, x1, y1, x2, y2; cin >> T >> x1 >> y1 >> x2 >> y2;
		int ans = 0;
		for(int a = x1; a <= x2; a++)for(int b = y1; b <= y2; b++)
		for(int c = a; c <= x2; c++)for(int d = b; d <= y2; d++){
			int tl = a+b+x1+y1, tr = T-(c+d+x2+y2);
			if (tl > tr)continue;
			if (tr >= MAX_T)continue;
			ans += dist.query(a-1,c-1,b-1,d-1,tl,tr) > (M/2);
		}
		cout << ans << endl;
	}
}

Problem D

Idea: duduFreire
Preparation: duduFreire

Solution

Code

#include<bits/stdc++.h>
 
using namespace std;
 
int main() {
	int rl,hl;
	cin>>rl>>hl;
	int rc, hc;
	cin>>rc>>hc;
 
	// pi rl^2 hl > pi rc^2 hc /3 + 2 pi rc^3 / 3
	// rl^2 hl > rc^2 hc/3 + 2 rc^3 /3
	// 3 rl^2 hl > rc^2 hc + 2rc^3
	if ( 3 * rl*rl * hl > rc*rc*hc + 2*rc*rc*rc) {
		cout << "Injusto\n";
	} else {
		cout << "Justo\n";
	}
 
	return 0;
}

Problem E

Idea: MagePetrus
Preparation: MagePetrus

Solution

Code

#include <bits/stdc++.h>
 
using namespace std;
 
const int mxN = 1e6;
long long dp[mxN + 1][26 + 1];
const long long MOD = 1e9 + 7;
 
long long rec(int n, int qtd) {
	if (qtd > 26) return 0;
	if (n == 0) return 1;
	long long & ans = dp[n][qtd];
	if (ans != -1) return ans;
	return ans = (rec(n-1, qtd) * qtd + rec(n-1, qtd+1)) % MOD;
}
 
void solve() {
	int n; cin >> n;
	memset(dp, -1, sizeof(dp));
	cout << rec(n, 0) << endl;
}
 
int32_t main() {
	ios_base::sync_with_stdio(0); cin.tie(0); cout.tie(0);
	int t = 1;
	// cin >> t;
	for (int i = 0; i < t; i++) {
		solve();
	}
	return 0;
}

Problem F

Idea: duduFreire
Preparation: duduFreire

Solution

Note that the Fibonacci sequence modulo $$$M$$$ is periodic. Indeed, there are only $$$M^2$$$ possible pairs $$$(a,b)$$$ of numbers in the interval $$$[0, M-1]$$$ that can be attained by consecutive terms of this sequence, and such a pair uniquely determines the rest of the sequence. Moreover, it is not difficult to see that the first such repetition is the pair $$$(0,1)$$$. The strategy for solving the problem will be to find a multiple $$$k$$$ of this period, which will be less than $$$5 \cdot 10^{18}$$$. Denoting by $$$r$$$ the remainder of the division of $$$a^b$$$ by $$$k$$$, it will suffice to compute $$$f_r$$$ modulo $$$M$$$, using, for example, matrix exponentiation.

We will denote by $$$\pi(k)$$$ the period of the Fibonacci sequence modulo $$$k$$$, where $$$k$$$ is any positive integer. Note that this function has the following property: if $$$a$$$ and $$$b$$$ are coprime positive integers, then $$$\pi(ab) = \operatorname{lcm}(\pi(a), \pi(b))$$$. Thus, it is enough to determine the values of $$$\pi$$$ at powers of primes.

First, we analyze the behavior of the function at primes. Fix a prime $$$p$$$ different from $$$5$$$ (the value of $$$\pi(5)$$$, which is 20, can be found naively). Assuming some familiarity with field theory, we will show that $$$p^2-1$$$ is a multiple of $$$\pi(p)$$$. A simple calculation shows that $$$\pi(2) = 3$$$, so the statement holds when $$$p=2$$$, and we can assume that $$$p \neq 2$$$.

Suppose that $$$\phi_1, \phi_2$$$ are solutions to the equation $$$x^2 = x+1$$$ in some field of characteristic $$$p$$$. It follows immediately by induction that

$$$\phi_1^n = f_n \phi_1 + f_{n-1}$$$ $$$\phi_2^n = f_n \phi_2 + f_{n-1}$$$

for every integer $$$n \geq 1$$$. Subtracting the equations, we obtain $$$\phi_1^n - \phi_2^n = f_n (\phi_1 - \phi_2)$$$. Therefore, if $$$\phi_1 \neq \phi_2$$$, we have the Binet formula:

$$$f_n = \frac{\phi_1^n - \phi_2^n}{\phi_1 - \phi_2}.$$$

Since $p \neq 2, 5$, the quadratic formula shows that we can take $$$\phi_1, \phi_2 = \frac{1 \pm \sqrt{5}}{2}$$$ as distinct solutions for $$$x^2 = x+1$$$. Note that $$$\phi_1, \phi_2$$$ belong to an extension $$$K$$$ of $$$F_p$$$ of degree at most $$$2$$$. Thus, the order of the multiplicative group of $$$K$$$ (denoted by $$$|K^*|$$$) is $$$p-1$$$ or $$$p^2-1 = (p-1)(p+1)$$$. Lagrange's theorem and Binet's formula then imply that $$$\pi(p)$$$ divides $$$|K^*|$$$, which in turn divides $$$p^2-1$$$, as we wanted to show.

Finally, it holds that $$$\pi(p^k)$$$ divides $$$\pi(p) \cdot p^{k-1}$$$. This follows immediately from [1, Th. 5].

[1] Wall, D. D. (1960). Fibonacci Series Modulo m. The American Mathematical Monthly, 67(6), 525--532. https://doi.org/10.2307/2309169

Code

#include <bits/stdc++.h>
 
#define ll long long
 
using namespace std;
 
array<int,4> mul(array<int,4>& a, const array<int,4>& b, int MOD) {
	array<int,4> c={0,0,0,0};	
	c[0] = ((ll)a[0] * b[0] + (ll)a[1] * b[2]) % MOD;
	c[2] = ((ll)a[2] * b[0] + (ll)a[3] * b[2]) % MOD;
	c[1] = ((ll)a[0] * b[1] + (ll)a[1] * b[3]) % MOD;
	c[3] = ((ll)a[2] * b[1] + (ll)a[3] * b[3]) % MOD;
 
	return c;
}
 
array<int,4> fpow(array<int,4> m, ll b, int MOD) {
	array<int,4> x = {1,0,0,1};
	while(b) {
		if (b&1) x=mul(x,m,MOD);
		m=mul(m,m,MOD);
		b >>= 1;
	}
	return x;
}
 
ll fpow(ll a, ll b, ll MOD) {
	ll x=1;
	while(b) {
		if (b&1) x=(__int128)a*x%MOD;
		a=(__int128)a*a%MOD;
		b >>= 1;
	}
	return x;
}
 
ll prime_period(int p) {
	return p == 5 ? 20 : (ll)p*p-1;
}
 
signed main() {
	int a,b,MOD; cin>>a>>b>>MOD;
 
	int x=MOD;
	ll period=1;
	for (int p=2; p*p <= x; p++) {
		if (x % p) continue;
		int e;
		int pow;
		for (e=0,pow=1;x%p==0; x/=p, e++, pow*=p);
		period=lcm(period, prime_period(p) * (pow/p));
	}
 
	if (x != 1) period=lcm(period, prime_period(x));
 
	assert(period <= 1000000018000000080ll);
 
	ll n = fpow(a,b,period);
	assert(n >= 0);
	int ans = fpow(array<int,4>{0,1,1,1}, n, MOD)[1];
	assert(ans >= 0);
	cout << ans << endl;
}

Problem G

Idea: arthur_9548
Preparation: arthur_9548

Solution

Code

#include<bits/stdc++.h>
using namespace std;
 
int main(){
	ios_base::sync_with_stdio(0); cin.tie(0);
	int n; cin >> n;
	vector g(n, vector<int>());
	for(int a = 0; a < n; a++){
		int b; cin >> b; b--;
		if (a == b)continue;
		g[a].push_back(b);
		g[b].push_back(a);
	}
	int ans = 0;
	vector vis1(n, 0), vis2(n, 0), taken1(n, 0), taken2(n, 0);
	auto dfs1 = [&](auto rec, int v, int p, int& prob1, int& prob2)->int{
		int res = 0;
		vis1[v] = 1;
		for(int f : g[v]){
			if (f == p)continue;
			if (vis1[f]){
				prob1 = f;
				prob2 = v;
				continue;
			}
			res += rec(rec, f, v, prob1, prob2);
			if (not taken1[f] and not taken1[v])taken1[f] = taken1[v] = 1, res++;
		}
		return res;
	};
	auto dfs2 = [&](auto rec, int v, int p, const int prob1, const int prob2)->int{
		int res = 0;
		vis2[v] = 1;
		for(int f : g[v]){
			if (f == p or vis2[f])continue;
			res += rec(rec, f, v, prob1, prob2);
			if (not taken2[f] and not taken2[v])
				if (not (v == prob1 or v == prob2))
					if (not (f == prob1 or f == prob2))
						taken2[f] = taken2[v] = 1, res++;
		}
		return res;
	};
	for(int i = 0; i < n; i++){
		if (vis1[i])continue;
		int prob1 = -1, prob2 = -1;
		int res1 = dfs1(dfs1, i, i, prob1, prob2);
		int res2 = dfs2(dfs2, i, i, prob1, prob2);
		ans += max(res1, res2+(prob1 != prob2));
	}
	cout << ans << endl;
}

Problem H

Idea: PedroGallo
Preparation: PedroGallo

Solution

We can state that two triangles intersect if and only if at least one edge of the first triangle intersects an edge of the second triangle.

Furthermore, since the velocity is constant, we can also state that if two edges do not intersect at time $$$t$$$, but did intersect at some earlier time before $$$t$$$, then for any time after $$$t$$$, those two edges will no longer intersect. This occurs because the distance between two segments moving at constant velocity over time describes a convex function.

Thus, we can conclude that the time at which the collision between the triangles occurs corresponds to the earliest time at which an edge of the first triangle intersects an edge of the second triangle. Therefore, we can iterate over all pairs of edges, one from each triangle, and determine the earliest time at which the edges intersect. The problem then reduces to: finding the earliest time at which two line segments, moving with constant velocities, intersect.

To this end, we consider the function $$$\vec l_1(\lambda_1)$$$ which parameterizes a line segment whose endpoints are the points $$$A$$$ and $$$B$$$:

$$$\vec l_1(\lambda_1) = (B - A) \lambda_1 + A , \quad 0 \leq \lambda_1 \leq 1.$$$

Here, $$$\vec l_1$$$ represents a two-dimensional vector with the coordinates of a point along the segment. Knowing that the points move with constant velocity $$$\vec v_1$$$, we define the function $$$\vec s_1(\lambda_1, t)$$$, which describes the motion of the segment over time:

$$$\vec s_1(\lambda_1, t) = \vec l_1(\lambda_1) + t \vec v_1 , \quad 0 \leq \lambda_1 \leq 1 \quad , \quad t \geq 0.$$$

We do the same for the second line segment, considering points $$$C$$$ and $$$D$$$ as its endpoints:

$$$\vec l_2(\lambda_2) = (D - C) \lambda_2 + C , \quad 0 \leq \lambda_2 \leq 1.$$$ $$$\vec s_2(\lambda_2, t) = \vec l_2(\lambda_2) + t \vec v_2 , \quad 0 \leq \lambda_2 \leq 1 \quad , \quad t \geq 0.$$$

The two segments intersect at a given time if there exists a point that belongs simultaneously to both $$$\vec s_1(\lambda_1, t)$$$ and $$$\vec s_2(\lambda_2, t)$$$. Therefore, the condition for the existence of an intersection is:

$$$\vec s_1(\lambda_1, t) = \vec s_2(\lambda_2, t), \quad \text{with } 0 \leq \lambda_1 , \lambda_2 \leq 1, \quad t \geq 0.$$$

This equation represents a system of two equations (one for each coordinate) with three variables.

Finally, we can observe that when two segments begin to intersect, the initial collision always involves an endpoint of one of the segments. Thus, we only need to consider four cases: when $$$\lambda_1 = 0$$$, $$$\lambda_1 = 1$$$, $$$\lambda_2 = 0$$$, or $$$\lambda_2 = 1$$$. In each case, the equation $$$\vec s_1(\lambda_1, t) = \vec s_2(\lambda_2, t)$$$ becomes a system of two equations with two variables. Therefore, for each case, we solve the system and identify the smallest value of $$$t$$$ at which the intersection occurs. The minimum among the four cases will be the earliest time at which the two segments intersect.

With the result for each pair of edges, we can then determine the moment at which the collision between the triangles occurs.

Code

#include <bits/stdc++.h>
using namespace std;
 
struct Point{
    long long x, y;
 
    Point(long long _x = 0, long long _y = 0){
        x = _x;
        y = _y;
    }
 
    Point operator -(Point b){
        return {x - b.x, y - b.y};
    }
 
    long long operator ^(Point b){
        return x*b.y - y*b.x;
    }
 
};
 
struct Triangle{
    Point p[3], v;
};
 
long long sgn(long long a){
    if(a == 0){
        return 0;
    }
    else if(a > 0){
        return 1;
    }
    return -1;
}
 
long double solve2x2(Point a1, Point a2, Point b){
    long long D = a1^a2, Dt = b^a2, Dt1 = a1^b;
    if(D == 0){
        if(Dt == 0 and Dt1 == 0){
            long double ans = -1;
            if(b.x == 0 or sgn(a1.x) == sgn(b.x)){
                ans = b.x;
                ans = ans/a1.x;
            }
            if((b.x - a2.x == 0) or sgn(a1.x) == sgn(b.x - a2.x)){
                long double aux = b.x - a2.x;
                aux = aux/a1.x;
                if(ans < 0 or aux < ans){
                    ans = aux;
                }
            }
            return ans;
        }
        return -1;
    }
 
    if(Dt != 0 and sgn(Dt) != sgn(D)){
        return -1;
    }
    if((Dt1 != 0 and sgn(Dt1) != sgn(D)) or abs(Dt1) > abs(D)){
        return -1;
    }
 
    long double t = Dt;
    t = t/D;
 
    return t;
}
 
void update_ans(long double &ans, long double a){
    if(a >= 0 and (a < ans or ans < 0)){
        ans = a;
    }
}
 
int main(){
 
    Triangle T[2];
    for(int t=0; t<2; t++){
        for(int i=0; i<3; i++){
            cin >> T[t].p[i].x >> T[t].p[i].y;
        }
        cin >> T[t].v.x;
        cin >> T[t].v.y;
    }
 
 
    long double ans = -1;
    Point a1 = T[1].v - T[0].v, a2, b;
    for(int i=0; i<3; i++){
        int ii = (i + 1) % 3;
        for(int j=0; j<3; j++){
            int jj = (j + 1) % 3;
 
            Point A = T[0].p[i], B = T[0].p[ii];
            Point C = T[1].p[j], D = T[1].p[jj];
 
            a2 = D - C;
            // t1 = 0;
            b = A - C;
            update_ans(ans, solve2x2(a1, a2, b));
            // t1 = 1;
            b = B - C;
            update_ans(ans, solve2x2(a1, a2, b));
 
            a2 = A - B;
            // t2 = 0;
            b = A - C;
            update_ans(ans, solve2x2(a1, a2, b));
            // t2 = 1;
            b = A - D;
            update_ans(ans, solve2x2(a1, a2, b));
        }
    }
 
    cout << fixed << setprecision(12) << ans << '\n';
 
    return 0;
}

Problem I

Idea: arthur_9548
Preparation: arthur_9548

Solution

Code

#include<bits/stdc++.h>
using namespace std;
 
#define vi vector<int>
#define pb push_back
#define sz(x) ((int)x.size())
template<class T>
struct Trie{
	vector<unordered_map<T, int>> g; vi cnt;
	Trie():g(1),cnt(1,0){}
	int new_node(){g.pb(unordered_map<T, int>()); cnt.pb(0); return sz(g)-1;}
	template<class S> void insert(const S & s){
		int cur = 0;
		for(T c : s){
			if (g[cur].count(c))cur = g[cur][c];
			else cur = g[cur][c] = new_node();
		}
		cnt[cur]++;
	}
};
 
int main(){ //Trie solution
	ios_base::sync_with_stdio(0); cin.tie(0);
	int n, k; cin >> n >> k;
	Trie<char> names;
	for(int i = 0; i < n; i++){
		string name; cin >> name;
		names.insert(name);
	}
	string ans;
	auto dfs = [&](auto rec, int i = 0, int p = 0)->bool{
		if (i == k)return false;
		for(char c = 'a'; c <= 'z'; c++){
			ans.push_back(c);
			if (not names.g[p].count(c))return true;
			int np = names.g[p][c];
			if (not names.cnt[np])return true;
			if (rec(rec, i+1, np))return true;
			ans.pop_back();
		}
		return false;
	};
	dfs(dfs); cout << ans << endl;
}

Problem J

Idea: wallacelw
Preparation: wallacelw

Solution

For a single query, the answer can be solved with xor-basis in $$$O(N \log (\max(A_i)))$$$.

Knowing that elements $$$A_i$$$ can be interpreted as vectors in the vector space of $$$\mathbb{Z}_2$$$, the single query subproblem can be solved by iterating the range $$$[L, R]$$$ and checking if each element (vector) will be added or not to the basis of this vector space.

On the one hand, if a new element is linearly independent to all the vectors already in the basis, then this new element cannot be derived from any linear combination of the elements of the basis and needs to be included in the basis of this vector space.

On the other hand, if a new element is not linearly independent to all vectors already in the basis, this means that this element can be formed by a linear combination of the vectors in the basis and, therefore, this new element is redundant and will be included in the kernel of this vector space instead.

With the xor-basis of $$$[L, R]$$$ computed, answering if an element $$$X$$$ can be formed by a subset of this range is easy: simply check if it is linearly independent to the basis. If it is linearly independent, then it cannot be formed. Otherwise, it can. In addition, it can be proved that the total number of different subsets that can form $$$X$$$ is given by $$$2^{|kernel|}$$$ if $$$X$$$ is not linearly independent of the basis.

The basis will have at most $$$\log(A_i))$$$ elements in any instant, and to verify whether a new element can be added to this basis, only the vectors of the basis need to be compared with this new element. This derives the total complexity of $$$O(N \log (\max(A_i)))$$$ for the single query subproblem.

Now, to solve the problems with more queries, we can use a heuristic to solve the queries offline. While iterating from $$$[1, N]$$$ with the pointer $$$r$$$, we can compute the xor-basis for all sub-ranges starting in a $$$l$$$ (with $$$l \leq r$$$) and ending in $$$r$$$.

When iterating a new element $$$r$$$ and checking if it is linearly independent to the basis of all sub-ranges, we can start trying to add it in the smallest sub-range first. If it is linearly independent, add it, proceed to the next smallest sub-range, and continue doing it.

But when this new element is not linearly independent to this considered sub-range, we can break the sub-range loop because the next bigger sub-ranges (which contain the current smaller sub-range) will not need this element to be added in their basis either.

With this heuristic, the final time complexity is $$$O( (N + Q) \log (\max(A_i)))$$$ and the total memory complexity is $$$O(N \log (\max(A_i)))$$$.

Observation: knowing that every answer is 0 or a power of 2, you can pre compute the values under the MOD beforehand.

Code

#include <bits/stdc++.h>
using namespace std;
 
#define sws cin.tie(0)->sync_with_stdio(0)
#define endl '\n'
#define pb push_back
typedef long long ll;
 
const ll MOD = 998'244'353;
 
struct XorBasis {
    vector<ll> B;
    
    ll reduce(ll vec) { // O(log(a_max))
        for(auto b : B) vec = min(vec, vec^b);
        return vec;
    }
 
    bool add(ll vec) { // O(log(a_max))
        ll val = reduce(vec);
        if (val) {
            B.pb(val);
            return true;
        }
        return false;
    }
 
    ll dim() { return B.size(); }
};
 
int32_t main(){ sws;
    ll n; cin >> n;
    
    vector<ll> pow(n+1);
    pow[0] = 1;
    for(ll i=1; i<=n; i++) {
        pow[i] = (pow[i-1] * 2) % MOD;
    }
 
    vector<ll> lego(n+1);
    for(ll i=1; i<=n; i++) {
        cin >> lego[i];
    }
 
    using T = array<ll, 3>;
    vector<vector<T>> queries(n+1);
    ll q; cin >> q;
    for(ll i=1; i<=q; i++) {
        ll l, r, x; cin >> l >> r >> x;
        queries[r].pb({l, i, x});
    }
    
    vector<XorBasis> xb(n+1);
    vector<ll> ans(q+1);
 
    for(ll r=1; r<=n; r++) {
 
        for(ll l=r; l>=1; l--) {
            if (!xb[l].add(lego[r])) break;
            // We can break here, because this xor-basis of L already contains a basis that doesn't need x[r].
            // Therefore, the xor-basis of {L-1}, {L-2}, ..., which contains the xor-basis of L, also doesn't need x[r].
        }
 
        // solve all queries ending in r,
        // knowing that all xor-basis are computed up to r.
        for(auto [left, i, x] : queries[r]) {
            if (xb[left].reduce(x) == 0) {
                ll kernel = (r - left + 1) - xb[left].dim();
                ans[i] = pow[kernel];
            }
            else {
                ans[i] = 0;
            }
        }
    }
 
    for(ll i=1; i<=q; i++) {
        cout << ans[i] << endl;
    }
}

Problem K

Idea: arthur_9548
Preparation: lucassala

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
int main(){
    string s; cin >> s;
    cin >> s;
    cin >> s;
    if(s[0] == 'T'){
        cout << "Staraptor, eu escolho voce!" << '\n';
    }
    else if(s[0] == 'S'){
        cout << "Luxray, eu escolho voce!" << '\n';
    }
    else{
        cout << "Torterra, eu escolho voce!" << "\n";
    }
}

Problem L

Idea: arthur_9548
Preparation: duduFreire

Solution

Code

#include<bits/stdc++.h>
 
using namespace std;
 
int main() {
	int a,b,c,q;
	cin>>a>>b>>c>>q;	
 
	auto f=[&](int x) -> int {
		return abs((a+c) * x*x*x - b*x*x + b*c*x + (x+c)*(x-a));
	};
 
	int result=0;
	for (int i=0; i < q; i++) {
		int x;cin>>x;
		result ^= f(x);
	}
	cout << result << endl;
}

Problem M

Idea: duduFreire
Preparation: duduFreire

Solution

Summary of the first solution: In the DAG of shortest paths, find with dynamic programming the path that minimizes the longest edge.

Summary of the second solution: Find the length $$$w$$$ of the shortest path and perform binary search to find the path that minimizes the longest edge.

Details of the first solution:

First, we construct a graph called the DAG of shortest paths from the input graph $$$G$$$. This graph has the same vertices as $$$G$$$ and contains an edge $$$(a_i, b_i, w_i)$$$ if and only if $$$G$$$ contains this edge and it is present in some shortest path from $$$1$$$ to $$$N$$$. To check if this second condition holds, we do the following: run Dijkstra from $$$1$$$ in graph $$$G$$$ and another Dijkstra from $$$N$$$ in the reverse graph of $$$G$$$ (the graph obtained by reversing the direction of the edges of $$$G$$$). This way, we obtain for each vertex $$$a$$$ the distance from $$$1$$$ to $$$a$$$ and the distance from $$$a$$$ to $$$N$$$. Thus, to check if the edge $$$(a_i, b_i, w_i)$$$ from $$$G$$$ belongs to the DAG of shortest paths, it is enough to verify if (the distance from $$$1$$$ to $$$a_i$$$ + w_i + the distance from $$$b_i$$$ to $$$N$$$) is equal to the distance from $$$1$$$ to $$$N$$$.

It is not difficult to demonstrate that the constructed graph is indeed acyclic and has the following property: any path in this graph that starts at $$$1$$$ and ends at a vertex $$$v$$$ is a shortest path from $$$1$$$ to $$$v$$$ in the original graph. Moreover, every shortest path in the original graph exists in the DAG of shortest paths. Thus, it is enough to find any path that minimizes the longest edge in this DAG. For this, for each vertex $$$a$$$, denote by $$$f(a)$$$ the minimum value of the longest edge on the path from $$$a$$$ to $$$N$$$ in the DAG of shortest paths. We also define that $$$f(N) = -\infty$$$. If $$$N(a)$$$ is the set of neighbors of $$$a$$$, it holds that

$$$f(a) = \min_{b \in N(a)}(f(b))$$$

We can use this equation to calculate $$$f$$$ recursively. To ensure good complexity, it is necessary to memoize the values of $$$f$$$. This step has a complexity of $$$O(N + M)$$$. Since we need to run two Dijkstras, the final complexity is $$$O(M + N \log(N))$$$.

Details of the second solution:

First, find the length $$$w$$$ of the shortest path from $$$1$$$ to $$$N$$$ using Dijkstra's algorithm. Define the value of a path as the length of its longest edge. We will use binary search to find the smallest value of a shortest path from $$$1$$$ to $$$N$$$. To do this, it is enough to check for each $$$c$$$ whether there exists a shortest path from $$$1$$$ to $$$N$$$ with a value less than or equal to $$$c$$$. This is simple: just remove the edges from $$$G$$$ with a length greater than $$$c$$$ and run Dijkstra from $$$1$$$. If (and only if) the value found for the distance from $$$1$$$ to $$$N$$$ in this new graph is also $$$w$$$, then there exists a shortest path from $$$1$$$ to $$$N$$$ with a value less than or equal to $$$c$$$.

Code

#include <bits/stdc++.h>
 
#define ll long long
 
using namespace std;
 
constexpr ll oo=0x3f3f3f3f3f3f3f3f;
constexpr int MAX = 1e5;
 
vector<ll> dijkstra(vector<vector<pair<int,int>>>& g, int s) {
	int n = g.size();
	vector<ll> dist(n, oo);		
	vector<int> vis(n);
	
	priority_queue<pair<ll,int>> pq;
	pq.emplace(0, s);
	dist[s]=0;
 
	while(!pq.empty()) {
		auto [d, a] = pq.top(); pq.pop();		
		d = -d;
 
		if (vis[a]) continue;
		vis[a]=true;
 
		for (auto [b,w] : g[a]) {
			if (vis[b] or dist[b] <= d + w) continue; 	
			dist[b]=d+w;
			pq.emplace(-dist[b], b);
		}
	}
 
	return dist;
}
 
 
signed main() { //first solution
	ios_base::sync_with_stdio(0);cin.tie(0);
 
	int n,m;cin>>n>>m;	
	vector<vector<pair<int,int>>> g(n), ig(n);
	for (int i=0; i < m; i++) {
		int a,b,w;cin>>a>>b>>w; a--;b--;
		g[a].emplace_back(b,w);
		ig[b].emplace_back(a,w);
	}
 
	vector<vector<pair<int,int>>> dag(n);
	auto dist_g = dijkstra(g,0);
	auto dist_ig = dijkstra(ig,n-1);
 
	for (int a=0; a < n; a++) 
		for (auto [b,w] : g[a]) 
			if (dist_g[a] + w + dist_ig[b] == dist_g[n-1]) dag[a].emplace_back(b,w);
	
 
	vector<int> dp(n, -1);
	auto calc=[&](auto self, int a) -> int {
		if (a == n-1) return 0;	
		
		int& ans = dp[a];
		if (ans != -1) return ans;
		ans=MAX;
		for (auto [b,w] : dag[a]) ans=min(ans, max(w,self(self, b)));
		return ans;
	};
	
	assert(dist_g[n-1] > 0);
	cout << dist_g[n-1] << ' ' << calc(calc, 0) << endl;
}

Problem N

Idea: duduFreire
Preparation: MagePetrus

Solution

We will preprocess all intervals that are permutations. For each $$$1 \leq i \leq n$$$, we denote by $$$p(i)$$$ the position where $$$i$$$ occurs in the list $$$a_1, \dots, a_n$$$. Note that this position exists and is unique, since the given list is a permutation.

We initialize an integer $$$r$$$ with the value $$$p(1)$$$, and observe that the interval $$$[p(1), r]$$$ is a permutation. Then, we increment or decrement $$$r$$$ repeatedly until $$$r$$$ reaches the value $$$p(2)$$$. In this process, we compute the maximum value $$$M$$$ reached by $$$a_r$$$, the minimum value $$$L$$$ reached by $$$r$$$, and the maximum value $$$R$$$ reached by $$$r$$$. Note that the sublist of $$$b$$$ between $$$L$$$ and $$$R$$$ contains exactly $$$R-L+1$$$ integers, its smallest element is $$$1$$$, and its largest element is $$$M$$$. Thus, for this sublist to be a permutation, it is necessary and sufficient that $$$M = R-L+1$$$. If this is the case, we store the fact that the interval determined by $$$(L, R)$$$ is a permutation.

After this preprocessing, to answer a query of the form $$$l_i, r_i$$$, it is enough to check if the interval $$$l_i, r_i$$$ is a permutation. For this, we simply need to ensure that, in the previous step, every time we verified that the interval determined by $$$L, R$$$ is a permutation, we inserted the pair $$$(L,R)$$$ into a set. Upon receiving the query, we check if the pair $$$(l_i, r_i)$$$ belongs to the set.

The complexity of the preprocessing is $$$O(n \log(n))$$$ due to the insertions in the set, and the complexity of answering each query is $$$O(\log(n))$$$, to check if an element belongs to a set. In total, we have a complexity of $$$O((n + q) \log(n))$$$.

We observe that it is possible to solve the problem in complexity $$$O(n + q\log(n))$$$. For this, instead of using a set, we insert the pairs $$$(L,R)$$$ into a list and at the end of the preprocessing, we sort this list using counting sort. This sorting can be done in $$$O(n)$$$, due to the fact that $$$1 \leq L,R \leq n$$$.

Code

#include <bits/stdc++.h>
 
#define int long long
#define endl '\n'
#define pb push_back
#define eb emplace_back
#define all(x) (x).begin(), (x).end()
#define rep(i, a, b) for(int i=(int)(a);i < (int)(b);i++) // [a,b)
#define irep(i, a, b) for(int i=(int)(a);i >= (int)(b);i--) // [b,a]
#define pii pair<int, int>
#define vi vector<int>
#define vvi vector<vi>
#define sz(x) ((int)(x).size())
#define chmax(a,b) a=max(a, (b))
#define chmin(a,b) a=min(a, (b))
 
#ifdef LOCAL
#define debug(var) cerr << #var << ": " << (var) << endl
#else
#define debug(var)
#endif
 
 
using namespace std;
 
void solve()
{
	int n,q;cin>>n>>q;	
	vector<int> a(n), ia(n);
	rep(i,0,n) {
		cin>>a[i];
		a[i]--;
	}
	rep(i,0,n) ia[a[i]]=i;
	set<pii> good_itvls;
 
	int p=ia[0];
	int l=p, r=p;
	int mx = 0;
 
	rep(i,0,n) {
		// ia[i]	
		if (p < ia[i]) {
			p=min(ia[i], r);
			for(;p+1 <= ia[i]; chmax(mx,a[p+1]), p++);
			r=max(r, p);
		} else if (p > ia[i]) {
			p=max(ia[i], l);
			for(;p-1 >= ia[i]; chmax(mx,a[p-1]), p--);
			l=min(l,p);
		}
 
		if (r-l+1 == mx+1) good_itvls.emplace(l, r);
	}
 
	while(q--) {
		int l,r;cin>>l>>r;l--;r--;
		cout << (good_itvls.count(pair<int,int>(l,r)) ? "TAK":"NIE") << endl;
	}
}
 
signed main() {
	#ifndef LOCAL
	ios_base::sync_with_stdio(0);cin.tie(0);
	#endif
	int t=1;
	//cin>>t;		
	while(t--) solve();
}

Полный текст и комментарии »

Разбор задач VI UnBalloon Contest Mirror

arthur_9548
11 месяцев назад
5

VI UnBalloon Contest Mirror

Автор arthur_9548, 12 месяцев назад, По-английски

Hello, Codeforces!

We are happy to invite you to the VI UnBalloon Contest Mirror on May/18/2025 20:00 (Moscow time) . UnBalloon is the competitive programming community of University of Brasília.

This contest is a Codeforces Gym, and therefore is NOT an official Codeforces Round and is NOT rated.. The contest will follow standard ICPC rules.

All problems were created and prepared by me, duduFreire, lucassala, PedroGallo, MagePetrus and wallacelw.

Also, we would like to thank:

Our problem reviewers/testers: cebolinha, EYZ, isasouza and Vilsu;
Professors dporto and viniciusrpb for the support on the on-site contest and all UnBalloon activities;
DOM Judge team for the platform used in the on-site contest;
dario2994 for the brilliant pol2dom tool;
MikeMirzayanov for the amazing Codeforces and Polygon platforms.

Most of the problems are targeted to those who are starting their journey in competitive programming. However, we also included a few problems we believed will be an interesting challenge for more experienced participants, such as finalists of ICPC Regionals. We hope the contest to be balanced enough to have all kinds of teams thinking on challenging tasks until the end!

Good luck to all participants!

Note: if you participated in the on-site contest, please don't make any kind of comment regarding the problems in this blog. As soon as the Mirror finishes, we will publish an editorial — you may discuss the problems there.

Полный текст и комментарии »

Анонс VI UnBalloon Contest Mirror

arthur_9548
12 месяцев назад
0

Multidimensional Range Query Data Structures

Автор arthur_9548, история, 15 месяцев назад, По-английски

Hello, Codeforces!

I'm Arthur Botelho (arthur_9548), and this is my first blog. After reading this blog by mouse_wireless and solving this problem, my perspective of multidimensionality changed. I decided to implement other multidimensional data structures and solve problems using them. Now, I wish to explain my understanding of this subject and my implementations of these structures. I hope it will be useful or, at least, interesting.

Summary

In this blog, we will discuss the general idea of Range Query Data Structures in one dimension and extrapolate the concepts to higher dimensions. We will also study implementations of multidimensional versions of these four structures:

Prefix/Partial/Cumulative Sum (Psum)
Disjoint Sparse Table (DiST)
Binary Indexed/Fenwick Tree (BIT)
Segment Tree (SegTree)

The idea of the implementations is to be scalable, generic, understandable and efficient in time complexity. It is possible to make small optimizations which improve the runtime and adapt the structures to work better in specific problems, but this is not the goal of this blog.

Customized versions of these implementations are available at my Competitive Programming repository.

Prerequisites

Computational complexity
C++ language
Basic Algebra knowledge
1D Range Query Data Structures

Links about these topics and more will appear throghout the blog. It is not necessary to know everything — I'm not a specialist in any of these myself. Just the basic notion is enough.

Blog Organization

Introduction: 1D Range Queries
Multidimensional Range Queries
Data Structures Implementation:
- Static Range Queries:
  - Psum
  - DiST
- Range Queries + Point Updates:
  - BIT
  - SegTree
Conclusion

This blog has ended up bigger than I expected, but I believe it's mostly due to code used in explanations. You also don't need to read every section if you don't want to. The first two are the theoretical basis for all structures — they may be important even if you already know the subject, since I will use definitions and concepts from them later on. On the other hand, the subsections of "Data Structures Implementation" are more or less independent.

Introduction: 1D Range Queries

From now on, I may refer to Data Structures as DS and Range Query Data Structures as RQDS.

We know that RQDS are used to solve problems of the form:

Be $$$*$$$ the symbol for an associative binary operation and $$$A$$$ a list of elements, where $$$A_i$$$ is the element at the $$$i+1$$$-th position in the list (which means $$$0$$$-indexed). Given queries of the form $$$[l,\ r]$$$, calculate the value of:

$$$ A_l * A_{l+1} * A_{l+2} * \ldots * A_{r-2} * A_{r-1} * A_r $$$

This abstract definition is meant to reflect that we can solve this problem in the same way for many different types of elements and associative operations. Some concrete examples:

Given a list of numbers, find the sum of the numbers in positions $$$[l,\ r]$$$.
Given a list of matrices, find the product of matrices in positions $$$[l,\ r]$$$.

The main idea is to store the results of applying the operation in some specific ranges, such as prefixes of the list $$$A$$$ or some subarrays of $$$A$$$ whose size is a power of $$$2$$$.

If some elements $$$A_i$$$ are changed between queries, we are dealing with range queries and point updates (we won't discuss range queries and range updates here). If the list $$$A$$$ doesn't change, we are dealing with static range queries.

The associative binary operation is also important. Let's suppose this operation has an identity/neutral element $$$id$$$ (if it doesn't, we can always redefine the operation to include one more element such that this new element becomes the identity). Then, we are working with a Monoid. If the operation is inversible, which means for all $$$x$$$ there is an element $$$x^{-1}$$$ such that $$$x * x^{-1} = x^{-1} * x = id$$$, the Monoid is actually a Group.

Depending on the problem faced and operation type, we will use different data structures:

Groups and static range queries: Psum
Monoids and static range queries: DiST
Groups and range queries + point updates: BIT
Monoids and range queries + point updates: SegTree

Also, if the operation is idempotent, which means $$$x * x = x$$$ for any $$$x$$$, we have an Idempotent Monoid. The Sparse Table structure works with Idempotent Monoids and static range queries. We won't discuss it here, but I've made a multidimensional implementation of it as well (it's in the repository I've mentioned earlier).

In this blog, our data structures will be implemented in a way to make it easier to modify the operation type and the structure size. The definition of the operation and the element's type will be made outside the DS (passed through a template parameter) and the size will be passed as a parameter to the constructor (usually read from input), making it a generic implementation (similar to what is done in this blog).

The decision to code them as a C++ struct aims to make the usage and creation of multiple structures easier. For example: imagine that you want to create a list of $$$N$$$ Psums (where $$$N$$$ is a variable), or that you have one SegTree that multiplies integers and another SegTree that sums floats. You'll probably want to avoid having to copy/paste code and manually change variables, functions and types.

To sum up, our RQDS code will look like this:

template<class S> //information on the operation and element type
struct DataStructure{ //implementation of the DS, doesn't depend on operation
    using T = typename S::T; //to reduce verbosity. T is the element type
    int n; vector<T> v; //size and structure itself
    DataStructure(int s):n(s),v(n, S::id){} //initialization example
    //the line above is making use of C++ Member Initializer lists
    T query(int l, int r){ //range query
    /* we will suppose "merge" uses stored information of some
    specific ranges to produce the answer for any range */
        return merge(l, r);
    }
};

struct AlgebraicStructure{ //implementation of operation, doesn't depend on DS
    using T = some_type; //element type
    static constexpr T id = some_value; //operation identity
    T op(T a, T b){return a * b;} //some associative operation
};

void usage(){
    DataStructure<AlgebraicStructure> ds(size);
    //...
    //after structure is ready, answer queries of [l, r] like this:
    cout << ds.query(l, r) << endl; //0-indexed
}

We will get to concrete examples of this implementation later, when viewing speficic data structures. Depending whether we want point updates or not, different additional methods will be implemented to insert data inside the structure. Also, look here if the identity $$$id$$$ cannot be made constexpr.

Multidimensional Range Queries

I'm supposing you already know how to efficiently solve most types of range query problems in lists, i.e., in dimension $$$1$$$. However, we may want to solve similar problems in higher dimensions. The range query problem we will be solving is as follows:

Be $$$*$$$ the symbol for a commutative associative binary operation and $$$A$$$ a set of points, where each point is represented by it's integer coordinates $$$(x,\ y,\ z,\ \ldots)$$$ and each point has a value. Given queries of the form $$$[(x_1,\ y_1,\ z_1, \ \ldots), \ (x_2,\ y_2,\ z_2,\ \ldots)]$$$ calculate the result of applying the operation to the values of all points whose coordinates obey:

$$$x_1 \leq x \leq x_2, \ y_1 \leq y \leq y_2, \ z_1 \leq z \leq z_2,\ \ldots$$$

As an example, we may imagine we have a grid ($$$2D$$$) filled with numbers and queries regarding the sum of numbers in subgrids of this grid, or a $$$3D$$$ space where each point is associated to a polynomial and queries regarding the product of all polynomials in a cuboid whose edges are aligned with the space axes.

The general idea will be the same as the $$$1D$$$ case: calculate the results of applying the operation in some specific ranges and answer queries by efficiently merging these results. In fact, we are going to proceed essentially the same way. The only difference is that, instead of merging single values (the results of ranges), we will be merging data structures.

Let's study an example problem: given a grid with numbers, answer queries of calculating the sum of all numbers inside a given subgrid. Suppose we want to calculate sums of subgrids of the following grid:

$$$ $$$ $$$a \ b$$$ $$$c \ d$$$

For this example, let's try to create a data structure that will calculate the sum of numbers for any subgrid $$$[(x_1,\ y_1),\ (x_2,\ y_2)]$$$.

We already know how to calculate sums in rows ($$$1D$$$). Let's calculate, for each row, the list of values we get by applying the operation (sum) in the column ranges: $$$[0,\ 0]$$$, $$$[1,\ 1]$$$ and $$$[0,\ 1]$$$. For the first row, we will have the list: $$$a,\ b,\ a+b$$$. For the second: $$$c,\ d,\ c+d$$$. With these, we can calculate the sum of elements in any range of a single row.

However, if the range of a query contains more than one row, we can't seem to get the answer immediately with the lists we have just calculated. We will overcome this by extrapolating the process: let's calculate the list of lists of values we get by applying the operation between corresponding elements of the lists associated to rows in the row ranges: $$$[0,\ 0]$$$, $$$[1,\ 1]$$$ and $$$[0,\ 1]$$$. This will yield the following:

For $$$[0,\ 0]$$$: $$$a,\ b,\ a+b$$$.
For $$$[1,\ 1]$$$: $$$c,\ d,\ c+d$$$.
For $$$[0,\ 1]$$$: $$$a+c,\ b+d,\ a+b+c+d$$$.

Now, we are able to answer any subgrid query of the form $$$[(x_1,\ y_1), \ (x_2,\ y_2)]$$$. For all possible ranges in the first dimension ($$$[x_1,\ x_2]$$$), we have a list associated to it which can solve queries of all possible ranges in the second dimension ($$$[y_1,\ y_2]$$$). Let's solve the query $$$[(0,\ 1),\ (1,\ 1)]$$$:

The list associated to the range of the first dimension ($$$[0,\ 1]$$$) is $$$a+c,\ b+d,\ a+b+c+d$$$.
The element of this list associated to the range of the second dimension ($$$[1,\ 1]$$$) is $$$b+d$$$. This is the answer.

Despite being a very simple example, this is very similar to how an actual $$$2$$$x$$$2$$$ $$$2D$$$ SegTree would work. Of course, the idea is not calculating the query answer for all ranges in all dimensions, but instead calculating for the same ranges we would in the $$$1D$$$ case.

You should notice the restriction of commutativity in the operation. When dealing with multidimensional queries, we will assume there is no specific order in which we should apply the operation. All the reasoning and implementations in this blog are based on this. I don't know if it is possible to answer non-commutative multidimensional range queries with multidimensional versions of the structures we're working with — at least, I wasn't able to do so without heavily worsening the time complexity or removing the generality of the structures.

Why?

Generalizing the $$$2D$$$ example, the idea will be maintaining data structures of data structures. In dimension $$$D \gt 0$$$, our multidimensional RQDS will maintain a list of internal DS of dimension $$$D-1$$$, where each of them is associated to some $$$D$$$-dimensional range (usually prefixes or ranges whose size is a power of two). The $$$0D$$$ data structures will simply contain single values, since they represent points. This way, we can see that the $$$1D$$$ RQDS we know so far actually operate on lists of $$$0$$$-dimensional data structures.

Supose we want to answer a $$$D$$$-dimensional range query $$$[(x_1,\ y_1,\ z_1, \ \ldots), \ (x_2,\ y_2,\ z_2,\ \ldots)]$$$. The range related to dimension $$$D$$$ is $$$[x_1,\ x_2]$$$, and we will suppose (recursively) that our internal data structures will be able to solve $$$(D-1)$$$-dimensional range queries of the form $$$[(y_1,\ z_1, \ \ldots), \ (y_2,\ z_2,\ \ldots)]$$$ — the base case, of course, is a $$$0$$$-dimensional query $$$[(),\ ()]$$$: simply returning a value. What we want to do is to be able to propagate this $$$(D-1)$$$-dimensional query to as few internal data structures as possible, using the fact that they represent ranges in dimension $$$D$$$. Each structure will do this in a different way, but all of them build their list of internal DS in similar fashion: merging data structures associated to smaller ranges.

Let's understand a little better what it means to merge data structures. Suppose we are working in a dimension $$$D \gt 1$$$ and currently have the following list of internal $$$D-1$$$ dimensional data structures:

$$$DS_{0,0}$$$: associated to the range $$$[0,\ 0]$$$ in $$$D$$$.
$$$DS_{1,1}$$$: associated to the range $$$[1,\ 1]$$$ in $$$D$$$.

If the query range in $$$D$$$ is $$$[0,\ 0]$$$ or $$$[1,\ 1]$$$, we already know where to propagate the query. However, we may want to merge $$$DS_{0,0}$$$ and $$$DS_{1,1}$$$ to form $$$DS_{0,1}$$$, to whom we will propagate queries if the range in $$$D$$$ is $$$[0,\ 1]$$$. For this, suppose (recursively) that we are already able to merge $$$(D-2)$$$-dimensional data structures. Then, we can define their merging as follows:

For all the ranges $$$[l,\ r]$$$ for which $$$DS_{0,0}$$$ and $$$DS_{1,1}$$$ calculate merges of their internal data structures, we will proceed in the same way — let's look to a particular range $$$[i,\ j]$$$. Suppose $$$DS_a$$$ is the merge of $$$[i,\ j]$$$ in $$$DS_{0,0}$$$ and $$$DS_b$$$ is the same in $$$DS_{1,1}$$$. $$$DS_{0,1}$$$ will maintain, for $$$[i,\ j]$$$, the merge of $$$DS_a$$$ and $$$DS_b$$$ (which are $$$D-2$$$ dimensional). Note that this is recursively defined (base case: $$$DS_a$$$ and $$$DS_b$$$ are $$$0$$$-dimensional (just values) and their merge is $$$DS_a * DS_b$$$ ).

A small 3D example

Let's take a list consisting of two $$$2$$$x$$$2$$$ grids:

$$$ $$$

$$$Grid_0$$$:

$$$ $$$ $$$ a \enspace b$$$ $$$ c \enspace d$$$ $$$ $$$

$$$Grid_1$$$:

$$$ $$$ $$$ e \enspace f $$$ $$$ g \enspace h $$$ $$$ $$$

Again, the operation will be the sum, and we will create the same data structure we did for our previous example problem. Thus, in each row of each grid we will solve the problem for dimension $$$1$$$. The first row of the first grid will maintain the following list: $$$a,\ b,\ a+b$$$ (the merge of ranges $$$[0,\ 0],\ [1,\ 1],\ [0,\ 1]$$$), and the results for the other rows will be analogous.

In each grid, it's rows are able to solve $$$1D$$$ queries. Now, to solve $$$2D$$$ queries let's merge the $$$1D$$$ structures. For the first grid, we have also already calculated this in the previous example:

For $$$[0,\ 0]$$$: $$$a,\ b,\ a+b$$$.
For $$$[1,\ 1]$$$: $$$c,\ d,\ c+d$$$.
For $$$[0,\ 1]$$$: $$$a+c,\ b+d,\ a+b+c+d$$$.

For the second grid:

For $$$[0,\ 0]$$$: $$$e,\ f,\ e+f$$$.
For $$$[1,\ 1]$$$: $$$g,\ h,\ g+h$$$.
For $$$[0,\ 1]$$$: $$$e+g,\ f+h,\ e+f+g+h$$$.

We are thus able to solve $$$2D$$$ queries. Now, to solve $$$3D$$$ queries, we will calculate the merge for the ranges $$$[0,\ 0],\ [1,\ 1],\ [0,\ 1]$$$ in dimension $$$3$$$:

For $$$[0,\ 0]$$$: $$$(a,\ b,\ a+b)$$$, $$$(c,\ d,\ c+d)$$$, $$$(a+c,\ b+d,\ a+b+c+d)$$$ — the data structure for the first grid.
For $$$[1,\ 1]$$$: $$$(e,\ f,\ e+f)$$$, $$$(g,\ h,\ g+h)$$$, $$$(e+g,\ f+h,\ e+f+g+h)$$$ — the data structure for the second grid.
For $$$[0,\ 1]$$$: $$$(a+e,\ b+f,\ a+b+e+f)$$$, $$$(c+g,\ d+h,\ c+d+g+h)$$$, $$$(a+c+e+g,\ b+d+f+h,\ a+b+c+d+e+f+g+h)$$$ — the merge of data structures of ranges $$$[0,\ 0]$$$ and $$$[1,\ 1]$$$.

We can see that each range in dimension $$$3$$$ contains a list of $$$2D$$$ data structures, each of them containing their list of $$$1D$$$ RQDS, which again maintain lists of elements ($$$0D$$$ data structures).

Let's solve the query for sum in $$$[(0,\ 0,\ 0),\ (1,\ 0,\ 0)]$$$. The range in dimension $$$3$$$ is $$$[0,\ 1]$$$, so we will look in the data structure corresponding to this range (the last one in the previous list). The range in dimension $$$2$$$ is $$$[0,\ 0]$$$, so we should look in the corresponding data structure: $$$a+e,\ b+f,\ a+b+e+f$$$. In dimension $$$1$$$ the range is also $$$[0,\ 0]$$$, so our answer is the value stored in the position corresponding to this range: $$$a+e$$$.

Now, let's see how we do this in practice using the RQDS we know and solve actual problems. From now on, I may refer to Multidimensional Range Query Data Structures as MDRQDS.

Data Structures Implementation

We will make use of template metaprogramming and ellipses (the "...") in C++ for the recursive definition of data structures and their merging. Our code will look like this:

#define MAs template<class... As> //we are going to type this a lot
//MAs stands for multiple arguments
template<int D, class s> //dimension and algebraic structure
struct MDRQ{ //multidimensional case
    using T = typename S::T; //to further reduce verbosity
    int n; vector<MDRQ<D-1, S>> v; //size and our list of data structures
    MAs MDRQ(int s, As... ds):n(s),v(n, MDRQ<D-1, S>(ds...){}
    /* in this dimension, we save our size and propagate the
    sizes of other dimensions (ds) to our internal structures */
    MAs T query(int l, int r, As... ps){ //the query range in this dimension is [l, r]
        /* suppose "merge" uses stored information of some specific ranges
        to efficiently produce the merge of the structures in any range */
        return merge(l, r).query(ps...);
        /* we are propagating the query to lower dimensions,
        passing forward their query ranges (ps) */
    }
};

template<class S>
struct MDRQ<0, S>{ //base case: dimension 0
    using T = typename S::T;
    T val = S::id; //value starts with identity
    T query(){return val;} //0D query
};

This way, we will able to use it like this:

void example(){ //3D example
    MDRQ<3, SomeAlgebraicStructure> ds(3, 4, 5);
    /* this creates a 3x4x5 3D data structure
    based on "SomeAlgebraicStructure" */
    //...
    /* after we are ready to answer queries of the form
    [(x1, y1, z1), (x2, y2, z2)], we do as follows: */
    cout << ds.query(x1, x2, y1, y2, z1, z2) << endl; //also 0-indexed
}

We can see that the code matches our reasoning: we define how the DS should work in any positive dimension (by merging lower dimension data structures) and define the base cases for recursive processes. It may seem magical, but you can think that everytime you declare MDRQ<X, S> (a MDRQDS with dimension $$$X$$$, $$$X \gt 0$$$), the compiler will assure that all versions of MDRQ<D, S> for all non-negative values of D up to $$$X$$$ exist, and it will create the missing ones. This usage of templates provides the correct number of arguments for each method version and correct typing for each variable in all dimensions — all of this will be adjusted in compile time.

Notice that this implementation is not sparse/dynamically sized: we define beforehand the corresponding size in each dimension. I believe that for most structures this simplifies the implementation and usage, but I think it's also possible to implement some of them in the other way. However, it may significantly affect the running time in problems where this wouldn't be necessary. Also, notice that the number of dimensions of our structures should be known beforehand in order to instantiate them, as it's a template parameter (if the dimension of the problem is variable you may have to do some ifs to solve the problem in the correct dimension).

That will be the skeleton for our implementation of MDRQDS. Now, depending on the type of problem we want to solve and the restrictions on the operation to be applied in the range, we will insert the data into the structure and answer queries in different ways.

Static Range Queries

The general idea to solve static range query problems is to do some preprocessing and then answer queries very quickly. We should add these two methods in our implementation:

//... In the multidimensional version:
    MAs void set(T x, int p, As... ps){
        //we want to set the value of the point (p, ps...) to x
        v[p].set(x, ps...); 
        /* the structure corresponding to the coordinate p (or range [p, p])
        will propagate the recursion in the correct direction */
    }
    void init(){
        //we will call this to do the preprocessing, becoming able to answer queries
    }
//...
//... In the 0D version:
    void set(T x){val = x;} //setting the value to x
    void init(){} //no preprocessing needed
//...

The interface will end up looking like this:

void example(){
    MDRQ<3, S> ds(3, 4, 5); //3x4x5 3D structure based on S
    ds.set(val_1, 0, 2, 1); //point (0, 2, 1) has value = val_1
    ds.set(val_2, 2, 3, 0); //point (2, 3, 0) has value = val_2
    ds.set(val_3, 1, 0, 4); //point (1, 0, 4) has value = val_3
    ds.init(); //ready to answer queries
    cout << ds.query(0, 1, 0, 3, 0, 4) << endl; //only val_1 and val_3 influence this query
}

We can see how simple it will be in practice to use the structures to solve multidimensional problems. However, we have to actually develop these structures before solving anything.

Psum

As mentioned before, Psums are used to answer static range queries when we are working with a Group operation (like sums and xor), which for higher dimensions should be commutative (i.e. an Abelian Group). The idea in dimension $$$1$$$ is to calculate the answer for all prefixes of a list — we preprocess this in $$$O(n)$$$, where $$$n$$$ is the list size. Now, we can answer queries in $$$O(1)$$$: answer($$$[l,\ r]$$$) = answer($$$[0,\ r]$$$) $$$*$$$ (answer($$$[0,\ l-1]$$$))$$$^{-1}$$$, where $$$*$$$ is the Group operation and $$$a^{-1}$$$ is the inverse of $$$a$$$. We will proceed exactly the same way in any dimension $$$D$$$.

Note: in our Psum code, we will use $$$1$$$-indexing, because it makes the implementation easier. However, we will consider that set and query will always be called with $$$0$$$-indexing.

Firstly, we should process the prefixes. Suppose we have a $$$n$$$-sized list of $$$(D-1)$$$-dimensional Psums, and each of them has already concluded the preprocessing for lower dimensions. Now, we will, for each prefix of the list, simply calculate the merge of all Psums contained in it:

//... In the multidimensional version:
    void init(){
        //we will use 1-indexing to access internal Psums, located in "v", which has size n
        for(int i = 1; i < n; i++){
            v[i].init(); //lower dimensions processed first
            v[i].merge(v[i-1]); //v[i] will merge itself with v[i-1]
            //v[0] is a Psum filled with S::id and v[i] is the merge of Psums representing [0, i-1]
        }
    }
    void merge(Psum& p){ //merging psums
        for(int i = 1; i < n; i++){
            v[i].merge(p.v[i]); //recursive definition of merge
        }
    }
//...
//... In the 0D version:
    //merging in dimension 0 is simply applying the operation
    void merge(Psum& p){val = S::op(val, p.val);} 
//...

We can use induction to prove that, if all dimensions have size $$$n$$$ (we will always assume this in calculations for simplicity), we are in dimension $$$D$$$, and S::op has complexity $$$op$$$, the init method has complexity $$$O(D \cdot op \cdot n^D)$$$.

After finishing the preprocessing, we will want to answer queries. We can simply use the same logic for the $$$1D$$$ case, but propagating the query to lower dimensions:

//...
//If S is a group, it should have an inv() function: returning the inverse of the element
    MAs T query(int l, int r, As... ps){ //multidimensional query
        return S::op(v[r+1].query(ps...), S::inv(v[l].query(ps...)); //ans([0, r]) * (ans([0, l-1])^-1
        /* note the use of 1-indexing: when l is 0, instead of accessing an invalid position,
        we will just be accessing a Psum filled with identities, which won't cause errors */
    }
//...

If $$$inv$$$ is the complexity of S::inv, we can see that this is $$$O((op + inv) \cdot 2^D)$$$. Joining all the code, we get the following (fully functional) implementation:

Code

#define MAs template<class... As>
template<int D, class S> //dimension D and group S
struct Psum{ 
    using T = typename S::T;
    int n; vector<Psum<D-1, S>> v;
    MAs Psum(int s, As... ds):n(s+1),v(n,Psum<D-1, S>(ds...)){}
    MAs void set(T x, int p, As... ps){v[p+1].set(x, ps...);} //1-indexing
    void merge(Psum& p){
        for(int i = 1; i < n; i++)v[i].merge(p.v[i]);
    }
    void init(){
        for(int i = 1; i < n; i++){
            v[i].init(); v[i].merge(v[i-1]);
        }
    }
    MAs T query(int l, int r, As... ps){
        return S::op(v[r+1].query(ps...), S::inv(v[l].query(ps...)));
    }
};

template<class S>
struct Psum<0, S>{ //base case
    using T = typename S::T;
    T val=S::id;
    void set(T x){val=x;}
    void merge(Psum& p){val=S::op(p.val,val);}
    void init(){}
    T query(){return val;}
};

The memory used will be something like $$$n^D \cdot sizeof(T)$$$.

You can use this structure in the following problems:

$$$1D$$$: Static Range Sum Queries (CSES)
$$$2D$$$: Forest Queries (CSES)
$$$?D$$$ (challenge): 372B - Counting Rectangles is Fun

DiST

If we have to work with Monoid operations (like multiplication and minimum/maximum) instead of Groups, we can use DiST for static range queries. This DS is not as well-known as the others, you can read more about it here and here. The idea is similar to Segment Trees and Merge Sort Trees: we will consecutively divide our list in halfs and each node will maintain information about a sublist of our list.

What a $$$1D$$$ DiST does is to store, for each of these sublists, all steps of calculating the operation from the middle of the sublist to it's beginning and to it's end (with $$$O(n \cdot log(n))$$$ total complexity). To answer a query $$$[l,\ r]$$$, we will locate the node in the DiST which can answer the query in $$$O(1)$$$ by merging the answers from it's middle to $$$l$$$ and from it's middle to $$$r$$$. This location will be made using bitwise operations.

Note: we will use $$$0$$$-indexing for DiST, since it works well with the common implementation.

DiST requires that it's size, $$$n$$$, is a power of two. We will also store the value of $$$h = log_2(n)$$$. If lg(x) returns $$$\lfloor log_2(x) \rfloor$$$, we can initialize our structure like this:

#define MAs template<class... As>
template<int D, class S>
struct DiST{
    using T = S::T;
    int lg(int x){return __builtin_clz(1)-__builtin_clz(x);} //log_2 floor
    int n; vector<vector<DiST<D-1, S>>> v; //DiST stores a matrix instead of a list
    MAs DiST(int s, As... ds)n(1<<(lg(s)+(s!=(1<<lg(s))))), //n is the smallest power of two >= s
    h(lg(n)), v(h+(n==1), vector(n, DiST<D-1, S>(ds...))){} //storing h and dealing with corner case n = 1
//...

As always, we will process lower dimensions recursively. Initially, we will have a $$$n$$$-sized list of $$$(D-1)$$$-dimensional DiSTs. Now, we should consecutively partition the list in blocks of size $$$s$$$, where $$$s$$$ is a power of two. In each block, we will merge the DiSTs from the middle to the left and from the middle to the right and store each step:

//... In the multidimensional version:
    void init(){
        for(int i = 0; i < n; i++)v[0][i].init(); //v[0] is the base list of DiSTs
        //lower dimensions have already been processed
        for(int d = 1, s = 2; d < h; d++, s *= 2) //blocks of size 2s, s = 2^d
        for(int m = s; m < n; m += 2*s){ //iterating the middle of each block
            v[d][m] = v[0][m]; v[d][m-1] = v[0][m-1]; //middle of the block is between m and m-1
            //having processed the middle, we will merge structures in both directions:
            //from the middle to the right
            for(int i = m+1; i < m+s; i++)v[d][i].merge(v[d][i-1], v[0][i]);
            //from the middle to the left
            for(int i = m-2; i >= m-s; i--)v[d][i].merge(v[0][i], v[d][i+1]);
            /* v[d][i] is some information on a block of size 2^(d+1). It may be the middle of a block or
            the result of merging data structures from the middle to the left or right of the block */
        }
    }
    void merge(DiST& a, DiST& b){ //merging two DiSTs
        for(int d = 0; d < h; d++)
            for(int i = 0; i < n; i++)
                v[d][i].merge(a.v[d][i], b.v[d][i]); //recursive definition of merge
    }
//...
//... In the 0D version:
    void merge(DiST& a, DiST& b){val = S::op(a.val, b.val);} //base case: applying the operation
    void init(){} //no preprocessing needed
//...

You may verify that the complexity of this process is $$$O(op \cdot (2 \cdot n \cdot log(2 \cdot n))^D)$$$, where $$$n$$$ is the original size of the structure. The $$$2$$$ factors are due to rounding up $$$n$$$ to the nearest power of two.

To answer queries, we should firstly deal with the trivial case $$$l = r$$$, in which we can simply propagate the query to the $$$l$$$-th DiST in our list. The basis of DiSTs efficiency is that, if $$$l \neq r$$$, we can take the value $$$k = $$$ lg( $$$l \oplus r$$$ ) and be sure that a block of size $$$2^{k+1}$$$ has $$$l$$$ located to the left and $$$r$$$ to the right of it's middle. Since we already processed this block, the query becomes simple:

//...
    MAs T query(int l, int r, As... ps){ //multidimensional query
        if (l==r)return v[0][l].query(ps...); //trivial case
        int k = lg(l^r);
        return S::op(t[k][l].query(ps...), t[k][r].query(ps...));
        /* t[k][l] is the merge from l to the middle of some block of size 2^(k+1) and t[k][r] is the merge
        from this middle to r. Thus, merging their answers will be the same as merging the whole range [l, r] */
    }
//...

Similarly to Psum query, this process is $$$O(op \cdot 2^D)$$$. The implementation is done:

Code

#define MAs template<class...As>
template<int D, class S>
struct DiST{ //multidimensional case
    using T = typename S::T;
    int n, h; vector<vector<DiST<D-1, S>>> v;
    int lg(int x){return __builtin_clz(1)-__builtin_clz(x);}
    MAs DiST(int s, As... ds):n(1<<(lg(s)+(s!=(1<<lg(s))))), //resizing 
    h(lg(n)), v(h+(n==1), vector(n, DiST<D-1, S>(ds...))){}
    MAs void set(T x, int p, As... ps){t[0][p].set(x, ps...);}
    void merge(DiST& a, DiST& b){
        for(int d = 0; d < h; d++)
            for(int i = 0; i < n; i++)
                v[d][i].merge(a.v[d][i], b.v[d][i]);
    }
    void init(){
        for(int i = 0; i < n; i++)v[0][i].init();
        for(int d = 1, s = 2; d < h; d++, s *= 2)
        for(int m = s; m < n; m += 2*s){
            v[d][m] = v[0][m]; v[d][m-1] = v[0][m-1];
            for(int i = m+1; i < m+s; i++)v[d][i].merge(v[d][i-1], v[0][i]);
            for(int i = m-2; i >= m-s; i--)v[d][i].merge(v[0][i], v[d][i+1]);
        }
    }
    MAs T query(int l, int r, As... ps){
        if (l==r)return v[0][l].query(ps...);
        int k = lg(l^r);
        return S::op(v[k][l].query(ps...), v[k][r].query(ps...));
    }
};

template<class S>
struct DiST<0, S>{ 
    using T = typename S::T;
    T val = S::id;
    void set(T x){val = x;}
    void merge(DiST& a, DiST& b){val = S::op(a.val, b.val);}
    void init(){}
    T query(){return val;}
};

The memory used may get to $$$(2 \cdot n \cdot log(2 \cdot n))^D \cdot sizeof(T)$$$ due to power of two rounding.

The following problems can be solved using this structure:

$$$1D$$$: Product on the segment by modulo (CodeChef)
$$$1D$$$: Static Range Minimum Queries (CSES)
$$$2D$$$: Chef and Rectangle Array (CodeChef)
$$$2D$$$ (challenge): 713D - Animals and Puzzle

The last three problems can be solved using the Sparse Table structure. It may be an interesting exercise to try and create your own multidimensional version of it using the same process we've used so far. You can compare yours to mine afterwards.

Range Queries + Point Updates

When we are required to process both range queries and point updates, our strategy will be a bit different. Our structure will be initialized with the identity value, and inserting the initial values will be the same as updating values. Update logic will be the most difficult part.

We know that in each dimension $$$D$$$, our DS will store a list of $$$D-1$$$ dimensional data structures, each representing the merge of data structures in some ranges. Let's suppose we know how to efficiently update in dimension $$$D-1$$$ (base case: updating a value in $$$0D$$$). When we want to update the point with coordinates $$$(p_1,\ p_2,\ \ldots)$$$ ($$$p_1$$$ is the coordinate in dimension $$$D$$$), we should only update the data structures representing the merge of ranges that contain $$$p_1$$$ (usually they will be $$$O(log(n))$$$). For each of them, we will propagate a $$$D-1$$$-dimensional update with coordinates $$$(p_2,\ \ldots)$$$ that will correctly adjust the merges in each dimension with the updated value. The way this works in practice will be explored when studying speficic data structures.

In the implementation, we won't be keeping the set and init methods; instead, we will use a different method depending on our structure. If we want to support updates of the form $$$past \ value = new \ value$$$, we will name it u_set, but if what we want to do is $$$past \ value = past \ value * new \ value$$$ (remember $$$*$$$ is the operation symbol), we will name it u_op.

If our structure had an u_set method, it's implementation would look like this:

//...
//... In the multidimensional version:
    MAs void u_set(T x, int p, As... ps){ //set the value of point (p, ps...) to x
        set<int> U = related_to(p); 
        //suppose U contains the indexes of all data structures whose associated range contains p
        for(int i : U){
            T cur_x = recalculate(x, p, range_of(i));
            /* in each i we want to set the value of the point (ps...) in data structure i to the result of applying
            the operation to all points (j, ps...) such that j is in the range associated to the DS i. We need to be 
            able to quickly calculate this value considering that the value of (p, ps...) was changed to x. */
            v[i].u_set(cur_x, ps...); //we will propagate the update to lower dimensions
        }
    }
//...
//... In the 0D version:
    void u_set(T x){val = x;} //setting the value
//...

The usage will also be simple:

void example(){
    MDRQ<3, S> ds(3, 4, 5); //3x4x5 3D structure based on S
    ds.u_set(val_1, 0, 2, 1); //point (0, 2, 1) has now value = val_1
    cout << ds.query(0, 1, 0, 3, 0, 4) << endl; //only val_1 influences this query
    ds.u_set(val_2, 2, 3, 0); //point (2, 3, 0) has now value = val_2
    ds.u_set(val_3, 0, 2, 1); //point (0, 2, 1) has now value = val_3
    cout << ds.query(0, 1, 0, 3, 0, 4) << endl; //only val_3 influences this query
}

BIT

I probably wouldn't have done this blog or studied MDRQDS in a deeper level or understand and implement them the way I do now if I hadn't read "Nifty implementation of multi-dimensional Binary Indexed Trees using templates." by mouse_wireless. My approach will be slightly different to theirs, mostly due to different objectives and coding styles — it may be a good idea to read their blog too and make your own implementation based on both.

We know that our multidimensional BIT should answer queries related to Abelian Groups (same restriction on Psums). Besides, single values will be changed between queries. The modification that BITs support is the u_op type. However, since we are working with Groups, it's easy to create our own u_set when we want to:

    MAs void u_set(T x, int p, As... ps){
        T past_val = get_value(p, ps...); //suppose we know the current value at (p, ps...)
        u_op(S::op(S::inv(past_val), x), p, ps...);
        //past_val = past_val * (past_val)^-1 * x = x
    }

Note: as with Psums, we will use $$$1$$$-indexing when working with BITs. It's important for the idea and makes the implementation easier. As always, we will still assume that the interface methods ( query and u_op ) use $$$0$$$-indexing.

What a $$$1D$$$ BIT does is to store, in each $$$1$$$-indexed position $$$i$$$, the result of applying the operation on the subarray of size $$$lp(i)$$$ that ends in $$$i$$$, where $$$lp(i)$$$ is the largest power of two that divides $$$i$$$. This enables us to get the answer corresponding to any prefix by simply applying the operation to the stored values at $$$O(log(n))$$$ positions (similarly to Psums, we can transform a query $$$[l,\ r]$$$ into two prefix queries).

Our multidimensional BIT will work basically the same way. To get the answer for a query $$$[(0,\ l_2,\ \ldots),\ (r_1,\ r_2,\ \ldots)]$$$, we will iterate some positions of the BIT such that the ranges associated to these positions are disjoint and cover completely the range $$$[0,\ r_1]$$$. In each of these positions, we will propagate the query for lower dimensions (and, of course, the query in dimension $$$0$$$ is simply returning a value). The answer for this prefix will be the merge (applying the operation) of the answers calculated by the internal data structures in each of the iterated positions.

//...
//... In the multidimensional version:
    int lp(int i){return i&-i;} //largest power of two that divides i
    MAs T query(int l, int r, As... ps){
        T lv=S::id, rv=S::id; //answers of prefixes l-1 and r
        r++; //due to 1-indexing
        for(; r; r-=lp(r))rv=S::op(rv, v[r].query(ps...)); //r = 0 means the prefix ending in r was fully processed
        // we add to our answer the result of the range [r-lp(r)+1, r] then proceed to the range ending on r-lp(r)
        for(; l; l-=lp(l))lv=S::op(lv, v[l].query(ps...));
        // as always, we are propagating the query to lower dimensions
        return S::op(rv,S::inv(lv)); //use of group property
    }
//...
//... In the 0D version:
    T query(){return val;} //query in 0D is returning a value
//...

You may see that query complexity becomes $$$O(op \cdot (2 \cdot log(n))^D + inv \cdot (2 \cdot log(n))^{D-1})$$$. However, besides answering queries, we also have to process updates ( u_op ). Let's remember how the update is done in dimension $$$1$$$:

Suppose the value at position $$$i$$$, $$$A_i$$$ ($$$1$$$-indexed), is changed to $$$A_i * x$$$. This update will change the stored value in $$$O(log(n))$$$ positions in our BIT: all positions $$$j$$$ such that $$$[j-lp(j)+1,\ j]$$$ contains $$$i$$$. Thus, to process an update, we should iterate all of these $$$j$$$ and update the value stored at $$$j$$$, $$$val_j$$$, to $$$val_j * x$$$. This works due to commutativity: since $$$val_j$$$ can be expressed as $$$A_{j-lp(j)+1} * \ldots * A_i * \ldots * A_j$$$, after this update we have (at $$$j$$$): $$$A_{j-lp(j)+1} * \ldots * (A_i * x) * \ldots * A_j = (A_{j-lp(j)+1} * \ldots * A_i * \ldots * A_j) * x = val_j * x$$$.

We can easily generalize this to any dimension $$$D$$$, with the base case being updating in dimension $$$0$$$ (setting $$$val$$$ to $$$val * x$$$). Let's say the point of the update is $$$(p_1,\ p_2,\ \ldots)$$$ and the value of the update is $$$x$$$. We will iterate all internal data structures whose range contains $$$p_1$$$ and, in each of them, update it's point $$$(p_2,\ \ldots)$$$ with the value $$$x$$$. We are doing exactly what we should do: for all positions (in all dimensions) related to the point $$$(p_1,\ p_2,\ \ldots)$$$, the value $$$val$$$ of this position will become $$$val * x$$$. We can see that this is simpler than the u_set skeleton we've seen in the previous section, due to the nature of these types of updates.

//...
//... In the multidimensional version:
    MAs void u_op(T x, int p, As... ps){//updating (p, ps...) with x
        for(p++; p < n; p += lp(p)){//p++ is for 1-indexing
            /* p += lp(p) proceeds to the next endpoint of a range containing p. You can verify this by
            proving that this endpoint's range contains p and no other endpoints' ranges up to it contain p. */
            v[p].u_op(x, ps...); //propagating the update for lower dimensions
        }
    }
//...
//... In the 0D version:
    void u_op(T x){val = S::op(val, x);} //u_op in 0D
//...

It's easy to see that the update complexity is $$$O(op \cdot log(n)^D)$$$. And this concludes our implementation:

Code

#define MAs template<class... As>
template<int D, class S>
struct BIT{
    using T = typename S::T;
    int n; vector<BIT<D-1, S>> v;
    int lp(int i){return i&-i;} //largest power of two that divides i
    MAs BIT(int s, As... ds):n(s+1),v(n,BIT<D-1, S>(ds...)){} //1-indexing
    MAs T query(int l, int r, As... ps){
        T lv=S::id, rv=S::id; r++;
        for(; r; r-=lp(r))rv=S::op(rv, v[r].query(ps...));
        for(; l; l-=lp(l))lv=S::op(lv, v[l].query(ps...));
        return S::op(rv,S::inv(lv));
    }
    MAs void u_op(T x, int p, As... ps){
        for(p++; p < n; p += lp(p)) v[p].u_op(x, ps...);
    }
};

template<class S>
struct BIT<0, S>{
    using T = typename S::T;
    T val = S::id;
    T query(){return val;}
    void u_op(T x){val = S::op(val, x);}
};

We end up using around $$$n^D \cdot sizeof(T)$$$ memory. Now you can solve the following problems:

$$$1D$$$: Dynamic Range Sum Queries (CSES)
$$$2D$$$: Forest Queries II (CSES)
$$$3D$$$: UFOs (Timus)
$$$1,2,3D$$$ (challenge): Pairs (IOI)

SegTree

Again: Groups are nice, but sometimes we are forced to work with Monoids. Range query + point update with Monoids is the most general version of the range query problem we will study here, solvable by one of the most famous data structures in competitive programming: Segment Tree.

Note: in our SegTree, we will index the root at $$$1$$$ and the query logic will be based on half-open intervals ($$$[l,\ r[$$$). However, we will still consider that query and update methods are called with $$$0$$$-indexing and closed intervals ($$$[l,\ r]$$$).

Each node in the SegTree stores the merge of a range whose size is a power of two. We will use the iterative implementation as a basis, as it's shorter, faster and easier to adapt for our needs. You can here more about it in this amazing blog.

Queries in SegTree are simple. In $$$1D$$$, what we do is select some nodes in the SegTree such that their associated ranges are disjoint and they cover completely the range of the query. Then, the answer of the query is simply merging (applying the operation to) the values stored at each of these nodes (similarly to BITs).

Now, suppose we want to answer a query $$$[(l_1,\ l_2,\ \ldots),\ (r_1,\ r_2,\ \ldots)]$$$ in dimension $$$D$$$ and we already know how to answer $$$D-1$$$-dimensional queries in our multidimensional SegTree. Then, answering a query will be simply computing the merge of answers of the $$$D-1$$$-dimensional query $$$[(l_2,\ \ldots),\ (r_2,\ \ldots)]$$$ given by internal data structures whose associated ranges are disjoint and cover completely the range $$$[l_1,\ r_1]$$$.

//...
//... In the multidimensional version:
    MAs T query(int l, int r, As... ps){
        T lv=S::id, rv=S::id;
        /* in a usual iterative SegTree, we would calculate the merge of node values in a way to maintain the
        order of operations, using a value to calculate from left to right and other to calculate from right
        to left. This is not necessary here, due to the restriction of commutativity, but I believe it helps
        resembling a usual SegTree (and it does not change the complexity). */
        for(l += n, r += n+1; l < r; l /= 2, r /= 2){ //i/2 is the parent of i
            //i+n corresponds to the position of a leaf in the SegTree representing the range [i, i]. 
            /* summing n+1 to r instead of n is to get the answer of [l,r] through a query in the half-open
            interval [l,r+1[. */
            if (l&1)lv = S::op(lv, v[l++].query(ps...)); //node at l is in the border of query range
            if (r&1)rv = S::op(v[--r].query(ps...), rv); //note at r-1 is in the border of query range
            //we propagate the query for lower dimensions
        }
        return S::op(lv, rv); //merge values from left and right
    }
//...
//... In the 0D version:
    T query(){return val;} //query in 0D is returning a value
//...

We can see that the complexity of query is $$$O(op \cdot (2 \cdot log(n))^D)$$$. Now, we should worry about updates. A SegTree can actually handle both u_op and u_set updates, but we will implement only u_set. Whenever we want to, we can create our own u_op:

//...
    MAs void u_op(T x, int p, As... ps){//set the value val of the point (p, ps...) to val * x
        T val = get_value(p, ps...); //suppose we know the current value at (p, ps...)
        u_set(S::op(val, x), p, ps...); //simply set val to val * x
    }
//...

In a SegTree, the update is as follows: firstly, update the leaf, and then update the ancestors from the leaf's parent up to the root using their children. Let's understand better how to unite this idea with the u_set skeleton we have seen earlier:

Suppose we know how to update $$$D-1$$$ dimensional SegTrees (base case: $$$0D$$$, a single value). We want to support updates of the form: set the value of the point at coordinates $$$(p_1,\ p_2,\ \ldots)$$$ to $$$x$$$ and update our $$$D$$$-dimensional SegTree accordingly. Firstly, we will update our internal data structure (a node) corresponding to the range $$$[p_1,\ p_1]$$$ by propagating an update with the value $$$x$$$ at the point $$$(p_2,\ \ldots)$$$. Then, we will successively update the parents of the current node. In each of them, we will propagate an update at the point $$$(p_2,\ \ldots)$$$ with a value $$$y$$$: the merge of values at points $$$(i,\ p_2,\ \ldots)$$$ for $$$i$$$ in the parent's associated range. This $$$y$$$ can be easily computed, because children have been already updated: $$$y$$$ is simply the merge of the values of point $$$(p_2,\ \ldots)$$$ in the left child and right child.

We can implement this using an additional helper method:

//...
//... In the multidimensional version:
    MAs T get(int p, As... ps){ //return the value of point (p, ps...)
        return v[p+n].get(ps...); //v[p+n] is the leaf associated to the range [p, p]
    }
    MAs void u_set(T x, int p, As... ps){
        v[p+=n].u_set(x, ps...); //we should update the corresponding leaf first
        while(p/=2){ //then iterate ancestors up to the root
            T y = S::op(v[2*p].get(ps...), v[2*p+1].get(ps...)); //merge values of updated children
            v[p].u_set(y, ps...);
        }
    }
//...
//... In the 0D version:
    T get(){return val;} //the value at this point
    void u_set(T x){val = x;} //setting the value of the point
//...

You may verify that the update complexity is $$$O(op \cdot log(n)^D)$$$. We now have a functional Multidimensional Segment Tree:

Code

#define MAs template<class... As>
template<int D, class S>
struct SegTree{
    using T = typename S::T;
    int n; vector<SegTree<D-1, S>> v;
    MAs SegTree(int s, As... ps): n(s), v(2*n, SegTree<D-1, S>(ps...)){}
    MAs T query(int l, int r, As... ps){
        T lv=S::id, rv=S::id;
        for(l += n, r += n+1; l < r; l /= 2, r /= 2){
            if (l&1)lv = S::op(lv, v[l++].query(ps...));
            if (r&1)rv = S::op(v[--r].query(ps...), rv);
        }
        return S::op(lv, rv);
    }
    MAs T get(int p, As... ps){return v[p+n].get(ps...);}
    MAs void u_set(T x, int p, As... ps){
        v[p+=n].u_set(x, ps...);
        while(p/=2){
            T y = S::op(v[2*p].get(ps...), v[2*p+1].get(ps...));
            v[p].u_set(y, ps...);
        }
    }
};

template<class S>
struct SegTree<0, S>{
    using T = typename S::T;
    T val = S::id;
    T query(){return val;}
    T get(){return val;}
    void u_set(T x){val = x;}
};

This implementation uses around $$$O((2 \cdot n)^D \cdot sizeof(T))$$$ memory.

I wasn't able to find a good amount of multidimensional range query problems not solvable by any of the structures we have studied before (most are solvable with BITs or DiSTs). If you want, you may go back to the recommended problems in previous sections and try to solve them with SegTree, although for some of them it may be impossible.

You can use the multidimensional SegTree at:

$$$1D$$$: Dynamic Range Minimum Queries (CSES)
$$$2D$$$ (challenge): 869E - The Untended Antiquity

Conclusion

I hope this blog has helped you understand better multidimensional range query data structures or, at least, showed an interesting approach to them. You should now be ready to solve range query problems in any dimension! No more suffering with $$$2D$$$ or $$$3D$$$ implementations, and if anytime you want to solve something $$$4D$$$, $$$5D$$$ or any $$$D$$$, it won't be any more difficult to solve or implement than a $$$1D$$$ structure.

On the downside, these type of problems aren't very common out there. I think it's actually very rare to find range query problems where a multidimensional data structure is part of a concrete solution. However, as seen in this blog, implementing MDRQDS can be very simple and the idea is not difficult — there is no reason why this topic shouldn't appear more in competitions, and many variations of the multidimensional range query problem can be explored. I'm talking to you, fellow problemsetters!

Of course, we should note there is no magic going on here. Complexity increases quickly as dimension increases, both in time and space, and usually this is an important part of problems. As I have stated earlier, my goal with the code was to provide something scalable, generic, understandable and efficient — specific problems may require specific techniques. A lot of recurring topics in range query problems weren't discussed here (range updates, coordinate compression, persistency...).

Also, even though we've only properly addressed four RQDS, there are many more out there, solving different problems or the same ones in different complexity. I would be very interested in seeing how you would implement multidimensional versions of them! I will to try to keep my MDRQDS repository updated with new structures I eventualy implement.

Lastly, I would to like to say thanks to mouse_wireless (again) for helping me understand C++ and MDRQDS a lot more through their blog, to duduFreire, Vilsu, dharinha and MagePetrus for giving me feedback on this blog, and to you for reading it!

Feel free to ask questions, share problems and express your opinions in the comments (:

Полный текст и комментарии »

multidimensional, range query, data structures, advanced data structures, 2d, 3d, 4d, mdrqds

+115

arthur_9548
15 месяцев назад
1

№	Пользователь	Рейтинг
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	dXqwq	3436
8	Radewoosh	3415
9	Otomachi_Una	3413
10	Um_nik	3376

№	Пользователь	Вклад
1	Qingyu	158
2	adamant	152
3	Proof_by_QED	146
3	Um_nik	146
5	Dominater069	144
6	errorgorn	141
7	cry	139
8	YuukiS	135
9	chromate00	134
9	TheScrasse	134