[Tutorial] On Applications of CDQ Divide & Conquer

This tutorial is nominated for the blog posts competition from cadmiumky. Thanks to him for this initiative!

Also thanks to Ghos007, dima210121012101, dmitryAdams and ChatGPT for proofreading this post and giving useful feedback.

CDQ Divide & Conquer is an interesting algorithmic technique that nobody talks about (at least in Russian CP community). However, it is quite powerful, as it often provides unexpectedly clean solutions to seemingly tedious problems. In this tutorial, I will discuss different contexts where this method is useful. I also want to collect what I have seen about this technique on the Internet in one place and add my own insights. Credit to robert1003's blog where I first learned this method.

The main idea of this technique is to calculate the influence/contribution of one half to the other while dividing. Let’s see how this can be useful.

3D Queries

Problem. You are given $$$n$$$ points $$$(x_i, y_i, z_i)$$$ and $$$q$$$ queries $$$(a_i, b_i, c_i)$$$. For each query $$$i$$$, you need to count the number of points $$$j$$$ such that $$$x_j \lt a_i$$$, $$$y_j \lt b_i$$$ and $$$z_j \lt c_i$$$.

The 2D version of this problem is well known and can be solved using sweepline with Fenwick tree, but the 3D case seems tedious. With CDQ, however, it becomes much simpler.

As in a typical sweepline problem, define events $$$(x, y, z, \text{type})$$$, where $$$\text{type} = 1$$$ means "add a point $$$(x, y, z)$$$", $$$\text{type} = 2$$$ means "count the number of points with all coordinates smaller than $$$(x, y, z)$$$". Sort the events by $$$x$$$. If two events have the same $$$x$$$ value, sort them by type in descending order so that type 2 comes before type 1. This is needed to ensure that points with $$$x_j = a_i$$$ are not counted.

Then write a function $$$\text{solve3d}(l, r)$$$ that solves the problem as if only the events in the interval $$$[l, r)$$$ existed. The idea is simple:

If $$$r - l = 1$$$, return.
Solve the problem recursively for the left half: $$$\text{solve3d}(l, \text{mid})$$$
Compute the contribution of the left part to the right — that is, how update queries from the left affect calculation queries from the right. Details of this computation are explained below.
Solve the problem recursively for the right half: $$$\text{solve3d}(\text{mid}, r)$$$

The nice thing about partitioning into halves is that, for each "add point" $$$(x_i, y_i, z_i)$$$ event from the left part, we have $$$x_i \lt a_j$$$ for every "calculate" event $$$(a_j, b_j, c_j)$$$ in the right part because of the sorting. So for one dimension, the inequality holds automatically. The problem now reduces to this:

Problem (reduced). You are given $$$n$$$ points $$$(y_i, z_i)$$$ from the left half and $$$q$$$ queries $$$(b_i, c_i)$$$ from the right half. For each query $$$i$$$, count the number of points $$$j$$$ such that $$$y_j \lt b_i$$$ and $$$z_j \lt c_i$$$.

This is the standard 2D problem (for convenience, assume it's implemented in a $$$\text{solve2d}$$$ function). Once we computed the full contribution of the left half, we treat the halves as independent. The time complexity is $$$\mathcal{O}(n \log^2 n)$$$, since we perform an $$$\mathcal{O}(n \log n)$$$ sweepline on each of $$$\log n$$$ layers of D&C.

You can see the implementation of this algorithm in the next section.

Note that the algorithm above does not rely on the fact that each point contributes exactly $$$1$$$ to the answer. It would still work if points had arbitrary weights. Also you can query sums on arbitrary 3D ranges by doing inclusion-exclusion like in prefix sums.

Tasks for practice

APIO 2019 Street Lamps

JOISC 2019 Examination

JOI Final Round 2020 Fire

ROI Regional Stage 2026 Sliding Windows

Arbitrary-dimensional queries

The recursive algorithm above just drops one dimension. To handle a fourth dimension, for example, you only need to write a $$$\text{solve4d()}$$$ function that calls $$$\text{solve3d()}$$$ to compute the contribution.

This method generalizes to any number of dimensions, giving a time complexity of $$$\mathcal{O}(n \log^{d - 1} n)$$$ for $$$d$$$ dimensions.

Implementation

struct cdq_solver {
  struct query {
    array<int, 4> point;
    long long weight;
  };
 
  struct fenwick {
    int size;
    vector<long long> bit;
 
    fenwick(int n) : size(n), bit(n + 1) {}
 
    void update(int i, long long value) {
      for (i++; i <= size; i += i & -i) {
        bit[i] += value;
      }
    }
 
    long long query(int r) {
      long long answer = 0;
      for (; r > 0; r -= r & -r) {
        answer += bit[r];
      }
      return answer;
    }
  };
 
  vector<query> queries;
  vector<long long> result;
 
  void add_point(int a, int b, int c, int d, long long weight) {
    queries.push_back({{a, b, c, d}, weight});
  }
 
  void add_query(int a, int b, int c, int d) {
    queries.push_back({{a, b, c, d}, INF});
  }
 
  bool comp(int qa, int qb, int start) {
    for (int i = start; i < 4; i++) {
      if (queries[qa].point[i] != queries[qb].point[i]) {
        return queries[qa].point[i] < queries[qb].point[i];
      }
    }
    return queries[qa].weight > queries[qb].weight;
  }
 
  void solve2d(vector<int> query_ids) {
    sort(query_ids.begin(), query_ids.end(), [&](int x, int y) {
      return comp(x, y, 2);
    });
    vector<int> diff;
    for (int i : query_ids) {
      diff.push_back(queries[i].point[3]);
    }
    sort(diff.begin(), diff.end());
    diff.resize(unique(diff.begin(), diff.end()) - diff.begin());
    fenwick bit(diff.size());
    for (int i : query_ids) {
      int y = lower_bound(diff.begin(), diff.end(), queries[i].point[3]) - diff.begin();
      if (queries[i].weight == INF) {
        result[i] += bit.query(y + 1);
      } else {
        bit.update(y, queries[i].weight);
      }
    }
  }
 
  void solve3d(vector<int> query_ids) {
    if (query_ids.size() <= 1) {
      return;
    }
    sort(query_ids.begin(), query_ids.end(), [&](int x, int y) {
      return comp(x, y, 1);
    });
    int mid = (int)query_ids.size() / 2;
    solve3d(vector<int>(query_ids.begin(), query_ids.begin() + mid));
    solve3d(vector<int>(query_ids.begin() + mid, query_ids.end()));
    vector<int> query_mid;
    for (int i = 0; i < mid; i++) {
      if (queries[query_ids[i]].weight != INF) {
        query_mid.push_back(query_ids[i]);   
      }
    }
    for (int i = mid; i < (int)query_ids.size(); i++) {
      if (queries[query_ids[i]].weight == INF) {
        query_mid.push_back(query_ids[i]);
      }
    }
    solve2d(query_mid);
  }
 
  void solve4d(vector<int> query_ids) {
    if (query_ids.size() <= 1) {
      return;
    }
    sort(query_ids.begin(), query_ids.end(), [&](int x, int y) {
      return comp(x, y, 0);      
    });
    int mid = (int)query_ids.size() / 2;
    solve4d(vector<int>(query_ids.begin(), query_ids.begin() + mid));
    solve4d(vector<int>(query_ids.begin() + mid, query_ids.end()));
    vector<int> query_mid;
    for (int i = 0; i < mid; i++) {
      if (queries[query_ids[i]].weight != INF) {
        query_mid.push_back(query_ids[i]);   
      }
    }
    for (int i = mid; i < (int)query_ids.size(); i++) {
      if (queries[query_ids[i]].weight == INF) {
        query_mid.push_back(query_ids[i]);
      }
    }
    solve3d(query_mid);
  }
 
  vector<long long> solve() {
    vector<int> query_ids(queries.size());
    iota(query_ids.begin(), query_ids.end(), 0);
    result.resize(queries.size());
    solve4d(query_ids);
    vector<long long> answer;
    for (int i = 0; i < (int)query_ids.size(); i++) {
      if (queries[query_ids[i]].weight == INF) {
        answer.push_back(result[i]);
      }
    }
    return answer;
  }
};

Tasks for practice

You can try to apply 4D CDQ in IZhO 2020 Nasty Donchick, good luck :)

I didn't manage to fit 40 4-dimensional CDQs in the limit though even with heavy constant factor optimizations.

Generalization for min/max

By the way, this method works not only for sums but also for min/max. The only limitation is that you can compute min/max only over prefix ranges, not over general ones directly, because the operation is not invertible. There is a workaround though: just add more dimensions!

For example, suppose we want the minimum on $$$[x_l, x_r] \times [y_l, y_r] \times [z_l, z_r]$$$. Then we want points $$$(x, y, z)$$$ that satisfy $$$x_l \le x \le x_r$$$ and $$$y_l \le y \le y_r$$$ and $$$z_l \le z \le z_r$$$.

For the $$$z$$$-dimension, there's no such problem — we can update any segment in segment tree not just prefixes. For $$$x$$$ and $$$y$$$, however, we need to "double" them. That is, consider a 5D point $$$p = (x, x, y, y, z)$$$ and require $$$p_1 \ge x_l$$$, $$$p_2 \le x_r$$$, $$$p_3 \ge y_l$$$, $$$p_4 \le y_r$$$ and $$$z_l \le p_5 \le z_r$$$. Suffix queries can be handled similarly to prefix queries, for example, by multiplying the relevant coordinates by $$$-1$$$. The time complexity is $$$\mathcal{O}(n \log^4 n)$$$ because of 5-dimensional CDQ 💀

MEX on segment with modifications (offline)

Problem. You are given an array $$$a$$$ of $$$n$$$ integers. Also there are $$$q$$$ queries:

Type 1. Set $$$a_i := x$$$.
Type 2. Find $$$\text{MEX}(a_l, a_{l + 1}, \cdots, a_{r - 1}, a_r)$$$.

The idea is to maintain intervals where each value $$$x$$$ is excluded.

Formally, let the positions of $$$x$$$ be $$$p_1, p_2, \cdots, p_k$$$. Then, for each $$$i$$$, we create an interval $$$(p_i, p_{i + 1})$$$ with weight $$$x$$$. For convenience, we also add $$$(-1, p_1)$$$ and $$$(p_k, n)$$$. If $$$x$$$ does not appear in the array at all, we add a single interval $$$(-1, n)$$$. Now a query of type 2 can be answered by taking the minimum weight of all intervals fully covering $$$[l, r]$$$. Intuitively, if some occurrence of $$$x$$$ was before the left border $$$l$$$, and the next occurrence is after $$$r$$$, this means that $$$x$$$ does not appear in this segment.

Without modifications, this reduces the problem to "minimum in prefix subrectangle" queries: treat intervals as points and queries as subrectangles. The time complexity is $$$\mathcal{O}(n \log n)$$$.

To handle modifications, we need to update the set of intervals after each change. To do this, we store positions of each value $$$x$$$ in a std::vector of std::set’s. When processing a modification at position $$$i$$$, we remove $$$i$$$ from the set corresponding to the old value of $$$a_i$$$. Before the removal, $$$i$$$ splits one existing interval into two, so we delete the two intervals that had $$$i$$$ as an endpoint and add the merged interval formed by its neighbors. Then we set $$$a_i := x$$$ and insert $$$i$$$ into the set for the new value. This insertion splits one existing interval into two, so we delete the merged interval and add the two new intervals created by inserting $$$i$$$.

The main difficulty is that now the set of intervals is dynamic, i.e., intervals may be added or removed as the algorithm runs. The idea is to maintain the "time of life" of each object. That is, each interval is now described by four integers: $$$(l, r, T_l, T_r)$$$. This means that the interval was added at query with index $$$T_l$$$ (or $$$-1$$$ if it existed from the beginning) and was removed in query with index $$$T_r$$$ (or $$$q$$$ if it was never removed). Note that we describe a query by a triple $$$(l, r, T)$$$ now. The problem reduces to the following:

Problem (reduced). You are given $$$n$$$ points $$$(l_i, r_i)$$$ with weights $$$w_i$$$. Each point $$$i$$$ appears at time $$$lt_i$$$ and disappears at time $$$rt_i$$$. Also there are $$$q$$$ queries $$$(lq_i, rq_i)$$$. For each query, find the minimum weight of all points $$$j$$$ such that:

$$$l_j \lt lq_i$$$
$$$r_j \gt rq_i$$$
$$$j$$$ is "alive" at time $$$i$$$. Formally, $$$lt_j \le i \le rt_j$$$

At first glance, this looks like a four-dimensional problem, since each point is described by four parameters (and weight). However, there is a way to avoid this.

Run CDQ on events sorted by $$$l_i$$$ / $$$lq_i$$$. When computing the contributions, the inequality on left borders is satisfied automatically, so we can drop this dimension. Then run a sweepline over time. For each point, create two events: an add event at time $$$lt_i$$$ and a remove event at time $$$rt_i$$$. Also create one event for each query. When processing an add event for point $$$i$$$, insert weight $$$w_i$$$ at position $$$r_i$$$ in a segment tree. When processing a remove event, replace this value with $$$+\infty$$$. For each query $$$i$$$, query the minimum on the suffix $$$(rq_i, +\infty)$$$ of the segment tree.

For correctness, right borders must be unique. Otherwise, removing one point could incorrectly delete another point with the same $$$r$$$. This is easy to fix by compressing $$$r_i$$$ values and assigning unique indices (by sorting and using their order). The time complexity is $$$\mathcal{O}(n \log^2 n)$$$ because of three-dimensional queries.

Implementation

#include <bits/stdc++.h>
using namespace std;

struct query {
  int l, r, lt, rt, value;
  int pos;
};

struct modification {
  int l, r, i, t;
};

int Min(int a, int b) {
  return a < b ? a : b;
}

struct segment_tree {
  int n;
  vector<int> tree;
  vector<int> updates;

  segment_tree() {}

  segment_tree(int _n) : n(_n), tree(_n * 2, INT_MAX) {}

  void modify(int i, int v) {
    for (updates.push_back(i += n), tree[i] = v; i > 1; i >>= 1) {
      updates.push_back(i >> 1);
      tree[i >> 1] = Min(tree[i], tree[i ^ 1]);
    }
  }

  int query(int l, int r) {
    int ans = INT_MAX;
    for (l += n, r += n; l < r; l >>= 1, r >>= 1) {
      if (l & 1) {
        ans = Min(ans, tree[l++]);
      }
      if (r & 1) {
        ans = Min(ans, tree[--r]);
      }
    }
    return ans;
  }

  void rollback() {
    for (int x : updates) {
      tree[x] = INT_MAX;
    }
    updates.clear();
  }
};

segment_tree ST;

void CDQ(vector<query>& queries, vector<int>& answers, int L, int R) {
  if (R - L == 1) {
    return;
  }
  int M = (L + R) / 2;
  CDQ(queries, answers, L, M);
  CDQ(queries, answers, M, R);
  vector<array<int, 4>> events;
  for (int i = L; i < M; i++) {
    if (queries[i].pos == -1) {
      events.push_back({queries[i].lt, -1, queries[i].r, queries[i].value});
      events.push_back({queries[i].rt, 1, queries[i].r, queries[i].value});
    }
  }
  for (int i = M; i < R; i++) {
    if (queries[i].pos != -1) {
      events.push_back({queries[i].lt, 0, queries[i].r, queries[i].pos});
    }
  }
  sort(events.begin(), events.end());
  for (auto [x, type, r, value] : events) {
    if (type == -1) {
      ST.modify(r, value);
    } else if (type == 0) {
      answers[value] = min(answers[value], ST.query(r, ST.n));
    } else {
      ST.modify(r, INT_MAX);
    }
  }
  ST.rollback();
}

int main() {
  cin.tie(nullptr)->sync_with_stdio(false);
  int n, q;
  cin >> n >> q;
  vector<int> a(n);
  for (int &v : a) {
    cin >> v;
  }
  vector<set<int>> pos(n + 1);
  for (int i = 0; i < n; i++) {
    pos[a[i]].insert(i);
  }
  vector<modification> mods;
  int T = 0;
  for (int i = 0; i <= n; i++) {
    int last = -1;
    pos[i].insert(-1);
    pos[i].insert(n);
    for (auto it = next(pos[i].begin()); it != pos[i].end(); it++) {
      int j = *it;
      mods.push_back({last, j, i, T});
      last = j;
    }
    mods.push_back({last, n, i, T});
  }
  vector<query> queries;
  int answer_c = 0;
  while (q--) {
    T++;
    char type;
    cin >> type;
    if (type == '?') {
      int l, r;
      cin >> l >> r;
      l--, r--;
      queries.push_back({l - 1, r + 1, T, T, -1, answer_c++});
    } else {
      int i, x;
      cin >> i >> x;
      i--;
      auto it = pos[a[i]].lower_bound(i);
      mods.push_back({*prev(it), *it, a[i], T});
      mods.push_back({*it, *next(it), a[i], T});
      mods.push_back({*prev(it), *next(it), a[i], T});
      pos[a[i]].erase(i);
      a[i] = x;
      pos[a[i]].insert(i);
      it = pos[a[i]].lower_bound(i);
      mods.push_back({*prev(it), *next(it), a[i], T});
      mods.push_back({*prev(it), *it, a[i], T});
      mods.push_back({*it, *next(it), a[i], T});
    }
  }
  sort(mods.begin(), mods.end(), [](modification a, modification b) {
    return a.l != b.l ? a.l < b.l : (a.r != b.r ? a.r < b.r : (a.i != b.i ? a.i < b.i : a.t < b.t));
  });
  T++;
  for (int i = 0; i < (int)mods.size(); i++) {
    int j = i;
    vector<int> times;
    while (j < (int)mods.size() && mods[i].l == mods[j].l && mods[i].r == mods[j].r && mods[i].i == mods[j].i) {
      times.push_back(mods[j++].t);
    }
    times.push_back(T);
    for (int k = 0; k < (int)times.size() - 1; k += 2) {
      queries.push_back({mods[i].l, mods[i].r, times[k], times[k + 1], mods[i].i, -1});
    }
    i = j - 1;
  }
  sort(queries.begin(), queries.end(), [](query a, query b) {
    return a.r != b.r ? a.r < b.r : a.pos > b.pos;
  });
  int last = -1;
  for (int i = 0; i < (int)queries.size(); i++) {
    int j = i;
    while (j < (int)queries.size() && queries[i].r == queries[j].r) {
      last = (queries[j].r = max(last + 1, queries[j].r));
      j++;
    }
    i = j - 1;
  }
  ST = segment_tree(last + 1);
  sort(queries.begin(), queries.end(), [](query a, query b) {
    return a.l != b.l ? a.l < b.l : (a.r != b.r ? a.r > b.r : (a.lt != b.lt ? a.lt < b.lt : a.rt > b.rt));
  });
  vector<int> answers(answer_c, INT_MAX);
  CDQ(queries, answers, 0, (int)queries.size());
  for (int value : answers) {
    cout << value << "\n";
  }
}

Tasks for practice

Long tour of Moscow Open Olympiad 2018-2019, problem "Dima and Array"

Online FFT

Problem. You are given arrays $$$a$$$, $$$b$$$ and $$$r$$$, all of length $$$n$$$. You need to calculate a recurrent array $$$dp_k = a_k \cdot \sum_{i + j = k}{(dp_i \cdot r_j)} + b_k$$$ also of length $$$n$$$.

The problem is that this recurrence depends on its previous terms so we can't solve it by just multiplying polynomials. Once again, we can apply CDQ on this DP.

Suppose we are computing the contribution of DP's in the range $$$[L, M)$$$ to the range $$$[M, R)$$$ (and we have already computed the DP values in the range $$$[L; M)$$$). For now, let's ignore $$$a_k$$$'s and $$$b_k$$$'s for a moment — we will apply the transform $$$dp_k := a_k \cdot dp_k + b_k$$$ inside the $$$\text{solve}(k, k + 1)$$$ call later. At this stage, we treat unprocessed $$$dp_k$$$ as $$$\sum_{i + j = k}{dp_i \cdot r_j}$$$. For our current ranges $$$[L; M)$$$ and $$$[M; R)$$$, what values of $$$i$$$ and $$$j$$$ are actually relevant for computing the influence? Exactly those that form cross contributions: $$$i \in [L; M)$$$ and $$$j \in [0; R - L)$$$. The second bound holds because the smallest contributing index is $$$i = L$$$ and the largest receiving index is $$$k = R - 1$$$, so the maximum possible difference is $$$(R - 1) - L \lt R - L$$$.

We can then use FFT to multiply $$$DP[L:M]$$$ with $$$r[0:R - L]$$$ (here ":" denotes slice as in Python). Then add the results of multiplications to the DP's from the right half. The time complexity is $$$\mathcal{O}(n \log^2 n)$$$ because we convolve arrays of total size $$$\mathcal{O}(n)$$$ in $$$\mathcal{O}(n \log n)$$$ time on each of $$$\mathcal{O(\log n)}$$$ recursion layers.

Implementation for a similar problem (without array a)

#include <bits/stdc++.h>
using namespace std;

const int MOD = 998244353;
vector<int> c, r, dp;

int power(int x, int y) {
  int result = 1;
  for (int base = x; y > 0; y /= 2, base = 1LL * base * base % MOD) {
    if (y & 1) {
      result = 1LL * result * base % MOD;
    }
  }
  return result;
}

void FFT(vector<int>& A, bool inverse = false) {
  int N = 1;
  while (N < (int)A.size()) {
    N *= 2;
  }
  A.resize(N);
  vector<int> rev_index(N);
  for (int len = 1; len < N; len *= 2) {
    for (int i = 0; i < len; i++) {
      rev_index[i] *= 2;
      rev_index[i + len] = rev_index[i] + 1;
    }
  }
  for (int i = 0; i < N; i++) {
    if (i < rev_index[i]) {
      swap(A[i], A[rev_index[i]]);
    }
  }
  for (int len = 1; len < N; len *= 2) {
    int omega = power(3, (MOD - 1) / (2 * len));
    for (int i = 0; i < N; i += len * 2) {
      int mult = 1;
      for (int j = 0; j < len; j++, mult = 1LL * mult * omega % MOD) {
        int x = A[i + j], y = 1LL * A[i + j + len] * mult % MOD;
        A[i + j] = x + y - (x + y >= MOD ? MOD : 0);
        A[i + j + len] = x - y + (x - y < 0 ? MOD : 0);
      }
    }
  }
  if (inverse) {
    reverse(A.begin() + 1, A.end());
    int inv = power(N, MOD - 2);
    for (int i = 0; i < N; i++) {
      A[i] = 1LL * A[i] * inv % MOD;
    }
  }
}

void CDQ(int L, int R) {
  if (R - L == 1) {
    dp[L] += c[L];
    if (dp[L] >= MOD) {
      dp[L] -= MOD;
    }
    return;
  }
  int mid = (L + R) / 2;
  CDQ(L, mid);
  vector<int> DP = vector<int>(dp.begin() + L, dp.begin() + mid);
  vector<int> A = vector<int>(r.begin(), r.begin() + R - L);
  DP.resize(2 * (R - L)), A.resize(2 * (R - L));
  FFT(DP);
  FFT(A);
  for (int i = 0; i < (int)DP.size(); i++) {
    DP[i] = 1LL * DP[i] * A[i] % MOD;
  }
  FFT(DP, true);
  for (int i = mid - L; i < 2 * (R - L) && L + i < R; i++) {
    dp[L + i] += DP[i];
    if (dp[L + i] >= MOD) {
      dp[L + i] -= MOD;
    }
  }
  CDQ(mid, R);
}

int main() {
  cin.tie(nullptr)->sync_with_stdio(false);
  int N;
  cin >> N;
  c.resize(N), r.resize(N);
  dp.resize(N);
  for (int &value : c) {
    cin >> value;
  }
  for (int &value : r) {
    cin >> value;
  }
  CDQ(0, N);
  for (int i = 0; i < N; i++) {
    cout << dp[i] << " \n"[i + 1 == N];
  }
}

Tasks for practice

JOISC 2023 Festival

CDQ + SOS DP

This is similar to the previous section. Let's look at the following DP:

Problem. You are given arrays $$$a$$$, $$$b$$$ and $$$r$$$, all of length $$$2^n$$$. You need to calculate recurrence $$$dp_i = a_i \cdot \sum_{j \subset i}{dp_j} + b_j$$$. That is, for each mask we take the sum of DPs of all submasks, multiply by given constant and add another constant.

The idea is simple: run CDQ D&C on masks. Again, we want to compute the influence of $$$[L; M)$$$ on $$$[M; R)$$$. The binary representations of numbers in the first half look like [LCP]0[anything] and in the second half [LCP]1[anything]. Here, LCP means longest common prefix. Let its length be $$$k$$$. When computing the influence, note that the submask condition automatically holds for the first $$$k + 1$$$ bits. On the remaining bits, it becomes a standard SOS DP on an array of size $$$2^{n - k - 1}$$$.

So the method is: recursively compute the left half, run SOS DP to propagate its contribution, then recursively compute the right half. At a leaf, apply $$$dp_i := a_i \cdot dp_i + b_i$$$. The time complexity is $$$\mathcal{O}(2^n \cdot n^2)$$$, because we do $$$O(2^n)$$$ work per level and there are $$$\log_2(2^n) = n$$$ levels.

Tasks for practice

USACO 2023 Feb Platinum, Problem Setting.

CDQ on Graph Trick

The following 1D problem is well known. I assume you already know how to solve it:

Problem (1D). You are given a directed graph on $$$n$$$ vertices. Initially, there are $$$m$$$ edges $$$(u_i, v_i)$$$. Then you need to process queries. Each query gives three integers $$$v$$$, $$$l$$$ and $$$r$$$ and asks you to add edges $$$(v, l), (v, l + 1), \cdots, (v, r - 1), (v, r)$$$. After all queries, find the shortest path between $$$s$$$ and $$$t$$$.

This can be solved in $$$\mathcal{O}(n \log n)$$$ using segment tree on graph trick. But there is an even harder problem that is still solvable using this trick:

Problem (2D). You are given a directed graph on $$$n$$$ vertices and an array $$$a$$$. Initially there are $$$m$$$ edges $$$(u_i, v_i)$$$. Then you need to process queries. Each query gives four integers $$$v$$$, $$$l$$$, $$$r$$$ and $$$x$$$ and asks you to add edges $$$(v, i)$$$ for each $$$i$$$ such that $$$l \le i \le r$$$ and $$$a_i \lt x$$$. After all queries, find the shortest path between $$$s$$$ and $$$t$$$.

Let's use persistent segment tree on graph trick to solve this problem. The idea is to do a sweepline where we sort by $$$a_i$$$ / $$$x$$$. We can treat $$$a_i$$$ as the "time" when $$$i$$$-th element appears (activates). Then, there are two types of events:

Activate $$$i$$$-th position
Add an edge from given node $$$v$$$ to all already activated nodes in range $$$[l; r]$$$

Initially there are no virtual nodes (as in implicit segment tree). For the queries of the first type, add the path from the root to $$$i$$$ in the segment tree. If some node from this path already exists, add a link from the new version to the old one. For the queries of the second type, as in the 1D problem, add edges from $$$v$$$ to the latest versions of nodes of segment tree that form range $$$[l; r]$$$. The copying of versions is needed to ensure that we don't break old second type queries by adding a new position that was not activated at that moment. The time complexity is $$$\mathcal{O}(n \log n)$$$.

Moreover, we can solve even the 3D version:

Problem (3D). You are given a directed graph on $$$n$$$ vertices and arrays $$$a$$$ and $$$b$$$. Initially there are $$$m$$$ edges $$$(u_i, v_i)$$$. Then you need to process queries. Each query gives you five integers $$$v$$$, $$$l$$$, $$$r$$$, $$$x$$$, $$$y$$$, and and asks you to add edges $$$(v, i)$$$ for each $$$i$$$ such that $$$l \le i \le r$$$, $$$a_i \lt x$$$ and $$$b_i \lt y$$$. After all queries, find the shortest path between $$$s$$$ and $$$t$$$.

The solution is to do CDQ on $$$b_i$$$, build a persistent segment tree on the nodes from the left half and use the 2D solution with queries from the right half (that have greater $$$y$$$'s than $$$b_i$$$'s from the left). The time complexity is $$$\mathcal{O}(n \log^2 n)$$$.

Of course, this algorithm generalizes to arbitrary number of dimensions $$$d$$$ by using $$$(d - 1)$$$-dimensional CDQ inside of $$$d$$$-dimensional CDQ.

Tasks for practice

I am not aware of any problems in which you can use the 3D algorithm. Here are some problems where you can use the 2D method:

JOISC 2020 Treatment Project

JOISC 2020 Capital City

Rev.	By	When	Δ	Comment
en5	tch1cherin	2026-02-14 00:40:55	211
en4	tch1cherin	2026-02-14 00:31:43	116	Tiny change: 't-1346828)\n\nGenera' -> 't-1346828).\n\nGenera'
en3	tch1cherin	2026-02-09 18:51:14	1691	Tiny change: 'ely add $\delta = [\t' -> 'ely add $\Delta = [\t'
en2	tch1cherin	2026-02-08 13:31:07	16	Tiny change: 'com/gym/101225)\n\nOnlin' -> 'com/gym/102069/problem/D)\n\nOnlin'
en1	tch1cherin	2026-02-08 11:58:57	27075	Initial revision (published)

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	turmax	3559
6	tourist	3541
7	strapple	3515
8	ksun48	3461
9	dXqwq	3436
10	Otomachi_Una	3413

#	User	Contrib.
1	Qingyu	157
2	adamant	153
3	Um_nik	147
3	Proof_by_QED	147
5	Dominater069	145
6	errorgorn	142
7	cry	139
8	YuukiS	135
9	TheScrasse	134
10	chromate00	133

3D Queries

Tasks for practice

Arbitrary-dimensional queries

Tasks for practice

Generalization for min/max

MEX on segment with modifications (offline)

Tasks for practice

Online FFT

Tasks for practice

CDQ + SOS DP

Tasks for practice

CDQ on Graph Trick

Tasks for practice

History