Divide and conquer. Dynamic connectivity offline and DP optimization. Divide and conquer by queries.

Правка en1, от Wind_Eagle, 2022-07-02 15:01:02

Hello, Codeforces! I've been wanting to write an educational blog for a long time, but I couldn't find a topic for it. And recently I remembered that Codeforces doesn't have a good blog on one of my favorite topics: the "divide and conquer" technique. So I want to talk about it. The outline will be something like this:

1) Divide and Conquer. What this technique is and what it is for. Example. 2) Dynamic connectivity offline with divide and conquer. Divide and conquer dp optimization. 3) Let me tell you about a method I invented myself: divide and conquer by queries. In this case, we divide not only the array, but also the queries into groups.

So, here we go.

First, about what this technique is. The point is about the following: let's say we have an array. Then, if we know how to calculate the answer for two parts of it, and if we know how to get the answer for the whole array from the answers of two parts, we can apply the technique of divide and conquer. Let's break it down on the simplest example: merge sorting.

In merge-sorting we have an array of $$$n$$$ elements, then, for simplicity, numbers. To sort the subarray $$$[l ... r]$$$, we sort the subarrays $$$[l ... m]$$$ and $$$[m + 1 ... r]$$$, where $$$l \le m \le r$$$, then merge the two sorted arrays into one. Let's see how to do this in the easiest way.

In fact, let us have two arrays. We know that they are sorted, for example, by non-decreasing. Then to merge the two arrays into one, we can apply the following algorithm:

1) If the first array is empty, add the second array to the array with the result and quit. 2) If the second array is empty, add the first array to the array with the result and quit. 3) If both arrays are not empty, then take the smallest of the first two elements of the arrays, add it to the answer and remove it from the original array.

Of course, we should not remove from the beginning of the array, but it is much better to store a pointer to the current element we are considering. If you write in C++, I recommend using the std::merge function instead of this handwritten algorithm. Still, sometimes it is useful to understand how it works.

Now we will act recursively: write a function sort(l, r) that sorts the segment from l to r:

sort(l, r):
    if l == r:
        return
    m = (l + r) / 2
    sort(l, m)
    sort(m + 1, r)
    merge(l, m, m + 1, r)

So now let's estimate the running time of such sorting. It seems logical to divide the array roughly in half, that is, $$$m = \lfloor \frac{l + r}{2} \rfloor $$$. Let's see how many operations our algorithm can perform. For simplicity, let's put the number of array elements $$$n = 2^k$$$. Then it is clear that combining subarrays of size $$$1$$$ will take a total of $$$O(n)$$$ operations, combining subarrays of size $$$2$$$ will take $$$O(n)$$$ operations, then $$$4$$$, $$$8$$$, and so on.

From this you can see that the algorithm will perform $$$O(n)$$$ actions when combining each of the <>, and since there are $$$k$$$ such layers, it works for $$$O(nk) = O(n \log n)$$$ in total. This construction resembles a tree of segments. In fact, it almost is. For example, if in a segment tree each vertex has a sorted subtraction corresponding to that vertex, then the amount of memory spent on it will be $$$O (n \log n)$$$, and so will the construction time (you can prove it by yourself as an exercise).

Some other algorithms, such as dynamic connectivity offline, are also based on the divide and conquer method. The challenge is to handle such requests offline:

1) Add an edge to the graph. 2) Remove an edge from the graph. 3) Check if two vertices lie in the same connectivity component.

This topic may be a bit complicated for beginners, so if it is difficult, you can skip it. Since I could not quickly find English-language materials on this method, I will describe it here, although it is quite well known.

The problem is not very difficult to solve using SQRT decomposition, but I will tell the solution for $$$O(q \log n \log q)$$$.

Suppose the graph has $$$n$$$ vertices and $$$q$$$ queries, moreover, the graph is initially empty. If this is not the case, then simply add queries to the beginning to add the original edges.

Now let's define for each edge a lifetime: a continuous interval of request numbers during which the edge exists. Each check request corresponds to some time: essentially, just its input number.

Now let us build such a segment tree: it will store the numbers of edges. Initially this segment tree is empty. Now for an edge with lifetime [l ... r] write this addition to the segment tree:

add(v, tl, tr, l, r):
    if l > r:
        return
    if tl == l && tr == r:
        tree[v].insert(edge)
        return
    tm = (tl + tr) / 2
    add(v * 2, tl, tm, l, min(r, tm))
    add(v * 2 + 1, tm + 1, tr, max(l, tm + 1), r)

That is, we wrote a segment tree that is very similar to the one that assigns to a segment, only without the lazy push-through. So what did we end up with? You can see that now the segment [l ... r] turns out to be divided into $$$O(\log n)$$$ segments from the segment tree. Let us now run the search in depth this way:

dfs(v, tl, tr):
    if tl == tr:
        answer_query(tl)
        return
    tm = (tl + tr) / 2
    edge_set.insert_edges(tree[v])
    dfs(v * 2, tl, tm)
    dfs(v * 2 + 1, tm + 1, tr)
    edge_set.erase_edges(tree[v])

Now notice that when we come to the sheet corresponding to the segment [w ... w], the $$$edge$$$_$$$set$$$ will contain those and only those edges that were <> at time $$$w$$$, so in fact we will get the list of edges that were at time $$$w$$$.

So, if we knew how to respond quickly to a list of edges, we would have a ready solution. Unfortunately, we can only add edges to a regular SNM, and with this solution we still need to remove them. That is why we need to use a persistent SNM. Don't be afraid, it doesn't work at all complicated. Let's write a regular SNM with, for example, rank heuristics, or size heuristics. Then every time an edge is added, it just changes the immediate ancestor of one of the vertices, and changes the rank/size of the subtree of one of the vertices. Let's add to a special array after each added edge information about which vertex and what changes were made. Then we can cancel the last added edge for $$$O(\log n)$$$. In principle, this is enough for us.

The asymptotic evaluation is now clear: adding one edge to the segment tree takes $$$O(\log q)$$$ operations, and the whole dfs will take $$$O(q \cdot t_1 + S \cdot t_2)$$$, where $$$t_1$$$ is time to respond to the request, $$$S$$$ is the total number of edges stored in the segment tree, and $$$t_2$$$ is time to add and remove one edge. Given that $$$t_1 = O(\log n)$$$, $$$S = O(q \log q)$$$, and $$$t_2 = O(\log n)$$$, we come to the required asymptotics.

Quite a challenge on this topic: https://mirror.codeforces.com/problemset/problem/813/F

Now briefly mention DP optimization divide and conquer. It allows you to reduce the running time of some DPs under certain conditions. You can read about it in English, for example, here: https://cp-algorithms.com/dynamic_programming/divide-and-conquer-dp.html

A task on this topic: https://mirror.codeforces.com/problemset/problem/868/F

Now let's get to the fun part: divide and conquer by queries. I came up with this method on my own while solving a problem (more on that later). I don't know if I'm the first person to come up with this method, but honestly, I haven't seen it before. If it's been described somewhere before me, please write me a comment about it.

This method is somewhat similar to the segment tree that stores queries, but it seems to me somewhat easier to write and understand.

Let's solve the following problem: https://mirror.codeforces.com/problemset/problem/763/E

Let's proceed this way: first we build a divide-and-conquer on an array. We can use it to construct an SNM for each segment in question, and we can also learn how to find the answer at the prefix and suffix of each segment: indeed, this is just the number of different connectivity components on the segment. We can do this by adding one vertex at a time to the SNM.

But how do we respond to requests? This is where divide and conquer by segment comes into play. When considering segment [l ... r], let us answer only such queries [ql ... qr] that $$$ql \le m < qr$$$, that is, only such queries whose left bound lies on segment [l ... m] and whose right bound lies on segment [m + 1 ... r]. Clearly, other queries will be considered either in [l ... m] or in [m + 1 ... r].

Now it becomes easy to answer such queries: we can take two SNMs, connect them into one, and draw edges on the boundary. Since $$$k$$$ is small, there will be few such connections.

Thus, we got a pretty fast and nice solution to this problem. You can see that it doesn't use a query-based segment tree.

Finally, let's look at another problem that can be solved using this method: https://mirror.codeforces.com/gym/101590/problem/D

Unfortunately, it only has a Russian condition, so I'll tell you what you need to do:

An array of $$$n$$$ triples of numbers is given, as well as the number $$$D$$$. Also given $$$q$$$ of queries with segments $$$l_i$$$ and $$$r_i$$$. You need to choose one number from each triplet on the segment, so that the sum is as large as possible, but still divisible by $$$D$$$. For example, if the triples (1, 2, 3), (4, 5, 6) and (1, 7, 8) are given, and $$$D = 8$$$, then you can choose $$$3$$$, $$$6$$$ and $$$7$$$.

$$$n \le 50,000, q \le 300,000, D \le 50$$$.

From this problem it is clear why a segment tree is not always better. In fact, let's try to write a segment tree, each vertex of which will have a DP on that segment. But then let's estimate the running time. Clearly, the DP will be one-dimensional: $$$dp[r]$$$, where $$$r$$$ from the word remainder -- remainder. It is not difficult to calculate for each remainder what the maximal sum will be.

But here's the problem: when it comes to the segment tree, it turns out that the union of two DUs works for $$$O(D ^ 2)$$$, and the segment tree simply will not pass!

But here we can apply divide and conquer to the queries: we will separately calculate LPs for the left half, and separately calculate LPs for the right half. We will store DPs for each suffix of the left half and for each prefix of the right half (remember, DPs are an array of $$$D$$$ numbers). We will answer only those queries that intersect both segments. To do this, simply take the DP of the corresponding suffix, and the DP of the corresponding prefix, and go through the remainder $$$r$$$ of the prefix (the remainder of the suffix will equal $$$D - r$$$). Thus, we got the solution for $$$O(q \cdot D + n \cdot \log n \cdot D)$$$.

I hope you enjoyed this blog. This is my first educational blog, and I have tried to write as clearly as possible. If you still have questions, feel free to leave them in the comments.

Теги divide and conquer, dp optimization, dynamic connectivity, range queries

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
ru12 Русский Wind_Eagle 2022-07-02 16:11:22 4
en7 Английский Wind_Eagle 2022-07-02 16:08:45 0 (published)
ru11 Русский Wind_Eagle 2022-07-02 16:08:38 0 (опубликовано)
en6 Английский Wind_Eagle 2022-07-02 16:08:28 88
ru10 Русский Wind_Eagle 2022-07-02 16:05:48 7 Мелкая правка: 'бра за $O(\log n)$. В прин' -> 'бра за $O(1)$. В прин'
en5 Английский Wind_Eagle 2022-07-02 15:16:58 14 Tiny change: 'a Russian condition, so I'll ' -> 'a Russian statem, so I'll '
en4 Английский Wind_Eagle 2022-07-02 15:14:33 4 Tiny change: 'n\nThanks for [user:gep' -> 'n\nThanks [user:gep'
ru9 Русский Wind_Eagle 2022-07-02 15:13:52 8
en3 Английский Wind_Eagle 2022-07-02 15:13:37 8
ru8 Русский Wind_Eagle 2022-07-02 15:09:52 80
en2 Английский Wind_Eagle 2022-07-02 15:09:32 383
en1 Английский Wind_Eagle 2022-07-02 15:01:02 11010 Initial revision for English translation (saved to drafts)
ru7 Русский Wind_Eagle 2022-07-02 14:56:57 197
ru6 Русский Wind_Eagle 2022-07-02 14:50:10 32
ru5 Русский Wind_Eagle 2022-07-02 14:49:33 44
ru4 Русский Wind_Eagle 2022-07-02 14:48:38 3 Мелкая правка: 'то в $edge\_set$ окажу' -> 'то в $edge$_$set$ окажу'
ru3 Русский Wind_Eagle 2022-07-02 14:48:24 1 Мелкая правка: 'то в $edge_set$ окаж' -> 'то в $edge\_set$ окаж'
ru2 Русский Wind_Eagle 2022-07-02 14:48:05 1
ru1 Русский Wind_Eagle 2022-07-02 14:47:06 11134 Первая редакция (сохранено в черновиках)