Sacha's blog

By Sacha, history, 2 months ago, In English

This blog post is a submission for the Codeforces Month of Blog Posts Pt. III challenge. Thank you cadmiumky for the initiative!

Hello, Codeforces! Recently I heard a short tale about one idea on algocourses.ru (from teraqqq) and I realized the idea is powerful, so probably it should be more recognized.

This post has been written with my best knowledge. If something is incorrect, please let me know!

Intro

Dynamic Forest queries are usually solved with Link Cut Tree (LCT). However, for offline queries there's a powerful Divide&Conquer technique that avoids the heavy implementation of LCT. This blog explains this technique. Spoiler: LCT is faster.


Related techniques ($$$N$$$ denotes the number of vertices and $$$Q$$$ denotes the number of queries.):

Core idea:

Unlike Methods 1️⃣ and 2️⃣, we don't merge vertices. However, similar to both 1️⃣ and 2️⃣, we perform a DFS traversal on the segment tree built over the timeline. We observe that edges persisting throughout the current time segment can be compressed, provided we preserve the connectivity between vertices involved in that interval's queries.


I. Illustration of one application of this technique

We have $$$Q$$$ offline queries of the following types:

  1. link $$$u$$$ $$$v$$$ ($$$1 \le$$$ $$$u$$$, $$$v$$$ $$$\le N$$$) — add an undirected edge between the vertices $$$u$$$ and $$$v$$$. The graph must remain a forest.

  2. cut $$$i$$$ — remove the existing edge that was added in the $$$i$$$-th query.

  3. get_path $$$u$$$ $$$v$$$ ($$$1 \le$$$ $$$u$$$, $$$v$$$ $$$\le N$$$) — output the number of edges on the simple path $$$u$$$ $$$\rightarrow$$$ $$$v$$$ or -1 if no such path exists.

Let's elaborate.

Standard D&C Setup (As in Method 2️⃣):

  • Each edge is characterized by a time segment during which it exists.

  • We define a recursive function solve(l, r, G) that receives time boundaries and some set of edges $$$G$$$.

  • The job of the function is to answer queries occurring within $$$[l, r]$$$.

  • The $$$i$$$-th query is answered in the leaf node of the segment tree (i.e., when $$$l=r=i$$$).

  • The set $$$G$$$ contains sufficient information to answer all the queries in $$$[l, r]$$$.

  • The sum of $$$|G|$$$ over the calls of solve is $$$O(Q log Q)$$$.

  • The initial call is solve(l=0, r=Q-1, G=all_edges).

In segment tree leaves, we simply perform one BFS or DFS on $$$G$$$ to determine the length of the path $$$u \rightarrow v$$$.

Let $$$G_l$$$ be the subset of edges in $$$G$$$, whose time segment intersect with $$$[l, m]$$$.

Similarly, let $$$G_r$$$ be the subset of edges in $$$G$$$, whose time segment intersect with $$$[m+1, r]$$$.

If we didn't care about time complexity, we could simply pass $$$G$$$ to both segment tree children (i.e., call solve(l, m, G_l) and solve(m+1, r, G_r)).

Now, let's talk about the optimization that allows us to achieve the aforementioned $$$\Sigma |G| = O(QlogQ)$$$.

Unlike 2️⃣, we never merge vertices. Instead, we compress paths of edges. Consequently, every edge is also characterized by a length (initially $$$1$$$ for all the edges).

Suppose we are at some node of the segment tree corresponding to some time segment:

  • We call an edge Stable if it exists during the whole time segment.

  • We call a vertex Unstable if it has incident Unstable edges and/or it is an endpoint of some path(s) requested during that time segment.

Stable edges must create a forest. And Unstable edges are between Unstable vertices from different trees of this forest (and can create cycles).

Let's look at the forest created by Stable edges. It is intuitive that we care only about Auxiliary forest induced by Unstable vertices. So we can replace our Stable edges with the edges from Auxiliary forest.

Visualize the way Stable edges are merged

So,

  1. We add Unstable edges of $$$G$$$ to $$$G_l$$$ and/or $$$G_r$$$ (depending on the intersections).
  2. We add Auxiliary forest of Stable edges (we merge edges) induced by Unstable vertices to BOTH $$$G_l$$$ and $$$G_r$$$.
Complexity analysis

II. Generalize this technique to other scenarios

Here are some problems this method solves (order matters):

1.P9 from Errichto's blog on SQRT decomposition (I offer a simple $$$QlogQ$$$ solution that Errichto/Radewoosh haven't mentioned here).

Solution sketch

2.link(u,v), cut(i), compute_LCA(u,v,root).

Solution sketch

3.342E - Xenia and Tree modification: link(u,v,weight), cut(i), make_blue(u), make_red(u), get_closest_blue(u), get_closest_red(u).

Solution sketch for nonnegative weights
(Bonus, not sure about it) Solution sketch for general case

4.(Bonus, not sure about it) link(u,v,weight), cut(i), compute_sum_over_all_paths().

Solution sketch

III. Conclusion

We can use this technique for various problems about paths on Dynamic Forest.

This technique can be coded instead of Nlog^2N HLD.

The efficiency of this technique relies on the observation that across the entire timeline, there are only $$$O(Q)$$$ «events» where the state of a vertex changes (yes, edge $$$u \rightarrow v$$$ weight/existence/etc. change means changes of $$$u$$$'s and $$$v$$$'s states).

(When we need to compute aggregate values: Tasks 3,4) The Stable vertices should have fixed states. That way, we can push values from/through the vertices that may «leave» from now on.

IV. Outro

One can practice four of these six problems in Codeforces Gym I created. My C++ solutions are in "Contest materials".

I would love to hear your opinion/ideas/suggestions on this blogpost and/or on the technique overall in the comments. Have a good day!

  • Vote: I like it
  • +115
  • Vote: I do not like it

»
2 months ago, hide # |
Rev. 3  
Vote: I like it 0 Vote: I do not like it
  • »
    »
    2 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    We can't answer encoded (i.e. online) queries from ABC350G with this technique, because we need to know edges' time segments in advance.

»
2 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

The technique is pretty cool but the explanation is a bit awkward.

  • »
    »
    2 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    Thank you for feedback! Which wording would you change?

    • »
      »
      »
      2 months ago, hide # ^ |
      Rev. 3  
      Vote: I like it 0 Vote: I do not like it
      1. I think you could have explained everything from scratch (in the same amount of text) without reliance on the two older blogposts.
      2. You don't mention that the reason we can answer queries naively at leaves in the recursion is because the virtual forest gets reduced to a set of $$$O(1)$$$ nodes that are directly relevant to that leaf's query.
      3. Explain how the virtual forest is actually reduced as we go down to a child in the recursion (since the resultant virtual forest is going to have only a subset of the nodes that the current virtual forest does, we only have to perform edge contractions, which can be done by doing a dfs over the current virtual forest), and why this is efficient (The size of the virtual forest when we're at some recursive call that covers $$$e$$$ edges/queries is $$$O(e)$$$, and $$$\sum {e} = O(q \log{q})$$$. You allude to this by mentioning $$$O(q \log {q})$$$ edges being "passed down", and while someone who already knows the technique will immediately understand, it's confusing to someone who doesn't).
      4. The visualization you've attached doesn't convey anything clearly.

      None of these are difficult for the reader to figure out on his own, but a blog like this (since you're nominating it for that contest) should put more effort into communicating its ideas, and not assume that the reader is already familiar with them.

»
2 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Very cool blog! I think (or at least hope, the details are kinda funky) this problem can be solved with this technique, maybe pushing it to the very edge.

»
2 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Nice blog! I love it

»
7 weeks ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

You should be careful when implementing this technique.

For example, if you have $$$O(Q)$$$ queries in one LEAF, you may end up with quadratic time complexity. It may happen when you think "Why not store many read-only queries in one leaf? That way, we have number of leaves = number of updates".