[Tutorial] Auxiliary Trees for QlogQ offline queries about Dynamic Forest

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	turmax	3559
6	tourist	3541
7	strapple	3515
8	ksun48	3461
9	dXqwq	3436
10	Otomachi_Una	3413

#	User	Contrib.
1	Qingyu	157
2	adamant	153
3	Um_nik	146
3	Proof_by_QED	146
5	Dominater069	145
6	errorgorn	141
7	cry	139
8	YuukiS	135
9	TheScrasse	134
10	chromate00	133

This blog post is a submission for the Codeforces Month of Blog Posts Pt. III challenge. Thank you cadmiumky for the initiative!

Hello, Codeforces! Recently I heard a short tale about one idea on algocourses.ru (from teraqqq) and I realized the idea is powerful, so probably it should be more recognized.

This post has been written with my best knowledge. If something is incorrect, please let me know!

Intro

Dynamic Forest queries are usually solved with Link Cut Tree (LCT). However, for offline queries there's a powerful Divide&Conquer technique that avoids the heavy implementation of LCT. This blog explains this technique. Spoiler: LCT is faster.

Related techniques ($$$N$$$ denotes the number of vertices and $$$Q$$$ denotes the number of queries.):

Auxiliary/Virtual tree
1️⃣ The traditional $$$Q log Q log N$$$ DCP uses DSU with rollbacks to merge vertices
2️⃣ The technique for $$$Q log Q$$$ DCP uses BFS on stable edges to merge vertices

Core idea:

Unlike Methods 1️⃣ and 2️⃣, we don't merge vertices. However, similar to both 1️⃣ and 2️⃣, we perform a DFS traversal on the segment tree built over the timeline. We observe that edges persisting throughout the current time segment can be compressed, provided we preserve the connectivity between vertices involved in that interval's queries.

I. Illustration of one application of this technique

We have $$$Q$$$ offline queries of the following types:

link $$$u$$$ $$$v$$$ ($$$1 \le$$$ $$$u$$$, $$$v$$$ $$$\le N$$$) — add an undirected edge between the vertices $$$u$$$ and $$$v$$$. The graph must remain a forest.
cut $$$i$$$ — remove the existing edge that was added in the $$$i$$$-th query.
get_path $$$u$$$ $$$v$$$ ($$$1 \le$$$ $$$u$$$, $$$v$$$ $$$\le N$$$) — output the number of edges on the simple path $$$u$$$ $$$\rightarrow$$$ $$$v$$$ or -1 if no such path exists.

Let's elaborate.

Standard D&C Setup (As in Method 2️⃣):

Each edge is characterized by a time segment during which it exists.
We define a recursive function solve(l, r, G) that receives time boundaries and some set of edges $$$G$$$.
The job of the function is to answer queries occurring within $$$[l, r]$$$.
The $$$i$$$-th query is answered in the leaf node of the segment tree (i.e., when $$$l=r=i$$$).
The set $$$G$$$ contains sufficient information to answer all the queries in $$$[l, r]$$$.
The sum of $$$|G|$$$ over the calls of solve is $$$O(Q log Q)$$$.
The initial call is solve(l=0, r=Q-1, G=all_edges).

In segment tree leaves, we simply perform one BFS or DFS on $$$G$$$ to determine the length of the path $$$u \rightarrow v$$$.

Let $$$G_l$$$ be the subset of edges in $$$G$$$, whose time segment intersect with $$$[l, m]$$$.

Similarly, let $$$G_r$$$ be the subset of edges in $$$G$$$, whose time segment intersect with $$$[m+1, r]$$$.

If we didn't care about time complexity, we could simply pass $$$G$$$ to both segment tree children (i.e., call solve(l, m, G_l) and solve(m+1, r, G_r)).

Now, let's talk about the optimization that allows us to achieve the aforementioned $$$\Sigma |G| = O(QlogQ)$$$.

Unlike 2️⃣, we never merge vertices. Instead, we compress paths of edges. Consequently, every edge is also characterized by a length (initially $$$1$$$ for all the edges).

Suppose we are at some node of the segment tree corresponding to some time segment:

We call an edge Stable if it exists during the whole time segment.
We call a vertex Unstable if it has incident Unstable edges and/or it is an endpoint of some path(s) requested during that time segment.

Stable edges must create a forest. And Unstable edges are between Unstable vertices from different trees of this forest (and can create cycles).

Let's look at the forest created by Stable edges. It is intuitive that we care only about Auxiliary forest induced by Unstable vertices. So we can replace our Stable edges with the edges from Auxiliary forest.

Visualize the way Stable edges are merged

So,

We add Unstable edges of $$$G$$$ to $$$G_l$$$ and/or $$$G_r$$$ (depending on the intersections).
We add Auxiliary forest of Stable edges (we merge edges) induced by Unstable vertices to BOTH $$$G_l$$$ and $$$G_r$$$.

Complexity analysis

II. Generalize this technique to other scenarios

Here are some problems this method solves (order matters):

1.P9 from Errichto's blog on SQRT decomposition (I offer a simple $$$QlogQ$$$ solution that Errichto/Radewoosh haven't mentioned here).

Solution sketch

2.link(u,v), cut(i), compute_LCA(u,v,root).

Solution sketch

3.342E - Xenia and Tree modification: link(u,v,weight), cut(i), make_blue(u), make_red(u), get_closest_blue(u), get_closest_red(u).

Solution sketch for nonnegative weights

(Bonus, not sure about it) Solution sketch for general case

4.(Bonus, not sure about it) link(u,v,weight), cut(i), compute_sum_over_all_paths().

Solution sketch

III. Conclusion

We can use this technique for various problems about paths on Dynamic Forest.

This technique can be coded instead of Nlog^2N HLD.

The efficiency of this technique relies on the observation that across the entire timeline, there are only $$$O(Q)$$$ «events» where the state of a vertex changes (yes, edge $$$u \rightarrow v$$$ weight/existence/etc. change means changes of $$$u$$$'s and $$$v$$$'s states).

(When we need to compute aggregate values: Tasks 3,4) The Stable vertices should have fixed states. That way, we can push values from/through the vertices that may «leave» from now on.

IV. Outro

One can practice four of these six problems in Codeforces Gym I created. My C++ solutions are in "Contest materials".

I would love to hear your opinion/ideas/suggestions on this blogpost and/or on the technique overall in the comments. Have a good day!

Comments (8)

Write comment?

onekeys

2 months ago, hide # |

← Rev. 3 →

ABC350 G-Mediator

→ Reply

Sacha

2 months ago, hide # ^ |

We can't answer encoded (i.e. online) queries from ABC350G with this technique, because we need to know edges' time segments in advance.

DNR

The technique is pretty cool but the explanation is a bit awkward.

Thank you for feedback! Which wording would you change?

I think you could have explained everything from scratch (in the same amount of text) without reliance on the two older blogposts.
You don't mention that the reason we can answer queries naively at leaves in the recursion is because the virtual forest gets reduced to a set of $$$O(1)$$$ nodes that are directly relevant to that leaf's query.
Explain how the virtual forest is actually reduced as we go down to a child in the recursion (since the resultant virtual forest is going to have only a subset of the nodes that the current virtual forest does, we only have to perform edge contractions, which can be done by doing a dfs over the current virtual forest), and why this is efficient (The size of the virtual forest when we're at some recursive call that covers $$$e$$$ edges/queries is $$$O(e)$$$, and $$$\sum {e} = O(q \log{q})$$$. You allude to this by mentioning $$$O(q \log {q})$$$ edges being "passed down", and while someone who already knows the technique will immediately understand, it's confusing to someone who doesn't).
The visualization you've attached doesn't convey anything clearly.

None of these are difficult for the reader to figure out on his own, but a blog like this (since you're nominating it for that contest) should put more effort into communicating its ideas, and not assume that the reader is already familiar with them.

bweah

Very cool blog! I think (or at least hope, the details are kinda funky) this problem can be solved with this technique, maybe pushing it to the very edge.

MieAi.

Nice blog! I love it

7 weeks ago, hide # |

You should be careful when implementing this technique.

For example, if you have $$$O(Q)$$$ queries in one LEAF, you may end up with quadratic time complexity. It may happen when you think "Why not store many read-only queries in one leaf? That way, we have number of leaves = number of updates".

Sacha's blog

Intro

Core idea:

I. Illustration of one application of this technique

II. Generalize this technique to other scenarios

III. Conclusion

IV. Outro