A(n) (less?) intuitive way to LCA with Euler Tour and Segment Tree

№	Пользователь	Рейтинг
1	tourist	3803
2	jiangly	3707
3	Benq	3627
4	ecnerwala	3584
5	orzdevinwang	3573
6	Geothermal	3569
6	cnnfls_csy	3569
8	Radewoosh	3542
9	jqdai0815	3532
10	gyh20	3447

№	Пользователь	Вклад
1	awoo	163
2	maomao90	160
3	adamant	157
4	maroonrk	155
5	-is-this-fft-	150
6	Petr	148
6	SecondThread	148
8	atcoder_official	147
9	TheScrasse	145
9	nor	145

So, a few months ago, I was working on Euler Tour from the usaco guide website. There, they explained how you could compress a tree into an array of start[] and end[] arrays to build a segment tree on top of it. However, with LCA (least common ancestor, they built a RMQ (range minimum query) segtree off of the full inorder traversal of the tree, which confused me greatly. After all, in the examples of summing subtrees and updating vertexes, we hadn't needed to do that. They basically used a different way to euler tour, which confused me. Additionally, I was still greatly uncomfortable with changing the size of the segtree, and a segtree with 2 * (2n-1) memory (I use the iterative segtree) pissed me off. I gave up and went back to dynamic programming. ↵
↵
Yesterday, however, I returned. The most annoying thing, in my opinion, was how the start[] and end[] arrays were combined so the segtree could be built, and I had to remember two ways to do the euler tour. Indeed, in my code to declare the segtree I wrote the following:↵
↵
<spoiler summary="peak commenting">↵
~~~~~↵
N = euler.size();  //do not do this↵
~~~~~↵
</spoiler>↵
↵
<spoiler summary="standard LCA with segtree explanation">↵
Basically, if we maintain all the vertices that we visit in a dfs in order traversal of the tree, and record each of their depths in arrays, by building a segtree on these arrays we can find, with RMQ, the LCA between two nodes; it is the node with least depth (distance from the root). Because between the first times we reach the nodes being queried, we record all the nodes visited, including the LCA, we can do this. However, this requires the full traversal, all 2 * N — 1 nodes. cp-algo has a very clear explanation: https://cp-algorithms.com/graph/lca.html↵
</spoiler>↵
↵
So, I present a way to build the segtree off of only the array ordering vertices in which we list them, similar to the other one.↵
To understand this, make sure you know RMQ segtree (I don't know fenwick tree/BIT but it likely also works here) and how to euler tour for, say, Subtree Queries on CSES, or have read the corresponding section (18.2) in CPH or whatever.↵
↵
<spoiler summary="written for CSES Company Queries II">↵
~~~~~↵
#include <bits/stdc++.h>↵
↵
using namespace std;↵
vector<vector<int>> tree;↵
vector<int> start;↵
vector<int> stop;↵
vector<int> par;↵
vector<int> depth;↵
vector<int> seg;↵
int n;↵
int euler = 0;↵
void dfs(int v, int d) {↵
    start[v] = euler;↵
    depth[v] = d;↵
    euler++;↵
    for (int x : tree[v]) {↵
        dfs(x, d + 1);↵
    }↵
    stop[v] = euler;↵
}↵
int comp(int u, int v) {↵
    return depth[v] < depth[u] ? v : u;↵
}↵
void init() {↵
    seg.resize(2 * n);↵
    for (int i = 0; i < n; i++) {↵
        seg[start[i] + n] = i;↵
    }↵
    for (int i = n - 1; i > 0; i--) {↵
        seg[i] = comp(seg[i * 2], seg[i * 2 + 1]);↵
    }↵
}↵
int query(int l, int r) {↵
    int ans = seg[l + n];↵
    for (l += n, r += n; l < r; l /= 2, r /= 2) {↵
        if (l & 1) { ans = comp(ans, seg[l++]); }↵
        if (r & 1) { ans = comp(ans, seg[--r]); }↵
    }↵
    return ans;↵
}↵
int main() {↵
    cin.tie(0) -> sync_with_stdio(0);↵
    int q; cin >> n >> q;↵
    tree.resize(n);↵
    start.resize(n);↵
    stop.resize(n);↵
    depth.resize(n);↵
    par.resize(n);↵
    for (int i = 1; i < n; i++) {↵
        int b; cin >> b; b--;↵
        tree[b].push_back(i);↵
        par[i] = b;↵
    }↵
    dfs(0, 0);↵
    init();↵
    for (int i = 0; i < q; i++) {↵
        int a, b; cin >> a >> b; a--; b--;↵
        if (start[b] < start[a]) { swap(a, b); }↵
        if (stop[a] >= stop[b]) { cout << a+1 << "\n"; continue; }↵
        cout << par[query(start[a], start[b] + 1)] + 1 << "\n";↵
    }↵
}↵
~~~~~↵
</spoiler>↵
↵
To see why this works, let's take a look at an example.↵
↵
![ ](/predownloaded/f9/7f/f97f02017fa7ca34ede1fd74bd5fd2342928aa7e.png)↵
↵
The corresponding array we build off of (and the depths):↵
↵
<spoiler summary="yes">↵
~~~~~↵
[1, 2, 5, 6, 3, 4, 7]↵
[0, 1, 2, 2, 1, 1, 2]↵
~~~~~↵
</spoiler>↵
↵
Essentially what we are building off of is the first array, where the nodes are ordered by when we first meet them in the dfs traversal.↵
Then:↵
To find the LCA of two nodes, if any of the two is not within the others subtree, then the LCA is the _parent node_ of the RMQ (based on depth, of course) in the range between the two. Of course, if one is contained within another's subtree, the LCA is that node of least depth.↵
↵
Rudimentary attempt at a proof:↵
The parent node will not be present in this range, because it must have already been visited before. Then we must prove that the parent of the node with least depth in this range is the LCA. This is done by noticing that we are required to traverse to a yet unvisited node (after visiting the first vertex) that is one level under the LCA in order to get to the second node.↵
↵
The code can be further simplified by removing the stop[] array and modifying the query call to `query(start[a] + 1, start[b] + 1)`, and then checking for the edge case where a == b. However, for the sake of keeping consistency with euler tour, I have kept the stop[] array.↵
↵
So, is this any better than before? We built on a segtree of half the size, after all.↵
↵
Well, not really. Because segtree operations take O(log N) after processing, actually, we are reducing the time per query by from like 20 operations to 19. And likely because we are adding a vector to track the parent of each vertex, it isn't any faster.↵
Memory isn't any better, either.↵
↵
So what's the use?↵
↵
Now, at least, there aren't two ideas for euler tour running inside my head. I just understand that I can use start[] and stop[] arrays for this. It's mostly for simplicity, really. And because I'm doing usaco guide, I probably won't learn binary lifting until later, as it's in platinum.↵
↵
This was just a fun thing to think about and do.↵
Open to any code simplifications and blog criticisms! It's my first time writing one.↵
Thanks to [user:iframe_,2024-07-15] for helping optimize and proofread, and thank you for reading!

Rev.	Кто	Когда	Δ	Комментарий
en9	SnoopyCodes	2024-07-15 20:06:25	0	(published)
en8	SnoopyCodes	2024-07-15 20:05:24	0	Tiny change: 'all 2 * N — 1 nodes. ' -> 'all 2 * N - 1 nodes. '
en7	SnoopyCodes	2024-07-15 19:30:41	42	Tiny change: 's\n~~~~~\n\n</spoile' -> 's\n~~~~~\n</spoile'
en6	SnoopyCodes	2024-07-15 19:09:46	985
en5	SnoopyCodes	2024-07-15 17:31:24	70
en4	SnoopyCodes	2024-07-15 08:52:38	1090	Tiny change: 'n</spoiler\n\n<spoil' -> 'n</spoiler>\n\n<spoil'
en3	SnoopyCodes	2024-07-15 08:32:19	78	Tiny change: 'example.\nPicture\nThe corr' -> 'example.\n\nPicture\n\nThe corr'
en2	SnoopyCodes	2024-07-15 08:29:50	1696	Tiny change: 'ths):\n\n</spoiler su' -> 'ths):\n\n<spoiler su'
en1	SnoopyCodes	2024-07-15 08:09:15	2691	Initial revision (saved to drafts)

История