A(n) (less?) intuitive way to LCA with Euler Tour and Segment Tree

№	Пользователь	Рейтинг
1	tourist	3993
2	jiangly	3743
3	orzdevinwang	3707
4	Radewoosh	3627
5	jqdai0815	3620
6	Benq	3564
7	Kevin114514	3443
8	ksun48	3434
9	Rewinding	3397
10	Um_nik	3396

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	155
8	TheScrasse	154
9	Dominater069	153
10	nor	152

So, a few months ago, I was working on Euler Tour from the usaco guide website. There, they explained how you could compress a tree into an array of start[] and end[] arrays to build a segment tree on top of it. However, with LCA (least common ancestor, they built a RMQ (range minimum query) segtree off of the full inorder traversal of the tree, which confused me greatly. After all, in the examples of summing subtrees and updating vertexes, we hadn't needed to do that. They basically used a different way to euler tour, which confused me. Additionally, I was still greatly uncomfortable with changing the size of the segtree, and a segtree with 2 * (2n-1) memory (I use the iterative segtree) pissed me off. I gave up and went back to dynamic programming.

Yesterday, however, I returned. The most annoying thing, in my opinion, was how the start[] and end[] arrays were combined so the segtree could be built, and I had to remember two ways to do the euler tour. Indeed, in my code to declare the segtree I wrote the following:

peak commenting

N = euler.size();  //do not do this

standard LCA with segtree explanation

So, I present a way to build the segtree off of only the array ordering vertices in which we list them, similar to the other one. To understand this, make sure you know RMQ segtree (I don't know fenwick tree/BIT but it likely also works here) and how to euler tour for, say, Subtree Queries on CSES, or have read the corresponding section (18.2) in CPH or whatever.

written for CSES Company Queries II

#include <bits/stdc++.h>
 
using namespace std;
vector<vector<int>> tree;
vector<int> start;
vector<int> stop;
vector<int> par;
vector<int> depth;
vector<int> seg;
int n;
int euler = 0;
void dfs(int v, int d) {
    start[v] = euler;
    depth[v] = d;
    euler++;
    for (int x : tree[v]) {
        dfs(x, d + 1);
    }
    stop[v] = euler;
}
int comp(int u, int v) {
    return depth[v] < depth[u] ? v : u;
}
void init() {
    seg.resize(2 * n);
    for (int i = 0; i < n; i++) {
        seg[start[i] + n] = i;
    }
    for (int i = n - 1; i > 0; i--) {
        seg[i] = comp(seg[i * 2], seg[i * 2 + 1]);
    }
}
int query(int l, int r) {
    int ans = seg[l + n];
    for (l += n, r += n; l < r; l /= 2, r /= 2) {
        if (l & 1) { ans = comp(ans, seg[l++]); }
        if (r & 1) { ans = comp(ans, seg[--r]); }
    }
    return ans;
}
int main() {
    cin.tie(0) -> sync_with_stdio(0);
    int q; cin >> n >> q;
    tree.resize(n);
    start.resize(n);
    stop.resize(n);
    depth.resize(n);
    par.resize(n);
    for (int i = 1; i < n; i++) {
        int b; cin >> b; b--;
        tree[b].push_back(i);
        par[i] = b;
    }
    dfs(0, 0);
    init();
    for (int i = 0; i < q; i++) {
        int a, b; cin >> a >> b; a--; b--;
        if (start[b] < start[a]) { swap(a, b); }
        if (stop[a] >= stop[b]) { cout << a+1 << "\n"; continue; }
        cout << par[query(start[a], start[b] + 1)] + 1 << "\n";
    }
}

To see why this works, let's take a look at an example.

The corresponding array we build off of (and the depths):

yes

[1, 2, 5, 6, 3, 4, 7]
[0, 1, 2, 2, 1, 1, 2]

Essentially what we are building off of is the first array, where the nodes are ordered by when we first meet them in the dfs traversal. Then: To find the LCA of two nodes, if any of the two is not within the others subtree, then the LCA is the parent node of the RMQ (based on depth, of course) in the range between the two. Of course, if one is contained within another's subtree, the LCA is that node of least depth.

Rudimentary attempt at a proof: The parent node will not be present in this range, because it must have already been visited before. Then we must prove that the parent of the node with least depth in this range is the LCA. This is done by noticing that we are required to traverse to a yet unvisited node (after visiting the first vertex) that is one level under the LCA in order to get to the second node.

The code can be further simplified by removing the stop[] array and modifying the query call to query(start[a] + 1, start[b] + 1), and then checking for the edge case where a == b. However, for the sake of keeping consistency with euler tour, I have kept the stop[] array.

So, is this any better than before? We built on a segtree of half the size, after all.

Well, not really. Because segtree operations take O(log N) after processing, actually, we are reducing the time per query by from like 20 operations to 19. And likely because we are adding a vector to track the parent of each vertex, it isn't any faster. Memory isn't any better, either.

So what's the use?

Now, at least, there aren't two ideas for euler tour running inside my head. I just understand that I can use start[] and stop[] arrays for this. It's mostly for simplicity, really. And because I'm doing usaco guide, I probably won't learn binary lifting until later, as it's in platinum.

This was just a fun thing to think about and do. Open to any code simplifications and blog criticisms! It's my first time writing one. Thanks to iframe_ for helping optimize and proofread, and thank you for reading!

Комментарии (8)

Написать комментарий?

SnoopyCodes

4 месяца назад, # |

Auto comment: topic has been updated by SnoopyCodes (previous revision, new revision, compare).

→ Ответить

JeffLegendPower

happy snoopy

4 месяца назад, # ^ |

jef legen powuh

lrvideckis

this is my preferred way of doing lca because you can also find the next node on path.

but i like to write it like:

if (a==b) // lca is a
if (start[b] < start[a]) { swap(a, b); }
//lca is: par[query(start[a] + 1, start[b] + 1)]

Yes, this is the simplification where we can remove the stop[] array. I hadn't considered the problem of finding next node on path, but it should be easily doable with this too.

Codertang

snoopy orz

nou

Блог пользователя SnoopyCodes