I came across this paper

On Finding Lowest Common Ancestors: Simplification and Parallelization by Baruch Schieber, Uzi Vishkin April, 1987

so naturally I tried to code golf it

lca https://judge.yosupo.jp/submission/188189

rmq https://judge.yosupo.jp/submission/188190

edit: minor golf

Cool! This algorithm seems to be almost the same size of the one described here, although maybe a bit slower.

rip

although looking at the test cases, this approach does worse on the small width query where the usual 4-russians approach would have an advantage: you hit the case with less operations

on cses rmq: your implementation: .10s, my implementation: 0.12s

revisiting this, I code golfed it another way, and now it is a bit faster

https://judge.yosupo.jp/submission/222465

Whoa, pretty small code! which algorithm is it using?

So it uses the same alg as in the paper; but you calculate the cartesian tree of the array, then the index of RMQ of [le, ri] is the LCA of nodes

`le`

,`ri`

in the cartesian tree.So the paper describes how to calculate 3 arrays INLABEL, ASCENDANT, HEAD from a rooted tree; and then how to answer O(1) LCA using these arrays. I thought the paper describes the alg quite well, so I won't re-explain it here. But I can mention a few key points:

So our task is to calculate these arrays INLABEL, ASCENDANT, HEAD from the cartesian tree of the array.

For a node u, the paper defines INLABEL[u] as (considering all nodes in u's subtree) the dfs-time-in value with maximum count of trailing zeros (__builtin_ctz) in it's binary representation.

And noting, among the numbers le, le+1, ..., ri-1, ri, the number with maximum count of trailing zeros is given by

`ri & -bit_floor(ri^(le-1))`

.So we could calculate dfs-time-in of cartesian tree, then use the above formula with: le = time_in[u], ri = time_in[u]+subtree_size[u]-1; but there's a better way:

the paper has this remark:

Remark: The above computation is based on PREORDER numbering of the vertices of T. This numbering has the property that the numbers assigned to the subtree rooted at any vertex of T provide a consecutive series of integers. In fact, any alternative numbering having this property (e.g. POSTORDER, INORDER) will produce INLABEL numbers which will be suitable for our preprocessing stage.

And for cartesian tree, It makes more sense to calculate INLABEL over INORDER traversial instead of PREORDER, because the INORDER value of node u is just u.

And for cartesian tree, the INORDER values of the subtree of u is the range [index_of_previous_smaller[u]+1, index_of_next_smaller[u]-1], where the previous and next smaller indexes can be calculated with the monotonic stack alg.

Finally one more point is I don't actually build the cartesian tree, rather it is thought of as being "implicitly" used; just like for rectangular-array-graph problems, where you just loop over the 4 (or 8) neighbors "implicitly" in a loop instead of building the graph "explicitly"

So if you take the code:

It's kind of like doing an "implicit" dfs on cartesian tree: the order which nodes are popped off the stack is POSTORDER of cartesian tree for example. Also there's a bunch of invariants:

at spot 1, we have:

at spot 2, we have:

One more thing which may help: as you pop off the stack, it's like looping up the "right-spine" of the cartesian tree:

Need fix

`int query`

to`T query`

Kudos for a clean, neat code and usage of C++20 features!

thanks

Auto comment: topic has been updated by lrvideckis (previous revision, new revision, compare).