O(n),O(1) lca/rmq; not method of 4 russians

→ Pay attention

Before contest
Codeforces Global Round 28
07:25:16
Register now »

*has extra registration

→ Streams

Codeforces Round 994 Solution Discussion

By aryanc403

Before stream 33:35:15

View all →

→ Top rated

#	User	Rating
1	tourist	3985
2	jiangly	3741
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3489
7	Radewoosh	3483
8	Kevin114514	3442
9	ecnerwala	3392
9	Um_nik	3392

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
2	atcoder_official	162
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	nor	150

View all →

→ Find user

→ Recent actions

Detailed →

lrvideckis's blog

O(n),O(1) lca/rmq; not method of 4 russians

By lrvideckis, history, 11 months ago, In English

I came across this paper

On Finding Lowest Common Ancestors: Simplification and Parallelization by Baruch Schieber, Uzi Vishkin April, 1987

so naturally I tried to code golf it

lca https://judge.yosupo.jp/submission/188189

rmq https://judge.yosupo.jp/submission/188190

edit: minor golf

lrvideckis
11 months ago
11

Comments (11)

Write comment?

brunomont

11 months ago, # |

Cool! This algorithm seems to be almost the same size of the one described here, although maybe a bit slower.

→ Reply

lrvideckis

11 months ago, # ^ |

rip

→ Reply

lrvideckis

10 months ago, # ^ |

although looking at the test cases, this approach does worse on the small width query where the usual 4-russians approach would have an advantage: you hit the case with less operations

→ Reply

lrvideckis

10 months ago, # ^ |

on cses rmq: your implementation: .10s, my implementation: 0.12s

→ Reply

lrvideckis

5 months ago, # ^ |

← Rev. 2 →

+16

revisiting this, I code golfed it another way, and now it is a bit faster

https://judge.yosupo.jp/submission/222465

→ Reply

brunomont

5 months ago, # ^ |

Whoa, pretty small code! which algorithm is it using?

→ Reply

lrvideckis

5 months ago, # ^ |

+10

So it uses the same alg as in the paper; but you calculate the cartesian tree of the array, then the index of RMQ of [le, ri] is the LCA of nodes le, ri in the cartesian tree.

So the paper describes how to calculate 3 arrays INLABEL, ASCENDANT, HEAD from a rooted tree; and then how to answer O(1) LCA using these arrays. I thought the paper describes the alg quite well, so I won't re-explain it here. But I can mention a few key points:

So our task is to calculate these arrays INLABEL, ASCENDANT, HEAD from the cartesian tree of the array.

For a node u, the paper defines INLABEL[u] as (considering all nodes in u's subtree) the dfs-time-in value with maximum count of trailing zeros (__builtin_ctz) in it's binary representation.

And noting, among the numbers le, le+1, ..., ri-1, ri, the number with maximum count of trailing zeros is given by ri & -bit_floor(ri^(le-1)).

So we could calculate dfs-time-in of cartesian tree, then use the above formula with: le = time_in[u], ri = time_in[u]+subtree_size[u]-1; but there's a better way:

the paper has this remark:

Remark: The above computation is based on PREORDER numbering of the vertices of T. This numbering has the property that the numbers assigned to the subtree rooted at any vertex of T provide a consecutive series of integers. In fact, any alternative numbering having this property (e.g. POSTORDER, INORDER) will produce INLABEL numbers which will be suitable for our preprocessing stage.

And for cartesian tree, It makes more sense to calculate INLABEL over INORDER traversial instead of PREORDER, because the INORDER value of node u is just u.

And for cartesian tree, the INORDER values of the subtree of u is the range [index_of_previous_smaller[u]+1, index_of_next_smaller[u]-1], where the previous and next smaller indexes can be calculated with the monotonic stack alg.

Finally one more point is I don't actually build the cartesian tree, rather it is thought of as being "implicitly" used; just like for rectangular-array-graph problems, where you just loop over the 4 (or 8) neighbors "implicitly" in a loop instead of building the graph "explicitly"

So if you take the code:

vector<int> st;
for(int i = 0; i < n; i++) {
  int prev = -1;
  while(!empty(st) && a[st.back()] > a[i]) {
    // (1)
    prev = st.back();
    st.pop_back();
  }
  // (2)
  st.push_back(i);
}

It's kind of like doing an "implicit" dfs on cartesian tree: the order which nodes are popped off the stack is POSTORDER of cartesian tree for example. Also there's a bunch of invariants:

at spot 1, we have: