How do you usually build HLD?
Usually we calculate $$$sz_v$$$ — size of subtree of $$$v$$$. Then from $$$v$$$ we go to $$$u$$$ with largest $$$sz_u$$$ ($$$u$$$ is child of $$$v$$$). And this is still good way, but today I want to show you something different.
What if from $$$v$$$ we go to the random vertex $$$u$$$ in the whole subtree of $$$v$$$. It is the same as choosing to go to the $$$u$$$ ($$$u$$$ is child of $$$v$$$) with probability $$$\frac{sz_u}{sz_v - 1}$$$, so the probability of edge $$$v, u$$$ to be heavy is $$$\frac{sz_u}{sz_v - 1}$$$. Intuitively we should go the largest subtree.
Proof
Let's proof it works well. For vertex $$$v$$$ the expected value number of light edges on the way to root is $$$\sum_{i = 1}^{i \lt |u|}{1-\frac{sz_{u_i}}{sz_{u_{i + 1}} - 1}}$$$, where $$$u$$$ is vertexes on way from $$$v$$$ to root. Let $$$a_i$$$ be $$$sz_{u_i}$$$. Then $$$a$$$ is an increasing array and we know that $$$a_{h_v}$$$ equals to the size of the whole tree. $$$\sum_{i = 1}^{i \lt |a|}{1-\frac{a_i}{a_{i + 1} - 1}} \leq \sum_{i = 1}^{i \lt |a|}{\frac{a_{i + 1} - a_i}{a_{i + 1}}}$$$.
$$$\frac{a_{i+1} - a_i}{a_{i+1}} \le \int_{a_i}^{a_{i+1}} \frac{1}{x} \, dx = \ln(a_{i+1}) - \ln(a_i)$$$
$$$\sum_{i = 1}^{i \lt |a|}{1-\frac{a_i}{a_{i + 1} - 1}} \leq ln(a_{|a|})$$$.
So we have expected value number of light edges on the way is less than $$$ln(n)$$$.
Usages
remain to the reader








Auto comment: topic has been updated by PvPro (previous revision, new revision, compare).
Wow, such a cool new idea!
I've thought about something like this before.
One way to implement this idea is assign a random priority to each node and always extend the child which has the lowest priority in its subtree. One usecase i saw for this is a dynamically growing tree i.e. updates where you add new nodes and you only get info about where to add the new nodes after answering queries. Standard HLD could easily be made O(n^2) by having a long path followed by a split and then alternately adding 2 nodes to either side of the split. With this approach however you can see that the amount of times a node gets added to a new path is the pareto-front of the priorities in its subtree which is expected O(log(n)).
There's also another way to do tree decomposition. I call it lowbit decomposition.
Consider the lowbit value of the dfs order of the leaf. For non-leaf nodes, you connect the one with the largest lowbit value. It's not hard to prove that you only visit at most $$$2 \log_2(N)$$$ chains for any node pairs.
https://gitlab.com/oToToT/chika/-/blob/main/lbd.hpp
goated impl! deserves more attention imo
Thanks. And I think this one can easily combined with the linear LCA algorithm you posted.
We can also use this idea to build a suffix tree without any advanced algorithms.
Cool idea lol
What if queries aren't random? Suppose a query is repeated many times and it so happened, that number of light edges on its path to the root is very large. Indeed, one must give also an upper bound to the probability of at least one path being too long.
Absolutely crazy data structure