PvPro's blog

By PvPro, history, 2 months ago, In English

How do you usually build HLD?

Usually we calculate $$$sz_v$$$ — size of subtree of $$$v$$$. Then from $$$v$$$ we go to $$$u$$$ with largest $$$sz_u$$$ ($$$u$$$ is child of $$$v$$$). And this is still good way, but today I want to show you something different.

What if from $$$v$$$ we go to the random vertex $$$u$$$ in the whole subtree of $$$v$$$. It is the same as choosing to go to the $$$u$$$ ($$$u$$$ is child of $$$v$$$) with probability $$$\frac{sz_u}{sz_v - 1}$$$, so the probability of edge $$$v, u$$$ to be heavy is $$$\frac{sz_u}{sz_v - 1}$$$. Intuitively we should go the largest subtree.

Proof

Let's proof it works well. For vertex $$$v$$$ the expected value number of light edges on the way to root is $$$\sum_{i = 1}^{i \lt |u|}{1-\frac{sz_{u_i}}{sz_{u_{i + 1}} - 1}}$$$, where $$$u$$$ is vertexes on way from $$$v$$$ to root. Let $$$a_i$$$ be $$$sz_{u_i}$$$. Then $$$a$$$ is an increasing array and we know that $$$a_{h_v}$$$ equals to the size of the whole tree. $$$\sum_{i = 1}^{i \lt |a|}{1-\frac{a_i}{a_{i + 1} - 1}} \leq \sum_{i = 1}^{i \lt |a|}{\frac{a_{i + 1} - a_i}{a_{i + 1}}}$$$.

$$$\frac{a_{i+1} - a_i}{a_{i+1}} \le \int_{a_i}^{a_{i+1}} \frac{1}{x} \, dx = \ln(a_{i+1}) - \ln(a_i)$$$

$$$\sum_{i = 1}^{i \lt |a|}{1-\frac{a_i}{a_{i + 1} - 1}} \leq ln(a_{|a|})$$$.

So we have expected value number of light edges on the way is less than $$$ln(n)$$$.

Usages

remain to the reader

  • Vote: I like it
  • +135
  • Vote: I do not like it

»
2 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by PvPro (previous revision, new revision, compare).

»
2 months ago, hide # |
 
Vote: I like it +18 Vote: I do not like it

Wow, such a cool new idea!

»
2 months ago, hide # |
 
Vote: I like it +56 Vote: I do not like it

I've thought about something like this before.

One way to implement this idea is assign a random priority to each node and always extend the child which has the lowest priority in its subtree. One usecase i saw for this is a dynamically growing tree i.e. updates where you add new nodes and you only get info about where to add the new nodes after answering queries. Standard HLD could easily be made O(n^2) by having a long path followed by a split and then alternately adding 2 nodes to either side of the split. With this approach however you can see that the amount of times a node gets added to a new path is the pareto-front of the priorities in its subtree which is expected O(log(n)).

»
2 months ago, hide # |
Rev. 2  
Vote: I like it +13 Vote: I do not like it

There's also another way to do tree decomposition. I call it lowbit decomposition.

Consider the lowbit value of the dfs order of the leaf. For non-leaf nodes, you connect the one with the largest lowbit value. It's not hard to prove that you only visit at most $$$2 \log_2(N)$$$ chains for any node pairs.

https://gitlab.com/oToToT/chika/-/blob/main/lbd.hpp

»
2 months ago, hide # |
 
Vote: I like it +3 Vote: I do not like it
»
2 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Cool idea lol

»
2 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

What if queries aren't random? Suppose a query is repeated many times and it so happened, that number of light edges on its path to the root is very large. Indeed, one must give also an upper bound to the probability of at least one path being too long.

»
2 months ago, hide # |
 
Vote: I like it +14 Vote: I do not like it

Absolutely crazy data structure