Dominant Path Queries

→ Pay attention

Before contest
CodeTON Round 9 (Div. 1 + Div. 2, Rated, Prizes!)
33:42:41
Register now »

*has extra registration

→ Top rated

#	User	Rating
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

View all →

→ Find user

→ Recent actions

Detailed →

Epilogue's blog

Dominant Path Queries

By Epilogue, history, 14 months ago, In English

Hi!. I'm solving a problem. Given a tree with N nodes with colors and q queries. For each query given two integers (u, v), answer if the path(u, v) on the tree is dominant. A path is called dominant when the most frequently occuring element appears strictly more than pathlength / 2.

Are there any offline, online solutions for this? If you have any ideas please comment bellow.

Epilogue
14 months ago
23

Comments (23)

Write comment?

NeverCompromise

14 months ago, # |

-28

Did you try to google before asking this? If you did, you would atleast know that it can be solved easily offline with Mo's.

→ Reply

Epilogue

14 months ago, # ^ |

← Rev. 2 →

I know Mo for the subtree one but this one is path query. Can you using MO with it or Did you even read the problem properly?

→ Reply

NeverCompromise

14 months ago, # ^ |

-16

Did you even read the problem properly?

Yes, I read your problem properly. You however, still didn't read my comment properly. A simple google search yields this.

→ Reply

Epilogue

14 months ago, # ^ |

Oh I saw that one before but the constraint is like 5*10^5 so.

Anyways tks for reminding me that.

→ Reply

CentroidDecomposition

14 months ago, # |

← Rev. 6 →

Arbitrarily root the tree. Let $$$cnt(i, j)$$$ denote the frequency of color $$$j$$$ on the path from root to node $$$i$$$. Now the number of occurrences of a color $$$x$$$ in $$$path(u, v)$$$ is $$$cnt(u, x) + cnt(v, x) - 2 * cnt(lca(u, v), x) + (color[lca(u, v)] == x)$$$. A path is dominant if the number of occurrences of the median color in the path is strictly greater than $$$\frac{pathlength}{2}$$$. $$$cnt(i, j)$$$ can be calculated online in $$$\mathcal{O}(\log N)$$$ with the help of persistent segment tree and the median color in a path can be calculated by binary search with range sum queries on the same tree. Overall time complexity will be $$$\mathcal{O}(N \log N + Q \log^2 N)$$$ and memory complexity $$$\mathcal{O}(N \log N)$$$.

The problem can also be solved in $$$\mathcal{O}((N + Q) \log N)$$$ by an approach similar to walking on a segment tree once you have the persistent segment trees containing information on the frequency of colors on the path to root for each node. It can also be solved by a binary lifting approach using the fact that the majority of a sequence must be the majority of one of the sequences when it is partitioned into two.

→ Reply

Epilogue

14 months ago, # ^ |

Wow that's nice. I have thought about Persistent Segment Tree but I didn't think about finding the median on the path. Thanks.

→ Reply

JaberSH1

14 months ago, # ^ |

Actually you can get rid of the binary search and just search the segtree going to the child node with the maximum sum on each step, so it'll become $$$O(NlogN + QlogN)$$$.

→ Reply

CentroidDecomposition

14 months ago, # ^ |

← Rev. 3 →

~~You cannot do that.~~ The segment tree of node $$$i$$$ contains the information of the frequencies of colors on the path from root to node $$$i$$$. To get the frequency of colors on $$$path(u, v)$$$ you need to combine the information of three segment trees. On a step in binary search you need to check the sum of frequencies of colors less than mid. The median will be the largest element for which such sum is less than $$$\frac{pathlength}{2}$$$.

→ Reply

JaberSH1

14 months ago, # ^ |

You can keep the 3 segtree roots with you and go to their right or left children on each step of the search.

→ Reply

CentroidDecomposition

14 months ago, # ^ |

Yeah that works.

→ Reply

chromate00

14 months ago, # |

← Rev. 5 →

Two approaches I can think of.

Approach 1

Approach 2

→ Reply

bicsi

14 months ago, # ^ |

You don’t need HLD, you can just apply Approach 2 as a ‘binary lifting’-like dp and check the candidate in $$$O(\log n)$$$, making it $$$O(q \log n)$$$

→ Reply

chromate00

14 months ago, # ^ |

But how would you adapt the Boyer-Moore Majority Vote on the binary-lifting like method? Can we just try two possible candidates for the majority using the fact (which I have not proven, but does seem trivial) that if an element is a majority of $$$A + B$$$ it must either be a majority of $$$A$$$ or a majority of $$$B$$$?

→ Reply

bicsi

14 months ago, # ^ |

The fact that you mentioned is indeed true, it’s not too hard to prove. However, you can just aggregate the results on the two independent paths with the same combining function.

→ Reply

abdelrahman_rashed

14 months ago, # ^ |

Proof for "if an element is a majority of $$$A + B$$$ it must either be a majority of $$$A$$$ or a majority of $$$B$$$" : $$$\newline$$$ Suppose both $$$A$$$ and $$$B$$$ have length of $$$2n + 1$$$. Then let $$$X$$$ and $$$Y$$$ be the number of majority for $$$A$$$ and $$$B$$$ respectively each with count of $$$n + 1$$$. For the majority of $$$A + B$$$ to be some element other than $$$X$$$ or $$$Y$$$, such element will need to have a count greater than that of $$$X + Y$$$. The maximum count we can get for any element other than $$$X$$$ or $$$Y$$$ is $$$|A + B| - cnt(X) - cnt(Y) = 4n + 2 - 2(n + 1) = 2n$$$. Since $$$2n < 2(n + 1)$$$ then it is impossible for any number other than $$$X$$$ or $$$Y$$$ to be the majority of $$$A + B$$$. $$$\newline$$$
Explanation for "how would you adapt the Boyer-Moore Majority Vote on the binary-lifting like method" : $$$\newline$$$ let $$$lft(x, y)$$$ be the resulting node after lifting $$$x$$$ by $$$y$$$, and let $$$dp_{i,j}$$$ be a pair $$$(cnt, val)$$$ that stores the answer after applying Boyer-Moore on all nodes on the path from $$$lft(i, 2^j)$$$ to $$$i$$$. updating the $$$dp$$$ is as follows : $$$\newline$$$ $$$dp_{i,0} = (1, val_i) \newline$$$ $$$(dp_{i,j} \neq dp_{lft(i, 2^j),j}) \implies dp_{i,j+1} = max(dp_{i,j}, dp_{lft(i, 2^j),j}) \newline$$$ $$$otherwise \implies dp_{i, j+1} = (dp_{i,j}.first + dp_{lft(i, 2^j),j}.first, dp_{i,j}.second) \newline$$$ This $$$dp$$$ can be preprocessed in a similar manner to $$$lft$$$. Now lets consider each query in the form : "is $$$path(x, y)$$$ dominant?", you can get Boyer-Moore for $$$path(x, lca(x, y))$$$ in using binary lifting on $$$dp$$$ in a similar manner to getting $$$lca(x, y)$$$. The answer is simply to check if the value of Boyer-Moore on $$$path(x, y)$$$ is the majority or not.

→ Reply

Epilogue

14 months ago, # |

I solved it using Persistent Segment Tree, I think will try some others approach too.

→ Reply

0x0002

14 months ago, # |

Maybe you can randomly choose some vertexs and count the occurance of it. If you randomly choose $$$x$$$ times, the error rate will be $$$\dfrac{1}{2^x}$$$, so $$$x$$$ be about $$$O(\log V)$$$ is enough.

→ Reply

Mr.Whitebird

14 months ago, # ^ |

thats so cooolll

→ Reply

estoy-re-sebado

14 months ago, # |

← Rev. 3 →

First find a candidate for the most common color by calculating a pair (color, score) with a HLD path query. To combine pairs its easy. $$$(c1,s1)+(c2,s2)$$$ is just $$$(c1,s1+s2)$$$ if $$$c1=c2$$$ and $$$(c1,s1-s2)$$$ if $$$c1!=c2$$$ and $$$s1>s2$$$. Otherwise its $$$(c2, s2-s1)$$$

(The cool property of this operation is that, if there is a dominant color then the result of the operation has that color.

Then use another path query to compute real amount of appearances of that color along path (also HLD).

This way you can do online queries with updates.

It's probably overkill but maybe it gives some ideas

→ Reply

estoy-re-sebado

14 months ago, # |

← Rev. 2 →

Another idea

If the path is short (L < 2sqrt(N)) we can just check it in linear time using a hashmap or something.

If the path is long (L > 2sqrt(N)) the color must have at least L/2 appearances on the path, so it must have at least sqrt(N) appearances on the tree. This means that there will not be many colors that have this property and we can try and check all of them using something like binary lifting idk.

This works for online queries but no update s

→ Reply

TimDee

14 months ago, # ^ |

-8

I don't see that updates would be mentioned in post, so I think your solution can be further simplified:

We can have at most sqrt colours with freq >= sqrt.

For each we can build array d[c][u] = how many nodes have colour 'c' on path from u to root. Then calculating frequency of c on path from u to v is trivial in O(1), path_len >= 2sqrt can be solved in O(sqrt) per query. The other case with small length can be checked in O(length) as you said.

*(Note, it works only if ML allows NsqrtN memory)

→ Reply

1802042

14 months ago, # |

Epilogue problem link plz..

→ Reply

lmn207

11 months ago, # |

An O(q.logn.C) solution is you randomize choosing value in path (u, v) Than break the query(u, v) into queries(u, v, w): count number of value w in path (u, v) you can iterate the value of a and than build fenwick tree => we count in O(logn)

→ Reply