I'm stuck on the following problem:
Given a tree with $$$N$$$ nodes and a positive integer $$$D$$$ $$$(1 \leq N, D \leq 2\times 10^5)$$$, a set of marked nodes is a subset of nodes from the original graph such that marked nodes may not be closer to each other than distance $$$D$$$. What is the maximum number of marked nodes for this graph?
I tried tree dp with $$$dp[i][j]$$$ representing the answer for node $$$i$$$'s subtree such that all marked nodes in $$$i$$$'s subtree are $$$\geq j$$$ edges away from $$$i$$$ but it's too inefficient. Any ideas?
You can speed up your DP to $$$\mathcal O(N)$$$ with two optimizations:
A proof for why this is linear time is detailed in the last paragraph of this blog.
As an aside, this is exactly BOI 2017 – Cat in a tree.
I'm not sure how the first one helps with the linear solution. Let's consider a chain tree from $$$1$$$ to $$$n$$$ and say the deepest node in the tree (when rooted from $$$1$$$) is $$$n$$$. Then $$$dp[n]$$$ will have one entry, $$$dp[n - 1]$$$ will have $$$2$$$ entries and so on, $$$dp[1]$$$ will have $$$n$$$ entries. This makes $$$\mathcal{O}(n^2)$$$ memory. Where don't I understand, exactly?
I probably should have clarified, both optimizations are needed simultaneously for $$$\mathcal O(n)$$$ complexity, just one is not enough. Here's code for reference that I wrote a few months ago for the BOI problem.
I am not sure, how your n^2 dp is correct. Lets say we are at the ith node... if we select the ith node then we need to select nodes at a distance >= D from that node, this transition you mentioned in the blog. But if we do not select the ith node then all selected nodes need not to be at >=D distance from ith node as we can construct >=D distance by considering i as the LCA. You are constructing the answer considering all selected nodes are >=D from ith which may not be the case. Am I missing something?
According to my definition of dp[i][j], I got that if $$$2j \geq D$$$ then you can just get the answer from the child subtrees since marked nodes in different subtrees wouldn’t break the condition of being D edges apart. Otherwise, for an optimal set of marked nodes (still considering the subtree rooted at i), exactly one child node is between j and (max integer k such that 2k < D) edges away from i, and I’m trying to optimize the overall computation of this dp.
Agreed with the first para of your comment if $$$2j>=D$$$ this will not break any condition. But I am saying if that's not the case then also you can mark nodes. Like consider a tree with 4 nodes $$$(1,2) (1,3) (3,4)$$$ and you have rooted at 1 and lets say value of D=3. then at root how does your $$$dp[i][j]$$$ method gives correct answer? Here you can still select 3 and 4 , only dp[i][1] will give 2 as answer but $$$2*1<3$$$ which may violate the condition.
Let k be the least integer such that 2k >= D, then only one child subtree can have a marked node not >= k edges from i. Then dp[i][j] is the maximum of dp[v][j-1] + sum over the rest of the children of dp[u][k-1]. (edit: I think it is actually more complicated than this but it still works)
yeah thanks got it.
Greedy works. Root the tree somewhere and run DFS. After you visit all children of a particular node $$$u$$$, either:
To see that it works just note that, after this, we'll have the optimal pair (maximum number of marked nodes in the subtree of $$$u$$$, depth of highest marked node).