Blog entries - Codeforces

Solution

Interaction

Solution

Definitions

Let $$$T$$$ be a tree of size $$$n$$$, and let $$$k$$$ be a positive integer. Define a partition of $$$T$$$ as a collection of disjoint subtrees whose union is $$$T$$$. A subtree is called a good tree if its size is at least $$$k$$$. The good value of $$$T$$$ is the maximum number of good trees that can appear in any partition of $$$T$$$.

A subtree $$$X \subseteq T$$$ is called a minimal subtree with respect to $$$k$$$ if $$$|X| \geq k$$$ and every proper subtree $$$Y \subsetneq X$$$ satisfies $$$|Y| \lt k$$$.

Theorem

There exists an optimal partition of $$$T$$$ (i.e., one attaining the good value) that includes every minimal subtree $$$X$$$ as one of its parts.

Proof

Let $$$P^*$$$ be an optimal partition of $$$T$$$, meaning that it contains the maximum possible number of good trees.

Claim: The nodes of $$$X$$$ must be contained in at most one good tree in $$$P^*$$$.

Proof by contradiction: Suppose the nodes of $$$X$$$ are distributed among at least two good trees in $$$P^*$$$. Let $$$T_1$$$ and $$$T_2$$$ be two such good trees, and assume without loss of generality that $$$T_1$$$ contains the top node (or root) of $$$X$$$. Since $$$X$$$ is a minimal subtree, every proper subtree of $$$X$$$ has size strictly less than $$$k$$$. In particular, the intersection $$$X \cap T_2$$$ is a proper subset of $$$X$$$ and thus must have size $$$|X \cap T_2| \lt k$$$. This contradicts the assumption that $$$T_2$$$ is a good tree (i.e., has size at least $$$k$$$). Therefore, $$$X$$$ can be contained in at most one good tree in $$$P^*$$$.

Since at most one good tree in $$$P^*$$$ contains the nodes of $$$X$$$, we can replace this tree with $$$X$$$ itself (along with any necessary adjustments to maintain a partition) without decreasing the number of good trees in $$$P^*$$$. This guarantees that there exists an optimal partition that includes $$$X$$$ as an independent subtree.

Consequently, the greedy strategy of selecting every minimal subtree (with respect to $$$k$$$) to construct a partition finds the good value of $$$T$$$.

Additional Definitions

A chain in a tree $$$T$$$ is a sequence of nodes $$${v_1, v_2, \dots, v_m}$$$ such that they form a connected path in the tree and for every pair of nodes $$$(v_i, v_j)$$$ in the chain, one is an ancestor of the other in $$$T$$$.
A chain decomposition of $$$T$$$ is a decomposition of the tree into chains such that each node is in exactly one chain.
A subchain of a chain $$$X$$$ is a chain $$$Y$$$ such that an ancestor of each node in $$$Y$$$ lies in $$$X$$$.
An extended tree is the tree formed by a chain and all its subchains.
A minimal chain is a chain whose extended tree has size $$$\geq k$$$ but all its subchains have extended trees of size $$$ \lt k$$$.

Greedy Algorithm

The greedy algorithm removes minimal subtrees at each step. The root of a minimal subtree lies in the minimal chain. The idea is to maintain the subtree size of each node and find the lowest node (considering the root of the chain to be the highest) in the chain with subtree size $$$\geq k$$$.

Find a chain decomposition of the tree.
Store the count of descendants in the same chain for each node.
When the size of an extended tree becomes $$$ \lt k$$$, add its size to its ancestors in the immediate parent chain(the chain which contains the parent of the root of the extended tree) and remove the current chain from consideration.

$$$O(\frac{n}{k})$$$ modify and query operations on a segment tree are needed to find the value for each $$$k$$$, so the total time complexity is $$$O(n\log^2 n)$$$.

Code: 310459944

Solution

Let's find the beauty of only the string $$$s$$$.

We firstly observe that the final string would be some subsequence of $$$s$$$, but not all subsequences of $$$s$$$ can be obtained. For example, you cannot obtain $$$AA$$$ from $$$ABA$$$. In fact, the count would be the number of distinct subsequences of $$$s$$$ for which, if the subsequence contains any two consecutive $$$A$$$s, then there are no $$$B$$$s in $$$s$$$ between the two positions corresponding to these $$$A$$$s. Similarly, if the subsequence starts or ends with $$$A$$$, then there should be no $$$B$$$s before or after the position in $$$s$$$ corresponding to this $$$A$$$. Call such a subsequence good. Call a subsequence semi-good if we only relax the constraint that there should be no $$$B$$$s after the position of the last $$$A$$$ when the subsequence ends in $$$A$$$ (so $B$s are allowed after this position).

Let $$$G$$$ denote the set of distinct good subsequences of $$$s$$$. Let $$$T_i$$$ denote the set of distinct semi-good subsequences that can be obtained from the string $$$s[1,i]$$$ (the prefix of $$$s$$$ of length $$$i$$$). Clearly, $$$T_{i-1} \subseteq T_i$$$. Let

$$$ S_i = T_i - T_{i-1}. $$$

Any string $$$t \in G$$$ also belongs to a unique $$$S_i$$$, but not all strings in $$$\bigcup_{i=1}^{n} S_i$$$ are in $$$G$$$. For example, if $$$s$$$ is $$$AAAB$$$, then $$$AAA \in S_3$$$ but $$$AAA \notin G$$$. Moreover, $$$S_i \cap S_j = \emptyset$$$ for $$$i \neq j$$$.

We first try to find

$$$ dp_i = |S_i| $$$

for each $$$i$$$. Consider any string $$$w \in S_i$$$. We try to think what happens when we append an $$$A$$$ or $$$B$$$ to $$$w$$$. Since $$$w \in T_i$$$ and $$$w \notin T_{i-1}$$$, the last character of $$$s[1,i]$$$ must also be the last character of $$$w$$$. So let's take cases on the last character of $$$w$$$.

Case 1: The last character of $$$w$$$ is $$$B$$$.
$$$wB \in S_{j_1}$$$ where $$$j_1$$$ is the least index $$$ \gt i$$$ for which $$$s[j_1] = B$$$. So we do

$$$ dp_{j_1} \leftarrow dp_{j_1} + dp_i. $$$

If no such $$$j_1$$$ exists, we skip this step.

$$$wA \in S_{j_2}$$$ where $$$j_2$$$ is the least index $$$ \gt i$$$ for which $$$s[j_2] = A$$$. So we do

$$$ dp_{j_2} \leftarrow dp_{j_2} + dp_i. $$$

If no such $$$j_2$$$ exists, we skip this step.

Case 2: The last character of $$$w$$$ is $$$A$$$.
$$$wB \in S_{j_1}$$$ where $$$j_1$$$ is the least index $$$ \gt i$$$ for which $$$s[j_1] = B$$$. So we do

$$$ dp_{j_1} \leftarrow dp_{j_1} + dp_i. $$$

If no such $$$j_1$$$ exists, we skip this step.

The case for appending $$$A$$$ is more complex. Firstly, observe that the number of consecutive $$$A$$$s at the end of $$$w$$$ must be the same as the number of consecutive $$$A$$$s at the end of $$$s[1,i]$$$, otherwise $$$w \notin S_i$$$. If $$$s[i+1] = A$$$ then clearly $$$wA \in S_{i+1}$$$. However, if $$$s[i+1]$$$ is $$$B$$$:
- Suppose $$$w$$$ contains a $$$B$$$, then $$$wA$$$ belongs to $$$S_{j_1}$$$ where $$$j_1$$$ is the least index $$$ \gt i$$$ for which the number of consecutive $$$A$$$s at the end of $$$s[1,j_1]$$$ is greater than the number of consecutive $$$A$$$s at the end of $$$s[1,i]$$$. We skip this step if no such $$$j_1$$$ exists.
- Suppose $$$w$$$ does not contain any $$$B$$$. This can only happen when $$$w = s[1,i]$$$ (because if $$$s[1,i]$$$ contains a $$$B$$$, then by the nature of the operation, $$$w$$$ must contain a $$$B$$$ too). Thus, $$$wA$$$ can never be an element of any $$$S_j$$$ for $$$j \gt i$$$.

Clearly, the $$$dp$$$ array can be computed in $$$O(n)$$$ space and time. Now, for each $$$S_i$$$, let's consider which strings in it are contained in $$$G$$$.

If $$$s[i]$$$ is $$$B$$$, then any string in $$$S_i$$$ is also in $$$G$$$. So we update

$$$ ans \leftarrow ans + dp_i. $$$

If $$$s[i]$$$ is $$$A$$$ and $$$s[1,i]$$$ contains a $$$B$$$ (which means all strings in $$$S_i$$$ contain a $$$B$$$), then we update

$$$ ans \leftarrow ans + dp_i $$$

only if the number of $$$A$$$s at the end of $$$s[1,i]$$$ is $$$\leq$$$ the number of $$$A$$$s at the end of $$$s$$$. Otherwise, $$$ans$$$ doesn't change. - If $$$s[i]$$$ is $$$A$$$ and $$$s[1,i]$$$ does not contain a $$$B$$$ (which means $$$S_i = {s[1:i]}$$$), then we update

$$$ ans \leftarrow ans + dp_i $$$

only if $$$s$$$ does not contain an $$$A$$$. Otherwise, $$$ans$$$ doesn't change.

Thus, the problem can be solved in $$$O(n)$$$ time for a single string $$$s$$$.

Now we are asked to find the beauty count for each prefix of $$$s$$$. Observe that the same $$$dp$$$ array can be used for solving each prefix! For the string $$$s[1,i]$$$, we do the following:

If $$$s[1,i]$$$ contains only $$$A$$$s, then the answer for this prefix is simply $$$i$$$.
If $$$s[1,i]$$$ ends in a $$$B$$$, then

$$$ ans_i = \sum_{s[j] = 0} dp_j. $$$

If $$$s[1,i]$$$ ends in an $$$A$$$, then

$$$ ans_i = \sum_{j \in I} dp_j, $$$

where $$$I$$$ is the set of all $$$j \lt i$$$ for which the number of consecutive $$$A$$$s at the end of $$$s[1,j]$$$ is $$$\leq$$$ the number of consecutive $$$A$$$s at the end of $$$s[1,i]$$$ (note that $$$s[1,j]$$$ can also end in $$$B$$$).

All of this can be efficiently computed as we process the string from left to right using a Fenwick tree.

Thus, the overall time complexity is $$$O(n \log n)$$$.

Code: 310460120

Solution

Suppose we are solving for a prefix of length $$$l$$$. Let $$$S$$$ be the set of indices in the prefix such that the element originally at that position is encountered during the process (i.e. For every element originally at an index in $$$S$$$, there exists some position such that when we move to that position during the process, we find this element at that position and add its value to the cost). Define $$$T$$$ as the complement of $$$S$$$ in this prefix. To reach indices in $$$T$$$ while having an element outside the indices in $$$T$$$ present at those indices, we must swap the element at that index with some element originally outside $$$T$$$ before visiting it; hence, at least $$$|T|$$$ swaps are necessary.

Let $$$m = \min(a_1,a_2,\dots,a_n)$$$ be a minimum element in the array.

Claim: If $$$\geq|T|+1$$$ swaps are made, then it is optimal to perform all swaps involving the minimum element $$$m$$$, and visit all indices in $$$T$$$ with the cost of that element.

Proof: If $$$\geq|T|+1$$$ swaps are made, then the cost of the process is at least $$$\sum_{i\in S} a_i \;+\; (|T|+1)\cdot k \;+\; m\cdot|T|.$$$ We can achieve this cost using the following process:

Let $$$X = T \cup {i_m}$$$, where $$$i_m$$$ is an index with $$$a_{i_m} = m$$$. Iterate from $$$x=0$$$ to $$$l-1$$$. If $$$x+1\in X$$$, swap the current position of $$$m$$$ with $$$x+1$$$. Move to $$$x+1$$$.

Now, we are left with the case with exactly $$$|T|$$$ swaps. For each index in $$$T$$$, we must perform exactly one swap from an element in $$$S$$$ or from the suffix (positions $$$l+1,\dots,n$$$) to it. No other type of swaps are allowed, as then the required number of swaps would exceed $$$|T|$$$. Notice that if an index $$$i$$$ in $$$S$$$ is swapped with an index in $$$T$$$ before position $$$i$$$ is reached in the process, an additional swap would be required later to recover that position (since position $$$i$$$ then holds an element from $$$T$$$ and has not yet been reached). Hence, an index $$$i$$$ in $$$S$$$ can only be swapped to an index $$$j$$$ in $$$T$$$ such that $$$j \gt i$$$ (after position $$$i$$$ has been reached), while indices in the suffix can be swapped to any position in the prefix at any time.

Thus, for this case, the cost of reaching an index $$$x$$$ (i.e. moving from position $$$x-1$$$ to $$$x$$$, where the move normally costs $$$a_x$$$) is

$$$ \min\Bigl(a_x,\; \min\bigl\{\operatorname{pref}_{\min}(x),\, \operatorname{suf}_{\min}(l+1)\bigr\}+k\Bigr), $$$

where $$$\operatorname{pref}_{\min}(x)$$$ is the minimum element in the prefix up to $$$x$$$ and $$$\operatorname{suf}_{\min}(l+1)$$$ is the minimum element in the suffix (i.e. positions $$$l+1,\dots,n$$$). The summation of this over $$$x\le l$$$ can be computed for each $$$l$$$ by iterating from $$$l=1$$$ to $$$n$$$, maintaining in a multiset the values of $$$\min\bigl(a_x,\; \operatorname{pref}_{\min}(x)+k\bigr)$$$ that exceed $$$\operatorname{suf}_{\min}(l+1)+k$$$, and their sum, so that we can replace these costs with $$$\operatorname{suf}_{\min}(l+1)+k$$$ to compute the cost for this case.

For the first case (with at least $$$|T|+1$$$ swaps), the total cost can be viewed as covering each element with $$$\min(a_i,\; m+k)$$$ and adding an extra cost of $$$k$$$ for the additional swap.

Code: 310468120

Solution

First note that for any two vertices $$$u$$$ and $$$v$$$, to minimise the sum of weights, we will always have to take the path either directly from $$$u$$$ to $$$v$$$ or $$$u$$$ to $$$x$$$ to $$$v$$$, where $$$x$$$ is a point in the unique path from $$$u$$$ to $$$v$$$ in the original tree.

Second, also note that the weight of any edge in the complete graph, that is the range of the function $$$d$$$, is $$${0,1,2,3}$$$.

This is because, let $$$u$$$ and $$$v$$$ be any two vertices. Let the path between the two vertices be $$$u$$$, $$$v_1$$$, $$$v_2$$$, $$$\dots$$$, $$$v$$$. Let $$$a_1$$$, $$$a_2$$$, $$$\dots$$$ be the values assigned to these vertices. If none of the vertices in the path have the value assigned to be $$$0$$$, clearly the $$$\mathrm{MEX}(S)=0$$$ and correspondingly $$$d(u,v)=0$$$. If $$$0$$$ is present, we take two cases.

Say $$$a_i=0$$$. If neither $$$a_{i-1}$$$ nor $$$a_{i+1}$$$ are equal to $$$0$$$, then we can easily divide the path into two parts, say $$$u$$$ to $$$x$$$ to $$$v$$$, such that

$$$ w(u,x) + w(x,v) = 1. $$$

That is, one of them is one and the other zero. Hence in this case, $$$d(u,v)=1$$$.

If either of them is $$$1$$$, (without loss of generality, assume $$$a_{i+1}=1$$$), we further divide this into two cases. It can be similarly seen by case analysis that if $$$a_{i-1}=2$$$ then $$$d(u,v)=3$$$ and if $$$a_{i-1} \neq 2$$$, then $$$d(u,v)=2$$$.

Hence, we can see that the value of $$$d(u,v)$$$ for any vertices is determined by the position of vertices with values $$$0$$$, $$$1$$$, and $$$2$$$, whether they are in the path between $$$u$$$ and $$$v$$$ in the tree or not.

In the optimal tree, we want to maximize the occurrences of the pattern $$$2$$$-$$$0$$$-$$$1$$$ in the paths as much as possible. Hence, the values $$$2$$$, $$$0$$$, and $$$1$$$ need to occur in order.

To maximize the required occurrences, we run a DFS and place $$$0$$$ at the current node at every time. Then, to maximize the occurrences, we place $$$1$$$ at the root of the subtree (of the original tree considering the current node as the root) with the highest nodes contained in it, and $$$2$$$ at the root of the subtree with the second highest nodes.

This is because if the sizes of the subtrees with roots assigned $$$1$$$ and $$$2$$$ are $$$a_1$$$ and $$$a_2$$$, respectively, then it can be shown that the answer we can achieve in this setting is:

$$$ \frac{\sum_{a \in \text{subtrees}} \left(\text{size}(a) \cdot (n - \text{size}(a))\right) + (n - 1)}{2} + a_1 \cdot \left(n - a_1\right) + a_1 \cdot a_2 $$$

where the first term represents the case when $$$d(u,v)\ge1$$$, the second term represents the case when $$$d(u,v)\ge2$$$, and the third term represents when $$$d(u,v)\ge3$$$.

Now, we check whether the value that we have achieved with $$$0$$$ at the current node is the maximum we have achieved so far, and if it is so, we update the answer.

Note: If the current node is a leaf node, we would have to handle it separately, as it would have only one subtree.

Time Complexity: $$$O(n\log(n))$$$

Code: 310460432

Solution

A naive approach for solving this problem would be to calculate the minimum distance from every node in set $$$S$$$ and every node in set $$$T$$$ to all vertices of the tree using either dynamic programming (DP) or multi-source Dijkstra. Then, we take the minimum sum over all nodes. However, this approach runs in $$$O(n)$$$ per query, which is inefficient for large values of $$$n$$$. We need to optimize this solution.

Let the minimum distance be achieved between some $$$s \in S$$$ and some $$$t \in T$$$. Then, the shortest path between them must pass through their Lowest Common Ancestor (LCA). Thus, the only relevant nodes we need to consider are:

All nodes in $$$S$$$
All nodes in $$$T$$$
All nodes $$$u \in V$$$ such that $$$u = \text{LCA}(s, t)$$$ for some $$$s \in S$$$, $$$t \in T$$$

Enumerating all such LCAs is a common idea in tree problems. We run a Depth First Search (DFS) from any root and sort the vertices of $$$S \cup T$$$ according to their in-times. Let these sorted vertices be $$$v_1, v_2, \dots, v_{|S| + |T|}$$$. We only consider the LCAs of consecutive pairs $$$(v_i, v_{i+1})$$$ for $$$1 \leq i \lt |S| + |T|$$$.

Once we have the complete set of nodes, say $$$X$$$, we need to construct a tree using only these vertices. This structure is called the Virtual Tree of $$$(S \cup T)$$$. A simple way to construct it is as follows:

Sort the nodes in $$$X$$$ in order of depth from the root (not by distance).
Construct a Range Minimum Query (RMQ) structure on $$$X$$$.
Recursively build the virtual tree:
Find the node in the range $$$[l, r]$$$ with the minimum depth (say at index $$$\text{idx}$$$).
Make this node the root of the virtual tree.
Recursively construct the left subtree using the range $$$[l, \text{idx} - 1]$$$ and the right subtree using the range $$$[\text{idx} + 1, r]$$$.

The edge lengths in the Virtual Tree are computed as:

$$$ \text{dist}[\text{child}] - \text{dist}[\text{parent}], $$$

where $$$\text{dist}[i]$$$ denotes the distance of the $$$i^{th}$$$ vertex from the initial DFS root.

Since the Virtual Tree contains only $$$O(|S| + |T|)$$$ vertices, we can efficiently compute the result using our initial approach. The overall time complexity of this optimized approach is:

$$$ O\left( \left(n + \sum (|S(i)| + |T(i)|)\right) \log n \right), $$$

which is significantly better than the naive $$$O(n)$$$ per query approach.

Code: 310461150