rastogimudit02's blog

Tell approach for it — "Find Index with minimum cost".

By rastogimudit02, history, 13 months ago, In English

Find Index with minimum cost

There are n indices from 1 to n.

Some indices are marked and given as input in an array (size m).

For the unmarked indices, find the index having the minimum cost.

Cost of an index is defined as -

the sum of reciprocal of absolute difference between this index and all other marked index.

$$$COST(i) = \sum_{m_{idx} \in \text{marked indices}} \frac{1}{|i - m_{idx}|}$$$

Constraint: 1 <= n, m <= 1e5

coding

rastogimudit02
13 months ago
17

Comments (15)

Show archived | Write comment?

onepersonintheuniverse

13 months ago, hide # |

Hint 1

Hint 2

Solution

→ Reply

rastogimudit02

13 months ago, hide # ^ |

← Rev. 2 →

Consider an example where marked indices are —
[2, 100, 101, 102, 103, 105]

According to your approach, largest gap is between (2 and 100). So, d = (100 — 2) = 98
half = 49
Hence, we should be selecting l + (r — l)/2 as the index,
which is 2 + (100 — 2) / 2 = 51 (in this case).
Ckearly, 51 is equidistant from 2 and 100.

But is it optimal?
Think about it.
Don't you think selected index should be leaning towards 2 (given the distribution of marked indices in this example).

→ Reply

RGB_ICPC_7

13 months ago, hide # |

← Rev. 2 →

-19

I wanted to share a neat “one-pass” trick for efficiently evaluating

$$$ \text{cost}(i) = \sum_{x \in \text{marked}} \frac{1}{|i - x|} $$$

for all $$$i=1,\dots,n$$$ in only $$$O(n+m)$$$ time, rather than the naive $$$O(n \cdot m)$$$. Here’s the idea:

Preprocessing:
Build an array $$$\text{freq}[1 \dots n]$$$ where $$$\text{freq}[x]$$$ counts the number of marks at position $$$x$$$.
Counting Distances:
For each $$$i$$$, let:

$$$\text{countLeft}[d]$$$ = number of marked points exactly $$$d$$$ units left of $$$i$$$ (i.e., at position $$$i-d$$$).
$$$\text{countRight}[d]$$$ = number of marked points exactly $$$d$$$ units right of $$$i$$$ (i.e., at position $$$i+d$$$).

Then, the cost at $$$i$$$ is:

$$$ \text{cost}(i) = \sum_{d=1}^{i-1} \text{countLeft}[d] \cdot \frac{1}{d} + \sum_{d=1}^{n-i} \text{countRight}[d] \cdot \frac{1}{d}. $$$

Initializing for $$$i=1$$$:
At the leftmost position $$$i=1$$$:

No points to the left: $$$\text{countLeft}[d] = 0$$$ for all $$$d \gt 0$$$.
$$$\text{countRight}[d] = \text{freq}[1 + d]$$$.

Therefore,

$$$ \text{cost}(1) = \sum_{d=1}^{n-1} \text{countRight}[d] \cdot \frac{1}{d} = \sum_{x=2}^{n} \text{freq}[x] \cdot \frac{1}{x-1}. $$$

This can be computed in $$$O(n+m)$$$ time if you loop only over the marked positions.

Left-to-Right Sweep:
The trick is to update the cost in constant time when moving from $$$i$$$ to $$$i+1$$$. As you increment $$$i$$$:

Every marked point to the left of $$$i$$$ is now one unit further away.
Every marked point to the right of $$$i$$$ is one unit closer.

Define two sums:
— $$$\text{sumLeft} = \sum_{d} \text{countLeft}[d] \cdot \frac{1}{d}$$$,
— $$$\text{sumRight} = \sum_{d} \text{countRight}[d] \cdot \frac{1}{d}$$$.

So, $$$\text{cost}(i) = \text{sumLeft} + \text{sumRight}$$$.

The updates are:

$$$ \begin{aligned} \text{sumLeft}' &= \text{sumLeft} + \sum_{d=1}^{i-1} \text{countLeft}[d] \cdot \left( \frac{1}{d+1} - \frac{1}{d} \right), \\ \text{sumRight}' &= \text{sumRight} + \sum_{d=1}^{n-i} \text{countRight}[d] \cdot \left( \frac{1}{d-1} - \frac{1}{d} \right). \end{aligned} $$$

Notice that:

$$$ \frac{1}{d+1} - \frac{1}{d} = -\frac{1}{d(d+1)}, $$$

so you only need to adjust each part by a simple term. By using clever shifting (or two pointers), the counts can be updated without re-scanning the entire array.

Finding the Minimum:
As you sweep $$$i$$$ from $$$1$$$ to $$$n$$$, skip positions where $$$\text{freq}[i] \gt 0$$$ (since these are marked) and track the unmarked $$$i$$$ with the minimum $$$\text{cost}(i)$$$.
Time Complexity:
Each increment is $$$O(1)$$$ (apart from minor array shifts via an offset), so the overall time is $$$O(n + m)$$$, which is optimal for $$$n, m \le 10^5$$$.

Key takeaway:
The main trick is recognizing that the difference

$$$ \frac{1}{|i+1 - x|} - \frac{1}{|i - x|} $$$

can be computed in constant time if you know how many $$$x$$$ are exactly $$$d$$$ units away from $$$i$$$. This avoids recomputing a summation over $$$m$$$ elements for each $$$i$$$.

Hope you find this method as elegant and useful as I did!

→ Reply

onepersonintheuniverse

13 months ago, hide # ^ |

How do you update the reciprocals? You have to increase every denominator in sumLeft, decrease every denominator in sumRight, and (more manageably) sometimes swap reciprocals from sumLeft to sumRight.

→ Reply

RGB_ICPC_7

13 months ago, hide # ^ |

-19

For the left side (sumLeft):
Every marked point to the left of position i is now one unit further away when you move from i to i + 1. If a marked point was at distance d (with reciprocal 1/*d*), its new distance becomes d + 1 (with reciprocal 1/(*d + 1*)). The change in its contribution is:

$$$ \frac{1}{d+1} - \frac{1}{d} = -\frac{1}{d(d+1)}. $$$

Instead of recomputing 1/(*d + 1*) for every point, you can subtract $$$ \frac{1}{d(d+1)} $$$ for each of the countLeft[d] points at distance d.

For the right side (sumRight):
Similarly, each marked point to the right gets one unit closer. A point at distance d will now be at distance d − 1, changing its reciprocal from 1/*d* to 1/(*d − 1*) (for d > 1). The difference is:

$$$ \frac{1}{d-1} - \frac{1}{d} = \frac{1}{d(d-1)}. $$$

So, for each countRight[d] marked points at distance d, you add $$$ \frac{1}{d(d-1)} $$$ to sumRight.

Swapping at the boundary:
The “swap” happens for points exactly one unit away. When moving from i to i + 1, any marked point that was exactly one unit to the right (with distance 1) becomes exactly at position i (thus no longer contributing to sumRight but now to sumLeft as distance 0, which is effectively not counted). You handle this by adjusting your counts: decrement countRight[1] and increment countLeft[1].

Efficient update:
By maintaining the counts of marked points at each distance (using your freq array), you can update sumLeft and sumRight in constant time for each shift. Instead of looping over all marked points, you only update based on the counts for the affected distances. This gives you the overall O(n + m) performance.

So, rather than “increasing every denominator” or “decreasing every denominator” explicitly, you capture the net change for each distance in one go, using the formulas above. This “one-pass” trick leverages the fact that the differences between consecutive reciprocals are simple fractions that depend only on the distance, not on the specific marked point.

→ Reply

rastogimudit02

13 months ago, hide # ^ |

-6

Is your 'd' always fixed?

Didn't found you iterating over d.

→ Reply

RGB_ICPC_7

13 months ago, hide # ^ |

-12

No, d isn't fixed—it represents the variable distance between the current index and each marked index. In our approach, we don't iterate over every possible d each time. Instead, we maintain aggregated counts (or sums) that let us update the overall contribution in constant time when we move from index (i) to (i+1). This way, the change in every reciprocal is handled implicitly via a difference update, rather than by iterating over each d.

→ Reply

rastogimudit02

13 months ago, hide # ^ |

Kindly provide code for the same. Would be of great help.

→ Reply

RGB_ICPC_7

13 months ago, hide # ^ |

-32

import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws IOException {
        // Read input values: n = total indices, m = number of marked indices.
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
        String[] parts = br.readLine().trim().split("\\s+");
        int n = Integer.parseInt(parts[0]);
        int m = Integer.parseInt(parts[1]);

        // Read and sort marked indices.
        String[] mStr = br.readLine().trim().split("\\s+");
        int[] mark = new int[m];
        for (int i = 0; i < m; i++) {
            mark[i] = Integer.parseInt(mStr[i]);
        }
        Arrays.sort(mark);

        // bestCand will hold the best candidate index,
        // bestDist holds the maximum minimum distance from any marked index.
        int bestCand = -1;
        int bestDist = -1;

        // Check left boundary: if index 1 is unmarked, compute its distance.
        if (mark[0] != 1) {
            int d = mark[0] - 1;
            if (d > bestDist) {
                bestDist = d;
                bestCand = 1;
            }
        }

        // Check right boundary: if index n is unmarked, compute its distance.
        if (mark[m - 1] != n) {
            int d = n - mark[m - 1];
            if (d > bestDist) {
                bestDist = d;
                bestCand = n;
            }
        }

        // For each gap between consecutive marked indices,
        // choose a candidate index that maximizes the minimum distance from both marks.
        for (int i = 0; i < m - 1; i++) {
            int l = mark[i];     // left marked index
            int r = mark[i + 1]; // right marked index
            if (r - l > 1) {
                int gap = r - l - 1; // gap size (unmarked indices in between)
                int cand;
                // Count how many marked are to the left/right of this gap.
                int leftCount = i + 1;  // all marked up to current
                int rightCount = m - leftCount;

                // If gap is even, take the exact midpoint.
                // If odd, lean towards the side with fewer marks.
                if (gap % 2 == 0) {
                    cand = l + gap / 2;
                } else {
                    if (leftCount < rightCount) {
                        cand = l + gap / 2;
                    } else if (leftCount > rightCount) {
                        cand = l + gap / 2 + 1;
                    } else {
                        cand = l + gap / 2;
                    }
                }

                // The candidate's quality is measured as the smaller distance to either mark.
                int dLeft = cand - l;
                int dRight = r - cand;
                int currDist = Math.min(dLeft, dRight);
                if (currDist > bestDist) {
                    bestDist = currDist;
                    bestCand = cand;
                }
            }
        }

        // Output the best candidate index.
        System.out.println(bestCand);
    }
}

→ Reply

arvindf232

13 months ago, hide # |

It should be a bannable offense to use ChatGPT to write a solution with such confidence while being absolute nonsense.

Anyways, your desired cost for each position can be calculated with FFT. You need two separate convolutions, one for handling marked index to the left and one for the right. Hopefully there is no precision issues with Doubles.

→ Reply

RGB_ICPC_7

13 months ago, hide # ^ |

-20

Sorry sir, I think there's been a misunderstanding. I did not use any LLM to write the solution.

→ Reply

rastogimudit02

13 months ago, hide # ^ |

arvindf232, can you provide your approach in a bit more detailed manner.

Also, I wonder if there existed an easier approach to solve this.

→ Reply

arvindf232

13 months ago, hide # ^ |

I made an inaccurate statement: one convolution is enough. Let S be the set of marked indices. Define array A to be $$$A[i]=1$$$ if $$$i$$$ is marked, $$$A[i]=0$$$ if not, A is indexed from $$$0$$$ through $$$n-1$$$. Define array B to be $$$B[i] = 0$$$ if $$$i=0$$$, or $$$\frac{1}{|i|}$$$. Indexed from $$$i=-n$$$ through $$$n$$$. ($$$B[0]$$$ can be anything, its value do not affect any answers)

Let $$$C$$$ be their convolution, which is defined to be $$$C[j] = \sum_{i}A[i]B[j-i]$$$, where the sum is taken over whenever both $$$A$$$ and $$$B$$$ are defined. This $$$C$$$ can be calculated with FFT in $$$O(n \log n)$$$.

Observe that $$$C[i]$$$ now stores the cost of any unmarked index $$$i$$$. ($$$C[i]$$$ for marked $$$i$$$ do not hold any meaning and can be ignored.)

→ Reply