cuom1999's blog

By cuom1999, history, 4 years ago, In English

We usually see problems like: Given an array $$$a_1, a_2, ..., a_n$$$. How many ways to partition the array into continuous groups in which each group satisfies a condition (e.g sum not exceeding a number, ...)? This kind of problem is usually done in linear or log-linear time with dynamic programming and prefix sum.

What about changing the statement to a circular array? Formally, given a circular array $$$a_1, a_2, ..., a_n$$$. How many ways to partition the circle into continuous groups in which each group satisfies a condition? Two ways are different if there are two indices belonging to the same group in one way, but different groups in the other way.

How to solve this problem in sub-quadratic time in general? I find even the simplest conditions like the below are quite hard.

a) Each group has sum $$$\leq M$$$, with a given $$$M$$$.

b) Each group has $$$gcd > 1$$$.

...

Solution for problem a or b are appreciated.

  • Vote: I like it
  • +64
  • Vote: I do not like it

»
4 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

For the sum $$$\leq M$$$, isnt it the same problem even if its a circular array, just with bigger instead of smaller?

We should find a contigous group inside the array with sum bigger than $$$arraysum-M$$$, then we take the rest of the elements and it will end up being a circular subarray.

Edit: sorry, I didn't understand the problem correctly, I thought it had to be split into just 2 groups.

»
4 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

I don't understand why b is not easy enough?

on linear array
on circular array
  • »
    »
    4 years ago, # ^ |
      Vote: I like it +13 Vote: I do not like it

    I don't think your last sentence is true.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Yeah, got your point.

      Error, for someone who's wondering

      But I think we can modify the approach this way?

      Let $$$dp[i], mx[i]$$$ stores the starting index of the last partition in greedy partitioning of $$$a[i...n]$$$, and the number of partitions respectively. Then we just need to check if for any $$$i$$$ in the first block $$$gcd(a[dp[i]+ 1...n], a[1...i- 1]) > 1$$$, and $$$mx[i] = k - 1$$$. Storing $$$dp[i], mx[i]$$$ is easy because we can track $$$j's$$$ for which $$$gcd(a[i..j])$$$ changes, and there are atmost $$$\log (MAX)$$$ of them for each $$$i$$$.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I think you misread the problem as I commented below.

»
4 years ago, # |
Rev. 2   Vote: I like it +16 Vote: I do not like it

There is actually a quite general way to handle such problems. Assume that if $$$[l, r]$$$ is a valid interval then $$$[l, r-1]$$$ is a valid interval as well (holds for many properties, in particular for both mentioned by you) and that you are able to determine for each position $$$x$$$ where is the furthest position $$$last(x)$$$ such that $$$[x, last(x))$$$ is a valid interval (doable as well for both of your properties). Then imagine you draw arrows from $$$x$$$ to $$$last(x)$$$ for each $$$x$$$. Assume that if you cut this circular array arbitrarily and use greedy algorithm following arrows then you will need $$$k$$$ arrows. Then answer is either $$$k$$$ or $$$k-1$$$. You can use now magic super simple algorithm to determine which is the case. Take $$$n$$$ steps with these arrows. If last $$$k-1$$$ steps covered distance of at least $$$n$$$ then answer is $$$k-1$$$, otherwise $$$k$$$. Proof left as an exercise (EDIT: From my proof posted below it follows you should first take n steps, and then look at following k-1, not sure if it works for the way described here. Or even simpler, first take n steps and then look at first moment you cover distance of at least n, even without determining k at any point)

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it +8 Vote: I do not like it

    I think you misread the problem a little. My question is counting how many valid partitions, not finding the partition with the fewest groups.

    About the above approach (fewest groups), I don't get the part "Take $$$n$$$ steps with these arrows. If last $$$k−1$$$ steps covered distance of at least $$$n$$$ then answer is $$$k−1$$$" yet. It seems to be very interesting and have good time complexity. I used a sparse table for this before, btw.

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 4   Vote: I like it +16 Vote: I do not like it

      Ah yeah, sorry. At first I read it correctly, but then I read above comments and forgot what you originally asked about :).

      But if I started talking about fewest groups let me explain that last part. You do

      int x = 1;
      for (int i = 1; i <= n; i++) {
        x = last[x];
      }
      int distance = 0;
      for (int i = 1; i <= k - 1; i++) {
        distance += last[x] - x;
        if (last[x] < x) {
          distance += n;
        }
        x = last[x];
      }
      if (distance >= n) {
        return k-1;
      } else {
        return k;
      }
      

      By the way I am quite sure nothing that includes at least reading the input works in sub-linear time as you mention ;). Did you mean sub-quadratic?

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Sorry, you are correct. I meant subquadratic (I was thinking about log-linear)

      • »
        »
        »
        »
        4 years ago, # ^ |
        Rev. 2   Vote: I like it 0 Vote: I do not like it

        I don't get the proof part yet. Any hint or intuition behind it?

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it +16 Vote: I do not like it

          Let $$$A$$$ be the answer (which is k or k-1 in terms of previous comment). Let $$$F$$$ (standing for "fast") be the set of starting places where you can start if you want to make a full lap in $$$A$$$ steps. Let us assume that $$$1 \not\in F$$$, cause otherwise it is pretty trivial. If at any point in time you are in $$$F$$$ (i.e. $$$x \in F$$$) you are moving "fast route" from this point on (note that $$$x \in F \Rightarrow last(x) \in F$$$). If you are not in $$$F$$$, then you are using "slow route".

          Key observation is that it is not possible that $$$x < y < last(y) < last(x)$$$ or in other words $$$x \le y \le last(x) \Rightarrow last^c(x) \le last^c(y) \le last^{c+1}(x)$$$ ($$$last^c(x)$$$ means applying last c times from $$$x$$$), or in other words "no route can overtake another route" (in these inequalities I assume that we do not use modulo n for last(x), we are always moving forward on infinite straight line). Fast route starting from a bit behind cannot overtake slow route/"the fastest thing fast route could do is to tie with slow route".

          Assume for a contradiction that you didn't reach $$$F$$$ after $$$n$$$ steps. It means that you visited some place twice, so there is a cycle consisting only of "slow positions". Let one of them be $$$s$$$ and let us take closest $$$f \in F$$$ behind $$$s$$$ (note we have $$$last(f) \ge s$$$. If you made $$$t$$$ laps before returning to $$$s$$$ then you needed to make at least $$$At+1$$$ steps. However fast route needed at most $$$At$$$ steps from $$$f$$$ to return to $$$f$$$ after $$$t$$$ laps (or it could have gone even further). But this one step we still have allows us to jump from $$$f$$$ to at least $$$s$$$. It is not possible that $$$last(f)^{At+1} > last(s)^{At+1}=s+nt$$$, but $$$last(f) \ge s$$$, so we end up in $$$s$$$ (but $$$F$$$ is closed under taking last), contradiction.

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            Great solution. You claim that after $$$n$$$ jumps, we always end up in a $$$F$$$ position. Is it enough to start from there and jump until we cover the circle? So we don't need to calculate $$$k$$$ first.

            • »
              »
              »
              »
              »
              »
              »
              4 years ago, # ^ |
                Vote: I like it +8 Vote: I do not like it

              Yes indeed. I realized that code could be a bit simpler, but I needed it anyway for the proof :P

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by cuom1999 (previous revision, new revision, compare).

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

How do you do a) efficiently on a normal array? Are we assuming that all $$$a_i$$$ are non-negative?

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I think segment tree works. To calculate $$$d[i]$$$, we need to sum up $$$dp[j]$$$ where $$$s[j] \geq s[i] - M$$$. May need compression first.

»
4 years ago, # |
  Vote: I like it +8 Vote: I do not like it

Solution for the $$$gcd$$$ problem and small values of $$$M$$$:

Let's cut the circle and get an array. Then, we want to compute the number of partitions of the array + we can try to merge the very first and last segments of each partition (if the merged segment also satisfies the condition). It's easy to see that we won't miss anything. Thus, we want to count every partition of the array either once or twice, depending on whether we can merge.

Now we can notice that for the second problem there are only $$$O(\log{C})$$$ possible $$$gcd$$$ values for the first and last segments. We can compute the number of partitions with dynamic programming $$$dp[i,firstGcd,lastGcd]$$$. Transitions: either we start a new segment or add an element to the last one. Fortunately, we already know segment's $$$gcd$$$ and can compute the next one. To get the answer, we sum $$$dp[n,first,last]$$$ and multiply the states where $$$gcd(first,last)>1$$$ by two.

The complexity of this solution is $$$O(n\log^2{C})$$$: the number of states is clearly like this, transitions are $$$O(1)$$$ if we notice that $$$gcd$$$s do not depend on $$$first$$$. In the first problem, if $$$M$$$ is small, and it restricts the value of $$$|sum|$$$, we can solve the problem in the same way in $$$O(nM^2)$$$.

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it +10 Vote: I do not like it

    I don’t think you need $$$lastGcd$$$ in your implementation, $$$i$$$ completely encodes that. This shaves one $$$log(C)$$$ factor. You also have to make sure $$$firstGcd$$$ is not $$$1$$$, so that you don’t overcount (you can set it to be $$$0$$$, corresponding to starting a segment at position $$$0$$$, though).

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Fair enough! Then, we should find the longest segment with $$$gcd>1$$$ that ends in $$$i+1$$$. How can we do it in $$$O(\log{C})$$$?

      Handling $$$firstGcd$$$ is indeed important, but it's more about implementation details.

      • »
        »
        »
        »
        4 years ago, # ^ |
        Rev. 2   Vote: I like it 0 Vote: I do not like it

        Well, if you iterate over $$$firstGcd$$$ and $$$i$$$ (after you compute dp), you need to just check if segment starting at $$$i$$$ has the potential of being the last segment, i.e., suffix gcd starting at i is coprime with $$$firstGcd$$$. Am I missing something?

        Edit: well, I guess what you are saying is how to compute dp? Two pointers and queue via two stacks would work. If not, then RMQ. Note that you need to compute longest segment only once, so logarithm from GCD gets added to (not multiplied with) the logarithm from extra state.

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
          Rev. 3   Vote: I like it 0 Vote: I do not like it

          The last step is not that interesting, I don't get how to make the dp computation fast. When you compute dp, you want to sum the values of dp on a segment $$$[k, i]$$$, such that $$$[k,i+1]$$$ has $$$gcd>1$$$. Then, to get rid of one $$$\log$$$ in complexity, you should find $$$k$$$ in $$$O(\log{C})$$$ and I don't get how to do it.

          Edit: queue via two stacks works, thanks! For RMQ, if we speak about segment trees, we need to visit $$$O(\log{n})$$$ nodes and compute $$$gcd$$$ in $$$O(\log{C})$$$ in each one, so it seems to be slower then needed.

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            I think computing gcd of $$$K$$$ numbers is $$$O(K + log(C))$$$ so segment tree and RMQ should work as well. The argument for that would be that for each ‘extra’ iteration of the Euclid algorithm, the answer at least halves.

»
14 months ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

Problem on a similar idea — Cyclic Array. Can anyone explain the approach to this problem in "simple words"?

  • »
    »
    14 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    The solution to your question has literally been answered above, if you bothered to read.

    • »
      »
      »
      14 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Yes, sir, I read Swistakk's answer but his proof after line "Assume for a contradiction that you didn't reach F after n...", I am not able to get.