Блог пользователя RP-1

Автор RP-1, 12 месяцев назад, По-английски

Thanks to Kapt for proofreading this blog.

Recently, I came across a couple of AtCoder problems on modular arithmetic. In short, they both involved sequences $$$\{ A, A+C, A+2C, \dots \}$$$ modulo $$$M$$$. The usual way to deal with these sequences is floor sums:

Problem (Floor Sum). Calculate $$$\displaystyle \sum_{k=0}^n \left\lfloor \frac{Ck+A}{M} \right\rfloor$$$.

This is solvable in $$$O(\log M)$$$ with a recursive Euclid-like algorithm.

If you want to learn more about floor sums, I recommend this blog by _timelord.

However, what I want to show you is how you can solve problems like that with a completely different idea. The basis of our reasoning is the following statement:

(For the rest of this blog, let $$$0 \le C, A \lt M$$$, $$$\gcd(C, M) = 1$$$, $$$n \lt M$$$, just to avoid hassle.)

Statement. The sequence $$$\{A, C + A, 2C + A, \dots, Cn + A\} \bmod M$$$ can be split into $$$O(\sqrt{M})$$$ disjoint arithmetic progressions (not necessarily contiguous).

Proof. Suppose either $$$C$$$ or $$$M - C$$$ is small. Then you can simply divide the sequence every time the progression overflows, with each segment being about $$$\frac{M}{C}$$$ elements long. Thus we split our sequence into $$$O(1 + \frac{nC}{M})$$$ arithmetic progressions.

Now let's consider the first $$$\sqrt{M} + 1$$$ members of our sequence. By the pigeonhole principle, some two of them, e.g. $$$(Ci + A) \bmod M$$$ and $$$(Cj + A) \bmod M$$$ ($$$i \lt j$$$), will differ by less than $$$\sqrt{M}$$$. Thus, for $$$d := j - i$$$, either $$$(Cd \bmod M)$$$ or $$$(M - Cd \bmod M)$$$ will be less than $$$\sqrt{M}$$$ (let's call this value $$$H$$$).

Consider the subsequence $$$\{A, Cd + A, 2Cd + A, \dots, Cd \cdot \left\lfloor \frac{n}{d} \right\rfloor + A\} \bmod M$$$. As shown above, we can split it into $$$O(1 + \frac{nH}{dM})$$$ arithmetic progressions.

Summing this result over all remainders modulo $$$d$$$, we get $$$O(d + \frac{nH}{M})$$$ arithmetic progressions. Since $$$d \le \sqrt{M}$$$ and $$$H \lt \sqrt{M}$$$, this boils down to $$$O(\sqrt{M})$$$.

The first question is surely 'what are we going to do with this thing'.

Problems

Problem 1. Calculate $$$\displaystyle \sum_{k=0}^n ((Ck + A) \bmod M)$$$.

This sum is the total sum of all arithmetic progressions. Sum of arithmetic progression $$$\{ x, H + x, 2H + x, \dots, nH + x \}$$$ is

$$$(0 + 1 + \dots + n) \cdot H + (n + 1) \cdot x = \frac{n(n + 1)}{2} \cdot H + (n + 1) \cdot x$$$
Solution using floor sums

This also allows us to solve Floor Sum in $$$O(\sqrt{M})$$$. Hooray!

Problem 2. Calculate $$$\displaystyle \sum_{k=0}^n ((Ck + A) \bmod M)^2$$$.

Nothing changes except the formula for summing squares of elements of the arithmetic progression $$$\{ x, H + x, 2H + x, \dots, nH + x \}$$$:

$$$(0^2 + 1^2 + \dots + n^2) \cdot H^2 + (0 + 1 + \dots + n) \cdot 2Hx + (n + 1) \cdot x^2 =$$$
$$$= \frac{n(n+1)(2n+1)}{6} \cdot H^2 + \frac{n(n + 1)}{2} \cdot 2Hx + (n + 1) \cdot x^2$$$
Solution using floor sums

This is a real problem that I prepared for a local event, and I sincerely apologize to anyone who was forced to try to solve it.

Problem 3 (relevant part of Fizz Buzz Difference). Given $$$0 \lt R \lt M$$$, find minimum $$$k \gt 0$$$, s.t. $$$Ck \bmod M \lt R$$$.

Since the indices of elements of arithmetic progressions in original sequence are increasing (that is, progression $$$\{ x, H + x, 2H + x, \dots, nH + x \}$$$ corresponds to indices $$$\{i, i + d, i + 2d, \dots, i + nd\}$$$), all we have to do is find the first element in each progression that satisfies the condition.

Solution using floor sums

Problem 4 (Sum of Min of Mod of Linear). Given $$$0 \le A_1 \le \dots \le A_m \lt M$$$, calculate $$$\displaystyle \sum_{k=0}^n \min_{1 \le i \le m} ((Ck + A_i) \bmod M)$$$.

If $$$(Ck \bmod M) \in [M - A_i; M - A_{i-1})$$$, then the value of minimum is $$$((Ck + A_i) \bmod M)$$$; note that this is just $$$(Ck \bmod M) + A_i$$$. For $$$i = 1$$$ it's a bit trickier: the value of minimum is

  • $$$(Ck \bmod M) + A_1$$$ if $$$(Ck \bmod M) \in [0; M - A_m)$$$
  • $$$(Ck \bmod M) + A_1 - M$$$ if $$$(Ck \bmod M) \in [M - A_1; M)$$$

Either way, calculating the sum of $$$(Ck \bmod M)$$$ is Problem 1, so what we need is to find the number of elements of the sequence that lie in each segment.

Suppose we consider each progression separately. If $$$H \gt 0$$$, then the progression $$$\{ x, H + x, 2H + x, \dots, nH + x \}$$$ has

$$$\max\left(0, \min\left(n + 1, \left\lceil \frac{R - x}{H} \right\rceil \right)\right)$$$

elements less than $$$R$$$. Using this formula, we can find the number of elements in our progression that lie on each of the $$$m + 1$$$ subsegments. This gives us the solution in $$$O(m \sqrt{M})$$$. Unfortunately, this is not fast enough.

Note that all of our arithmetic progressions have the same step $$$H$$$, so we can solve the problem independently for each residue modulo $$$H$$$.

For example, consider elements of the form $$$Hk + r$$$. Each progression either doesn't have any of them or contains a contiguous segment of them. Since subsegments of $$$[0; M)$$$ also correspond to some segments, all we need is to add $$$1$$$ on some segments and then find sums on $$$m + 1$$$ segments. This is a simple scanline.

We need to do scanline $$$H$$$ times, but we only have $$$O(d + H)$$$ segments in total. If we make $$$d$$$ larger and $$$H$$$ smaller, we can sort all segments in $$$O(d \log d)$$$ and do scanlines in $$$O(mH)$$$.

To do this, recall the way we obtained $$$d$$$ in the first place. Suppose we instead look through the first $$$M^{2/3}$$$ members of our sequence. Then $$$d \le M^{2/3}$$$ and $$$H \le M^{1/3}$$$. Thus we can solve the problem in $$$O(M^{2/3} \log M + mM^{1/3})$$$.

Solution using floor sums

Takeaway

I guess there isn't much moral to this story. If you're a problemsetter, this blog should probably serve as a warning and a prompt to try to counter heuristics like this. If you're a contestant, this blog should, more than anything, be a reason to get acquainted with floor sums, but also to try to find some clever ways around the intended solution.

If you know any problems that can be solved with this technique, or if you want to tell me off for rambling about a well-known idea, feel free to comment below.

Anyway, stay safe until I'm back with yet another way of torturing myself!

  • Проголосовать: нравится
  • +161
  • Проголосовать: не нравится

»
12 месяцев назад, скрыть # |
Rev. 2  
Проголосовать: нравится +8 Проголосовать: не нравится

Is RMI 2024/2 — Octagon solvable with something like "sum of fifth powers of floors" (generalization of problem 2 in this blog)? At the moment, the fact that this problem might be solvable in such a way is a well-known meme in Italy.

  • »
    »
    12 месяцев назад, скрыть # ^ |
     
    Проголосовать: нравится +5 Проголосовать: не нравится

    Can you please elaborate on that? I'm pretty sure there are no floor sums there, since you're calculating everything modulo $$$10^9 + 7$$$.

    In Problem 2, the resulting sum is not supposed to be a remainder modulo $$$M$$$, since otherwise it would just be a known formula.

    • »
      »
      »
      12 месяцев назад, скрыть # ^ |
       
      Проголосовать: нравится +15 Проголосовать: не нравится

      The answer is $$$O(N^8)$$$, so I guess you could use bigint if necessary.

      The $$$8$$$ variables make an octagon if they satisfy some linear equations and inequalities. If I recall correctly, you can put two variables on two axes, get a convex polygon and triangulate it. For each point with integer coordinates, you have a formula for the number of valid tuples. It seemed that you needed floor sum to get the sum inside a triangle.

»
12 месяцев назад, скрыть # |
 
Проголосовать: нравится +10 Проголосовать: не нравится

should the bound be $$$O(\sqrt n)$$$ since we can set $$$d=\sqrt n$$$ and $$$h=\frac{M}{\sqrt n}$$$?

by the way, I think the Euclid-like algorithm is somehow like a extended version of your algorithm, based on the fact that the arithmetic progressions found are also periodic and we can find "arithmetic progressions of arithmetic progressions" recursively

  • »
    »
    12 месяцев назад, скрыть # ^ |
     
    Проголосовать: нравится 0 Проголосовать: не нравится

    Yes, I think you're right, but note that the $$$n \lt M$$$ condition is not really necessary in any of the problems.

    For example, in Problem 1, the contribution of the first $$$M$$$ elements is simply a sum of numbers from $$$0$$$ to $$$M - 1$$$, so we can just add it $$$\left\lfloor\frac{n}{M}\right\rfloor$$$ times and solve the problem for $$$n \bmod M$$$. So for the purposes of this blog, $$$O(\sqrt{M})$$$ makes more sense.

    Funny thing is, if $$$n \le 10^5$$$ and $$$C, A, M \le 10^k$$$, $$$k \le 10^5$$$ in Problem 1, then $$$O(\sqrt{n} k \log k)$$$ would be better than $$$O(k^2 \log k)$$$ from Euclid-like algorithm. Isn't that crazy?

»
12 месяцев назад, скрыть # |
 
Проголосовать: нравится 0 Проголосовать: не нравится

Does anybody know where the term "sqrt heuristic" comes from? I've already seen/heard it multiple times. I always thought that the word "heuristic" refers to something that is not provable/proven but helps in practice. Like a heuristic function in the A* algorithm. However, there is nothing unproven about the square root decomposition, so I was always confused about the term "sqrt heuristic". When I saw the blog's name, I assumed it would talk about an algorithm with some bad time complexity in theory that works fast in practice and uses square roots for something.

  • »
    »
    12 месяцев назад, скрыть # ^ |
     
    Проголосовать: нравится 0 Проголосовать: не нравится

    Upon further research, it seems like this term mainly exists in the literature in the Russian language and not English. This seems to be the oldest example of the term "sqrt heuristic" being used. But in Russian, it still doesn't make any sense to me :)

»
5 недель назад, скрыть # |
 
Проголосовать: нравится 0 Проголосовать: не нравится