RP-1's blog

By RP-1, 12 months ago, In English

Thanks to Kapt for proofreading this blog.

Recently, I came across a couple of AtCoder problems on modular arithmetic. In short, they both involved sequences $$$\{ A, A+C, A+2C, \dots \}$$$ modulo $$$M$$$. The usual way to deal with these sequences is floor sums:

Problem (Floor Sum). Calculate $$$\displaystyle \sum_{k=0}^n \left\lfloor \frac{Ck+A}{M} \right\rfloor$$$.

This is solvable in $$$O(\log M)$$$ with a recursive Euclid-like algorithm.

If you want to learn more about floor sums, I recommend this blog by _timelord.

However, what I want to show you is how you can solve problems like that with a completely different idea. The basis of our reasoning is the following statement:

(For the rest of this blog, let $$$0 \le C, A \lt M$$$, $$$\gcd(C, M) = 1$$$, $$$n \lt M$$$, just to avoid hassle.)

Statement. The sequence $$$\{A, C + A, 2C + A, \dots, Cn + A\} \bmod M$$$ can be split into $$$O(\sqrt{M})$$$ disjoint arithmetic progressions (not necessarily contiguous).

Proof. Suppose either $$$C$$$ or $$$M - C$$$ is small. Then you can simply divide the sequence every time the progression overflows, with each segment being about $$$\frac{M}{C}$$$ elements long. Thus we split our sequence into $$$O(1 + \frac{nC}{M})$$$ arithmetic progressions.

Now let's consider the first $$$\sqrt{M} + 1$$$ members of our sequence. By the pigeonhole principle, some two of them, e.g. $$$(Ci + A) \bmod M$$$ and $$$(Cj + A) \bmod M$$$ ($$$i \lt j$$$), will differ by less than $$$\sqrt{M}$$$. Thus, for $$$d := j - i$$$, either $$$(Cd \bmod M)$$$ or $$$(M - Cd \bmod M)$$$ will be less than $$$\sqrt{M}$$$ (let's call this value $$$H$$$).

Consider the subsequence $$$\{A, Cd + A, 2Cd + A, \dots, Cd \cdot \left\lfloor \frac{n}{d} \right\rfloor + A\} \bmod M$$$. As shown above, we can split it into $$$O(1 + \frac{nH}{dM})$$$ arithmetic progressions.

Summing this result over all remainders modulo $$$d$$$, we get $$$O(d + \frac{nH}{M})$$$ arithmetic progressions. Since $$$d \le \sqrt{M}$$$ and $$$H \lt \sqrt{M}$$$, this boils down to $$$O(\sqrt{M})$$$.

The first question is surely 'what are we going to do with this thing'.

Problems

Problem 1. Calculate $$$\displaystyle \sum_{k=0}^n ((Ck + A) \bmod M)$$$.

This sum is the total sum of all arithmetic progressions. Sum of arithmetic progression $$$\{ x, H + x, 2H + x, \dots, nH + x \}$$$ is

$$$(0 + 1 + \dots + n) \cdot H + (n + 1) \cdot x = \frac{n(n + 1)}{2} \cdot H + (n + 1) \cdot x$$$
Solution using floor sums

This also allows us to solve Floor Sum in $$$O(\sqrt{M})$$$. Hooray!

Problem 2. Calculate $$$\displaystyle \sum_{k=0}^n ((Ck + A) \bmod M)^2$$$.

Nothing changes except the formula for summing squares of elements of the arithmetic progression $$$\{ x, H + x, 2H + x, \dots, nH + x \}$$$:

$$$(0^2 + 1^2 + \dots + n^2) \cdot H^2 + (0 + 1 + \dots + n) \cdot 2Hx + (n + 1) \cdot x^2 =$$$
$$$= \frac{n(n+1)(2n+1)}{6} \cdot H^2 + \frac{n(n + 1)}{2} \cdot 2Hx + (n + 1) \cdot x^2$$$
Solution using floor sums

This is a real problem that I prepared for a local event, and I sincerely apologize to anyone who was forced to try to solve it.

Problem 3 (relevant part of Fizz Buzz Difference). Given $$$0 \lt R \lt M$$$, find minimum $$$k \gt 0$$$, s.t. $$$Ck \bmod M \lt R$$$.

Since the indices of elements of arithmetic progressions in original sequence are increasing (that is, progression $$$\{ x, H + x, 2H + x, \dots, nH + x \}$$$ corresponds to indices $$$\{i, i + d, i + 2d, \dots, i + nd\}$$$), all we have to do is find the first element in each progression that satisfies the condition.

Solution using floor sums

Problem 4 (Sum of Min of Mod of Linear). Given $$$0 \le A_1 \le \dots \le A_m \lt M$$$, calculate $$$\displaystyle \sum_{k=0}^n \min_{1 \le i \le m} ((Ck + A_i) \bmod M)$$$.

If $$$(Ck \bmod M) \in [M - A_i; M - A_{i-1})$$$, then the value of minimum is $$$((Ck + A_i) \bmod M)$$$; note that this is just $$$(Ck \bmod M) + A_i$$$. For $$$i = 1$$$ it's a bit trickier: the value of minimum is

  • $$$(Ck \bmod M) + A_1$$$ if $$$(Ck \bmod M) \in [0; M - A_m)$$$
  • $$$(Ck \bmod M) + A_1 - M$$$ if $$$(Ck \bmod M) \in [M - A_1; M)$$$

Either way, calculating the sum of $$$(Ck \bmod M)$$$ is Problem 1, so what we need is to find the number of elements of the sequence that lie in each segment.

Suppose we consider each progression separately. If $$$H \gt 0$$$, then the progression $$$\{ x, H + x, 2H + x, \dots, nH + x \}$$$ has

$$$\max\left(0, \min\left(n + 1, \left\lceil \frac{R - x}{H} \right\rceil \right)\right)$$$

elements less than $$$R$$$. Using this formula, we can find the number of elements in our progression that lie on each of the $$$m + 1$$$ subsegments. This gives us the solution in $$$O(m \sqrt{M})$$$. Unfortunately, this is not fast enough.

Note that all of our arithmetic progressions have the same step $$$H$$$, so we can solve the problem independently for each residue modulo $$$H$$$.

For example, consider elements of the form $$$Hk + r$$$. Each progression either doesn't have any of them or contains a contiguous segment of them. Since subsegments of $$$[0; M)$$$ also correspond to some segments, all we need is to add $$$1$$$ on some segments and then find sums on $$$m + 1$$$ segments. This is a simple scanline.

We need to do scanline $$$H$$$ times, but we only have $$$O(d + H)$$$ segments in total. If we make $$$d$$$ larger and $$$H$$$ smaller, we can sort all segments in $$$O(d \log d)$$$ and do scanlines in $$$O(mH)$$$.

To do this, recall the way we obtained $$$d$$$ in the first place. Suppose we instead look through the first $$$M^{2/3}$$$ members of our sequence. Then $$$d \le M^{2/3}$$$ and $$$H \le M^{1/3}$$$. Thus we can solve the problem in $$$O(M^{2/3} \log M + mM^{1/3})$$$.

Solution using floor sums

Takeaway

I guess there isn't much moral to this story. If you're a problemsetter, this blog should probably serve as a warning and a prompt to try to counter heuristics like this. If you're a contestant, this blog should, more than anything, be a reason to get acquainted with floor sums, but also to try to find some clever ways around the intended solution.

If you know any problems that can be solved with this technique, or if you want to tell me off for rambling about a well-known idea, feel free to comment below.

Anyway, stay safe until I'm back with yet another way of torturing myself!

Full text and comments »

  • Vote: I like it
  • +161
  • Vote: I do not like it

By RP-1, history, 16 months ago, In English
Disclaimer

Looking at the top-10 users and Recent Actions bar right now, two things become apparent:

  1. Since LGMs cannot change their rank to a higher one, more people in, say, top-100 have changed their rank to a lower one than to a higher one.
  2. Codeforces-wide, more people have changed their rank to a higher one than a lower one.

So there has to be an exact cut-off point, above which the same number of people have changed their rank to either side. I decided to do some scraping to find out exactly where it is.

Lo and behold:

Rank changes (leaderboard position)

Of the top 15k users, about tenth changed their rank to a higher one, and the same number to a lower one.

Now, I'm no physicist, but I do know how to draw a line. I honestly have no explanation as to why the blue graph is so perfectly linear. If anyone can explain this kind of behavior, please leave a comment below. ;)

Now let's scale the x-axis by rating:

Rank changes (rating)

Looks like people who have almost reached a higher rank are less likely to change it. Also, I think that the orange curve becomes linear over time? I don't know. I'm certainly in no position to try and do some advanced statistics on this data, so I won't.

Rank changes difference (rating)

What I find curious is that Experts are almost evenly split into two groups. The difference almost doesn't change until around 1800 and then plummets.

The exact cut-off point in question, according to my calculations, is 1579. I really wanted to shout out a user with this rating, but there's quite a few, so congrats to all of you!

Tell you what, have some more graphs.

Most popular rank changes

Rank changes by target rank

Keep in mind that I only fetched 40k users, so the histogram might look a little different in reality, but the trends are clear.

Cyan is easily the most popular before 2000 (looking at you, Um_nik), but then nutellas and grays skyrocket, and after 1500 LGMs take over. Purples and oranges are tied for third, but I hope you'll agree that they're the prettiest of all.

The code I used is available here.

Anyway, thanks for reading, and I wish you all the best for the New Year!

Full text and comments »

  • Vote: I like it
  • +354
  • Vote: I do not like it