How I understand Floor Sum

Правка en1, от prabowo, 2025-08-03 18:03:57

Hi everyone,

Today we are going to compute

$$$\begin{gather}\sum_{x=0}^{n-1} \left\lfloor \frac{ax + b}{m} \right\rfloor\end{gather}$$$

in $$$\mathcal{O}(\log(m))$$$.

This topic has been discussed elsewhere, but I find the proofs that are currently available online to be too artificial for me. So here is my attempt to describe how I understand this topic using illustrations, albeit in a less rigorous manner.

1. Floor Sum

Let us define

$$$\begin{gather}S(a, b, m, n) = \sum_{x=0}^{n-1} \left\lfloor \frac{ax + b}{m} \right\rfloor\end{gather}$$$

Here, we are going to assume that $$$a \lt m$$$ and $$$b \lt m$$$. If they are not, it is not hard to reduce them into $$$a \rightarrow a \bmod m$$$ and $$$b \rightarrow b \bmod m$$$, which I am not going to discuss here.

The sum above is equivalent to counting the number of lattice points below the curve $$$y = \frac{ax + b}{m}$$$ bounded by $$$0 \le x \lt n$$$. As an instance, the sum for $$$(a, b, m, n) = (3, 2, 5, 10)$$$ can be illustrated as follows.

The goal here is to find another interpretation of the curve above, in the hope that we might find something useful. We can see that the lattice points are also to the "right" of the curve. If we try to rotate them:

Now, we will try to relabel the rotated curve here so that it can fit to our definition of the function $$$S$$$.

Let $$$n' = \lfloor \frac{an + b}{m} \rfloor$$$. We notice that the curve covers $$$n'$$$ units of the $$$y$$$-axis. The curve at $$$y = n'$$$ is also $$$\frac{(an + b) \bmod m}{a}$$$ units away from the $$$y$$$-axis. Now, let's reflect the curve such it is pivoted at $$$y = n'$$$ instead.

Now this looks like something we can represent with $$$S$$$. In fact, we can see that the curve now has slope $$$\frac{m}{a}$$$. If we let $$$b' = (an + b) \bmod m$$$, the number of lattice points below the curve above is exactly

$$$\begin{gather}\sum_{y=0}^{n'-1} = \left\lfloor \frac{my + b'}{a} \right\rfloor\end{gather}$$$

To wrap, when $$$a \lt m$$$ and $$$b \lt m$$$, and letting $$$n' = \left\lfloor \frac{an + b}{m} \right\rfloor$$$ and $$$b' = (an + b) \bmod m$$$, we have:

$$$S(a, b, m, n) = S(m, b', a, n')$$$

This is good, because we just transformed $$$(a, m)$$$ into $$$(m, a)$$$, which in turn will be transformed into $$$(m \bmod a, a)$$$, then we can repeat the process. That is, $$$(a, m)$$$ will shrink like the Euclidean algorithm. We can stop when $$$S(a, b, m, n) = 0$$$, which will take $$$\mathcal O(\log m)$$$ steps.

2. Floor Sum Extension

In this section, we are going to compute the following:

$$$\begin{gather}T_1(a, b, m, n) = \sum_{x=0}^{n-1} \left( x \left\lfloor \frac{ax+b}{m} \right\rfloor \right)\end{gather}$$$

and

$$$\begin{gather}S_2(a, b, m, n) = \sum_{x=0}^{n-1} \left\lfloor \frac{ax+b}{m}\right\rfloor^2\end{gather}$$$

Those two sums are going to be interrelated. We are going to, again, assume that $$$a \lt m$$$ and $$$b \lt m$$$. The case otherwise is left to the reader as an exercise.

Our goal here is the same as before: to swap $$$a$$$ and $$$m$$$ in its subproblem, so that $$$(a, m)$$$ will shrink like the Euclidean algorithm. For convenience, from this section onwards, we are also going to denote $$$n' = \lfloor \frac{an + b}{m} \rfloor$$$ and $$$b' = (an + b) \bmod m$$$.

Let's start with visualising $$$T_1$$$:

Here, the points have different colours for different contributions. In the illustration above, the colour-contributions are: (blue, $$$1$$$), (yellow, $$$2$$$), (green, $$$3$$$), and so on.

Let's transform the curve as we did in section 1.

Notice that, for each $$$y$$$, the contribution shrinks from bottom to top. The sum of point-by-point contributions based on the transformed curve is:

$$$\begin{align*} T_1(a, b, m, n) = &\sum_{y=0}^{n'-1} \sum_{x=1}^{\lfloor \frac{my + b'}{a}\rfloor} (n - x) \\ =& \sum_{y=0}^{n'-1} \left(n \left\lfloor \frac{my + b'}{a} \right\rfloor - \frac{1}{2} \left\lfloor \frac{my + b'}{a} \right\rfloor - \frac{1}{2} \left\lfloor \frac{my + b'}{a} \right\rfloor^2\right) \\ =& \frac{1}{2} \left( (2n - 1) S(m, b', a, n') - S_2(m, b', a, n')\right) \end{align*}$$$

We have swapped $$$a$$$ and $$$m$$$ here, but now we have to compute $$$S_2$$$.

Let us now visualise $$$S_2$$$.

Here again, we are using different colours for different contributions. The contribution by the point at $$$(x, y)$$$ is $$$y^2 - (y-1)^2 = 2y - 1$$$. In this way, we can see that the sum of contributions for a fixed $$$x$$$ is exactly $$$\lfloor \frac{ax+b}{m}\rfloor^2$$$.

Transforming the curve above, we end up with

Let's sum the contributions based on this transformed curve.

$$$\begin{align*} S_2(a, b, m, n) = &\sum_{y=0}^{n'-1} \left( \left(2(n' - y) - 1 \right) \left\lfloor \frac{my + b'}{a} \right\rfloor \right)\\ =& \sum_{y=0}^{n'-1} \left((2n'-1) \left\lfloor \frac{my + b'}{a} \right\rfloor - 2y \left\lfloor \frac{my + b'}{a} \right\rfloor\right) \\ =& (2n' - 1) S(m, b', a, n') - 2 T_1(m, b', a, n') \end{align*}$$$

and we are done.

Note that there will be overlapping subproblems, so it is a good idea to memoise the sums too. Since we will solve for $$$\mathcal O (\log m)$$$ subproblems of $$$S$$$, $$$T_1$$$, and $$$S_2$$$, the computation of both $$$T_1$$$ and $$$S_2$$$ can be done in $$$\mathcal O (\log m)$$$.

3. Bringing in the power

We are going to further extend the functions $$$T_1$$$ and $$$S_2$$$ by trying to compute:

$$$\begin{gather}T_k(a, b, m, n) = \sum_{x=0}^{n-1} \left( x^k \left\lfloor \frac{ax+b}{m} \right\rfloor \right)\end{gather}$$$

and

$$$\begin{gather}S_k(a, b, m, n) = \sum_{x=0}^{n-1} \left\lfloor \frac{ax+b}{m}\right\rfloor^k\end{gather}$$$

The approach is exactly the same as the previous section, so you may refer to the same illustrations while reading this section, but the lattice points now have slightly more complex contributions as we are going to describe below.

It is also worth mentioning that $$$T_0 = S_1 = S$$$, so they can be evaluated like the floor sum from section 1 instead.

We start from $$$S_k$$$, since it is easier. The contribution by the point $$$(x, y)$$$ is $$$y^k - (y-1)^k$$$.

After the curve transformation,

$$$\begin{align*} S_k(a, b, m, n) = &\sum_{y=0}^{n'-1} \left( (n' - y)^k - (n' - y - 1)^k \right) \left\lfloor \frac{my + b'}{a} \right\rfloor\\ =& \sum_{y=0}^{n'-1} \sum_{i=0}^{k-1} \binom{k}{i} (-y)^i \left((n')^{k-i} - (n' - 1)^{k-i} \right) \left\lfloor \frac{my + b'}{a} \right\rfloor\\ =& \sum_{i=0}^k (-1)^i \binom{k}{i} \left((n')^{k-i} - (n' - 1)^{k-i} \right) T_i(m, b', a, n') \end{align*}$$$

so this single step takes $$$\mathcal O(k)$$$ to compute. Also note that we assume $$$0^0 = 1$$$ here.

For $$$T_k$$$, the contribution by the point $$$(x, y)$$$ is $$$x^k$$$. For this, we will define $$$P_d(n) = \sum_{x=0}^n x^d$$$. Using Faulhaber's formula,

$$$\begin{gather} P_d(n) = d! \sum_{i=0}^d \frac{B_i}{i!} \frac{n^{d+1-i}}{(d+1-i)!} \end{gather}$$$

where $$$B_i$$$ is the Bernoulli number whose generating function is $$$\frac{x}{e^x - 1}$$$.

The reason we are explicitly expanding the sum of powers will be more relevant at a later section. Back to computing $$$T_k$$$, after the curve transformation, we can then compute the contributions as follows.

$$$\begin{align*} T_k(a, b, m, n) &= \sum_{y=0}^{n'-1} \sum_{x=1}^{\lfloor \frac{my + b'}{a}\rfloor} (n - x)^k \\ &= \sum_{y=0}^{n'-1} \left( P_k(n) + \left\lfloor \frac{my + b'}{a}\right\rfloor^k - P_k\left( \left\lfloor \frac{my + b'}{a}\right\rfloor\right) \right) \\ &= n' P_k(n) + S_k(m, b', a, n') - k! \sum_{i=0}^k \frac{B_i}{i!} \frac{S_{k+1-i}(m, b', a, n')}{(k+1-i)!} \end{align*}$$$

The Bernoulli numbers can be precomputed, so this single step to compute $$$T_k$$$ also takes $$$\mathcal O(k)$$$ to compute.

Small Optimization

You may already notice that the formulas to compute $$$T_k$$$ and $$$S_k$$$ above involve convolutions. This means that, if we have already computed $$$T_i(m, b', a, n')$$$ and $$$S_i(m, b', a, n')$$$ (for all $$$0 \le i \le k$$$), then $$$T_i(a, b, m, n)$$$ and $$$S_i(a, b, m, n)$$$ (for all $$$0 \le i \le k$$$) can be computed in $$$\mathcal O(k \log k)$$$.

Back to computing $$$T_k$$$ and $$$S_k$$$, when you try to reduce for the case when $$$a \geq m$$$, you'll realise that you need to compute

$$$\begin{gather}\sum_{x=0}^{n-1} x^{k_1} \left\lfloor \frac{ax + b}{m} \right\rfloor^{k2}\end{gather}$$$

which brings us to the next section.

4. Combining Everything Together

We are going to compute:

$$$\begin{gather}S_{k_1, k_2}(a, b, m, n) = \sum_{x=0}^{n-1} x^{k_1} \left\lfloor \frac{ax + b}{m} \right\rfloor^{k2}\end{gather}$$$

We just have to combine the expansions that we have obtained from the previous section. One can realise that the sum we want is:

$$$ \begin{align*} S_{k_1, k_2}(a, b, m, n) &= \sum_{y=0}^{n'-1} ((n' - y)^{k_2} - (n' - y - 1)^{k_1}) \sum_{x=1}^{\lfloor \frac{my + b'}{a}\rfloor} (n - x)^{k_1} \\ &= \sum_{y=0}^{n'-1} \sum_{i=0}^{k_2-1} \binom{k}{i} (-y)^i \left((n')^{k_2-i} - (n' - 1)^{k_2-i} \right) \left( P_{k_1}(n) + \left\lfloor \frac{my + b'}{a}\right\rfloor^{k_1} - P_{k_1}\left( \left\lfloor \frac{my + b'}{a}\right\rfloor\right) \right) \\ &= \sum_{i=0}^{k_2-1} (-1)^i \binom{k_2}{i} ((n')^{k_2-i} - (n' - 1)^{k_2-i}) \sum_{y=0}^{n'-1} \left(y^i P_{k_1}(n) + y^i \left\lfloor \frac{my + b'}{a}\right\rfloor^{k_1} + y^i P_{k_1}\left( \left\lfloor \frac{my + b'}{a}\right\rfloor\right) \right) \\ &= \sum_{i=0}^{k_2-1} (-1)^i \binom{k_2}{i} ((n')^{k_2-i} - (n' - 1)^{k_2-i}) \left(P_i(n'-1) P_{k_1}(n) + S_{i,k_1}(m, b', a, n') + (k_1)! \sum_{j=0}^{k_1} \frac{B_j}{j!} \frac{S_{i,k_1+1-j}(m, b', a, n')}{(k_2 + 1 - j)!} \right) \end{align*} $$$

which can be computed in $$$\mathcal O(k_1 k_2)$$$, so the overall operations to compute $$$S_{k_1, k_2}$$$ can be done in $$$\mathcal O ((k_1 + k_2)^4 \log m)$$$.

By noticing the convolution in the equation, and applying the optimisation we briefly mentioned from the previous section, it is, in fact, possible to bring down the whole operations to $$$\mathcal O((k_1 + k_2)^2 \log(k_1 + k_2) \log m)$$$.

Notes

From Min_25's article, it seems that it is possible to compute $$$S_k$$$ and $$$T_k$$$ in $$$\mathcal{O}(k^2 \log k)$$$ by decomposing $$$\frac{a}{m}$$$ using the Stern-Brocot tree. However, I am not sure whether there is an optimisation we can apply for our specific approach here without the need to compute $$$S_{k1, k2}$$$.

References

Appendix

Case a >= m reduction
Case b >= m reduction
Sample implementation for S, T_1, S_2 in Python

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en1 Английский prabowo 2025-08-03 18:03:57 13958 Initial revision (published)