sibillalazzerini's blog

By sibillalazzerini, history, 22 months ago, In English

I have been exploring the idea of computing Binomial Coefficients $$$\binom{n}{k}\mod m$$$ where $$$n$$$,$$$k$$$ and $$$m$$$ are all 64 bit integers.ie $$$m$$$ is any integer rather than being a prime. I have found few solutions here where they first factorise $$$m$$$ into $$$p_1^{e_1}p_2^{e_2}...p_t^{e_t}$$$ and then proceed for each $$$p_i^k$$$ in a manner similar to this method on math.stackexchange To elborate:

  1. Express as $$$(\binom{n}{k}\mod p_1^{e_1})\mod p_2^{e_2})$$$ ... $$$\mod p_t^{e_t})$$$
  2. then strip off all the $$$ p_i $$$ and its epponents from n!, k! and (n-k)!
  3. and compute $$$netexpo_i =$$$ total exponents of $$$p_i$$$ in n! — total exponents of $$$p_i$$$ in (n-k)! — total exponents of $$$p_i$$$ in k!
  4. if $$$netexpo_i \ge$$$ total exponents of $$$p_i$$$ in $$$m!$$$ then answer is $$$0$$$, if not proceed
  5. You Repeat step 2-4 for all $$$p_i$$$ in $$$m$$$.
  6. and then result is $$$ n! * ModularInverse((n — k)!) * ModularInverse(k!) * (p_1)^{netexpo_1}* (p_2)^{netexpo_2}... (p_t)^{netexpo_t}$$$
  7. Details of this method for each $$$(\binom{n}{k}\mod p_1^{e_1})$$$ have been discussed on math.stackexchange (at least for cases when each $$$p_i$$$ is 32 bit integer).

However among the solutions on yosupo none of the ones that I checked scale for large 64 bit $$$m$$$. Just changing the data type won't fix as it overshoots memory even when the largest prime factor of $$$m$$$ is 32-bit. So there can be two parts to the problem:

  • m is 64-bit but all of its prime factors are 32-bit
  • m is 64-bit and one of its prime factor is 64-bit

None of the solutions submitted on yosupo addresses either. My understanding is if they are implementing the algorithm similar to the discussion on math.stackexchange just scaling up datatype should work for the first part (ie $$$m$$$ is 64-bit but all of its prime factors are 32-bit). But to my surprise I found even in those cases it overshoots memory(or perhaps my edited code was buggy).

Any suggestions on how we can modify the code to scale to 64 bit for each of $$$n,k,m $$$ assuming that the largest prime factor of $$$m$$$ is within 32-bit(ie the fist part) I should call it out upfront that I have already gone through https://mirror.codeforces.com/blog/entry/65178 , https://mirror.codeforces.com/blog/entry/12878 , https://mirror.codeforces.com/blog/entry/55311 and https://mirror.codeforces.com/blog/entry/53039 .

To summarise all the ideas:

  • $$$O(p^2)$$$ Time per Query, $$$O(1)$$$ Memory — initial answer by Aryaman Maithani on math.stackexchange
  • $$$O(p)$$$ Time per Query, $$$O(1)$$$ Memory — proved using polynomial that in a subgroup of size $$$p$$$ product of only the constant term remains and other terms vanish . ie $$$\prod_{i=pk+1}^{p(k+1)-1}(p^2+i) \mod p^2 = (p-1)! \mod p^2 $$$. — updated answer by Aryaman Maithani on math.stackexchange
  • $$$O(p)$$$ Time preprocessing, $$$O(p^{1/2})$$$ Time per Query, $$$O(p^{1/2})$$$ Memory — idea is to break each block of $$$p$$$ into sub-blocks of size $$$p^{1/2}$$$ each and then preprocess for each blocks of $$$O(p^{1/2})$$$. This idea works because $$$\prod_{i=1}^{t}(p^2+p+i) \mod p^2 = \prod_{i=1}^{t}(i) \mod p^2 $$$. The method is useful when we can't afford $$$O(p)$$$ memory like the following two methods.
  • $$$O(p)$$$ Time preprocessing, $$$O(log_{p}{n})$$$ Time per Query, $$$O(p)$$$ Memory — Similar to Granville's idea. Basically the above 3 are also implementation of Granville's but with lesser degree of memorization that what Granville proposed.
  • $$$O(p)$$$ Time preprocessing, $$$O(log_2{n})$$$ Time per Query, $$$O(p)$$$ Memory — Similar to Min25's idea, it uses as Stirling number of the first kind and Lagrange's interpolation to implement it. There are few solutions on yosupo use Interpolation/NTT/FFT or some kind of domain transformation, I would put them in this category. SPyofcode might help us understand.

Where $$$p$$$ is the largest prime factor of $$$m$$$ and assuming it has $$$e_{i}=2$$$ or lesser

  • Vote: I like it
  • +17
  • Vote: I do not like it

| Write comment?
»
22 months ago, # |
Rev. 2   Vote: I like it +3 Vote: I do not like it

AFAIK there is a Chinese problem with a constant integer $$$m \leq 10^{15}$$$ that can be solved under $$$O(\sqrt{m} \times polylog(m))$$$ using Lagrangian Interpolation, NTT, Extended Euclidean, Lucas, CRT, Primitive Root, Modular Inverse, (and Montgomery Multiplication, Fast Modular Multiplication for faster modulo operations) but it is rather complex. But it might be also possible to solve for $$$m \leq 10^{18}$$$ faster than $$$1$$$ second.

For a constant prime $$$m \leq 10^{12}$$$ you can also store $$$1e6$$$ elements of $$$0e6!, 1e6!, 2e6!, \ldots, 1e12!$$$ (under mod $$$m$$$) then you can solve with only Modular Inverse and normal Modular Arithmetic in $$$O(1e6)$$$