LorentzianExpanders's blog

By LorentzianExpanders, history, 15 months ago, In English

Hi everyone,

In this post, I'll share a method to compute modular inverses in $$$O(1)$$$ time under some $$$p^{\varepsilon}$$$-time precomputation.

Rather than simply computing the modular inverse, we aim to solve for a pair of integers $$$x, y$$$ such that $$$ax + by = \gcd(a, b)$$$. I.e., find the solution of the Bezout equation.

Observations

Observation. For two non-zero positive integers $$$a, b$$$ less than some bound $$$X$$$, there exists an invertible integer matrix $$$M \in \mathrm{GL}_2(\mathbb{Z})$$$ such that one component of the vector $$$M \begin{pmatrix}a\\b \end{pmatrix}$$$ is bounded by $$$X^{1/2}$$$, and the absolute values of the elements in $$$M$$$ are also bounded by $$$X^{1/2}$$$.

Proof. The pigeonhole principle is sufficient to prove this observation. However, there could be a more algorithmic proof by analyzing the execution of Euclid's algorithm. We omit the details of this proof here. $$$\square$$$

Corollary. Given two non-zero positive integers $$$a, b$$$ less than $$$X$$$, and a parameter $$$B \leq X$$$, there exists an invertible integer matrix $$$M \in \mathrm{GL}_2(\mathbb{Z})$$$ such that one component of the vector $$$M \begin{pmatrix}a\\b \end{pmatrix}$$$ is bounded by $$$3X/B^{1/2}$$$.

Proof. First, we define $$$a_0 = \lfloor aB/X \rfloor$$$ and $$$b_0 = \lfloor bB/X \rfloor$$$. Using the matrix $$$M$$$ from the previous observation, we can compute $$$M(a_0, b_0)$$$, where one of the components is bounded by $$$B^{1/2}$$$. The error between $$$(a,b)$$$ and $$$(a_0,b_0)$$$ is controlled by the difference $$$O(B/X)$$$. In total, the error is bounded by $$$3X/B^{1/2}$$$. $$$\square$$$

The core of the algorithm involves using this corollary to precompute relevant data and then quickly look up the results during the query phase. The following describes the two phases of the approach:

Algorithm

Processing Phase:
Choose a parameter $$$B$$$ and precompute the invertible matrices $$$M$$$ for all $$$a, b \lt B$$$, as well as the answers for $$$a, b \lt 3B$$$. This precomputation phase has a time complexity of $$$\tilde{O}(B^2)$$$, where the $$$\tilde{O}$$$ notation accounts for polylogarithmic factors.

Query Phase:
Once the preprocessing is done, we can perform the following steps to compute the result for any query:

  • For $$$\max(a, b) \geq 3B$$$, use the argument from the Corollary to reduce the larger of the two numbers to a magnitude of $$$X \cdot 3B^{-1/2}$$$. The matrix from the Observation can then be retrieved via a table lookup. Afterward, apply one step of the Euclidean algorithm to reduce the other number.

  • Continue this process until both $$$a, b \lt 3B$$$, at which point the result can be directly retrieved from the precomputed table.

This query phase runs in $$$O(\log X / \log B)$$$ time, providing an efficient way to compute the desired results.

Take $$$B = X^{O(1/k)}$$$, we could do $$$O(p^{1/k})$$$ time to do precomputation, and support the modular inversion in $$$O(k)$$$ time per query.

Take $$$B=(\log X)^{O(1)}$$$, we could compute modular inversion in $$$O(\log X / \log \log X)$$$ for a single query.

Remarks

We know that for the data structure problem, there are some tradeoffs between $$$O(n^\epsilon)$$$ time update and $$$O(1/\epsilon)$$$ time query. The approach described here can be thought of as the right analogy with the half-GCD algorithm. The key insight of the half-GCD method is recognizing that the results of the corollary can be computed recursively, leading to an $$$O(\ell \log^2 \ell)$$$ algorithm for Euclidean division in integer arithmetic or for polynomials over finite fields.

  • Vote: I like it
  • +116
  • Vote: I do not like it

»
15 months ago, hide # |
Rev. 2  
Vote: I like it 0 Vote: I do not like it
For $$$max(a,b)\geq 3B)$$$, reduce the larger of the two numbers by a factor of $$$X\cdot 3B^{-1/2}$$$.

I am not understanding this. I presume this steps uses the precomputed matrix and the previous lemma? How is the correct martix found? Are you computing some ranges for each matrirx where the lemma holds, and then find it using some 2D data structures (with horribly impractically large constant factors and log factors)?

  • »
    »
    15 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    Sorry for causing such confusion. I rewrote this part now, which is close to what you presumed. There is no need to maintain some data structure, it's just a 2D table lookup. By the argument of the Corollary, you just need to find the matrix stored in $$$(\lfloor aB/X\rfloor, \lfloor bB/X\rfloor)$$$.

»
15 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by LorentzianExpanders (previous revision, new revision, compare).

»
6 months ago, hide # |
 
Vote: I like it +28 Vote: I do not like it

I think "sublinear" would make more sense than "subpolynomial" (Precomputation).

  • »
    »
    6 months ago, hide # ^ |
     
    Vote: I like it +24 Vote: I do not like it

    Sublinear only tells that it's $$$o(n)$$$, but subpolynomial emphasizes that it's better than any polynomial ($$$n^c$$$).

    • »
      »
      »
      6 months ago, hide # ^ |
       
      Vote: I like it 0 Vote: I do not like it

      It seems that subpolynomial-time precomputation is mutually exclusive with constant-time query with this algorithm? If you choose any subpolynomial $$$B(p)$$$, query time becomes unbounded.