Half-GCD algorithm - Codeforces

#	User	Rating
1	tourist	4009
2	jiangly	3831
3	Radewoosh	3646
4	jqdai0815	3620
4	Benq	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	gamegame	3386
10	ksun48	3373

#	User	Contrib.
1	cry	164
1	maomao90	164
3	Um_nik	163
4	atcoder_official	160
5	-is-this-fft-	158
6	awoo	157
6	adamant	157
8	TheScrasse	154
8	nor	154
10	Dominater069	153

Hi everyone!

Today I'd like to finally talk about an algorithm to solve the following tasks in $$$O(n \log^2 n)$$$:

Compute the greatest common divisor of two polynomials $$$P(x)$$$ and $$$Q(x)$$$;
Given $$$f(x)$$$ and $$$h(x)$$$ find the multiplicative inverse of $$$f(x)$$$ modulo $$$h(x)$$$;
Given $$$F_0,F_1, \dots, F_m$$$, recover the minimum linear recurrence $$$F_n = a_1 F_{n-1} + \dots + a_d F_{n-d}$$$;
Given $$$P(x)$$$ and $$$Q(x)$$$, find $$$A(x)$$$ and $$$B(x)$$$ such that $$$P(x) A(x) + Q(x) B(x) = \gcd(P, Q)$$$;
Given $$$P(x)=(x-\lambda_1)\dots(x-\lambda_n)$$$ and $$$Q(x)=(x-\mu_1)\dots(x-\mu_m)$$$ compute their resultant.

More specifically, this allows to solve in $$$O(n \log^2 n)$$$ the following problems:

Library Checker — Find Linear Recurrence. You're given $$$F_0, \dots, F_{m}$$$. Find $$$a_1, \dots, a_d$$$ with minimum $$$d$$$ such that

$$$ F_n = \sum\limits_{k=1}^d a_k F_{n-k}. $$$

Library Checker — Inv of Polynomials. You're given $$$f(x)$$$ and $$$h(x)$$$. Compute $$$f^{-1}(x)$$$ modulo $$$h(x)$$$.

All tasks here are connected with the extended Euclidean algorithm and the procedure we're going to talk about is a way to compute it quickly. I recommend reading article on recovering minimum linear recurrence first, as it introduces some useful results and concepts. It is also highly recommended to familiarize yourself with the concept of continued fractions.

Euclidean algorithm and continued fractions

Assume that we want to compute the greatest common divisor of $$$A(x)$$$ and $$$B(x)$$$ for $$$\deg A \geq \deg B$$$. For this purpose, we compute

$$$ r_{i} = r_{i-2} \mod r_{i-1} = r_{i-2} - a_i r_{i-1}, $$$

starting with $$$r_{-2} = A(x)$$$ and $$$r_{-1} = B(x)$$$. The last element in the sequence is $$$r_k = 0$$$ for some $$$k$$$.

This sequence corresponds to the continued fraction

$$$ \frac{A(x)}{B(x)} = a_0(x) + \frac{1}{a_1(x) + \frac{1}{a_2(x)+\dots}} = [a_0(x); a_1(x), a_2(x),\dots] $$$

With this representation, the sequence of convergents $$$\frac{p_i(x)}{q_i(x)}$$$ is defined, such that

$$$ \frac{p_i(x)}{q_i(x)} = [a_0; a_1, \dots, a_i] = a_0 + \frac{1}{\dots + \frac{1}{a_i}}. $$$

The sequence of convergents adheres to the following recurrence:

$$$ \frac{p_i(x)}{q_i(x)} = \frac{a_i(x) p_{i-1}(x) + p_{i-2}(x)}{a_i(x) q_{i-1}(x) + q_{i-2}(x)}, $$$

starting with $$$\frac{p_{-2}}{q_{-2}} = \frac{0}{1}$$$ and $$$\frac{p_{-1}}{q_{-1}} = \frac{1}{0}$$$. Note that the following recurrence stands connecting the degrees of $$$q_i$$$, $$$p_i$$$ and $$$r_i$$$:

$$$ \deg q_{i} - \deg q_{i-1} = \deg p_{i} - \deg p_{i-1} = \deg r_{i-2} - \deg r_{i-1} = a_i. $$$

From this fact it follows that

$$$ \begin{cases} \deg p_i = \deg a_0 + \deg a_1 + \dots + \deg a_i,\\ \deg q_i = \deg a_1 + \dots + \deg a_i,\\ \deg r_i = \deg A - \deg a_0 - \dots - \deg a_{i+1} = \deg A - \deg p_{i+1}. \end{cases} $$$

For $$$\frac{A(x)}{B(x)} = [a_0; a_1, \dots, a_k]$$$, the sequence of convergents ends with $$$\frac{p_k(x)}{q_k(x)} = \frac{A(x)}{B(x)}$$$, such that $$$p_k$$$ and $$$q_k$$$ are coprime. Therefore,

$$$ \deg p_k = \deg \frac{A}{\gcd(A, B)} = \deg a_0 + \dots + \deg a_k. $$$

From this follows that the sequence $$$a_0(x),a_1(x),\dots,a_k(x)$$$ provides another linear-size representation of $$$\frac{A(x)}{B(x)}$$$.

It can be proven (see the previous article) that

$$$ A(x) q_i(x) - B(x) p_i(x) = (-1)^i r_i(x), $$$

in particular it means that $$$A(x) q_{i-1}(x) - B(x) p_{i-1}(x) = (-1)^{i-1} \gcd(A, B)$$$.

The key to execute the extended Euclidean algorithm in $$$O(n \log^2 n)$$$ is to be able to switch between the two representations.

Conversion of $$$[a_0(x); a_1(x), \dots, a_k(x)]$$$ to $$$p_k$$$, $$$q_k$$$ and $$$r_k$$$

The recurrence $$$p_{i} = p_{i-2} + a_i p_{i-1}$$$ can be written in matrix form as

$$$ \begin{pmatrix} a_i & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} p_{i-1} \\ p_{i-2} \end{pmatrix} = \begin{pmatrix} p_i \\ p_{i-1} \end{pmatrix}, $$$

therefore, the following formula stands:

$$$ \begin{pmatrix} a_i & 1 \\ 1 & 0 \end{pmatrix} \dots \begin{pmatrix} a_1 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} a_0 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \begin{pmatrix} p_i \\ p_{i-1} \end{pmatrix}. $$$

The product of $$$i+1$$$ matrices can be computed in $$$O(n \log^2 n)$$$ with divide and conquer algorithm for $$$n=\deg a_0 + \dots + \deg a_i$$$.

Using $$$\binom{0}{1}$$$ starting vector, one will get $$$\binom{q_i}{q_{i-1}}$$$, thus the following joint formula is also true:

$$$ \begin{pmatrix} a_i & 1 \\ 1 & 0 \end{pmatrix} \dots \begin{pmatrix} a_1 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} a_0 & 1 \\ 1 & 0 \end{pmatrix} = \begin{pmatrix} p_i & q_i \\ p_{i-1} & q_{i-1} \end{pmatrix}. $$$

Knowing $$$p_i$$$ and $$$q_i$$$, one can compute $$$r_i$$$ from the formula above:

$$$ (-1)^i r_i = A(x) q_i(x) - B(x) p_i(x). $$$

Note that transposing left-hand side and right-hand side will also give us the following identity:

$$$ \begin{pmatrix} a_0 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} a_1 & 1 \\ 1 & 0 \end{pmatrix} \dots \begin{pmatrix} a_i & 1 \\ 1 & 0 \end{pmatrix} = \begin{pmatrix} p_i & p_{i-1} \\ q_i & q_{i-1} \end{pmatrix}. $$$

Conversion of $$$\frac{A(x)}{B(x)}$$$ to $$$[a_0(x); a_1(x), \dots, a_k(x)]$$$

Connection between convergents and remainders

From the continued fraction properties it follows that

$$$ \frac{A(x)}{B(x)} = [a_0; a_1, \dots, a_{i-1}, s_i], $$$

where $$$s_i = \frac{r_{i-2}}{r_{i-1}}$$$ is the so-called $$$i$$$-th complete quotient of $$$\frac{A}{B}$$$. In the matrix form this is denoted as

$$$ \begin{pmatrix} p_{i-1} & p_{i-2} \\ q_{i-1} & q_{i-2} \end{pmatrix}\begin{pmatrix} r_{i-2} \\ r_{i-1} \end{pmatrix} = \begin{pmatrix} A \\ B \end{pmatrix}. $$$

From continued fraction properties it also follows that $$$p_{i-1} q_{i-2} - q_{i-1} p_{i-2} = (-1)^{i}$$$, therefore

$$$ \begin{pmatrix} r_{i-2} \\ r_{i-1} \end{pmatrix} = (-1)^{i} \begin{pmatrix} q_{i-2} & -p_{i-2} \\ -q_{i-1} & p_{i-1} \end{pmatrix} \begin{pmatrix} A \\ B \end{pmatrix}. $$$

Knowing $$$p_{i-1}$$$, $$$p_{i-2}$$$, $$$q_{i-1}$$$, $$$q_{i-2}$$$, it is possible to to recover the remaining partial quotients as the partial expansion of $$$\frac{r_{i-2}}{r_{i-1}}$$$.

In the algorithm below we'll learn how to find a transform that significantly advances $$$\frac{A}{B}$$$ to $$$s_i$$$.

Half-GCD

Denoting $$$\frac{A(x)}{B(x)} = \frac{A_0(x) + x^t A_1(x)}{B_0(x) + x^t B_1(x)}$$$, let's look on what we can derive from the continued fraction expansion of $$$\frac{A_1(x)}{B_1(x)}$$$. Let

$$$ A_1 q_i' - B_1 p_i' = (-1)^i r_i', $$$

then

\begin{align} A q_i' — B p_i'& = (A_0 + x^t A_1) q_i' — (B_0 + x^t B_1)p_i' \newline &= (A_0 q_i' — B_0 p_i') + x^t (-1)^i r_i'. \end{align}

Thus if the degree of $$$x^t (-1) r_i'$$$ is greater than of $$$A_0 q_i' - B_0 p_i'$$$, the fraction $$$p_i'/q_i'$$$ is also the $$$i$$$-th convergent of $$$A(x)/B(x)$$$ itself and not only $$$A_1(x)/B_1(x)$$$. With this in mind, we can design a divide and conquer algorithm that will safely advance the computation of the continued fraction representation of $$$A(x)/B(x)$$$ by $$$i$$$ steps using the continued fraction representation of $$$A_1(x)/B_1(x)$$$.

Let $$$n = \deg A$$$ and $$$t = \lceil \frac{n}{2} \rceil$$$. We will demand an algorithm to find the transform

$$$ \begin{pmatrix} p_i & p_{i-1} \\ q_i & q_{i-1} \end{pmatrix} \begin{pmatrix} r_{i-1} \\ r_i \end{pmatrix} = \begin{pmatrix} A \\ B \end{pmatrix} $$$

such that $$$\deg r_{i-1} \geq t > \deg r_i$$$. To do this, we decompose $$$\frac{A}{B}=\frac{A_0 + x^t A_1}{B_0 + x^t B_1}$$$ and solve the problem recursively for $$$\frac{A_1}{B_1}$$$.

Let $$$R$$$ be the transform found by the recursive algorithm, now we should apply $$$R^{-1}$$$ to $$$\binom{A}{B}$$$ to get $$$\frac{r_{k-1}}{r_k}$$$.

If after that the condition above holds, we return $$$R$$$. Otherwise, we advance one step in continued fraction computation, from $$$\frac{r_{k-1}}{r_k}$$$ to

$$$ \frac{r_k}{r_{k+1}}=\frac{r_k}{r_{k-1} \mod r_k}=\frac{r_k}{r_{k-1} - a_{k+1} r_k}. $$$

Now we have a fraction $$$\frac{r_k}{r_{k+1}}$$$ and we need to continue the process till we get to $$$\frac{r_{i-1}}{r_i}$$$.

We need to further reduce the denominator degree from $$$\deg r_{k+1}$$$ to less than $$$t$$$. Thus, we need to reduce this value by $$$\deg r_{k+1} - t$$$.

For this purpose, we represent $$$\frac{r_k}{r_{k+1}}$$$ as $$$\frac{A'_0 + x^m A'_1}{B'_0 + x^m B'_1}$$$ and compute half-GCD again, but for $$$\frac{A'_1}{B'_1}$$$.

For this to reduce $$$\deg r_{k}$$$ to $$$t$$$, it must hold that

$$$ \deg A'_1 - \lceil \frac{\deg A'_1}{2} \rceil = \deg r_k - t = (\deg r_k - m) - \lceil \frac{\deg r_{k} - m}{2} \rceil, $$$

Thus we should pick $$$m = 2t - \deg r_{k}$$$.

If the transform returned by the second half-GCD call is $$$S$$$, the overall transform is of form

$$$ R \cdot \begin{pmatrix}a_{k+1} & 1 \\ 1 & 0\end{pmatrix} \cdot S. $$$

The algorithm above makes $$$2$$$ recursive calls on halved sizes and needs $$$O(n \log n)$$$ time before the second recursive call and before returning the answer. Therefore, the overall running time is

$$$ T(n) = 2T(\frac{n}{2})+O(n \log n) = O(n \log^2 n). $$$

Additionally to the linear transform matrix you can maintain a sequence $$$[a_0; a_1, \dots, a_k]$$$ itself. Then, once half_GCD is implemented, you can repeatedly apply it to $$$\frac{A}{B}$$$ advancing its computation. Each advance will reduce the current size of polynomials by at least a factor of $$$2$$$, hence the overall complexity will still be $$$O(n \log^2 n)$$$.

Implementation

I've implemented this algorithm for my polynomial library. For $$$\frac{A}{B} = [a_0; a_1, \dots, a_k]$$$, the function full_gcd return a pair of vector that contains polynomials $$$a_0,\dots, a_k$$$ and of the transform

$$$ \begin{pmatrix} p_k & p_{k-1} \\ q_k & q_{k-1} \end{pmatrix}. $$$

A code that implements the algorithm in the terms of my polynomial library looks like this:

Source code

// (ax+b) / (cx+d)
struct transform {
    poly a, b, c, d;
    transform(poly a, poly b = T(1), poly c = T(1), poly d = T(0)): a(a), b(b), c(c), d(d){}
    
    transform operator *(transform const& t) {
        return {
            a*t.a + b*t.c, a*t.b + b*t.d,
            c*t.a + d*t.c, c*t.b + d*t.d
        };
    }
    
    transform adj() {
        return transform(d, -b, -c, a);
    }
    
    auto apply(poly A, poly B) {
        return make_pair(a * A + b * B, c * A + d * B);
    }
};

template<typename Q>
static void concat(vector<Q> &a, vector<Q> const& b) {
    for(auto it: b) {
        a.push_back(it);
    }
}

// finds a transform that changes A/B to A'/B' such that
// deg B' is at least 2 times less than deg A
static pair<vector<poly>, transform> half_gcd(poly A, poly B) {
    assert(A.deg() >= B.deg());
    int m = (A.deg() + 1) / 2;
    if(B.deg() < m) {
        return {{}, {T(1), T(0), T(0), T(1)}};
    }
    auto [ar, Tr] = half_gcd(A.div_xk(m), B.div_xk(m));
    tie(A, B) = Tr.adj().apply(A, B);
    if(B.deg() < m) {
        return {ar, Tr};
    }
    auto [ai, R] = A.divmod(B);
    tie(A, B) = make_pair(B, R);
    int k = 2 * m - B.deg();
    auto [as, Ts] = half_gcd(A.div_xk(k), B.div_xk(k));
    concat(ar, {ai});
    concat(ar, as);
    return {ar, Tr * transform(ai) * Ts};
}
        
// return a transform that reduces A / B to gcd(A, B) / 0
static pair<vector<poly>, transform> full_gcd(poly A, poly B) {
    vector<poly> ak;
    vector<transform> trs;
    while(!B.is_zero()) {
        if(2 * B.deg() > A.deg()) {
            auto [a, Tr] = half_gcd(A, B);
            concat(ak, a);
            trs.push_back(Tr);
            tie(A, B) = trs.back().adj().apply(A, B);
        } else {
            auto [a, R] = A.divmod(B);
            ak.push_back(a);
            trs.emplace_back(a);
            tie(A, B) = make_pair(B, R);
        }
    }
    trs.emplace_back(T(1), T(0), T(0), T(1));
    while(trs.size() >= 2) {
        trs[trs.size() - 2] = trs[trs.size() - 2] * trs[trs.size() - 1];
        trs.pop_back();
    }
    return {ak, trs.back()};
}

From the formulas above, it holds that

$$$ A q_{k-1} - B p_{k-1} = (-1)^{k-1} \gcd(A, B). $$$

This formula allows to find $$$\gcd(A, B)$$$ and a multiplicative inverse of $$$A(x)$$$ modulo $$$B(x)$$$ when $$$\deg \gcd(A, B) = 0$$$ (submission).

Finally, to find the minimum linear recurrence in $$$O(n \log^2 n)$$$, you should compute $$$\frac{x^{n+1}}{F(x)} = [a_0; a_1, \dots]$$$ representation explicitly and find $$$k$$$ for which $$$\deg r_k < \deg p_k$$$, as described in the previous article. Then, compute and output $$$[a_0; a_1, \dots, a_k]$$$ (submission).

Rev.	By	When	Δ	Comment
en5	adamant	2022-04-14 03:09:06	119
en4	adamant	2022-04-14 02:52:52	1	typo
en3	adamant	2022-04-12 21:30:54	362
en2	adamant	2022-04-12 21:28:26	2336	source code
en1	adamant	2022-04-12 21:24:32	10530	Initial revision (published)