Translated by ChatGPT 5.5 Instant.
Barrett Reduction can usually solve the following problem faster:
For a fixed modulus $$$m$$$, repeatedly given parameters $$$x$$$, quickly compute the value of $$$x \bmod m$$$.
Direct modulo operations, such as x % m in C++, are usually relatively slow, so we consider constant-factor optimization. The Barrett Reduction algorithm first preprocesses an approximation of (\frac{1}{m}).
Choose a precision parameter $$$k$$$. On a 64-bit machine, $$$k=64$$$ is commonly used. Let $$$B=2^k$$$. Precompute
Then we have
So the quotient $$$q$$$ can be estimated as
Then the remainder is clearly
Since the quotient $$$q$$$ is only an estimate, the remainder $$$r$$$ also needs some correction. See the code implementation for the details.
struct Barrett {
unsigned int mod;
unsigned long long im;
inline Barrett(unsigned int m = 1) : mod(m), im((unsigned long long)(-1) / m + 1) {}
inline unsigned int reduce(unsigned long long x) {
unsigned long long q = (unsigned long long)((__uint128_t)x * im >> 64);
unsigned int r = x - q * mod;
if (r >= mod) r += mod;
return r;
}
inline unsigned int mul(unsigned int a, unsigned int b) {
return reduce((unsigned long long)a * b);
}
};
Usage:
Barrett bt(mod); // mod is the modulus
Calling bt.reduce(a) gives the value of a % mod, and calling bt.mul(a, b) gives the value of a * b % mod.
From the analysis above, we can see that Barrett Reduction does not require mod to be prime; it only requires the modulus to be a positive integer. However, this 64-bit ACL-style implementation relies on 32-bit unsigned overflow correction, so it usually requires$ 1 \le \text{mod} < 2^{31}$. Within this range, bt.reduce(x) can quickly compute x % mod, and bt.mul(a, b) can quickly compute a * b % mod.




