On today's POI training camp round I have learnt a nice technique that could possibly be useful in some number theory problems. I couldn't find any CF article on it, so I think it's fair enough to share it on my own.
Remark on used notation
In some sums I will use an Iverson notation
Problem: Squarefree Function
Let's define a Squarefree Function f(x) for any positive integer x as x divided by a greatest perfect square, that divides x.
For example: f(1)=1, f(2)=2, f(4)=1, f(6)=6, f(27)=3, f(54)=6, f(800)=2
Given an array a of n≤105 positive integers, where each ai≤105 compute sum
\begin{gather} \sum\limits_{1 \leq i,j \leq n}f(a_i\cdot a_j) \mod (10^9 + 7) \end{gather}
Technique: GCD Convolution
You might probably heard about a Sum Convolution. For two arrays b, and c, it is defined as an array d such that \begin{gather} d_k = \sum\limits_{i + j = k}b_i\cdot c_j \end{gather} If not, it's basically the same thing as a polynomial multiplication. If B(x) = b_0 + b_1x + b_2x^2 + ... + b_nx^n, and C(x) = c_0 + c_1x + c_2x^2 + ... + c_nx^n, then (B \cdot C)(x) = d_0 + d_1x + d_2x^2 + ... + d_{2n}x^{2n}
Let's define GCD Convolution by analogy
Definition
A GCD Convolution of two arrays b, and c, consisting of positive integers, is an array d such that \begin{gather} d_k = \sum\limits_{gcd(i,j) = k}b_i\cdot c_j \end{gather}
Algorithm to find GCD Convolution
Of course, we can compute it naively by iterating over all pairs of indicies. If b and c consists of n elements then the overall complexity would be O(n^2log(max(b) + max(c))). But it turns out, that we can do better.
Let's look at the sum of d_k values, with indicies divisible by some integer g, so that k = gm is satisfied for some integer m. \begin{gather} \sum\limits_{m=1}^{n/g}d_{gm} = \sum\limits_{m=1}^{n/g}\sum\limits_{gcd(i,j) = gm}b_i\cdot c_j = \sum\limits_{g | gcd(i,j)}b_i\cdot c_j \end{gather}
From the definition of gcd, we know that g | gcd(i,j) \Leftrightarrow g | i \wedge g | j \begin{gather} \sum\limits_{g | gcd(i,j)}b_i\cdot c_j = \sum\limits_{i,j}b_i\cdot c_j[g \;|\; gcd(i,j)] = \sum\limits_{i,j}b_i\cdot c_j[g \;|\; i][g \;|\; j] = \end{gather} \begin{gather} =\sum\limits_{i,j}\left(b_i[g \;|\; i]\right)\cdot \left(c_j[g \;|\; j]\right) = \left(\sum\limits_{g|i}b_i\right)\left(\sum\limits_{g|j}c_j\right) \end{gather}
We can define B_g = \sum_{m=1}^{n/g}b_{gm}, and C_g = \sum_{m=1}^{n/g}c_{gm}, and D_g = \sum_{m=1}^{n/g}d_{gm}. From above equation one could easily derive D_g = B_g\cdot C_g. Knowing that O(n + \frac{n}{2} + \frac{n}{3} + ...) = O(n\log n), arrays B and C can be computed directly from their definitions in O(n\log n).
Recovering a d_k values from D array is simple. All we need is just subtract all the summands of D_i except for the smallest. So, formally, we have \begin{gather} d_k = D_k - \sum\limits_{m=2}^{n/k}d_{km} \end{gather} Which can be computed using dynamic programming, starting from k = n.
So, the overall complexity of computing a GCD Convolution of two arrays of size n is O(n\log n).
Back to original problem
We can see, that \begin{gather} f(a_i\cdot a_j) = \frac{f(a_i)\cdot f(a_j)}{gcd(f(a_i), f(a_j))^2} \end{gather}
So, having an array w_{f(a_i)} = \sum\limits_if(a_i) all we need is just to compute a GCD Convolution of w with itself. Let's denote that convolution by d. Then, by definition \begin{gather} \sum\limits_{i,j :\;gcd(f(a_i), f(a_j)) = k} \frac{f(a_i)\cdot f(a_j)}{gcd(f(a_i), f(a_j))^2} = \frac{d_k}{k^2} \end{gather}
So answer to our problem is just a sum \begin{gather} \sum\limits_{k=1}^{max(f(a_i))}\frac{d_k}{k^2} \end{gather}
Assuming that we have computed f(a_i) values with sieve, if we denote A = max(a_i), then overall complexity of this solution is O(n + A\log A)
Practice problems
Actually, I don't have any. I will be glad if you share some problems in comments. All I have is just this: