On reducing the times of Moebius transform for subset convolution

№	Пользователь	Рейтинг
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

№	Пользователь	Вклад
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

Hi, Codeforces! I want to introduce a simple way to understand the subset convolution and a simple optimization from scratch.

Actually, I do not know this trick is well-known or just out-of-date.

If there are any mistakes, I am happy to fix them (if I am able to) and thanks for pointing out.

We know that subset convolution is equivalent to truncated multivariate polynomial multiplication like

$$$ A(x_1,\dots ,x_n)B(x_1,\dots ,x_n)\bmod (x_1^2,\dots ,x_n^2) $$$

simply written

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{\mathbf{x}^2} $$$

we know that we could compute

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{\left(\mathbf{x}^2-\mathbf{x}\right)} $$$

by using the FFT-like way (Computing $$$A(\mathbf{x})\bmod{(\lbrace x_1,x_1-1\rbrace \times \cdots \times\lbrace x_n,x_n-1\rbrace)}$$$ and using Chinese remainder algorithm to restore the result. This is called the Zeta transform and the Moebius transform respectively). But the coefficients are somehow “mixed”, we can not tell the coefficients of $$$\mathbf{x}^2$$$ and $$$\mathbf{x}$$$. An idea is that we could introduce a new variable $$$t$$$ and computing $$$A(\mathbf{x})B(\mathbf{x})\bmod{\left(\mathbf{x}^2-\mathbf{x}t\right)}$$$, simply let $$$t\gets 0$$$ to get the result of subset convolution. Here is the main idea of the whole algorithm we may familar with:

First, we define $$$s$$$ for each term of $$$A(\mathbf{x}),B(\mathbf{x})$$$:

$$$ s\left(c x_1^{d_1}x_2^{d_2}\cdots x_n^{d_n}\right):=\sum_{k=1}^nd_k $$$

and

$$$ A_k(\mathbf{x})\text{ such that }s(y)=k\text{ for any term }y\text{ of }A_k(\mathbf{x}) $$$

then we compute

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}=\left(\sum_i A_i(\mathbf{x})\right)\left(\sum_i B_i(\mathbf{x})\right)\bmod{(\mathbf{x}^2-\mathbf{x}t)} $$$

For the $$$O(n^2)$$$ times of computation of form $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, we have an invariant that $$$s(y)+\deg_t(y)=i+j$$$ for any term $$$y$$$ of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$. Even we only do the computation of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x})}$$$, we could restore the result of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, and we do not use these coefficients that with variable $$$t$$$.

Another observation is that $$$\deg_t(A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)})\leq (i+j)/2$$$. For example, if we have $$$\left(\sum_{i+j=4}A_i(\mathbf{x})B_j(\mathbf{x})\right)+\left(\sum_{i+j=9}A_i(\mathbf{x})B_j(\mathbf{x})\right)\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, it is still possible for us to tell apart. With this simple trick, we could reduce the times of Moebius transform for about a half.

This is a submission for the optimized code. I did not implement it carefully.

upd: submission reinplemented.

Блог пользователя hly1204