On reducing the times of Moebius transform for subset convolution

Hi, Codeforces! I want to introduce a simple way to understand the subset convolution and a simple optimization from scratch.

Actually, I do not know this trick is well-known or just out-of-date.

If there are any mistakes, I am happy to fix them (if I am able to) and thanks for pointing out.

We know that subset convolution is equivalent to truncated multivariate polynomial multiplication like

$$$ A(x_1,\dots ,x_n)B(x_1,\dots ,x_n)\bmod (x_1^2,\dots ,x_n^2) $$$

simply written

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{\mathbf{x}^2} $$$

we know that we could compute

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{\left(\mathbf{x}^2-\mathbf{x}\right)} $$$

by using the FFT-like way (Computing $$$A(\mathbf{x})\bmod{(\lbrace x_1,x_1-1\rbrace \times \cdots \times\lbrace x_n,x_n-1\rbrace)}$$$ and using Chinese remainder algorithm to restore the result. This is called the Zeta transform and the Moebius transform respectively). But the coefficients are somehow “mixed”, we can not tell the coefficients of $$$\mathbf{x}^2$$$ and $$$\mathbf{x}$$$. An idea is that we could introduce a new variable $$$t$$$ and computing $$$A(\mathbf{x})B(\mathbf{x})\bmod{\left(\mathbf{x}^2-\mathbf{x}t\right)}$$$, simply let $$$t\gets 0$$$ to get the result of subset convolution. Here is the main idea of the whole algorithm we may familar with:

First, we define $$$s$$$ for each term of $$$A(\mathbf{x}),B(\mathbf{x})$$$:

$$$ s\left(c x_1^{d_1}x_2^{d_2}\cdots x_n^{d_n}\right):=\sum_{k=1}^nd_k $$$

and

$$$ A_k(\mathbf{x})\text{ such that }s(y)=k\text{ for any term }y\text{ of }A_k(\mathbf{x}) $$$

then we compute

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}=\left(\sum_i A_i(\mathbf{x})\right)\left(\sum_i B_i(\mathbf{x})\right)\bmod{(\mathbf{x}^2-\mathbf{x}t)} $$$

For the $$$O(n^2)$$$ times of computation of form $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, we have an invariant that $$$s(y)+\deg_t(y)=i+j$$$ for any term $$$y$$$ of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$. Even we only do the computation of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x})}$$$, we could restore the result of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, and we do not use these coefficients that with variable $$$t$$$.

Another observation is that $$$\deg_t(A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)})\leq (i+j)/2$$$. For example, if we have $$$\left(\sum_{i+j=4}A_i(\mathbf{x})B_j(\mathbf{x})\right)+\left(\sum_{i+j=9}A_i(\mathbf{x})B_j(\mathbf{x})\right)\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, it is still possible for us to tell apart. With this simple trick, we could reduce the times of Moebius transform for about a half.

This is a submission for the optimizated code. I did not implement it carefully.

	Rev.	Lang.	By	When	Δ	Comment
	en2		hly1204	2024-02-27 11:01:33	79	reinplemented the code
	en1		hly1204	2024-02-26 19:37:25	2896	Initial revision (published)

Rev.

Lang.

When

Comment

en2

hly1204

2024-02-27 11:01:33

reinplemented the code

en1

hly1204

2024-02-26 19:37:25

2896

Initial revision (published)

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	turmax	3559
6	tourist	3541
7	strapple	3515
8	ksun48	3461
9	dXqwq	3436
10	Otomachi_Una	3413

#	User	Contrib.
1	Qingyu	157
2	adamant	153
3	Um_nik	147
4	Proof_by_QED	146
5	Dominater069	145
6	errorgorn	142
7	cry	139
8	YuukiS	135
9	TheScrasse	134
10	chromate00	133

History