This is the snippet I have for FFT. But I feel it is slow in many cases.
One example is http://mirror.codeforces.com/problemset/submission/958/41610928 which takes nearly 3.85 seconds to pass for n=200000
Can somebody help me to optimise it.
Thanks.
# | User | Rating |
---|---|---|
1 | tourist | 3985 |
2 | jiangly | 3814 |
3 | jqdai0815 | 3682 |
4 | Benq | 3529 |
5 | orzdevinwang | 3526 |
6 | ksun48 | 3517 |
7 | Radewoosh | 3410 |
8 | hos.lyric | 3399 |
9 | ecnerwala | 3392 |
9 | Um_nik | 3392 |
# | User | Contrib. |
---|---|---|
1 | cry | 169 |
2 | maomao90 | 162 |
2 | Um_nik | 162 |
4 | atcoder_official | 161 |
5 | djm03178 | 158 |
6 | -is-this-fft- | 157 |
7 | adamant | 155 |
8 | awoo | 154 |
8 | Dominater069 | 154 |
10 | luogu_official | 150 |
This is the snippet I have for FFT. But I feel it is slow in many cases.
One example is http://mirror.codeforces.com/problemset/submission/958/41610928 which takes nearly 3.85 seconds to pass for n=200000
Can somebody help me to optimise it.
Thanks.
Name |
---|
Since module is small (equal to 1009) in this task you don't need to use
sqrt(MOD)
hack and can manually multiply two polynoms (but don't forget to take resulting product modulo 1009 after each multiplification).Yeah. In this case I did that to get AC. But how to improve it in general ??
In case of this problem, maybe I'm blind but I see you using
fft_modulo
instead ofmul
which is definetely usesqrt(MOD)
-hack.Regarding optimizing in general case: I'm sure I've read many articles, but I can't find anything (maybe because they are on Russian). But standart trick are to precalc all roots (
wlen
andw
in your code) — it gives boost with multiple using ofmul
. Precalcingbit reverse
(or calculating it in the linear time) can give a little speed up.Anyway, you can always look at realization of fft from top participants if you have enough time and patience.
Oh sorry. I misunderstodd your first comment.
And thanks for the pre-calculation parts.