Can we solve problem 1920C - Partitioning the Array faster than O(n*d(n)) ≈ O(n^(4/3))?
I'm getting interested in it.
# | User | Rating |
---|---|---|
1 | tourist | 3985 |
2 | jiangly | 3814 |
3 | jqdai0815 | 3682 |
4 | Benq | 3529 |
5 | orzdevinwang | 3526 |
6 | ksun48 | 3517 |
7 | Radewoosh | 3410 |
8 | hos.lyric | 3399 |
9 | ecnerwala | 3392 |
9 | Um_nik | 3392 |
# | User | Contrib. |
---|---|---|
1 | cry | 169 |
2 | maomao90 | 162 |
2 | Um_nik | 162 |
4 | atcoder_official | 161 |
5 | djm03178 | 158 |
6 | -is-this-fft- | 157 |
7 | adamant | 155 |
8 | awoo | 154 |
8 | Dominater069 | 154 |
10 | luogu_official | 150 |
Can we solve problem 1920C - Partitioning the Array faster than O(n*d(n)) ≈ O(n^(4/3))?
I'm getting interested in it.
Name |
---|
Of course.
How could we solve the problem in O(N * d(N))? For each divisor we need to calculate the GCD of the differences of each index [0..k-1] which would result in a final complexity of O(N log N d(N)) if I am not mistaken.
Taking the gcd (using the Euclidean algorithm) of elements in an array of size $$$n$$$ and max element $$$x$$$ takes $$$O(n + \log x)$$$ time.
Ah, my ignorance of basic Number Theory shows once again. Maybe I could have passed with much better time if I implemented the GCD function using the Euclidean algorithm instead of using __gcd, because I got close 1.5 seconds runtime in-con (which seems around N log N d(N)).
Upd: According to https://stackoverflow.com/questions/24981380/implementation-of-the-in-built-gcd-method-in-c, __gcd is already implemented using the Euclidean algorithm... Maybe the constants are blowing up runtime
Your solution is $$$O(n\log n\ d(n))$$$ since you're sorting all elements for every divisor.
You both are completely correct, it appears that I may have been a bit too drunk off of the buzzer-beater AC on C to properly analyze my solution.
The most recent gcd implementation (at least in the libstdc++ version I have) uses binary GCD, which doesn't enjoy this nice property. Binary GCD in general is pretty fast, though.
(Here's a reference for why the claim I mentioned earlier is true).
I went through your implementation, and it looked like you were doing some other operations and sorting too, which could have had an extra overhead.
For references, here are some links to my implementations with Euclidean GCD (300ms), inbuilt GCD (650ms), inbuilt GCD without pragmas (700ms). The fact that there is a slight speedup with pragmas might mean that the binary GCD part is getting sped up using specific BMI/BMI2 instructions.
btw
__gcd
is probably always euclideanstd::gcd
— 241431575 ~950 ms__gcd
: — 241568039 ~400 msI find this quite interesting, since
std::__gcd
is libstdc++'s own implementation of GCD, and a divergence betweenstd::gcd
andstd::__gcd
points to either forgetting to update it, or some other more important reason that they might have found.Can you explain how my most naive GCD implementation runs in 187ms? I can only imagine that comparing elements in different order
break
s faster in particular test suite.Thanks for the analysis, I have removed the mentioned overheads and now it indeed performs much better (259 ms).
Can you please explain how calculating d(n) is O(n^(1/3))?
https://mirror.codeforces.com/blog/entry/14463?#comment-710155
Thanks! I used this method to find divisors for C 241454777 (I am referring to printDivisor function). Can you tell me whether this implementation is a good implementation in this questions context or the method used in the editorial is better?
I think the problem is solvable in something like $$$n\cdot\log(n)^2$$$, more precisely, it's $$$\sigma(n)\cdot\log(n)\cdot\omega(n)$$$, where $$$\sigma(n)$$$ is the sum of divisors of $$$n$$$ and $$$\omega(n)$$$ is the number of prime factors of $$$n$$$.
First, we come up with a function $$$g(x,d) \ \forall \ d \ |\ n, 0\le x < d$$$ (there are $$$\sigma(n)$$$ such values we need to calculate), and we'll calculate these values in decreasing order of the divisors.
We define
Now, notice that if $d = p \cdot q$ for some prime factor $$$p$$$ of $$$d$$$, then
which takes $O(p)$ time to compute for a given $$$y$$$. Since there are $$$q$$$ such values, the total computation time is $$$O(p\cdot q) = O(d)$$$ So, when iterating over the divisors in decreasing order, after we've computed the values of $$$g(x,d)$$$ for all $$$0 \le x < d$$$, we'll be doing this computation for all prime factors of $$$d$$$, which makes the overall complexity $$$O(d\cdot \omega(d))$$$. Since we're iterating in descending order of divisors, when we reach a given divisor, we would have already computed its set of values, and we can do the similar thing for all its prime factors.
Total complexity hence turns out to be $$$\sum_{d \ | \ n} O(d \cdot \omega(d)) = \sigma(n) * \omega(n)$$$. I believe the complexity could even be brought further down if we are somehow smarter about which prime factors to consider, as one divisor would have been calculated multiple times in this approach.