### PeruvianCartel's blog

By PeruvianCartel, 2 months ago,

As you guys may have noticed, in a Div 4 contest on Codeforces, the system testing phase and queue usually takes like forever (a submission can get stuck in the queue for up to 30 minutes, and the system testing phase sometimes last literally like half a day). So why don't problem setters just... not do that?

What I mean is, why do author deliberately made both the time limit and the constraint so large? I'm actually really confused, since Div 4 is well known for its slow queue, so at some point someone should've thought about enforcing guidelines regarding setting the appropriate constraint for optimal judging speed in "crowded" contests, such as Div 3 and Div 4.

To really understand what I meant, let's take a look at this problem: 1980C — Sofia and the Lost Operations. This problem has a time limit of 2 seconds, as well as the maximum sum of $n, m$ over all testcase of $2 * 10^5$. You see, there's actually no reason to not half both the time limit and the constraint, since it's not like we are going to let $O(n * log (n))$ algorithm pass while failing $O(n ^ {1.5})$, $O(n ^ {1.75})$ ones. When you do that, inherently, nothing changed about the problem, and there are numerous merits in return, such as faster queue, faster judging time, happier participants, etc. So... what is stopping authors from setting the constraint to {$1$ second, $n \leq 10^5$}, or even as low as {$0.5$ second, $n \leq 5*10^4$}. I respect problem setters decision to set the constraint of their problem to a particular number, but I simply think there are things that can be done to make every party happy.

Again, this doesn't mean that every problem's constraint should be lowered. For example, this problem: 1985H2 — Maximize the Largest Component (Hard) works best with the current constraint, as going any lower would means a well implemented, simple brute-force with the complexity of $O(n * m * min(n,m))$ would be near indistinguishable to a intended, but badly implemented solution.

• +33

By PeruvianCartel, 8 months ago,

Hi guys. So, Centroid Decomposition, I never really paid attention to its runtime. That is until I was doing a problem (in an OI contest) that goes kinda like this:

Statement: Given a tree of $N$ nodes ($N \leq 4*10^5$) that are not colored. There are two types of queries:
- Color a node.
- Among the nodes that are colored, print the furthest distance between two colored node.
Time limit: 1 second.

This problem looks like the classic and beloved E — Xenia and Tree. Therefore, I just used Centroid Decomposition without giving it much thought. And to my surprise, I got FST because my Centroid Decomposition code was too slow. And I was really confused. You know, an $O(N * log N)$ algorithm getting TLE with $N = 10^6$ sounds unlikely already, let alone $N = 4*10^5$. Not only that, it took twice as long as I would have liked.

So... Why is that? Why is centroid decomposition so slow? I mean, it's literally find centroid of a tree, remove that node and do the same for all of the subtrees, ooga booga and you're done, so I really have no clue what is holding it back, and how do you optimize Centroid Decomposition (aka what is the best way to implement Centroid Decomposition for speed without touching black magic such as SIMD and pragma)? Here is my Centroid Decomposition implementation: 237313601

• +3

By PeruvianCartel, history, 8 months ago,

Hi guys. I was upsolving OI problems, when I stumbled upon this:

Statement: There are $n$ soldiers standing in a line, whose heights are distinct. A dude can be seen from the left side if there are no other dude to the left of him that is taller than him, and the same goes for the right side. Given $n$, $p$, $q$ ($n \leq 2000$), count how many way to arrange these dudes such that there are exactly $p$ dudes that can be seen from the left side, and $q$ dudes that can be seen from the right side. TL = $1$ second on a judge that runs as fast as Codeforces.

My idea: First we solve the case $q = 1$. For the sake of simplicity, we will assume the heights of these soldiers as a permutations of integers from $1$ to $n$. We can use dp: $f[i][j][k] =$ number of way to assign soldiers for the first $i$ position, such that the tallest soldiers that we assigned has the height $j$, and you can see $k$ dudes from the left side. From there it's just simple dp. For the general case, the answer is simply $\sum_{t = 1}^n (f[t-1][t-1][p-1] * f[n-t][n-t][q - 1] * C_{n-1} ^ {i-1})$. Sadly, this runs in $O(n^3)$, since our dp table has $n^3$ states.

I've discussed this problem with another dude who ranked Master on Codeforces, and we are proud to say that we can not cook, so any help would be appreciated. Thanks!

• +9

By PeruvianCartel, history, 8 months ago,

Hi guys. When I was doing a problem that basically requires a reasonably fast maximum flow template, one thought came across my mind: Is it possible to hack Ford-Fulkerson(DFS) + blocking flow + randomization? I know that using BFS to run Ford-Fulkerson is much safer, as there is a test generator (It's for MCMF, which is literally DFS Ford-Fulkerson) that make the basic DFS Ford-Fulkerson runs in $F$ iterations, resulting in the complexity of $O(EF)$ ($E$ is number of edges, $F$ is the maximum flow of the network).

However, I just really want to use DFS Ford-Fulkerson, since you can do blocking flow on it, which grants some pretty bullsh*t performance boost on average (and it's trivial to implement). So... is there any close upper-bound on the complexity of DFS Ford-Fulkerson given these optimization? And if my optimization sucked, is there a test generator to blow it up? (Obviously I can get godlike luck and find the maximum flow on the first iteration. Conversely, RNGesus could just walked my algorithm right into the worst case, so we are only counting the average case here)
• +17

By PeruvianCartel, history, 8 months ago,

Hi guys, in this problem: CSES: Maximum Building II, the constraint for $n$, $m$ was $1000$, which sounds fishy. Therefore, I tried the trivial $O(n^3)$ brute, and surprisingly it AC with the slowest testcase being 0.9s (I guess it's because the operations was so simple and predictable that the compiler and CPU just do some hocus pocus and speed it up like 10 times).
So... why did the author decide to set the constraint that low? You can solve that problem in $O(n^2)$, with good constant, so it's not like you have to be extremely generous when it comes to time limit to let these solutions pass, so what is the motive behind it? Was the $O(n^3)$ brute the intended algorithm? (which doesn't really make sense, since it's supposed to be a hard problem????)

• +20