Hi!
Yesterday I was solving problem 1743C - Save the Magazines, when I faced a strange verdict. My code was getting WA with C++20 (176908643), while everything was ok with C++17 (176909474, 176989803). I tried looking for anything in my code that may cause undefined behavior, nothing found. Also there were something more strange as well; Like these two:
Changing the order of two variables in line 43 caused different verdict (176965105).
Adding a single line of
cerr << 1;
, again,caused different verdict (176965552).
and many other strange behaviours...
Here's a smaller code which has different verdicts in GNU G++20 11.2.0 (64 bit, winlibs)
and GNU G++17 7.3.0
on Codeforces's custom test. You can check it out yourself, but as far as I know there's nothing special or causing undefined behaviour in this code:
UPD: Simpler code by pajenegod
UPD 2: Extremely simpler code by nor
UPD 3: I also reported the problem on Bugzilla (Link)
What's really happening ?
It's all about tree-loop-distribute-patterns
flag. You can learn about it here. But there's a problem in GCC 11.2
. While compiling a code like:
for (...) {
a[i] = 0;
b[i] = 0;
if (...) a[i] = a[i-1] + 1;
}
it generates something like:
memset(a, 0, sizeof a);
for (...) {
if (...) a[i] = a[i-1] + 1;
}
memset(b, 0, sizeof b);
And that's why the verdict differs just by changing the order of those two variables. As you can see here:
First there's a
memset
in line 85.Then the
for
will run.Finally in line 122 there's the last
memset
, which destroys everything!
It seems that this problem starts from GCC 10.1
, so GNU G++17 7.3.0
and GNU G++17 9.2.0
work correctly.
The default optimizition used by Codeforces for both C++20 and C++17 is O2
. The flag tree-loop-distribute-patterns
works automatically in O2
and later. So the code with O1
optimization will run correctly. In O3
and later, another flag, tree-vectorize
, has been included, which fixes this bug; So, again, if you use O3
or Ofast
it will run correctly.
tree-vectorize
-> AC 177021996O1
-> AC 176935274O2
-> WA 176934914O3
-> AC 176935180Ofast
-> AC 176935136O2
+tree-vectorize
-> AC 176984080O2
+no-tree-loop-distribute-patterns
-> AC 176984219O1
+tree-loop-distribute-patterns
-> WA 176984329O3
+no-tree-vectorize
-> WA 176998110Ofast
+no-tree-vectorize
-> WA 176998186
But there's still something strange. The code with O0
works ok (177008931), also the one with an extra flag tree-loop-distribute-patterns
wroks ok as well! (177008871)
Anyway! It's a serious bug, but it seems that O2
in GCC 12 has the flag tree-vectorize
. So currently, with this version of GCC which is the only available version for C++20 here, it isn't safe to use C++20 on Codeforces without extra flags (at least for me!). I highly recommend using C++17 instead of C++20 if possible, or using some extra flags like tree-vectorize
, I truly hope this one won't cause any other strange behaviour!
Thanks to Davoth, ymmparsa and specially AaParsa that almost everything I shared was with great help of them. I was just the poor one who faced the problem :')
As an advocate of some features on C++20, I hope this bug is fixed as soon as possible!
Is there report to bugzila? Not found in quick glance.
Wait, godbolt's GCC 11.2 gives correct output with flag, doesn't it?
https://godbolt.org/z/nGfb4Wfe6
This bug might be limited to MinGW/winlibs/some other Windows port of GCC, considering the fact Codeforces uses Windows for its judge servers.
The code in the link is a bit different from my first submission. Try the code with test 2 of the problem, or change lines 37 and 38 with
dp[0][i] = dp[1][i] = 0;
https://godbolt.org/z/95xMqazcM
Yes, got it. 10.1 correct, 10.2 incorrect.
https://www.diffchecker.com/diff — left is 10.1, right is 10.2, and there is extra memset
Was it reported to bugzila?
I've just reported the problem, Link
It's not limited to C++20. The reason it only happens with C++20 on Codeforces is that other versions of C++ use different versions of GCC on Codeforces.
AFAIK, Codeforces uses winlibs for C++20. Could it be that this bug is limited to winlibs, and we need to report to the maintainer of winlibs for it to be fixed?
No. It also happens on godbolt and my local Linux machine.
Here is a simpler example:
The printed value should clearly be 1000, but if you try in out in cf innvocation with g++20 it prints 0.
EDIT: An even more basic version can be found here
Yes, besides that the C++20 compiler on Codeforces has many more critical bugs. Many of them have already been written about in the post about the new version of the standard, but I haven't seen any mention of this one: If you want to use all gnu extensions, such as pbds, you need to remember some long inludes. But there is a very easy way around this. Instead of
#include<bits/stdc++.h>
, write#include<bits/extc++.h>
. Yes, it takes longer to compile, but it can still be precompiled. So,#include<bits/extc++.h>
does NOT work with C++20 on Codeforces, but it works with both C++17 compilers.UPD: C++20 gives this error:
Can't compile file: In file included from c:\programs\gcc11-64-winlibs\include\c++\11.2.0\x86_64-w64-mingw32\bits\extc++.h:81, from program.cpp:1: c:\programs\gcc11-64-winlibs\include\c++\11.2.0\ext\codecvt_specializations.h:40:10: fatal error: iconv.h: No such file or directory 40 | #include <iconv.h> | ^~~~~~~~~ compilation terminated.
Managed to do it without
std::string
, which makes it possible in pure C, which also has the bug. Also thanks to pajenegod for the simpler version.Ah nice. I realized that you can even trigger the bug just using an int array containing a single 0. No need to use strings. Something like a global
vector<int>
works too.And speaking of an int array containing a single 0, we know that B contains 0 initially, so we can use B itself instead (yeah, wtf)
that's not how pointers work. Also B is not guaranteed to contain 0 initially.
Are you sure about both of these claims?
You don't need any I/O to trigger this bug; all that matters is that there is some side effect. Since the return value of
main
is a side effect, you can get rid of all dependencies on any libraries. Also, the if is not necessary, so you get the following mind-boggling version:Even this works
Even this :/
That one looks like UB to me. You cannot dereference a nullpointer like that.
It is ub. My point is, it completely ignores
a[i] = *A + a[i-1]
. The compiled code is not different if I definedA
properly, except the definition ofA
.The idea behind UB is that you can't rely on the compiler for anything at that point -- it is the equivalent of the standard giving up on handling that case (and often for a good reason, like it being a case where you can't really define a behaviour (like dereferencing a nullptr), or one that is almost never needed and can be done in a different way, and not handling it will make it easier to better optimize the code). The compiler is free to do anything at that point; including making demons fly out of your nose. And it does ignore the nullptr dereference completely, not affecting the codegen much.
To summarize for people who just want to avoid weird bugs due to this issue, one possible solution is to use pragmas (the ones in the TL;DR should be fine).
I'd more like to just not to itemwise arrays and use memset/fill/rely on zero-initialization :)
Remember that a single pair of curly brackets can fill a whole array with zeroes on initialization ;)