fatemetmhr's blog

By fatemetmhr, 2 years ago, In English

Hi!

Yesterday I was solving problem 1743C - Save the Magazines, when I faced a strange verdict. My code was getting WA with C++20 (176908643), while everything was ok with C++17 (176909474, 176989803). I tried looking for anything in my code that may cause undefined behavior, nothing found. Also there were something more strange as well; Like these two:

  • Changing the order of two variables in line 43 caused different verdict (176965105).

  • Adding a single line of cerr << 1;, again,caused different verdict (176965552).

and many other strange behaviours...

Here's a smaller code which has different verdicts in GNU G++20 11.2.0 (64 bit, winlibs) and GNU G++17 7.3.0 on Codeforces's custom test. You can check it out yourself, but as far as I know there's nothing special or causing undefined behaviour in this code:

Code

UPD: Simpler code by pajenegod

UPD 2: Extremely simpler code by nor

UPD 3: I also reported the problem on Bugzilla (Link)

What's really happening ?

It's all about tree-loop-distribute-patterns flag. You can learn about it here. But there's a problem in GCC 11.2. While compiling a code like:

for (...) {
    a[i] = 0;
    b[i] = 0;
    if (...) a[i] = a[i-1] + 1;
}

it generates something like:

memset(a, 0, sizeof a);
for (...) {
    if (...) a[i] = a[i-1] + 1;
}
memset(b, 0, sizeof b);

And that's why the verdict differs just by changing the order of those two variables. As you can see here:

  • First there's a memset in line 85.

  • Then the for will run.

  • Finally in line 122 there's the last memset, which destroys everything!

It seems that this problem starts from GCC 10.1, so GNU G++17 7.3.0 and GNU G++17 9.2.0 work correctly.

The default optimizition used by Codeforces for both C++20 and C++17 is O2. The flag tree-loop-distribute-patterns works automatically in O2 and later. So the code with O1 optimization will run correctly. In O3 and later, another flag, tree-vectorize, has been included, which fixes this bug; So, again, if you use O3 or Ofast it will run correctly.

But there's still something strange. The code with O0 works ok (177008931), also the one with an extra flag tree-loop-distribute-patterns wroks ok as well! (177008871)

Anyway! It's a serious bug, but it seems that ‍O2 in GCC 12 has the flag tree-vectorize. So currently, with this version of GCC which is the only available version for C++20 here, it isn't safe to use C++20 on Codeforces without extra flags (at least for me!). I highly recommend using C++17 instead of C++20 if possible, or using some extra flags like tree-vectorize, I truly hope this one won't cause any other strange behaviour!

Thanks to Davoth, ymmparsa and specially AaParsa that almost everything I shared was with great help of them. I was just the poor one who faced the problem :')

  • Vote: I like it
  • +242
  • Vote: I do not like it

| Write comment?
»
2 years ago, # |
  Vote: I like it +42 Vote: I do not like it

As an advocate of some features on C++20, I hope this bug is fixed as soon as possible!

»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Is there report to bugzila? Not found in quick glance.

»
2 years ago, # |
  Vote: I like it +10 Vote: I do not like it

Wait, godbolt's GCC 11.2 gives correct output with flag, doesn't it?
https://godbolt.org/z/nGfb4Wfe6

»
2 years ago, # |
  Vote: I like it +8 Vote: I do not like it

It's not limited to C++20. The reason it only happens with C++20 on Codeforces is that other versions of C++ use different versions of GCC on Codeforces.

  • »
    »
    2 years ago, # ^ |
      Vote: I like it +10 Vote: I do not like it

    AFAIK, Codeforces uses winlibs for C++20. Could it be that this bug is limited to winlibs, and we need to report to the maintainer of winlibs for it to be fixed?

    • »
      »
      »
      2 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      No. It also happens on godbolt and my local Linux machine.

»
2 years ago, # |
Rev. 3   Vote: I like it +81 Vote: I do not like it

Here is a simpler example:

#include<bits/stdc++.h>
using namespace std;

int A[4];
int B[4];

int main()
{
    string s = "1";
    
    int a;
    cin >> a; // Input any int here, doesn't matter which one

    A[0] = 1000;
    for(int i = 1; i < 4; ++i) {
        B[i] = 0;
        A[i] = 0;
        if(s[0])
            B[i] = 1;
        A[i] = A[i - 1];
    }
    
    cout << A[3];
}

The printed value should clearly be 1000, but if you try in out in cf innvocation with g++20 it prints 0.

EDIT: An even more basic version can be found here

»
2 years ago, # |
Rev. 2   Vote: I like it +14 Vote: I do not like it

Yes, besides that the C++20 compiler on Codeforces has many more critical bugs. Many of them have already been written about in the post about the new version of the standard, but I haven't seen any mention of this one: If you want to use all gnu extensions, such as pbds, you need to remember some long inludes. But there is a very easy way around this. Instead of #include<bits/stdc++.h>, write #include<bits/extc++.h>. Yes, it takes longer to compile, but it can still be precompiled. So, #include<bits/extc++.h> does NOT work with C++20 on Codeforces, but it works with both C++17 compilers.

UPD: C++20 gives this error:

Error
»
2 years ago, # |
  Vote: I like it +8 Vote: I do not like it

Managed to do it without std::string, which makes it possible in pure C, which also has the bug. Also thanks to pajenegod for the simpler version.

Code
  • »
    »
    2 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Ah nice. I realized that you can even trigger the bug just using an int array containing a single 0. No need to use strings. Something like a global vector<int> works too.

    Code
    • »
      »
      »
      2 years ago, # ^ |
        Vote: I like it -11 Vote: I do not like it

      And speaking of an int array containing a single 0, we know that B contains 0 initially, so we can use B itself instead (yeah, wtf)

      code
      • »
        »
        »
        »
        2 years ago, # ^ |
          Vote: I like it -21 Vote: I do not like it

        that's not how pointers work. Also B is not guaranteed to contain 0 initially.

        • »
          »
          »
          »
          »
          2 years ago, # ^ |
            Vote: I like it +8 Vote: I do not like it

          Are you sure about both of these claims?

    • »
      »
      »
      2 years ago, # ^ |
        Vote: I like it +33 Vote: I do not like it

      You don't need any I/O to trigger this bug; all that matters is that there is some side effect. Since the return value of main is a side effect, you can get rid of all dependencies on any libraries. Also, the if is not necessary, so you get the following mind-boggling version:

      int a[4], b[4], *A = a;
      int main() {
          a[0] = 1;
          for (int i = 1; i < 4; ++i)
              b[i] = 0, a[i] = 0, a[i] = A[0], a[i] = a[i - 1];
          return a[3];
      }
      
      • »
        »
        »
        »
        2 years ago, # ^ |
        Rev. 2   Vote: I like it 0 Vote: I do not like it
        int a[4], b[4], c = 0, *C = &c;
        int main() {
            a[0] = 1337;
            for (int i = 1; i < 4; ++i, ++*C) {
                b[i] = 0;
                a[i] = 0;
                a[i] = a[i - 1];
            }
            return a[3];
        }
        

        Even this works

        • »
          »
          »
          »
          »
          2 years ago, # ^ |
            Vote: I like it -8 Vote: I do not like it
          int a[4], b[4], *A;
          int main() {
              a[0] = 1;
              for (int i = 1; i < 4; ++i) {
                  b[i] = 0;
                  a[i] = 0;
                  a[i] = *A + a[i-1];
              }
              return a[3];
          }
          

          Even this :/

          • »
            »
            »
            »
            »
            »
            2 years ago, # ^ |
              Vote: I like it +8 Vote: I do not like it

            That one looks like UB to me. You cannot dereference a nullpointer like that.

            • »
              »
              »
              »
              »
              »
              »
              2 years ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              It is ub. My point is, it completely ignores a[i] = *A + a[i-1]. The compiled code is not different if I defined A properly, except the definition of A.

              • »
                »
                »
                »
                »
                »
                »
                »
                2 years ago, # ^ |
                  Vote: I like it +11 Vote: I do not like it

                The idea behind UB is that you can't rely on the compiler for anything at that point -- it is the equivalent of the standard giving up on handling that case (and often for a good reason, like it being a case where you can't really define a behaviour (like dereferencing a nullptr), or one that is almost never needed and can be done in a different way, and not handling it will make it easier to better optimize the code). The compiler is free to do anything at that point; including making demons fly out of your nose. And it does ignore the nullptr dereference completely, not affecting the codegen much.

»
2 years ago, # |
  Vote: I like it +24 Vote: I do not like it

To summarize for people who just want to avoid weird bugs due to this issue, one possible solution is to use pragmas (the ones in the TL;DR should be fine).

  • »
    »
    2 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I'd more like to just not to itemwise arrays and use memset/fill/rely on zero-initialization :)

    • »
      »
      »
      2 years ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      use memset/fill/rely on zero-initialization

      Remember that a single pair of curly brackets can fill a whole array with zeroes on initialization ;)