tatianyi's blog

By tatianyi, history, 11 days ago, In English

This time, Auroraa_ is solving 524E - Rooks and Rectangles. After some trial, he got

this code

and then received TLE: 260160406

He was surprised, so he tried it again with exact same code, with C++17 (32). You know what? The AC magic happened again: 260161644.

At first glance, we do think some code has triggered undefined behaviours like Access out of bounds. But Auroraa_ denied that, because all the data could be fit in int32_t.

Then came the dramatic scene: by simply changing the code vector<int>v1[N], v2[N]; to vector<long long>v1[N], v2[N];, the strange TLE problem disappeared: 260164936.

After that, my further investigation(260199113) pointed out that the block causes TLE is:

for (int i = 1; i <= q; i++) {
	int x1, y1, x2, y2;
	cin >> x1 >> y1 >> x2 >> y2;
	vec.push_back({ x1, y1, x2, y2, i });
}

It took over 2000ms(260199205) to write data into this vec. And only took about 100ms(260199447) to do the same thing if we just changed vector<int>v1[N], v2[N]; to vector<long long>v1[N], v2[N];. The weirdest thing is, there is no direct connection between v1,v2 and vec.


Recalling the process just described, coincidentally, this modification process is really similar to the blog Strange TLE by cin using GNU C++20 (64), and some weird subtle changes to (surprisingly) fix it.

And like this blog Slowdown bug affecting C++ (64 bit) on Codeforces, this time we encountered a Pandora's Box that used a different magic number to open. Although MikeMirzayanov has updated the CF system to

Windows Server 2022

, there is still something fishy here.

The new version of GCC and OS passed the "test" on You won't believe how this simple trick defeated the unexplained bug destroying every top LGM; also included other languages such as Rust. The script mentioned there does not work for Rust now (both in custom test and on my Windows 11 computer) though.

However, when I translate the script to C++:

#include <vector>
#include <chrono>
#include <cstdio>

int main() {
    constexpr size_t n = 30000;
    auto vec = std::vector<std::vector<int>>(n);
    for (size_t i = 10000; i < 20000; i++) {
        vec.resize(i);
        for (size_t j = 0; j < i; j++) {
            vec[j].reserve(7);
        }
        auto start = std::chrono::system_clock::now();
        for (size_t j = 0; j < 10000; j++) {
            vec[i - 1].reserve(7);
            vec[i - 1].shrink_to_fit();
        }
        auto elapsed = std::chrono::duration<double, std::milli>(std::chrono::system_clock::now() - start).count();
        if (elapsed > 40) {
            std::printf("%zu %gms\n", i, elapsed);
        }
    }
}

and also the bomb:

#include <vector>

int main() {
    constexpr size_t n = 30000;
    constexpr size_t evil = 17182;
    auto vec = std::vector<std::vector<int>>(n);
    vec.resize(evil);
    for (size_t i = 0; i < evil; i++) {
        vec[i].reserve(7);
    }
    for (size_t i = 0; i < size_t(1e6); i++) {
        vec[evil - 1].reserve(7);
        vec[evil - 1].shrink_to_fit();
    }
}

And it generated the output Used: 5514 ms, 0 KB in the custom test.

As usual, I changed size_t evil = 17182; to size_t evil = 17183;, and then Used: 93 ms, 0 KB!

The combined version that automatically finds the bomb.

UPD:

After some tests, I figured out and confirmed that even UCRT version of the GCC toolchain would cause the bug, with the evil constant 11188.

Then, is there any solution for CodeForces which runs on the Windows platform?

UPUPD:

Even MSVC would also encounter this if using the combined bomb-finder script aforementioned, but would not trigger if just using the first script and then using the bomb.

I used the same script aforementioned to test Clang-cl/MSC++, and every time I run the compiled code, its output is literally random (and the time elapsed for each "stuck" is very small). I then choose any output from the script and assign it to the bomb code, and the bug doesn't come out.

And maybe one solution is using Clang-cl/MSC++. However, these two both use MSSTL, which does not support long double more than 64-bit (which means long double is almost identical to double). What's more, they are using the std::_Signed128 class for 128-bit support and #include "__msvc_all_public_headers.hpp" for "precompile header", which is somewhat not accustomed for many competitive programmers (The Clang-cl supports VLAs, although this syntax is not in standard).

However, the GCC is not originally designed for Windows, and the current choices of MinGW are not perfect and even deadly for competitive programming purposes. Is there any solution to use GCC on the Windows platform, and not modifying the whole system too much, and not using WSL2 or something?

There it is! Another answer that CodeForces almost used in the past.

Called it "almost", is because months ago, the language list contained one "GNU G++17 9.2.0 (64 bit, MSYS2)" for 64-bit C++ language support. And the very solution is here, the MSYS2 project. However, it seems that CF used the mingw64/mingw-w64-x86_64-toolchain, not the MSYS2 itself.


What is MSYS2?

MSYS2 is a collection of tools and libraries providing you with an easy-to-use environment for building, installing and running native Windows software.

It consists of a command line terminal called mintty, bash, version control systems like git and subversion, tools like tar and awk and even build systems like autotools, all based on a modified version of Cygwin. Despite some of these central parts being based on Cygwin, the main focus of MSYS2 is to provide a build environment for native Windows software and the Cygwin-using parts are kept at a minimum. MSYS2 provides up-to-date native builds for GCC, mingw-w64, CPython, CMake, Meson, OpenSSL, FFmpeg, Rust, Ruby, just to name a few.

So, what about its result for the script and the bomb test with its native GCC?

The result is pretty neat! On my computer, the script outputs something only if I switch if (elapsed > 40) to if (elapsed > 1) showing it has some particular memory allocation procedure. What's more, none of its output could be used as a bomb. As I've learned, the program compiled with MSYS2 native GCC would act like it's in the Linux environment, and now its performance is sure as if testing the script in other Linux-based OJ such as AtCoder.

Therefore, here comes my thought of solutions: add Clang-cl/MSC++2022 as an alternative, and install MSYS2's native GCC (by using pacman -S gcc clang in msys2.exe, and it will install GCC 13.2 in %MSYS_HOME%/usr/bin) and Clang 11.0 (pacman -S clang which will be at the same folder as GCC. However, the version of Clang here is rather old). As for other languages, there are also so many of them available in the MSYS2 package repository, with an easy-to-use command shell for maintenance.

The old-stable MinGW(64) GCCs could still be there, as these operations are experimental.

  • Vote: I like it
  • +111
  • Vote: I do not like it

»
10 days ago, # |
  Vote: I like it +5 Vote: I do not like it

Auto comment: topic has been updated by tatianyi (previous revision, new revision, compare).

»
10 days ago, # |
  Vote: I like it 0 Vote: I do not like it

I also came across another problem with GNU G++20 while solving problem E in today's round

for some reason this submission here 260462201 gets RE verdict while this one here 260462193 (which is the same code but using GNU G++17) gets an AC verdict.

Seems to me like there still exists some bugs with this compiler.

  • »
    »
    10 days ago, # ^ |
    Rev. 3   Vote: I like it +10 Vote: I do not like it

    It's because your code contains out of bound access. Blaming the compiler for errors should be the last resort.

    • »
      »
      »
      10 days ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Why would it pass on one compiler but not the other?

      • »
        »
        »
        »
        10 days ago, # ^ |
          Vote: I like it +11 Vote: I do not like it

        Because the code is optimized differently, the pattern of memory access or memory layout will be different. Whether you get a runtime error heavily depends on those subtle details.