UPD: I've reduced the code size.
I've recently found that the following code will generate wrong output.
#include <bitset>
#include <iostream>
const int N = 105;
std::bitset<N> ok[N][N];
int n = 5;
int main() {
ok[2][2].set(2);
for (int i = n; i; i--)
for (int j = i; j <= n; j++) {
ok[i][j] = ok[i][j] | ok[i + 1][j] | ok[i][j - 1];
}
std::cout << ok[2][5][2] << '\n';
return 0;
}
Compiled with -O3 -mtune=skylake -march=skylake, the code outputs 0.
However if you simulate the code you will know that the correct answer should be 1.
Note that the compiler seems to generate wrong sse instruction.
Again, I believe this code is ub-free, and has nothing to do with implementation-defined stuff.








Auto comment: topic has been updated by Xiaohuba (previous revision, new revision, compare).
Further reduced:
I've submitted a bug. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116768
It seems that the trunk branch has been updated, and the bug is fixed. Thank you!
I bet AI couldn't do that
It's good with
O2. I am always scared to useO3, now I have an additional reason to keep being scared. While searching I found something related to O3 and AVX probably not being addressed for a long time (>10 years) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001 . This may not be this issue though, as I tried to adjustalignasand that didn't help.