Xahandd's blog

By Xahandd, history, 7 years ago, In English

Recently i was trying to solve a problem ( Little Pony and Harmony Chest) and i encountered a weird thing.

At first i tried to solve it and i got TLE and then after some debugging(that came to no conclusion) i just tried to tinker with the code to see what happens and so i swapped two of my loop and MAGICALY the code got AC.

Has anyone encountered something like this before? and can anyone explain to me why?

these are the codes (you can compare them to see what is their difference):

AC

TLE

  • Vote: I like it
  • 0
  • Vote: I do not like it

»
7 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by Xahandd (previous revision, new revision, compare).

»
7 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by Xahandd (previous revision, new revision, compare).

»
7 years ago, # |
  Vote: I like it +50 Vote: I do not like it

Accessing memory in the correct order can make your code more cache-friendly and thus much faster. If you access many memory cells that are physically close to each other, the whole block can be kept in the cache.

As an example, try the following code:

const int N = 4096;
int A[N][N];

int main() {
    // initialization
    for (int a=0; a<N; ++a) for (int b=0; b<N; ++b) A[a][b] = rand();

    // iteration in the correct order
    int xor1 = 0;
    for (int a=0; a<N; ++a) for (int b=0; b<N; ++b) xor1 ^= A[a][b];
    cout << xor1 << endl;

    // iteration in the wrong order
    int xor2 = 0;
    for (int b=0; b<N; ++b) for (int a=0; a<N; ++a) xor2 ^= A[a][b];
    cout << xor2 << endl;
}

If you iterate in row major order, you are just reading the array from memory "from left to right". If you iterate in column major order, you are jumping around in memory and cache cannot help you.

On my machine, the second iteration takes cca. 40 times longer than the first one.

»
7 years ago, # |
  Vote: I like it -7 Vote: I do not like it

You used C++14 in the TLE solution, and C++11 in AC solution. There are some weird things in runtimes in C++14 vs C++11, pointed out here.