I was solving the Planets Queries I problem in CSES (a standard binary lifting problem).
After getting a few TLE on one of the test cases, I finally got AC, but it took 1.00 seconds to pass that test case, which is way too close. On checking other submissions with exact same logic, and investigating a bit more, I changed vector<vector<int>> declaration in my code to static int array declaration, and my code, with everything being exact same, worked in 0.60 seconds.
Usually there shouldn't be such a drastic difference. While solving codeforces problem, I have observed that vectors and static array take the same time, and sometimes vectors are faster.
Can anyone explain the reason behind this? Thanks.
Submission using vector<vector<int>> : https://cses.fi/paste/7d2b7a0b66df52a2ea28b6/
Submission using static int array : https://cses.fi/paste/6d8b9ef1a192caa1ea28f1/
Update: In my implementation, If I did vector<vector<int>> node(31, vector<int>>(n+1)) instead of vector<vector<int>> node(n+1, vector<int>>(31)), basically swap i and j, the same test case now passes in 0.55 seconds.
Basically in my original implementation, let's say I call dp[i][j], I was fixing j and iterating over i, which is horrible from cache locality POV. It doesn't help that vector allocates multiple heap blocks instead of one continous block. (There was a recent blog on this topic, refer that) Instead of this, fixing i and iterating over j, is much more cache friendly and makes a drastic difference in time.
It doesn't make a drastic difference in static arrays since unlike vector, they are not fragmented. (I am not exactly sure about the reason, will research a bit and then update)
For comparison, I did the same thing with my static array implementation, and now the static array implementation runs in 0.55 seconds as well. There isn't any time difference here.








Auto comment: topic has been updated by Argentum47 (previous revision, new revision, compare).
vector allocates many non contiguous heap blocks, causing extra pointer dereferences.But 2D array is one contiguous block.
Yeah it makes some difference but not much. I just updated the blog with the actual reason, and now both implementations take the same time.
Cache optimization is the key. The main difference here is that since vector allocated many non contiguous heap blocks, and in each call I was accessing different blocks, it was horrible from cache optimization point of view. (The point about non contiguous heap blocks made me realize the key mistake, thanks for your help! )
it takes some time for vector to allocate dynamic memory unlike for an array, its memory goes right onto the stack
Just read about it and tried a couple of things, This makes array slightly faster in practice but I don't think it is enough to make such a drastic difference?
every time you have a 2d matrix you should consider making it a vector(n*n) with pointer arithmetic. Same O(n^2) algo can run with drastically different constants considering allocations and cache locality
vectors usually reserve more space in the memory as they are dynamic arrays and their size can change.
when nesting vectors (vector<vector<vector<vector.......vector) always have the largest vector size at the very end, in your case it should be
vector<vector<int>> node(31, vector<int>(n + 1));.it is generally much better to create less vectors as there is an overhead to creating one
i also believe that swapping the order in your code as described above should make it nearly as fast as static arrays
This overhead is extremely minimal and at the magnitudes OP needs to test (n = 200000), it should account for around 10ms of time. Cache locality and cache misses due to iteration order are by far the more important factor here.
The essence of STL's operation is to first allocate a small portion of space. If subsequent added elements exceed this space, it will double the space incrementally — this means the space of STL is inherently not contiguous.
In contrast, the space for an array is allocated all at once, so the array's space is completely contiguous.
Think about it: is it easier to find an element in a pile of jigsaw pieces of varying sizes, or in an organized grid?
Therefore, the inherent space structure of vector determines it cannot be faster than an array. But, it is more flexible than an array.
Maybe we should use
std::vector<std::array<int, 20>>instead of 2D vector if it countsVectors are generally slower than the static arrays. That is all about the memory.
Hello Team,
I request a manual review of my submission for Round 2210, problem 2210C2. Submission ID: 368767062 Problem ID: 2210C2
I received a coincidence/plagiarism notice, but I wrote this solution myself. I did not copy, share, or intentionally leak my code.
Because the reported coincidence list is unusually large, I am concerned that my case may have been flagged incorrectly. Please review my submission manually. If there has been a mistake, kindly remove the flag. my English is not good but please try to understand my message
Thank you.