Why is vector much slower than static array in my binary lifting implementation?

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	turmax	3559
6	tourist	3541
7	strapple	3515
8	ksun48	3461
9	dXqwq	3436
10	Otomachi_Una	3413

#	User	Contrib.
1	Qingyu	157
2	adamant	153
3	Um_nik	147
4	Proof_by_QED	146
5	Dominater069	145
6	errorgorn	142
7	cry	139
8	YuukiS	135
9	TheScrasse	134
10	chromate00	133

I was solving the Planets Queries I problem in CSES (a standard binary lifting problem).

After getting a few TLE on one of the test cases, I finally got AC, but it took 1.00 seconds to pass that test case, which is way too close. On checking other submissions with exact same logic, and investigating a bit more, I changed vector<vector<int>> declaration in my code to static int array declaration, and my code, with everything being exact same, worked in 0.60 seconds.

Usually there shouldn't be such a drastic difference. While solving codeforces problem, I have observed that vectors and static array take the same time, and sometimes vectors are faster.

Can anyone explain the reason behind this? Thanks.

Submission using vector<vector<int>> : https://cses.fi/paste/7d2b7a0b66df52a2ea28b6/

Submission using static int array : https://cses.fi/paste/6d8b9ef1a192caa1ea28f1/

Update: In my implementation, I did vector<vector<int>> node(31, vector<int>>(n+1)) instead of vector<vector<int>> node(n+1, vector<int>>(31)), basically swap i and j, the same test case now passes in 0.55 seconds.

Basically in my original implementation, let's say I call dp[i][j], I was fixing j and iterating over i, which is horrible from cache locality POV. It doesn't help that vector allocates multiple heap blocks instead of one continous block. (There was a recent blog on this topic, refer that) Instead of this, fixing i and iterating over j, is much more cache friendly and makes a drastic difference in time.

It doesn't make a drastic difference in static arrays since unlike vector, they are not fragmented. (I am not exactly sure about the reason, will research a bit and then update)

For comparison, I did the same thing with my static array implementation, and now the static array implementation runs in 0.55 seconds as well. There isn't any time difference here.

Rev.	By	When	Δ	Comment
en10	Argentum47	2025-11-20 11:43:07	3	Tiny change: 'ntation, I did vec' -> 'ntation, If I did vec'
en9	Argentum47	2025-11-19 19:16:56	1017
en8	Argentum47	2025-11-19 18:20:07	0	(published)
en7	Argentum47	2025-11-19 18:19:40	2
en6	Argentum47	2025-11-19 18:19:03	21	Tiny change: 'ged vector<vector<int>> declarat' -> 'ged vector\<vector\<int\>\> declarat'
en5	Argentum47	2025-11-19 18:17:53	4	Tiny change: 'hanks.\n\nSubmissi' -> 'hanks.\n\n<int>\nSubmissi'
en4	Argentum47	2025-11-19 18:15:49	14	Tiny change: 'econds. \nUsually ' -> 'econds. \n\nUsually '
en3	Argentum47	2025-11-19 18:14:12	1	Tiny change: 'ea28b6/ \nSubmissi' -> 'ea28b6/ \n\nSubmissi'
en2	Argentum47	2025-11-19 18:13:19	16
en1	Argentum47	2025-11-19 18:12:48	986	Initial revision (saved to drafts)

History