Strange Comparison: vector<vector<int>> vs vector<pair<int,int>> ?

#	User	Rating
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

If we want to use a 2D dynamic array, some C++ user writes following 2D vector:

vector<vector<int>> a(n);

// add y to a[x]
a[x].push_back(y);

// walkthrough a[x]
for(auto &nx : a[x]){
  cout << nx << "\n";
}

But we can do the equivalent thing with handwritten linked list (when we know the total size) by using 1D vector<pair<int,int>>:

pair<int,int> empty={-1,-1};
vector<pair<int,int>> b(n+total_size,empty);
int add_point=n;

// add y to b[x]
if(b[x]!=empty){
  b[add_point]=b[x];
  b[x].second=add_point;
  add_point++;
}
b[x].first=y;

// walkthrough b[x]
int current=x;
while(current!=empty.second && b[currnt]!=empty){
  cout << b[current].first << "\n";
  current=b[current].second;
}

So, what's the benefit for the later implementation? The merit is the generation time of later one is very fast.

vector size = $$$10^5$$$, Total elements = $$$10^6$$$ (this means we need vector<vector<int>>(1e5) and the sum of the size of $$$10^5$$$ vectors is $$$10^6$$$)

	generate	walkthrough
vector<vector<int>>	49.04 ms	3.12 ms
vector<pair<int,int>>	12.78 ms	19.52 ms

vector size = $$$10^6$$$, Total elements = $$$10^6$$$

	generate	walkthrough
vector<vector<int>>	138.56 ms	10.92 ms
vector<pair<int,int>>	17.4 ms	7.8 ms

vector size = $$$10^6$$$, Total elements = $$$10^7$$$

	generate	walkthrough
vector<vector<int>>	1406.84 ms	32.86 ms
vector<pair<int,int>>	120.48 ms	437.28 ms

( on CF judge(C++20 64bit), x10 average, experiment code )

So, when the vector size is large and need small number of walkthrough, the vector<pair<int,int>> implementation have an advantage.
I use this implementation for 303C - Minimum Modular (because the package is old, current TL is 1s). 185222734

Comments (7)

Write comment?

nor

2 years ago, # |

+45

I've seen the latter implementation being used by a lot of Chinese competitors in graph problems (so I tend to call it Chinese adjacency list). It's used in the fastest solutions to a lot of problems on various judges (that involve graph traversals — since DFS is very bad for cache anyway, the vector approach loses its advantage).

One thing that is much faster than vector<vector<int>> is vector<basic_string<int>>, and this is because of short string optimization (SSO). However, due to how basic_string is implemented, if you write complex stuff like pair<int, int> instead of int, it will not compile/give UB. To fix that, you can just define a custom struct without any constructors/initial values (so stuff like template <class T, class U> struct Pair { T a; U b; }; works for "simple types" T and U that are either plain old datatypes or of this form recursively).

→ Reply

SPyofgame

2 years ago, # ^ |

Is it because fetching data from the adjacency list is usually (if not mostly) more cache-friendly for traversing through the graph?

Kyou_mo_kawaii

since DFS is very bad for cache anyway, the vector approach loses its advantage

I found https://mirror.codeforces.com/blog/entry/67001 pretty insightful on how to minimize cache misses

← Rev. 3 →

True, I have used that optimization myself once but this is the first time I have seen it mentioned on codeforces. Thanks for the reference! However it is not applicable to this problem so much, since it needs you to do a DFS anyway (so as to prepare for other potentially costly DFSes).

defnotmee

← Rev. 2 →

-8

Bro, this is scary. I swapped vector<vector> for vector<basic_string> in my template and got double speed on todays E.

without basic string (296ms): https://mirror.codeforces.com/contest/1774/submission/185685343

with basic string (186ms): https://mirror.codeforces.com/contest/1774/submission/185736069

This is insane

The difference is even more prominent if you make the IO faster. 185739871 with basic_string (109ms) and 185739827 with vector (202ms). This is mostly because for graphs where most vertices have a small degree, multiple allocations are bad.

oversolver

+16

12 years ago it was my typical graph implementation

physics0523's blog