If we want to use a 2D dynamic array, some C++ user writes following 2D vector:
vector<vector<int>> a(n);
// add y to a[x]
a[x].push_back(y);
// walkthrough a[x]
for(auto &nx : a[x]){
cout << nx << "\n";
}
But we can do the equivalent thing with handwritten linked list (when we know the total size) by using 1D vector<pair<int,int>>:
pair<int,int> empty={-1,-1};
vector<pair<int,int>> b(n+total_size,empty);
int add_point=n;
// add y to b[x]
if(b[x]!=empty){
b[add_point]=b[x];
b[x].second=add_point;
add_point++;
}
b[x].first=y;
// walkthrough b[x]
int current=x;
while(current!=empty.second && b[currnt]!=empty){
cout << b[current].first << "\n";
current=b[current].second;
}
So, what's the benefit for the later implementation? The merit is the generation time of later one is very fast.
vector size = $$$10^5$$$, Total elements = $$$10^6$$$ (this means we need vector<vector<int>>(1e5) and the sum of the size of $$$10^5$$$ vectors is $$$10^6$$$)
| generate | walkthrough | |
|---|---|---|
| vector<vector<int>> | 49.04 ms | 3.12 ms |
| vector<pair<int,int>> | 12.78 ms | 19.52 ms |
vector size = $$$10^6$$$, Total elements = $$$10^6$$$
| generate | walkthrough | |
|---|---|---|
| vector<vector<int>> | 138.56 ms | 10.92 ms |
| vector<pair<int,int>> | 17.4 ms | 7.8 ms |
vector size = $$$10^6$$$, Total elements = $$$10^7$$$
| generate | walkthrough | |
|---|---|---|
| vector<vector<int>> | 1406.84 ms | 32.86 ms |
| vector<pair<int,int>> | 120.48 ms | 437.28 ms |
( on CF judge(C++20 64bit), x10 average, experiment code )
So, when the vector size is large and need small number of walkthrough, the vector<pair<int,int>> implementation have an advantage.
I use this implementation for 303C - Minimum Modular (because the package is old, current TL is 1s). 185222734








I've seen the latter implementation being used by a lot of Chinese competitors in graph problems (so I tend to call it Chinese adjacency list). It's used in the fastest solutions to a lot of problems on various judges (that involve graph traversals — since DFS is very bad for cache anyway, the vector approach loses its advantage).
One thing that is much faster than
vector<vector<int>>isvector<basic_string<int>>, and this is because of short string optimization (SSO). However, due to howbasic_stringis implemented, if you write complex stuff likepair<int, int>instead ofint, it will not compile/give UB. To fix that, you can just define a custom struct without any constructors/initial values (so stuff liketemplate <class T, class U> struct Pair { T a; U b; };works for "simple types"TandUthat are either plain old datatypes or of this form recursively).Is it because fetching data from the adjacency list is usually (if not mostly) more cache-friendly for traversing through the graph?
I found https://mirror.codeforces.com/blog/entry/67001 pretty insightful on how to minimize cache misses
True, I have used that optimization myself once but this is the first time I have seen it mentioned on codeforces. Thanks for the reference! However it is not applicable to this problem so much, since it needs you to do a DFS anyway (so as to prepare for other potentially costly DFSes).
Bro, this is scary. I swapped vector<vector> for vector<basic_string> in my template and got double speed on todays E.
without basic string (296ms): https://mirror.codeforces.com/contest/1774/submission/185685343
with basic string (186ms): https://mirror.codeforces.com/contest/1774/submission/185736069
This is insane
The difference is even more prominent if you make the IO faster. 185739871 with
basic_string(109ms) and 185739827 withvector(202ms). This is mostly because for graphs where most vertices have a small degree, multiple allocations are bad.12 years ago it was my typical graph implementation