[Tutorial] Writing C++ like Python: tricks Python doesn't want you to know

→ Pay attention

Before contest
CodeTON Round 9 (Div. 1 + Div. 2, Rated, Prizes!)
2 days
Register now »

*has extra registration

→ Top rated

#	User	Rating
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

View all →

→ Find user

→ Recent actions

Detailed →

nor's blog

[Tutorial] Writing C++ like Python: tricks Python doesn't want you to know

By nor, 22 months ago, In English

https://nor-blog.codeberg.page/posts/2023-01-11-cpp-like-python/

+669

nor
22 months ago
39

Comments (38)

Show archived | Write comment?

C_plusplus

22 months ago, # |

Thanks this is very useful.

→ Reply

Errichto

22 months ago, # |

← Rev. 2 →

+16

Cool stuff. But don't use deque, it's slow.

I always use hull.end()[-2] in convex hull implementation. It's surprising that end(hull) works too.

→ Reply

nor

22 months ago, # ^ |

← Rev. 2 →

+80

For most purposes, using a deque is almost as fast as a vector according to some of my submissions.

There are only two issues that I know of that anything using a deque can face:

When you need to rely on the cache a lot: even though allocations are done in blocks to avoid this problem, in some extreme cases it might lead to some noticeable slowdown (about 10-30%). This might be an issue with vectorization in the future, but usual block sizes are large enough to avoid this issue to a large extent. Usually online judges don't have very tight limits that are able to differentiate between vector and deques in this scenario.
Allocating too many deques: If you do something like vector<deque<int>> a((int)1e6);, it is going to MLE (and if not MLE, TLE) because a deque allocates memory in chunks, and that chunk tends to be large. For example, using GCC on codeforces, it is 512 bytes, and it is quite bad to allocate a vector of $$$10^6$$$ empty deques in this case.

I think the second part is the reason behind the comment that deques are slow, but feel free to correct me if this issue showed up in some other context.

→ Reply

Kyou_mo_kawaii

22 months ago, # ^ |

Some python people avoid using deque for their BFS by doing something like:

        q = [src]
        seen = [0] * N
        seen[src] = 1
        for u in q:
            for v in graph[u]:
                if not seen[v]:
                    q.append(v)
                    seen[v] = 1

which translates to the following in c++:

        vector<int> q = {src};
        vector<int> seen(N, 0);
        seen[src] = 1;
        for (int i = 0; i < q.size(); i++) {
            int u = q[i];
            for (int v : graph[u]) {
                if (!seen[v]) {
                    q.push_back(v);
                    seen[v] = 1;
                }
            }
        }

→ Reply

chromate00

22 months ago, # ^ |

← Rev. 2 →

-16

I avoid using deque for my BFS by using something like this.

Warning - long code

#include<exception>
using namespace std;

//////// BEGIN STRUCT DEFINITION ////////
template<class T, size_t sz>
struct circular_buffer
{
	struct It
	{
		circular_buffer<T,sz>&par;
		size_t abspos;
		bool is_end;
		It&& operator++(int)
		{It pre=*this;abspos++;par.pop();return pre;}
		It& operator++()
		{abspos++;par.pop();return *this;}
		T& operator*()
		{return par.data[abspos%sz];}
		bool operator==(It r)
		{
			if(&r.par!=&par)return false;
			else if(is_end) return r.abspos==par.ed;
			else if(r.is_end) return abspos==par.ed;
			else return r.abspos==abspos;
		}
	};
	using iterator=It;
	using const_iterator=const It;
	T data[sz];
	size_t bg,ed;
	circular_buffer():bg(0),ed(0){}
	T& front()
	{return data[bg%sz];}
	void push(T&& val)
	{
		if(ed==bg+sz)throw out_of_range("Circular Buffer Full"s);
		else data[ed++%sz]=val;
	}
	void pop()
	{
		if(ed==bg)throw out_of_range("Circular Buffer Empty"s);
		else bg++;
	}
	It begin()
	{return It{*this,bg,false};}
	It end()
	{return It{*this,ed,true};}
	size_t size()
	{return ed-bg;}
	bool empty()
	{return bg==ed;}
	void swap(circular_buffer<T,sz>&other)
	{
		swap(data,other.data);
	}
};
//////// END STRUCT DEFINITION ////////

You can either use it like a normal std::queue, or plug it right into a range-based for loop. It's kinda crude, but it works

→ Reply

chromate00

22 months ago, # |

← Rev. 3 →

It is worth noting that and and or are actual keywords on C++ since C++98 or so. Also for BFS I like to use something like this. (probably meaningless rant: why do the container adapters not have emplace? this makes things a bit inconvenient compared to things that we have emplace for.)

int a,b;
queue<pair<int,int>>Q;
//push something...
while(!Q.empty()&&(tie(a,b)=Q.front(),Q.pop(),1))
{
  //do something...
}

→ Reply

Boring_Day

22 months ago, # ^ |

Is it possible to declare a,b under while?

→ Reply

nor

22 months ago, # ^ |

+15

for (int a, b; !Q.empty() && (tie(a, b) = Q.front(), Q.pop(), true);) {
  // do something
}

→ Reply

ko_osaga

22 months ago, # |

+76

Hot take: You generally don't need std::pair, and you never need std::tuple. Typically, they are just std::array<T, 2> or std::array<T, 3> expressed in a verbose way. std::pair can be useful if you need to store different types or do stuff with std::map. If you want to pack more than 2 fields in a different type, you need to define structs for basic readability.

(I learned this style from Savior-of-Cross's code, I think it makes sense and made my code cleaner, but I'm interested to hear any downside of this approach)

→ Reply

nor

22 months ago, # ^ |

← Rev. 5 →

+25

I agree with this, and it is definitely not a hot take in my opinion. In fact, in some competitive scenarios I've observed that using std::array helps with vectorization as well and std::pair and std::tuple have surprising inherent overheads compared to a toned down implementation like template <class T, class U> struct Pair { T first; U second; }; (so much so that some software companies ban their use in their codebases). I would recommend watching this for a good introduction to why this is the case. EDIT: just watched it again, and it seems an important reason in the competitive programming context wasn't covered. std::pair and std::tuple are not forced to be trivial types even if their members are, so this adds some non-zero overhead sometimes. Similarly, sometimes compilers aren't able to reason through all the abstraction that is provided by the suboptimal std::pair and std::tuple implementations, so that factors in sometimes too.

However, I have seen way too many people spamming std::pair and std::tuple everywhere, and it is quite annoying to look at (even more so to write) push_back(make_pair(...)) all the time. The main idea is that if you use std::tuple, you should at least make your code not-so-ugly.

There are still examples where you might need std::tuple: one reason to not completely abandon it in favor of simple structs is that you can't use ranges::elements_view with structs having different types, because std::get is defined only for them (and std::array and std::pair). Perhaps this can be fixed with better "accessing" functionality for structs, but I don't see how to do this.

→ Reply

lrvideckis

15 months ago, # ^ |

nit: you can't do:

	array<int, 2> x = {1, 2};
	int y, z;
	tie(y, z) = x;

works with pair, tuple, not struct

you also can't do:

	vector<array<int, 2>> v;
	v.emplace_back(1, 2);

works with pair, tuple, struct

→ Reply

timreizin

22 months ago, # ^ |

+10

How to use std::array if you want elements to be not of the same type?

→ Reply

chromate00

22 months ago, # ^ |

You don't. (I'm not saying you "definitely can't", but you would not appreciate finding out how)

→ Reply

nor

22 months ago, # ^ |

+34

It's not hard, you could use std::array<std::variant<T1, T2>, 2> (though this should not be used by any sane person who cares about performance).

→ Reply

nikgaevoy

22 months ago, # ^ |

How do you implement a scanline then? You usually have two types of events (something started and something ended) and you need to sort events firstly by the $$$x$$$ coordinate and then by the type. A tuple of $$$x$$$, type and parameters (usually tuple<int, bool, int>) does perfectly fine from my perspective. Do you define a structure with a custom comparator here as well?

→ Reply

nor

22 months ago, # ^ |

+16

I just use std::array<int, 3> for this case. In my opinion std::tuple is fine for quick and dirty operations (like sorting and so on) in cases where the data types aren't convertible or take too much memory, but for more complicated code (such as when you want to store some history on a stack and do divide and conquer on it, or even just DSU with rollbacks), I try to make things more readable and debuggable by making a different struct. As a rule of thumb, I keep my library implementations clean with meaningful variable names. Of course, your mileage may vary.

→ Reply

nikgaevoy

22 months ago, # ^ |

But this is not very clean even in comparison to the tuple approach. The second part is semantically bool (or even some enum, but you got the point). Also, what if points are in a floating-point type? array<double, 3>? I doubt any sane person would use double as a substitute for bool. Also, the parameters are not of the same type as $$$x$$$ quite often.

Besides that, I even think that always substituting tuple<int, int, int> with array<int, 3> is a bad idea. They are semantically different. The tuple means just the collection of three entities and the array means the collection of three entities of the same type. Just because three types happen to be the same it does not mean they are obliged to be the same.

And even defining a separate structure is not always the best solution, because it takes off some transparency of what is happening. For example, if I had to choose between a structure with fields $$$x, y, z$$$ that should mimic the tuple behaviour and an actual tuple, I would have prefer the latter. Obviously, here we all should forget for a second that in C++ specifically you have to write ugly get<2>(tuple) to access the third element, but in other a bit more modern languages it is usually done as tuple.2 and I fail to see much of a difference from not_tuple.z. Not to mention that structured bindings to your own custom types sometimes work not in the most trivial way.

→ Reply

nor

22 months ago, # ^ |

+11

About how morally incorrect using std::array is, I agree. It's just that std::array is easier to type and access. I definitely won't use this kind of a thing in "real-world" code, but I think it is convenient for competitive programming.

Regarding custom structs, I think I still have reservations about choosing std::tuple over them. For instance, if I had to write an implementation of some flow algorithm, I personally won't store the edge as a tuple (a couple of integers to maintain vertex/edge indices, and a couple of flow variables). I don't deny that they're very interoperable, and that they make a lot of sense when you are (only) thinking about what the data does. I think it's a matter of personal preference, though.

I didn't get your point about structured bindings not working in the most trivial way, could you elaborate? I almost always use a struct and bind variables in the order of declaration in the struct, is there any place where this fails?

→ Reply

nikgaevoy

22 months ago, # ^ |

Using structures for edges in graphs/flows are perfectly fine, I actually think that it is more correct way of doing it. But the difference is that here field names are usually .to, .from and .capacity, i.e. they are meaningful. My example was about the exactly opposite case.

I spent about 90 minutes trying to remember what was happening when I decided to be extra cautious with structured binding and, apparently, the point was that binding a tuple-like type and binding to data members are slightly different and you can accidentally switch from one to another. I have no clue how I did that (maybe by inheriting something?), and I failed to recreate an example.

→ Reply

ko_osaga

22 months ago, # ^ |

I rarely use bool whenever I can use int, so I don't have such use cases. If you want some symantical restriction, I think it's better to go with structs. In case of floating points — yes, sometimes that happen. I will define custom structs.

Everything here are inherently hacky. If you want industry-style code, you have to not use anything except structs.

→ Reply

nikgaevoy

22 months ago, # ^ |

I believe that even in industry-style code there are places for tuples, especially in public APIs (see examples above). In particular, STL functions love having pairs in their return types, tuples are not any worse. The reason why they are uncommon in STL is that there was no std::tuple until C++11. Another reason, obviously, is that functions that return more than two variables are uncommon in general, but it does not mean their existence inherently means bad design.

→ Reply

frozenca

22 months ago, # ^ |

To be fair, a recent trend in the C++ standard library API is to define a custom structure instead of std::pair or std::tuple and return it.

https://en.cppreference.com/w/cpp/algorithm/ranges/next_permutation

For example C++20 version of next_permutation returns a wrapped pair of (iterator, bool) and named its members as .in, .found for better readability

→ Reply

oversolver

22 months ago, # ^ |

How do you code Dijkstra?

→ Reply

spkumar9952

22 months ago, # |

Thanks! this is useful blog. can anyone tell me where I can download c++20 ?

→ Reply

snowmanonahoe

22 months ago, # ^ |

You should have it if your compiler is up-to-date. Make sure you don't have compiler arguments setting you to an older version (use -stdc++=20 on GCC/Clang, and /std:c++20 or /std:c++latest on MSVC).

→ Reply

OerTheFlood

22 months ago, # |

These C++ code is too erotic.

→ Reply

Thallium54

22 months ago, # |

Don't forget there's also reverse iterator so you can also use rbegin(a)[0] to access the last element.

→ Reply

nor

22 months ago, # ^ |

+24

That's true (and it makes more sense too)! However I almost always use end(a)[-1] instead of rbegin(a)[0] because:

It keeps things consistent across languages — Python and C++.
It has fewer characters.
I avoid using reverse iterators too much, since sometimes you can mess up by using ++ instead of -- and vice versa if you mix iterators.

→ Reply

rgnerdplayer

22 months ago, # ^ |

For the last element please use .back(), .front() also exist for accessing the first element but [0] is shorter anyway.

→ Reply

ageade

22 months ago, # |

+11

Is the AC server a discord server? If so could I get an invite ...

→ Reply

Adam_GS

22 months ago, # ^ |

+32

→ Reply

ageade

22 months ago, # ^ |

+27

→ Reply

Physics212303

22 months ago, # |

← Rev. 2 →

+19

It is risky to use deque. Last time someone used 10^6 deques in NOI in China, and got MLE.

→ Reply

battle_beast

22 months ago, # ^ |

Why? I think it's just a double-ended linked list

→ Reply

Physics212303

22 months ago, # ^ |

You can refer to this. It explains in detail why using many deques can cause MLE.

→ Reply

9xZER0

15 months ago, # |

Can someone explain how to use zip and enumerate from the spoiler snippet?

→ Reply

oToToT

15 months ago, # ^ |

Check this.

→ Reply

nor

15 months ago, # ^ |

← Rev. 2 →

You can also use auto&& [x, y] instead of auto [x, y] if you want references to x and y, while taking into account the case of enumerate.

Example

int main() {
    vector<int> a = {1, 2, 3};
    vector<double> b = {4, 5, 6};
    for (auto&& [x, y] : zip(a, b)) cout << x++ << ' ' << y++ << '\n';
    for (auto&& [i, x] : enumerate(a)) cout << i++ << ' ' << x++ << '\n';
    for (auto&& [x, y] : zip(a, b)) cout << x++ << ' ' << y++ << '\n';
    for (auto&& [i, x] : enumerate(a)) cout << i++ << ' ' << x++ << '\n';
    for (auto&& [x, y] : zip(vector{1, 2, 3}, array{3, 4, 5})) cout << x++ << ' ' << y++ << '\n';
    return 0;
}

→ Reply