Hey!
Consider the following solutions for the CSES problem Coin Combination I (Time Limit is 1s):
Solution with `constexpr` or `const` mod
Solution with non `const` mod
Note that using const
or constexpr
in the first solution is the same since we are initializing a global variable with a prvalue literal.
The first solution passes with Runtime ~0.58s in the worst test case, whereas the second solution gives a TLE outcome.
It seems that reading from compile time constants may be twice as fast as reading from arbitrary variables. Is there a particular reason for this?
EDIT: Big thanks to AkibAzmain and MattTheNub, I got the following answers:
- Compile time constants are inlined by the compiler which saves multiple machine register reading operations.
- When $$$mod$$$ is constant the modulo operator uses a technique called Barret Reduction to speed up computations.
It's because the compiler inlines the constants. It means that the constants are not stored in a variable at all when the program is running, instead all references to the constant is replaced directly by it's value. Since the CPU doesn't need to access the memory to get the value, it's faster.
Moreover, since the compiler knows that it's constant, so it can apply extra optimizations that doesn't work if you use a variable.
Make sense. I'm still surprised though that this creates such an overhead... Thank you!
I think there's something more involved here, like optimizations, CPU cache, etc, etc.
I think the reason it’s had such a large impact in this solution is because of the 3 lines or so of the main algorithm, the only line with anything more complex than just one addition or comparison is the line where you’re calling mod.
The machine code this creates is a bunch of calls where the value stored in mod has to be retrieved and put in a register then replaced with the next value. Which adds a good few instructions inside the loop.
Also constexpr allows lines to be precomputed in compile time. Whilst it obviously can’t just precompute every possible answer before it’s even given an input, it does allow the compiler to bake some information that could make it faster in run time.
Computing values modulo a constant is typically faster since the compiler can perform a Barrett reduction to compute the result using multiplication and bit shifts, which is faster than computing it with integer division. If the modulus is not constant, the compiler has to use regular, slow integer division.
Very interesting. Thank you for sharing! :D