yoshi_likes_e5's blog

By yoshi_likes_e5, history, 9 months ago, In English

When I was doing 1207F - Remainder Problem, I don't know why my program got TLE. Then this ONE trick cut 1 second off my program. You can see this video for more information: https://www.youtube.com/watch?v=ssDBqQ5f5_0. When the compiler compiles x/9, the code roughly converts to ((long long)954437177*x)>>33. Division is a much more expensive operation than multiplies and shifts. But what's the special constant 954437177? It's $$$\lceil \frac{2^{33}}{9} \rceil$$$. So we have this identity which is vaild for every $$$0\leq a < 2^{31}, 0 < x < 2^{31}$$$:

$$${\lfloor \frac{a}{x} \rfloor} ={ \lfloor{\frac{a*\lceil{2^{31+\lfloor{log_2(x)}\rfloor}/x}\rceil}{2^{31+\lfloor{log_2(x)}\rfloor}}}\rfloor}.$$$

You can clearly see how this speeds up 1207F: 244531668, 244540601. Note that gcc also does this trick on 64 bits.

After brute forcing with $$$a = 2^{31}-1$$$ I got this new, fixed identity. This should work for all $$$a$$$ (while the old one does not).

  • Vote: I like it
  • +46
  • Vote: I do not like it

»
9 months ago, # |
  Vote: I like it 0 Vote: I do not like it

very good

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

thank you

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Really cool trick! Sadly g++ doesn’t optimise divisions with numbers not known at compile-time this way.

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

You can make GCC use this trick in just one line:

#pragma GCC unroll B where B is your block size!

With B = 700 it is 1000ms faster: 262433429, 262433450

Compile to assembly and you can see fixed point multiplication being used.

»
7 months ago, # |
Rev. 3   Vote: I like it +4 Vote: I do not like it