I know that counting bits using precomputation is the fastest way to go (over __builtin_popcount()
).
For 32 bit integers we can do cnt[x>>16]+cnt[x&65535]
by precomputing counts upto (1<<16)
.
How can we do it for 64 bit integers by precomputing upto (1<<22)
.