ahcl's blog

By ahcl, history, 9 years ago, In English

Hello everyone!

Today, I will introduce a fastest solution for the problem: count number of 1 bits in a 63-bit integer X.

One basic solution for this problems:

int getbit(long long x,int k)
{
      return ((x>>k)&1);
}
int cal(long long x)
{
      int ans = 0;
      for(int i=0;i<=62;i++) ans += getbit(x,i);
      return ans;
}

This algorithm run with O(63) complexity. It is an acceptable solution but some cases you need a better solution. I will introduce for you an algorithm can run with O(3) complexity.

Firstly, we create an array F[1<<21], F[i] : number 1 bits of i; // example: F[5] = 2 (5="101") F[7] = 3 (7="111")

The main idea of this algorithm is separate that number into 3 blocks, each block have 63/3 = 21 bits;

For example with a small number: X = "101101", 3 blocks, each block have 2 bits.

  • Block_1 = "10" (equal 2 in decimal system)
  • Block_2 = "11" (equal 3 in decimal system)
  • Block_3 = "01" (equal 1 in decimal system)
  • Ans(X) = F[block_1] + F[Block_2] + F[Block_3] = F[2] + F[3] + F[1] = 1 + 2 + 1 = 4;

Now, how do you get value of Block 1, 2 and 3 with O(1) complexity? There is an easy method to do this.

Follow the steps:

  • Step1: Calculate value of Block3 by getting last 21 bits;
  • Step2: Shift to right X 21 bits;
  • Step3: Calculate value of Block 2 by getting last 21 bits;
  • Step4: Shift to right x 21 bits;
  • Step5: calculate value of Block1;

Here is my code:


#include <bits/stdc++.h> using namespace std; int F[1<<21]; void init() { F[1] = 1; for(int i=1;i<(1<<20);i++) { F[i*2] = F[i]; F[i*2+1] = F[i] + 1; } } int get21bit(long long x) { return (x & ((1<<21) -1) ); } int cal(long long x) { int ans = 0; for(int i=1;i<=3;i++) { int t = get21bit(x); ans += F[t]; x>>=21; } return ans; } int main() { init(); long long X; X = 1187340987109834710; // X = "‭0001000001111010010010000101011001000011111100010111111111010110‬"; cout<<cal(X); return 0; }

I hope it will be useful for you! Have a good day!

UPD1: In a C++ program, you can use function __builtin_popcountll() to solve this problem. It is really faster than my algorithm. So, I think you can apply my algorithm for another compiler.

  • Vote: I like it
  • -4
  • Vote: I do not like it

»
9 years ago, # |
  Vote: I like it +8 Vote: I do not like it

If you are working with C/C++ you can use the GCC function __builtin_popcountll. It has approximately the same performance of your code but you spend less time coding :) It also is implemented with a hardware instruction on newer Intel processors so you will get faster performance when compiled appropriately.

If you want something even faster for when when hardware instructions aren't available you can have a look at the snippet provided on this GCC bug report https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36041

Cheers! :)

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    It is really faster and easier :3 Sorry for my lack of knowledge. Thanks you very much!

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      You're welcome :) Remember that we are all here to practise and learn from each other; there's no need to apologise. Cheers!