I know the standard sieve algorithm for finding primes in a range [0..upperbound]
Here is a standard implementation
for(int i=2;i<=N;i++)
if(prime[i])
for(int j=i*i;j<=N;j+=i) prime[j]=0;
But when memory is tight, here is the famous implementation by yarin
#define MAXSIEVE 100000000 // All prime numbers up to this
#define MAXSIEVEHALF (MAXSIEVE/2)
#define MAXSQRT 5000 // sqrt(MAXSIEVE)/2
char a[MAXSIEVE/16+2];
#define isprime(n) (a[(n)>>4]&(1<<(((n)>>1)&7))) // Works when n is odd
int i,j;
memset(a,255,sizeof(a));
a[0]=0xFE; // 254
for(i=1;i<MAXSQRT;i++)
if (a[i>>3]&(1<<(i&7)))
for(j=i+i+i+1;j<MAXSIEVEHALF;j+=i+i+1)
a[j>>3]&=~(1<<(j&7));
I understood that we are only storing odd numbers and char
can store 8 bits and we are storing it using j>>3
and unsetting using ~(1<<(j&7))
. When considering only odd numbers, you let index 1 be 3, index 2 be 5 etc.
But I didn't understand the strange loop incrementations and initialization in for(j=i+i+i+1;j<MAXSIEVEHALF;j+=i+i+1)
Any help would be greatly appreciated. Thank you in advance.
First thing to notice here is that the check
a[i>>3]&(1<<(i&7))
isn't about the primality of $$$i$$$, but the primality of $$$2i + 1$$$. So we can rewrite our code like this:Now let's do the substitutions $$$u = 2i + 1$$$ and $$$v = 2j + 1$$$.
Now this is quite standard. The only thing to notice here is that $$$u$$$ and $$$v$$$ both skip the even numbers, that's why you see
u += 2
andv += 2 * u
, contrary to the standard implementation.In case you're wondering about
a[0]
, it's set to 254 to exclude 1 from the primes as a special case.In fact I don't get why the author didn't write the code like I did. The code you give is basically unreadable, littered with crap like
<<<>>0&1<<4>>(n)
. You might be thinking that it is somehow slower to callisprime(u)
and triggering a division by 2, but modern compilers are really smart. They can do some really non-trivial optimizations. By the way the same goes for the "i++
is slower than++i
" crowd. At least if you use them as the increment in a simple for-loop, the compiler is smart enough to realize that you don't care about it's return value.Thank you -is-this-fft-, I mean it.
We can make it faster by initializing
v=u*u
that becomesj=2*i*i+2*i