There is such an implementation of the sieve of Eratosthenes:
Then replace the second line with bitset used;
For comparison, let 's take another linear sieve of Eratosthenes.
Running time table on compiler GNU G++20 11.2.0 (64 bit, winlibs) with #pragma GCC optimize("O3")
in all three solutions.
\begin{array}{|c|c|c|c|} \hline N & usual & linear & bitset \cr \hline 2e7 & 202ms & 140ms & 78ms \cr \hline 5e7 & 467ms & 374ms & 249ms \cr \hline 1e8 & 1138ms & 732ms & 686ms \cr \hline 2e8 & 2308ms & ML & 1669ms \cr \hline \end{array}
I guess it's because of the cache. The data takes up 8 times less memory and is cached better. The more cache there is on the computer, the more noticeable the acceleration. For example, I have a 3-fold acceleration on $$$N = 1e8$$$ compared to a conventional sieve.
Unfortunately, this is useless in practice, because almost always in the sieve we want to get some more information.