Blog entries - Codeforces

Imagine you need to iterate through many random permutations of a string and perform some quick check for each one. The check is so fast that the bottleneck in your solution becomes the shuffling. For simplicity, in the snippets, I've sped up the check to a no-op and added printing the final string.

Implementation with `random_shuffle`

#include <bits/stdc++.h>
using namespace std;

#define ITER 10000

int main() {
    string s; cin >> s;
    for (int i = 0; i < ITER; ++i) {
        random_shuffle(s.begin(), s.end());
        // do_something(s);
    }
    cout << s << '\n';
}

We'll benchmark this on a random [a-zA-Z0-9] string of length $$$2 \cdot 10^5$$$.

How to generate such a string on Unix?

tr -dc '[:alnum:]' < /dev/urandom | head -c 200000 > stroka.txt

Compile the code with g++ 14.2.1 using the standard -O2 flag common in most judging systems. On my laptop, the user CPU time was $$$26.78$$$ seconds.

Implementation with `shuffle`

Perhaps you've read the well-known article about random or noticed the compiler warning. Indeed, no one uses random_shuffle anymore, as there's a modern replacement. How will it perform in the benchmark?

#include <bits/stdc++.h>
using namespace std;

#define ITER 10000

int main() {
    string s; cin >> s;
    mt19937 rng(42); // reproducibility
    for (int i = 0; i < ITER; ++i) {
        shuffle(s.begin(), s.end(), rng);
        // do_something(s);
    }
    cout << s << '\n';
}

The user CPU time for this implementation under the same conditions was $$$7.73$$$ seconds. Cool! In this note, we won't dive into implementation details or analyze the reason for this speedup.

Implementation with `strfry`

Let's turn to C's legacy. The standard library contains some useful functions, like strtok, which have no modern C++ equivalents. And if you look under _GNU_SOURCE, you can find even more surprising things.

Suddenly, glibc enters the stage. For our task, strfry can help. This function takes a null-terminated string and shuffles it. The global state of the PRNG remains unchanged, and the permutation will vary between runs. Under the hood, the function uses getrandom in the standard linear swap algorithm. By the way, according to the C++11 standard, our string is null-terminated.

#include <bits/stdc++.h>
using namespace std;

#define ITER 10000

int main() {
    string s; cin >> s;
    for (int i = 0; i < ITER; ++i) {
        strfry(s.data());
        // do_something(s);
    }
    cout << s << '\n';
}

The user CPU time under the same conditions was $$$5.54$$$ seconds. Congratulations, we've found the winner!

Full text and comments »

inline long long mul(long long a, long long b) { long long res; asm( "mov %1, %%rax\n" "mov %2, %%rbx\n" "imul %%rbx\n" "mov $1000000014018503, %%rbx\n" "idiv %%rbx\n" "mov %%rdx, %0\n" :"=res"(res) :"a"(a), "b"(b) ); return res; }

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	dXqwq	3436
8	Radewoosh	3415
9	Otomachi_Una	3413
10	Um_nik	3376

#	User	Contrib.
1	Qingyu	158
2	adamant	152
3	Um_nik	146
4	Dominater069	144
5	errorgorn	141
6	cry	139
6	Proof_by_QED	139
8	YuukiS	135
9	chromate00	134
9	TheScrasse	134

Gornak40's blog

Implementation with random_shuffle

Implementation with shuffle

Implementation with strfry

Implementation with `random_shuffle`

Implementation with `shuffle`

Implementation with `strfry`