Let us start with 2014H - Robin Hood Archery. A common solution assigns each value a random integer in $$$[0,L)$$$ and XORs the values in the interval. If every value appears an even number of times, the result is $$$0$$$. Otherwise, the result is $$$0$$$ with probability at most $$$1/L$$$.
Inspired by the hack used on 282284449 with xorshift, I found that many other submissions can also be hacked. The pitfalls of XOR hashing with mt19937_64 do not seem to be widely known, so I wrote this blog.
If we view both the internal state and the outputs as vectors over $$$\mathbb{F}_2$$$, then the $$$i$$$-th output of mt19937_64 with initial state $$$x$$$ can be written as $$$o_i = BA^i x$$$, where $$$A$$$ is the state transition matrix and $$$B$$$ is the output matrix over $$$\mathbb{F}_2$$$.
Thus, to make the XOR zero for every seed, it is enough to find a nonzero polynomial $$$p(t)=\sum_i a_i t^i$$$ such that $$$p(A)=0$$$. Then
Such a polynomial can be the minimal polynomial of $$$A$$$, whose degree is $$$19937$$$ for mt19937_64. By observing the least significant bit of the first $$$2 \times 19937$$$ outputs, we can recover the recurrence of this bit sequence using Berlekamp–Massey. This recurrence divides the minimal polynomial. Since mt19937_64 is designed to have maximal period, the corresponding polynomial is primitive, hence irreducible. Thus any nonzero recovered recurrence must be the full minimal polynomial.
The following is the test case generator.
Using uniform_int_distribution(0, L) incorrectly can still cause the same issue. For example, the implementation is unsafe on common standard-library implementations when $$$L = 2^n-1$$$, because the distribution may degenerate into taking the low $$$n$$$ bits of rng(), which is still linear over $$$\mathbb{F}_2$$$.








