During testing of Codeforces Round 934, I found a very cursed (albeit unproven) solution to 1944B - Equal XOR and I thought it would be worth a separate blog, so here it is.

Before I explain the solution, I must give you a quick disclaimer; It is much harder than the intended solution and is very likely useless. If you would appreciate understanding it despite it being very useless, please do read further.

First, let us use an assumption which will be under the very basis of the solution. I will **not** prove it to you, but you will see that it is likely true.

- Let $$$X$$$ be an uniform random subsequence of $$$a$$$ with size $$$k$$$. Then, $$$X_1 \oplus X_2 \oplus \cdots \oplus X_k$$$ is
**almost**uniformly distributed across all possible values.

If this assumption is true, then we can get to a solution with $$$\mathcal{O}(n \sqrt{n})$$$ expected time complexity and $$$\mathcal{O}(n \sqrt{n}/w)$$$ expected space complexity.

Let us sample one random subsequence of length $$$2k$$$ from $$$a_1,a_2,\cdots,a_n$$$, and one from $$$a_{n+1},a_{n+2},\cdots,a_{2n}$$$. I claim that the probability such that the XOR of these two subsequences coincide is on the order of $$$\Omega(1/n)$$$, under our assumption above. Then, if we sample $$$x$$$ subsequences from each side, what will be the probability that at least one pair will coincide?

Now, there are $$$x^2$$$ pairs between the $$$2x$$$ subsequences picked. Intuitively, we see that this situation is very close to the birthday problem where we need an expected number of $$$O(\sqrt{n})$$$ people to find a collision. Though the pigeonhole principle does not apply here, the probability still works very similarly. The probability that we will have a collision in $$$n$$$ pairs converges to a constant which is $$$e^{-1} \approx 0.367879$$$, and the probability that we get none in $$$x^2$$$ is essentially $$$e^{-x^2/n}$$$ when $$$x^2>n$$$. When $$$x=3\sqrt{n}$$$ this is already less than $$$0.02$$$ percent.

So, we will get at least one collition w.h.p under $$$\mathcal{O}(\sqrt{n})$$$ samples. Each round takes $$$\mathcal{O}(n)$$$ with a trivial process. Thus the time complexity is expected $$$\mathcal{O}(n\sqrt{n})$$$. The final issue is space complexity where $$$\mathcal{O}(n\sqrt{n})$$$ can be tight in $$$256$$$ megabytes, while bitset fixes this issue. The space complexity is reduced to $$$\mathcal{O}(n\sqrt{n}/w)$$$ using a bitset. The solution is complete. The AC submission is here.

Now here is the catch. I did **not** prove the assumptions along the proof. So I am asking, can anyone prove the assumptions and thus the solution, or disprove the solution (thus finding a hack)? Please let me know in the comments if anyone can either prove or disprove this.

You can save up some space by generating and storing random seed before shuffle and reshuffling answer from known seed later.

When $$$a=[1,1,2,2,\ldots,2k+1,2k+1]$$$, the only two subsets with the same xor are $$$[1,1,\ldots,k,k]$$$ and $$$[k+2,k+2,\ldots,2k+1,2k+1]$$$.

Hack

Oh yes, I definitely did not think about that case. Didn't expect that it wasn't in the tests either.

omg bitset