Max XOR of a subarray in O(nlog(C))

Revision en14, by TunaMayoSandwich, 2024-07-27 19:55:49

As a pupil there are a lot of problems I come across that I can't figure out by myself, at least not in a contest's time. Most of the time there are great resources available online and on this very site to learn more and understand all the nuances implied in these problems, but I recently stumbled upon what is regarded as a "standard problem" and couldn't find satisfying explanations about it so I decided to write one myself hoping it'll be of some help to other beginners. I'm going to discuss the problem of finding the maximum bitwise XOR of a subarray in $$$O(n \cdot log(C))$$$ as it pertains to the problem 1847C - Vampiric Powers, anyone?. I'll try to be as thorough as possible in the first part and as formal as possible in the second, only skipping over basic concepts (what is a subarray, what is bitwise xor, etc).

Task

Given an array $$$a$$$, let's define $$$f(l, r)$$$ as the xor of elements of $$$a$$$ from index $$$l$$$ to $$$r$$$, where $$$0 \leq l < r < n$$$. Find $$$\text{max} f(l, r)$$$ in $$$O(n \cdot log(C))$$$ where $$$C$$$ is the greatest value in the array.

Solution

The idea is to use the fact that $$$f(0, r) = f(0, l-1) \oplus f(l, r) \implies f(l, r) = f(0, r) \oplus f(0, l-1)$$$. This means that one can traverse the array and maintain a variable $$$\text{pre_xor} = f(0, i)$$$ where $$$i$$$ is the current position, then all that remains to do is to find $$$l-1$$$ such that $$$\text{pre_xor} \oplus f(0, l-1)$$$ is maximised, this way one has found $$$\text{max} (f(0, i) \oplus f(0, l-1)) = \text{max} f(l, i)$$$ and solves the problem by iterating over all values of $$$i$$$ from left to right.

The ideal way to find $$$l-1$$$ is to create a trie data structure. A trie is a k-ary search tree but in this case a branch is going to represent a bit and a path a number, since bits can only be 1s and 0s the trie will simply be a standard binary tree.

struct trieNode{
	ll value;
	trieNode *arr[2];
};

trieNode *newNode(){
	trieNode *temp = new trieNode;
	temp->value = 0;
	temp->arr[0] = NULL;
	temp->arr[1] = NULL;
	return temp;
}

Start traversing $$$a$$$ from index 0 and insert $$$\text{pre_xor}$$$ into the trie. To insert a number in the trie take a look at its bits from the most significant downwards, if it's set to 1 take a step right, otherwise take a step left and create new nodes along the way if needed. Exiting this cycle a path'll have been created through the trie representing $$$\text{pre_xor}$$$. Append $$$\text{pre_xor}$$$ to the end of this path so that when one reaches a leaf he has a convenient way of knowing what number he has arrived to. Note that nowhere in a trie node 1s and 0s appear, one just has to decide on a convention (right = 1, left = 0) and the encoding of 1s and 0s becomes implicit.

void insert(trieNode *root, ll pre_xor, ll max_bits){
	trieNode *temp = root;
 
	bool val;
	for (ll i=max_bits; i>=0; i--){
		val = pre_xor & (1<<i);
 
		if (temp->arr[val] == NULL) temp->arr[val] = newNode();
		temp = temp->arr[val];
	}
 
	temp->value = pre_xor;
}

Given $$$\text{pre_xor}$$$ the ideal value to xor it with is one that has 0s where $$$\text{pre_xor}$$$ has 1s and viceversa. The trie helps in finding the value that comes closest to the optimal one among $$$f(0, l-1)$$$ for all $$$0 \leq l \leq i$$$. Just start traversing it from top to bottom and try to go right (right = bit set to 1) if the most significant bit of $$$\text{pre_xor}$$$ is set to 0, left otherwise. If at any point one can't go where he wishes he shall go down the other path. When finally one reaches a leaf he'll find there the value representing the path he just went down, which is the optimal $$$f(0, l-1)$$$.

ll query(trieNode *root, ll pre_xor, ll max_bits){
	trieNode *temp = root;
	
	bool val;
	for (ll i=max_bits; i>=0; i--){
		val = pre_xor & (1<<i);
 
		if (temp->arr[!val] != NULL) temp = temp->arr[!val];
		else if (temp->arr[val] != NULL) temp = temp->arr[val];
	}
	
	return pre_xor^(temp->value);
}

ll maxSubarrayXOR(vector<ll>& arr, ll n = -1, ll max_val = -INF){
	if(n == -1) n = arr.size();
	if(max_val == -INF) max_val = *max_element(arr.begin(), arr.end());
	ll max_bits = 64 - __builtin_clzll(max_val | 1);
	
	trieNode *root = newNode();
	insert(root, 0, max_bits);
	ll result = 0, pre_xor =0;
 
	for (ll i=0; i<n; i++){
		pre_xor = pre_xor^arr[i];
		insert(root, pre_xor, max_bits);
		result = max(result, query(root, pre_xor, max_bits));
	}
	
	return result;
}

Why does this solve 1847C - Vampiric Powers, anyone?

It's not at all evident that problem 1847C tasks one to find the max xor of a subarray, but indeed it does, with one little adjustment. The function to optimise is still $$$f(l, r)$$$ but instead of $$$0 \leq l < r < n$$$, $$$0 \leq l \leq r < n$$$ stands where $$$f(i, i)$$$ is equal by definition to $$$a[i]$$$ for all $$$0 \leq i < n$$$. This means that if there's a value in the array which is not smaller than any xor will ever be then one needs to return that value. This is easily solved by inserting 0 into the trie at the beginning so that each number can be xored with 0 (that is, remain the same number) if there isn't anything better to xor it with. The piece of code posted above already does so.

If one doesn't find himself in such a lucky case, then the vampire powers only allow xoring with a fixed right index, that is $$$f(l, n-1)$$$, but allow to do so multiple times. Let's show that each xor of the form $$$f(l, r)$$$ can be constructed from 2 xors of the form $$$f(l_{1}, n-1)$$$, $$$f(l_{2}, n-1)$$$. This will prove that the xor of any subarray is not bigger than the solution to the problem at hand.

It goes as simple as this $$$f(l, r) = f(l, n-1) \oplus f(r+1, n-1)$$$, but there's a little caveat: after having performed the first xor the length of array $$$a$$$ will have grown to n+1 so, formally, the process is described as follows:

$$$a[0, ..., n-1] \rightarrow a[0, ..., n]$$$ where $$$a[n] = f(l, n-1) \rightarrow a[0, ..., n+1]$$$ where $$$ a[n+1] = f(r+1, n) = f(r+1, n-1) \oplus a[n] = f(r+1, n-1) \oplus f(l, n-1) = f(l, r)$$$

To end the proof that the solution to problem 1847C is the max xor of a subarray it's still required to show that vampire powers cannot create a solution that's better than $$$\text{max} f(l, r)$$$, let's do it by contradiction. Suppose vampire powers can create a solution that's better than $$$\text{max} f(l, r)$$$, this means that after having appended to the array several elements and having expanded it to be like $$$a[0, ..., m]$$$ where $$$a[i] = f(l_{i}, i-1)$$$ for all $$$n \leq i \leq m$$$, $$$0 \leq l_{i} < i-1$$$ one finds himself able to create a solution $$$f(l, m) >$$$ max xor of a subarray.

$$$f(l_{1}, m) = f(l_{1}, m-1) \oplus a[m] = f(l_{1}, m-1) \oplus f(l_{2}, m-1)$$$. Let $$$l_{+} = max(l_{1}, l_{2})$$$ and $$$l_{-} = min(l_{1}, l_{2})$$$, then $$$f(l_{1}, m) = f(l_{-}, l_{+})$$$. Repeating this process eventually yields the equality $$$f(l_{1}, m) = f(l, r)$$$ where $$$0 \leq l < r < n$$$ which is absurd and concludes the proof and this post. Have a nice rest of the day.

Tags bitmask, xor, arrays

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en15 English TunaMayoSandwich 2024-07-27 20:02:31 5 Tiny change: 'leq l_{i} < i-1$ one ' -> 'leq l_{i} \leq i-1$ one '
en14 English TunaMayoSandwich 2024-07-27 19:55:49 49 Tiny change: ' f(l, r) \rightarrow f(0, r) \' -> ' f(l, r) \implies f(0, r) \'
en13 English TunaMayoSandwich 2024-07-26 21:58:08 66 Tiny change: 'eq m$, $0 leq l_{i} ' -> 'eq m$, $0 \leq l_{i} ' (published)
en12 English TunaMayoSandwich 2024-07-26 21:43:15 42
en11 English TunaMayoSandwich 2024-07-26 21:29:33 675 Tiny change: 'barray in O(nlogC) as it per' -> 'barray in $O(n \cdot logC)$ as it per' (saved to drafts)
en10 English TunaMayoSandwich 2024-01-01 03:59:17 2 Tiny change: ' all 0 <= I <= n-1. T' -> ' all 0 <= i <= n-1. T'
en9 English TunaMayoSandwich 2023-12-18 23:27:31 2 Tiny change: 'y case, than the vamp' -> 'y case, then the vamp'
en8 English TunaMayoSandwich 2023-12-18 20:31:13 2 Tiny change: 'of **a** form index l ' -> 'of **a** from index l '
en7 English TunaMayoSandwich 2023-12-17 23:01:47 2 Tiny change: 'st value it the array' -> 'st value in the array'
en6 English TunaMayoSandwich 2023-12-17 16:54:31 23 Tiny change: 'having expended it to' -> 'having expanded it to'
en5 English TunaMayoSandwich 2023-12-17 14:52:49 4 Tiny change: 'ray in O(n) as it pe' -> 'ray in O(nlogC) as it pe'
en4 English TunaMayoSandwich 2023-12-17 14:51:42 51
en3 English TunaMayoSandwich 2023-12-17 14:16:27 0 (published)
en2 English TunaMayoSandwich 2023-12-17 14:09:12 8183 Tiny change: 'ains to this problem [' -> 'ains to the problem ['
en1 English TunaMayoSandwich 2023-12-17 13:25:25 2842 Initial revision (saved to drafts)