TunaMayoSandwich's blog

By TunaMayoSandwich, history, 5 months ago, In English

As a pupil who has started training intensely not too long ago there are a lot of problems I come across that I can't figure out by myself, at least not in a contest's time. Most of the time there are great resources available online and on this very site to learn more and understand all the nuances implied in these problems, but I recently came across what is regarded as a "standard problem" and couldn't find satisfying explanations about it so I decided to write one myself hoping it'll be of some help to other beginners. I am going to discuss the problem of finding the maximum bitwise XOR of a subarray in O(nlogC) as it pertains to the problem 1847C - Vampiric Powers, anyone?. I'll try to be as thorough as possible in the first part and as formal as possible in the second, only skipping over basic concepts (what is a subarray, what is bitwise xor, etc). Hope you like it.

Task

Given an array a, let's define f(l, r) = xor of elements of a from index l to r, where 0 <= l < r <= n-1. Find max f(l,r) in O(nlogC) where C is the greatest value in the array.

Solution

The idea is to use the fact that f(0, r) = f(0, l-1) ^ f(l, r) -> f(0, r) ^ f(0, l-1) = f(l, r). This means that one can traverse the array and maintain a variable pre_xor = f(0, i) where i is the current position, then all that remains is to find l-1 such that pre_xor ^ f(0, l-1) is maximised, this way one has found max (f(0, i) ^ f(0, l-1)) = max f(l, i) and solves the problem by iterating over all values of i from left to right.

The ideal way to find l-1 is to create a trie data structure. A trie is a k-ary search tree but in this case a branch is going to represent a bit and a path a number, since bits can only be 1s and 0s the trie will simply be a standard binary tree.

struct trieNode{
	ll value;
	trieNode *arr[2];
};

trieNode *newNode(){
	trieNode *temp = new trieNode;
	temp->value = 0;
	temp->arr[0] = NULL;
	temp->arr[1] = NULL;
	return temp;
}

Start traversing a from index 0 and insert pre_xor into the trie. To insert a number in the trie take a look at its bits from the most significant downwards, if it's set to 1 take a step right, otherwise take a step left and create new nodes along the way if needed. Exiting this cycle a path'll have been created through the trie representing pre_xor. Append pre_xor to the end of this path so that when one reaches a leaf he has a convenient way of knowing what number he has arrived to. Note that nowhere in a trie node 1s and 0s appear, one just has to decide on a convention (right = 1, left = 0) and the encoding of 1s and 0s becomes implicit.

void insert(trieNode *root, ll pre_xor, ll max_bits){
	trieNode *temp = root;
 
	bool val;
	for (ll i=max_bits; i>=0; i--){
		val = pre_xor & (1<<i);
 
		if (temp->arr[val] == NULL) temp->arr[val] = newNode();
		temp = temp->arr[val];
	}
 
	temp->value = pre_xor;
}

Given pre_xor the ideal value to xor it with is one that has 0s where pre_xor has 1s and viceversa. The trie helps in finding the value that comes closest to the optimal one among f(0, l-1) for all 0 <= l <= i. Just start traversing it from top to bottom and try to go right (right = bit set to 1) if the most significant bit of pre_xor is set to 0, left otherwise. If at any point one can't go where he wishes he shall go down the other path. When finally one reaches a leaf he'll find there the value representing the path he just went down, which is the optimal f(0, l-1).

ll query(trieNode *root, ll pre_xor, ll max_bits){
	trieNode *temp = root;
	
	bool val;
	for (ll i=max_bits; i>=0; i--){
		val = pre_xor & (1<<i);
 
		if (temp->arr[!val] != NULL) temp = temp->arr[!val];
		else if (temp->arr[val] != NULL) temp = temp->arr[val];
	}
	
	return pre_xor^(temp->value);
}

ll maxSubarrayXOR(vector<ll>& arr, ll n = -1, ll max_val = -INF){
	if(n == -1) n = arr.size();
	if(max_val == -INF) max_val = *max_element(arr.begin(), arr.end());
	ll max_bits = ceil(log2(max(1, max_val)));
	
	trieNode *root = newNode();
	insert(root, 0, max_bits);
	ll result = 0, pre_xor =0;
 
	for (ll i=0; i<n; i++){
		pre_xor = pre_xor^arr[i];
		insert(root, pre_xor, max_bits);
		result = max(result, query(root, pre_xor, max_bits));
	}
	
	return result;
}

Why does this solve 1847C - Vampiric Powers, anyone?

It's not at all evident that problem 1847C tasks one to find the max xor subarray, but indeed it does, with just one little adjustment. The function to optimise is no longer f(l, r) with 0 <= l < r <= n-1, but f(l, r) with 0 <= l <= r <= n-1 where f(i, i) is equal by definition to a[i] for all 0 <= i <= n-1. This means that if there's a value in the array which is not smaller than any xor will ever be then one needs to return that value. This is easily solved by inserting 0 into the trie at the beginning so that each number can be xored with 0 (that is, remain the same number) if there isn't anything better to xor it with. The piece of code posted above already does so.

If one doesn't find himself in such a lucky case, then the vampire powers only allow xoring with a fixed right index, that is f(l, n-1), but allow to do so multiple times. Let's show that each xor of the form f(l, r) can be constructed from 2 xors of the form f(l1, n-1), f(l2, n-1). This will prove that the max xor subarray is not bigger than the solution to the problem at hand.

It goes as simple as this f(l, r) = f(l, n-1) ^ f(r+1, n-1), but there's a little caveat: after having performed the first xor the length of array a will have grown to n+1 so, formally, the process is described as follows:

a[0, ..., n-1] -> f(l, n-1) -> a[0, ..., n] where a[n] = f(l, n-1) -> f(r+1, n) = f(r+1, n-1) ^ a[n] = f(r+1, n-1) ^ f(l, n-1) = f(l, r)

To end the proof that the solution to problem 1847C is the max xor subarray it's still required to show that vampire powers cannot create a solution that's better than max f(l, r), let's do it by contradiction. Suppose vampire powers can create a solution that's better than max f(l, r), this means that after having appended to the array several elements and having expanded it to be like a[0, ..., m] where a[i] = f(li, i-1) for all n <= i <= m, 0 <= li < i-1 one finds himself able to create a solution f(l, m) > max xor subarray.

f(l1, m) = f(l1, m-1) ^ a[m] = f(l1, m-1) ^ f(l2, m-1). Let l+ = max(l1, l2) and l- = min(l1, l2), then f(l1, m) = f(l-, l+). Repeating this process eventually yields the equality f(l1, m) = f(l, r) where 0 <= l < r <= n-1 which is absurd and concludes the proof and my first post. Have a nice rest of the day.

Full text and comments »

  • Vote: I like it
  • +19
  • Vote: I do not like it