As a pupil there are a lot of problems I come across that I can't figure out by myself, at least not in a contest's time. Most of the time there are great resources available online and on this very site to learn more and understand all the nuances implied in these problems, but I recently stumbled upon what is regarded as a "standard problem" and couldn't find satisfying explanations about it so I decided to write one myself hoping it'll be of some help to other beginners. I'm going to discuss the problem of finding the maximum bitwise XOR of a subarray in $O(n \cdot log(C))$ as it pertains to the problem [problem:1847C]. I'll try to be as thorough as possible in the first part and as formal as possible in the second, only skipping over basic concepts (what is a subarray, what is bitwise xor, etc).↵
↵
#### Task↵
Given an array $a$, let's define $f(l, r)$ as the xor of elements of $a$ from index $l$ to $r$, where $0 \leq l < r < n$. Find $\text{max} f(l, r)$ in $O(n \cdot log(C))$ where $C$ is the greatest value in the array.↵
↵
#### Solution↵
The idea is to use the fact that $f(0, r) = f(0, l-1) \oplus f(l, r) \rightarrow f(0, r) \oplus f(0, l-1) = f(l, r)$. This means that one can traverse the array and maintain a variable $\text{pre_xor} = f(0, i)$ where $i$ is the current position, then all that remains to do is to find $l-1$ such that $\text{pre_xor} \oplus f(0, l-1)$ is maximised, this way one has found $\text{max} (f(0, i) \oplus f(0, l-1)) = \text{max} f(l, i)$ and solves the problem by iterating over all values of $i$ from left to right.↵
↵
The ideal way to find $l-1$ is to create a trie data structure. A trie is a k-ary search tree but in this case a branch is going to represent a bit and a path a number, since bits can only be 1s and 0s the trie will simply be a standard binary tree.↵
↵
~~~~~↵
struct trieNode{↵
ll value;↵
trieNode *arr[2];↵
};↵
↵
trieNode *newNode(){↵
trieNode *temp = new trieNode;↵
temp->value = 0;↵
temp->arr[0] = NULL;↵
temp->arr[1] = NULL;↵
return temp;↵
}↵
~~~~~↵
↵
Start traversing $a$ from index 0 and insert $\text{pre_xor}$ into the trie. To insert a number in the trie take a look at its bits from the most significant downwards, if it's set to 1 take a step right, otherwise take a step left and create new nodes along the way if needed. Exiting this cycle a path'll have been created through the trie representing $\text{pre_xor}$. Append $\text{pre_xor}$ to the end of this path so that when one reaches a leaf he has a convenient way of knowing what number he has arrived to. Note that nowhere in a trie node 1s and 0s appear, one just has to decide on a convention (right = 1, left = 0) and the encoding of 1s and 0s becomes implicit.↵
↵
↵
~~~~~↵
void insert(trieNode *root, ll pre_xor, ll max_bits){↵
trieNode *temp = root;↵
↵
bool val;↵
for (ll i=max_bits; i>=0; i--){↵
val = pre_xor & (1<<i);↵
↵
if (temp->arr[val] == NULL) temp->arr[val] = newNode();↵
temp = temp->arr[val];↵
}↵
↵
temp->value = pre_xor;↵
}↵
~~~~~↵
↵
Given $\text{pre_xor}$ the ideal value to xor it with is one that has 0s where $\text{pre_xor}$ has 1s and viceversa. The trie helps in finding the value that comes closest to the optimal one among $f(0, l-1)$ for all $0 \leq l \leq i$. Just start traversing it from top to bottom and try to go right (right = bit set to 1) if the most significant bit of $\text{pre_xor}$ is set to 0, left otherwise. If at any point one can't go where he wishes he shall go down the other path. When finally one reaches a leaf he'll find there the value representing the path he just went down, which is the optimal $f(0, l-1)$.↵
↵
~~~~~↵
ll query(trieNode *root, ll pre_xor, ll max_bits){↵
trieNode *temp = root;↵
↵
bool val;↵
for (ll i=max_bits; i>=0; i--){↵
val = pre_xor & (1<<i);↵
↵
if (temp->arr[!val] != NULL) temp = temp->arr[!val];↵
else if (temp->arr[val] != NULL) temp = temp->arr[val];↵
}↵
↵
return pre_xor^(temp->value);↵
}↵
↵
ll maxSubarrayXOR(vector<ll>& arr, ll n = -1, ll max_val = -INF){↵
if(n == -1) n = arr.size();↵
if(max_val == -INF) max_val = *max_element(arr.begin(), arr.end());↵
ll max_bits = 64 - __builtin_clzll(max_val | 1);↵
↵
trieNode *root = newNode();↵
insert(root, 0, max_bits);↵
ll result = 0, pre_xor =0;↵
↵
for (ll i=0; i<n; i++){↵
pre_xor = pre_xor^arr[i];↵
insert(root, pre_xor, max_bits);↵
result = max(result, query(root, pre_xor, max_bits));↵
}↵
↵
return result;↵
}↵
~~~~~↵
↵
#### Why does this solve [problem:1847C]↵
It's not at all evident that problem 1847C tasks one to find the max xor subarray, but indeed it does, withjust one little adjustment. The function to optimise is still $f(l, r)$ but instead of $0 \leq l < r < n$, $0 \leq l \leq r < n$ stands where $f(i, i)$ is equal by definition to $a[i]$ for all $0 \leq i < n$. This means that if there's a value in the array which is not smaller than any xor will ever be then one needs to return that value. This is easily solved by inserting 0 into the trie at the beginning so that each number can be xored with 0 (that is, remain the same number) if there isn't anything better to xor it with. The piece of code posted above already does so.↵
↵
If one doesn't find himself in such a lucky case, then the vampire powers only allow xoring with a fixed right index, that is $f(l, n-1)$, but allow to do so multiple times. Let's show that each xor of the form $f(l, r)$ can be constructed from 2 xors of the form $f(l_{1}, n-1)$, $f(l_{2}, n-1)$. This will prove that the max xor subarray is not bigger than the solution to the problem at hand.↵
↵
It goes as simple as this $f(l, r) = f(l, n-1) \oplus f(r+1, n-1)$, but there's a little caveat: after having performed the first xor the length of array $a$ will have grown to n+1 so, formally, the process is described as follows:↵
↵
$a[0, ..., n-1] \rightarrowf(l, n-1) \rightarrow a[0, ..., n]$ where $a[n] = f(l, n-1) \rightarrow a[0, ..., n+1]$ where $ a[n+1] = f(r+1, n) = f(r+1, n-1) \oplus a[n] = f(r+1, n-1) \oplus f(l, n-1) = f(l, r)$↵
↵
To end the proof that the solution to problem 1847C is the max xor of a subarray it's still required to show that vampire powers cannot create a solution that's better than $\text{max} f(l, r)$, let's do it by contradiction.↵
Suppose vampire powers can create a solution that's better than $\text{max} f(l, r)$, this means that after having appended to the array several elements and having expanded it to be like $a[0, ..., m]$ where $a[i] = f(l_{i}, i-1)$ for all $n \leq i \leq m$, $0 \leq l_{i} < i-1$ one finds himself able to create a solution $f(l, m) >$ max xor of a subarray.↵
↵
$f(l_{1}, m) = f(l_{1}, m-1) \oplus a[m] = f(l_{1}, m-1) \oplus f(l_{2}, m-1)$. Let $l_{+} = max(l_{1}, l_{2})$ and $l_{-} = min(l_{1}, l_{2})$, then $f(l_{1}, m) = f(l_{-}, l_{+})$. Repeating this process eventually yields the equality $f(l_{1}, m) = f(l, r)$ where $0 \leq l < r < n$ which is absurd and concludes the proof and this post. Have a nice rest of the day.
↵
#### Task↵
Given an array $a$, let's define $f(l, r)$ as the xor of elements of $a$ from index $l$ to $r$, where $0 \leq l < r < n$. Find $\text{max} f(l, r)$ in $O(n \cdot log(C))$ where $C$ is the greatest value in the array.↵
↵
#### Solution↵
The idea is to use the fact that $f(0, r) = f(0, l-1) \oplus f(l, r) \rightarrow f(0, r) \oplus f(0, l-1) = f(l, r)$. This means that one can traverse the array and maintain a variable $\text{pre_xor} = f(0, i)$ where $i$ is the current position, then all that remains to do is to find $l-1$ such that $\text{pre_xor} \oplus f(0, l-1)$ is maximised, this way one has found $\text{max} (f(0, i) \oplus f(0, l-1)) = \text{max} f(l, i)$ and solves the problem by iterating over all values of $i$ from left to right.↵
↵
The ideal way to find $l-1$ is to create a trie data structure. A trie is a k-ary search tree but in this case a branch is going to represent a bit and a path a number, since bits can only be 1s and 0s the trie will simply be a standard binary tree.↵
↵
~~~~~↵
struct trieNode{↵
ll value;↵
trieNode *arr[2];↵
};↵
↵
trieNode *newNode(){↵
trieNode *temp = new trieNode;↵
temp->value = 0;↵
temp->arr[0] = NULL;↵
temp->arr[1] = NULL;↵
return temp;↵
}↵
~~~~~↵
↵
Start traversing $a$ from index 0 and insert $\text{pre_xor}$ into the trie. To insert a number in the trie take a look at its bits from the most significant downwards, if it's set to 1 take a step right, otherwise take a step left and create new nodes along the way if needed. Exiting this cycle a path'll have been created through the trie representing $\text{pre_xor}$. Append $\text{pre_xor}$ to the end of this path so that when one reaches a leaf he has a convenient way of knowing what number he has arrived to. Note that nowhere in a trie node 1s and 0s appear, one just has to decide on a convention (right = 1, left = 0) and the encoding of 1s and 0s becomes implicit.↵
↵
↵
~~~~~↵
void insert(trieNode *root, ll pre_xor, ll max_bits){↵
trieNode *temp = root;↵
↵
bool val;↵
for (ll i=max_bits; i>=0; i--){↵
val = pre_xor & (1<<i);↵
↵
if (temp->arr[val] == NULL) temp->arr[val] = newNode();↵
temp = temp->arr[val];↵
}↵
↵
temp->value = pre_xor;↵
}↵
~~~~~↵
↵
Given $\text{pre_xor}$ the ideal value to xor it with is one that has 0s where $\text{pre_xor}$ has 1s and viceversa. The trie helps in finding the value that comes closest to the optimal one among $f(0, l-1)$ for all $0 \leq l \leq i$. Just start traversing it from top to bottom and try to go right (right = bit set to 1) if the most significant bit of $\text{pre_xor}$ is set to 0, left otherwise. If at any point one can't go where he wishes he shall go down the other path. When finally one reaches a leaf he'll find there the value representing the path he just went down, which is the optimal $f(0, l-1)$.↵
↵
~~~~~↵
ll query(trieNode *root, ll pre_xor, ll max_bits){↵
trieNode *temp = root;↵
↵
bool val;↵
for (ll i=max_bits; i>=0; i--){↵
val = pre_xor & (1<<i);↵
↵
if (temp->arr[!val] != NULL) temp = temp->arr[!val];↵
else if (temp->arr[val] != NULL) temp = temp->arr[val];↵
}↵
↵
return pre_xor^(temp->value);↵
}↵
↵
ll maxSubarrayXOR(vector<ll>& arr, ll n = -1, ll max_val = -INF){↵
if(n == -1) n = arr.size();↵
if(max_val == -INF) max_val = *max_element(arr.begin(), arr.end());↵
ll max_bits = 64 - __builtin_clzll(max_val | 1);↵
↵
trieNode *root = newNode();↵
insert(root, 0, max_bits);↵
ll result = 0, pre_xor =0;↵
↵
for (ll i=0; i<n; i++){↵
pre_xor = pre_xor^arr[i];↵
insert(root, pre_xor, max_bits);↵
result = max(result, query(root, pre_xor, max_bits));↵
}↵
↵
return result;↵
}↵
~~~~~↵
↵
#### Why does this solve [problem:1847C]↵
It's not at all evident that problem 1847C tasks one to find the max xor subarray, but indeed it does, with
↵
If one doesn't find himself in such a lucky case, then the vampire powers only allow xoring with a fixed right index, that is $f(l, n-1)$, but allow to do so multiple times. Let's show that each xor of the form $f(l, r)$ can be constructed from 2 xors of the form $f(l_{1}, n-1)$, $f(l_{2}, n-1)$. This will prove that the max xor subarray is not bigger than the solution to the problem at hand.↵
↵
It goes as simple as this $f(l, r) = f(l, n-1) \oplus f(r+1, n-1)$, but there's a little caveat: after having performed the first xor the length of array $a$ will have grown to n+1 so, formally, the process is described as follows:↵
↵
$a[0, ..., n-1] \rightarrow
↵
To end the proof that the solution to problem 1847C is the max xor of a subarray it's still required to show that vampire powers cannot create a solution that's better than $\text{max} f(l, r)$, let's do it by contradiction.↵
Suppose vampire powers can create a solution that's better than $\text{max} f(l, r)$, this means that after having appended to the array several elements and having expanded it to be like $a[0, ..., m]$ where $a[i] = f(l_{i}, i-1)$ for all $n \leq i \leq m$, $0 \leq l_{i} < i-1$ one finds himself able to create a solution $f(l, m) >$ max xor of a subarray.↵
↵
$f(l_{1}, m) = f(l_{1}, m-1) \oplus a[m] = f(l_{1}, m-1) \oplus f(l_{2}, m-1)$. Let $l_{+} = max(l_{1}, l_{2})$ and $l_{-} = min(l_{1}, l_{2})$, then $f(l_{1}, m) = f(l_{-}, l_{+})$. Repeating this process eventually yields the equality $f(l_{1}, m) = f(l, r)$ where $0 \leq l < r < n$ which is absurd and concludes the proof and this post. Have a nice rest of the day.