I stumbled upon a problem that can be reduced to the following: given 22-bit masks *a*_{1}, *a*_{2}, ..., *a*_{n} (*n* < 10^{6}), calculate d[m] = the number of i such that a[i] is a submask of m (i.e. a[i] & m == a[i]).

The jury solves it like this:

```
for (int i = 0; i < n; i++) {
d[a[i]]++;
}
for (int i = 0; i < 22; i++) {
for (int mask = 0; mask < (1 << 22); mask++) {
if ((mask & (1 << i)) == 0) {
d[mask | (1 << i)] += d[mask];
}
}
}
```

I came up with a similar idea while trying to solve the problem myself, but discarded it immediately because I felt like it ought to double-count some submasks. I seem to have managed to convince myself that it doesn't, but I still don't have an intuitive understanding for why this approach works. Is this some kind of a trick that I don't know? Or perhaps somebody could just explain the reasoning behind this solution to me?

Thanks!

I tried to explain this here: http://mirror.codeforces.com/blog/entry/10476 (problem Div1. E, section "Solution fount by other contestants").

In short, let's create a function f(left, right): calculate d[j] for every j with property that left <= j <= right considering only a[i] such as left <= a[i] <= right. Answer should be fount after doing f(0, 2^22 — 1).

When you calculate f(left, right), let med = (left + right) / 2. We do divide and conquer. Let's call f(left, med) and f(med + 1, right). Now we have to "merge" the results. In [left, med], all numbers are equal to those from [med + 1, right] after switching the MSB of [med + 1, right] from 1 to 0. Hence, for i in [med + 1, right], i — (med + 1) is a submask of i (and all submasks of i — (med + 1) are submasks of i). So, we need to add d[i] += d[i — (med + 1)], for i in [med + 1, right]. After this merge step, nothing is counted twice, because when we do d[i] += d[i — (med + 1)], we add only numbers from [left, med] range. As the current range is [med + 1, right], those numbers were not added before.

Now the recursion code and yours are equivalent. That is just the iterative implementation of the divide and conquer solution explained :) BTW, this seemed magic for me, too, until I've stopped thinking it as an iterative code and I've started thinking it as divide&conquer.

let us see where a[i] will be counted in d[]. before first iteration it is counted in d[a[i] | 0]

let i1, i2, i3 ... in be the unset bits in a[i] such that i1 < i2 < i3... after first iteration a[i] increases count of d[a[i] | (1 << i1)]. so it affects d[a[i] | (1 << i1)] and d[a[i] | 0] after second iteration d[a[i] | (1 << i1) | (1 << i2)], d[a[i] | (1 << i2)], d[a[i] | 0], d[a[i] | (1 << i1)].

At each iteration of i1, i2.. affected indices are doubled. so total indices where a[i] is counted is 2^n, and all of the indices are supermasks of a[i]. Since a[i] has 2^n supermasks, a[i] is counted only once at each supermask.

I now the blog is old but I want to add a useful link here (for anyone who needs it). This is called sum over subset dynamic programming.

This is just prefix sum on a higher-dimension array. Probably an array like a[5][5][5][5][5] might be easier to imagine how this works.