This post will be a tutorial on 313B - Ilya and Queries, as well as some general python I/O optimizations that provided me with enough of a speedup to have my solution finish in the allotted time.
Solution
As an example, consider the input string #..###. From it, we create arrays
S = [ # . . # # # ]W = [ 0 1 0 1 1 X ]=[ w1 w2 w3 w4 w5 w6 ]
where
Sis an array encoding the input string.W[i]= 1 ifS[i]==S[i+1]and 0 otherwise (`W[-1] is undefined).
Let f(l, r) be a function where f(l, r) = w_{l} + w_{l+1} + ... + w{r-1}. Clearly f corresponds to the quantity of interest in 313B - Ilya and Queries.
A naive solution is to compute f(l, r) iteratively (i.e. by looping from l to r-1). However, the runtime of this algorithm is O(nm) = O(n^2) as 0 <= m <= n. Hence, in the worst case we will do (10^5)^2 = 10^10 operations as 0 <= n <= 10^5, which it too large a number for the allotted time. Hence we need to do better.
One idea is to pre-compute f(l, r) for 1 <= l < r <= n. However, it turns out this table would have 10^10 elements in it, so it won't do.
It turns out that instead of pre-computing every possible f(l, r) pair, we can pre-compute a smaller number of values which we can use very quickly to compute any f(l, r) we want. Consider the following function F(r), where F(r) = w_1 + w_2 + ... + w_{r-1}. Notice thatf(l, r)=F(l)=F(r-1). To see, this considerf(2, 5)`:
`f(2, 5)` = `w_2` + `w_3` + `w_4`
= `w_1` + `w_2` + `w_3` + `w_4`
- `w_1`
= F(5)
- F(2)
Recall the goal of the problem is to compute the number of elements in the range [l, r) whose right neighbor is the same as itself. A naive solution is to compute the sum of corresponding elements in X (e.g. sum up X[l+1] + X[l+2] + ... + X[r]. However, this is O(n) for a single query as 0 <= l, r <= n. Also, just processing each query is O(n) as 0 <= m <= n, so the total runtime of this approach is O(n^2). Hence the total number of timesteps for this solution in the worst case is 10^10 as 1 <= n <= 10^5, which is too much computation for the allotted time.
To speed things up, we can use A. The solution is simply the expression A[r] — A[l]. To see why this is the case, consider the example [l=3, r=6). A[3] is the number of elements up to and including index 2 whose right neighbors are the same as themselves. A[5] is the same, up to index 4, which is the index we want to go up to.
[ 0 0 1 0 1 1 ]
|---| => A[3] = 1
|---------| => A[6] = 3
The reason this works is because the ith position of A is the answer to [l=0, r=i). To recover [l, r) for arbitrary l and r, we simply subtract A[l] from A[r].
I/O Optimizations
Even with this optimization, my python solution was still exceeding the time limit. After profiling my code, I deduced that the way I was handling I/O was the bottleneck. I replaced
for _ in range(m):
l, r = [int(num) for num in raw_input().split()]
print A[r] - A[l]
with
lines = sys.stdin.readlines()
inputs = [[int(num)-1 for num in line.split()] for line in lines]
outputs = [str(SA[r]-SA[l]) for l, r in inputs]
print '\n'.join(outputs)
Eliminating the repeated calls to raw_input() and print gave me a substantial speedup which was fast enough to have my solution finish within the allotted time!



