Before reading, please read the solutions to the first subtask (and hopefully identify the two solutions that are helpful in solving the second subtask).
Directly applying the most optimal DP solution to the first subtask, we have a solution with $$$\mathcal{O}(n^2)$$$ time complexity. However, it seems hard to improve further as even our optimal DP has $$$\mathcal{O}(n)$$$ states and it looks like that can't be helped with.
However, we somehow already have enough resources to improve it even further; there is a way to know all values $$$\operatorname{opt}(1,r),\operatorname{opt}(2,r),\ldots,\operatorname{opt}(r,r)$$$ without using $$$\mathcal{O}(n)$$$ states.
Recall the following:
- There will be only at most $$$2\log_2(a_r)$$$ different values of $$$\operatorname{opt}(i,r)$$$. (Direct consequence of 2.1.a)
- Our DP to compute $$$\operatorname{opt}(1,r)$$$ is monotonic, i.e. $$$\operatorname{opt}(i,r) \ge \operatorname{opt}(i+1,r)$$$. (= 1.b)
Using the above, we can infer the following.
- We can model another DP by swapping the places of cost and index, and this new DP will only have at most $$$C=128 \ge 2\log_2(a_r)$$$ states.
This "swapping cost and index" might sound like it's so unexpected, but surely you may have seen such a DP formulation if you have solved many DP problems. Namely, this technique is used in Knapsack problems when the cost is small and the capacity is huge.
So, how would we formulate this DP? Let us define a new function:
- $$$\operatorname{opt}'(c,r)$$$: minimum value of $$$l$$$ such that $$$\operatorname{opt}(l,r) \le c$$$ ($$$0 \le c \le C$$$, $$$1 \le r \le n$$$).
Then, we can see by 1.b that $$$\operatorname{opt}'(c+1,r) \le \operatorname{opt}'(c,r)$$$ for all $$$0 \le c$$$. Also, after we know $$$\operatorname{opt}'(c,r)$$$ and $$$\operatorname{opt}'(c+1,r)$$$, we can compute the range of $$$l$$$ such that $$$\operatorname{opt}(l,r) = c$$$, and it lets us find the sum of $$$\operatorname{opt}(l,r)$$$ over all $$$1 \le i \le r$$$ in $$$\mathcal{O}(C)$$$ time.
So let's think of how we will compute values of $$$\operatorname{opt}'(c,r)$$$ for one $$$r$$$ in $$$\mathcal{O}(C)$$$ time.
Let us first precompute the minimum $$$l$$$ such that $$$\mathtt{cost}([a_l,a_{l+1},\ldots,a_i]) \le c$$$ for all $$$i$$$ and $$$c$$$. By 2.2.a, it suffices to compute this for only $$$1 \le c \le 3$$$. We will denote this as $$$L(i,c)$$$. These values can be easily preprocessed using walk on segment tree in $$$\mathcal{O}(n \log n)$$$ time. This will come in handy for our new DP formulation.
In the formulation of our original $$$\mathcal{O}(n)$$$ DP for the first subtask, we required a pointer for each cost $$$1 \le c \le 3$$$ on one side. On the new formulation, however, we will need some transition directly for indices. This is where the values $$$L(i,c)$$$ comes in handy.
We can find the following correct DP transition:
- $$$\operatorname{opt}'(c,r)=r+1$$$ (when $$$c \le 0$$$)
- $$$\operatorname{opt}'(c,r)= \min_\limits{1 \le k \le 3} \left ({\min_\limits{\max(\operatorname{opt}'(c-k,r)-1,0) \le i \le r}{L(i,k)}}\right )$$$ (when $$$c \gt 0$$$)
The correctness of this DP transition is shown by 1.b and 2.2.a.
Because we have already preprocessed the values of $$$L(i,1)$$$, $$$L(i,2)$$$, $$$L(i,3)$$$ in $$$\mathcal{O}(n \log n)$$$ time, we can now construct a RMQ data structure for these values in $$$\mathcal{O}(n \log n)$$$ time for $$$\mathcal{O}(1)$$$ query time. This allows us to process all values of $$$\operatorname{opt}'(c,r)$$$ for one $$$r$$$ in $$$\mathcal{O}(C)$$$ time according to the transition described above.
Computing this for each $$$r$$$ independently, we have now computed the sum of $$$\operatorname{opt}(l,r)$$$ over all $$$1 \le l \le r \le n$$$ in $$$\mathcal{O}(n \log n + n \log a)$$$ time. The problem is solved.