thecortex's blog

By thecortex, history, 8 years ago, In English

How is problem 712D - Memory and Scores solved using Dynamic Programming?

What is the memoization array structure?

What is the base case?

What is the greedy choice equation?

Thanks in advance!

  • Vote: I like it
  • +2
  • Vote: I do not like it

| Write comment?
»
8 years ago, # |
  Vote: I like it +1 Vote: I do not like it

It's easy to come up with the solution of saving scores of two people and DP, but it'll need too much space.

We can save the differ of two people's score instead, then we just need O(t2k) space(or O(tk) using only 2 demension).

Each turn is independent, so round of A is  + [ - k, k], round of B is  - [ - k, k], they're the same, so the model can be changed into things like: initially you have value of a - b, 2t round, add [ - k.k] every round, how many possible games make value positive at the end.

In this way, the solution is O(t2k2) time because every state contributes to 2k + 1 state. So we shall optimize it. As it goes that , we can let , then dp[i][j] = dpn[i - 1][j + k] - dpn[i - 1][j - k - 1], we can update two arrays in O(1) time now, time is linear to space now, done.

»
8 years ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

My solution is probably the same as ned_chu's. I used two arrays for dp: one for the first player and the other one for second player. Size is a little more than 4 * 105 including the shift for negative indices. During calculation of DP, I used temporary array to calculate dp of next state in order to prevent recalculation of current state. This might be a little confusing, do let's start from the naive solution first.

The naive solution:
dpa[i][j] = Number of ways to add [k, k] i times to player a's score and end up with result j.
dpb[i][j] = Same idea, but with player b instead

Transition:


Base case:
dpa[0][a] = dpb[0][b] = 1

To calculate the final answer, we just do

The range for j = [ - 2kt, 2kt] (this might have a or b plus or minus, but you get the idea), which is around 4 * 105 if not greater. Multiply this by range of i = 2t = 200 and you get size of 8 * 107 and also multiply by 64 for long long. Finally you get 512 * 107 which definitely exceeds 512 MB memory limit. So, we have to create a more memory efficient solution. Notice that for each transition, we only need to use i - 1. We will exploit this fact and eliminate the first dimension (i) completely.

More optimal solution:
As state above, we will eliminate the first dimension of dpa[i][j] and dpb[i][j] completely. So I will become dpa[j] and dpb[j]. This is when we use our prefix sum trick. I think ned_chu has already mentioned this but I will mention it again just for completeness :).

Transitions:


Base case:

So now we can calculate answer for dpa and dpb in O(1). However, you need to be careful. If you were to do use to same array and update dpa[j] = dpa[j - 1] + dpa[j + k] - dpa[j - k - 1] you will run into a WA because you are duplicating. There are a couple ways to avoid this. One way is to store the next state in a separate tmp array: tmp[j] = tmp[j - 1] + dpa[j + k] - dpa[j - k - 1] and after you have iterated all j, you set dpa[j] = tmp[j] to update the answer. Another method is instead of using extra tmp array, you can make dpa and dpb have two dimensions like this: dpa[0, 1][j] and dpb[0, 1][j]. However, you need to swap every iteration of i you calculate. It really depends on your preference, and there may be other methods available that I don't know (I'm still cyan after all lol).

Using this method, the final answer with be