How is Problem D — Memory and Scores solved using Dynamic Programming?

Блог пользователя thecortex

Автор thecortex, история, 10 лет назад, По-английски

How is problem 712D - Memory and Scores solved using Dynamic Programming?

What is the memoization array structure?

What is the base case?

What is the greedy choice equation?

Thanks in advance!

dp, dynamic programming

thecortex
10 лет назад
3

Комментарии (2)

Показать архивные | Написать комментарий?

ned_chu

10 лет назад, скрыть # |

It's easy to come up with the solution of saving scores of two people and DP, but it'll need too much space.

We can save the differ of two people's score instead, then we just need O(t²k) space(or O(tk) using only 2 demension).

Each turn is independent, so round of A is + [ - k, k], round of B is - [ - k, k], they're the same, so the model can be changed into things like: initially you have value of a - b, 2t round, add [ - k.k] every round, how many possible games make value positive at the end.

In this way, the solution is O(t²k²) time because every state contributes to 2k + 1 state. So we shall optimize it. As it goes that $\text{[math]}$ , we can let $\text{[math]}$ , then dp[i][j] = dpn[i - 1][j + k] - dpn[i - 1][j - k - 1], we can update two arrays in O(1) time now, time is linear to space now, done.

→ Ответить

szawinis

10 лет назад, скрыть # |

← Rev. 3 →

My solution is probably the same as ned_chu's. I used two arrays for dp: one for the first player and the other one for second player. Size is a little more than 4 * 10⁵ including the shift for negative indices. During calculation of DP, I used temporary array to calculate dp of next state in order to prevent recalculation of current state. This might be a little confusing, do let's start from the naive solution first.

The naive solution:
dpa[i][j] = Number of ways to add [k, k] i times to player a's score and end up with result j.
dpb[i][j] = Same idea, but with player b instead

Transition:
$\text{[math]}$
$\text{[math]}$

Base case:
dpa[0][a] = dpb[0][b] = 1

To calculate the final answer, we just do $\text{[math]}$

The range for j = [ - 2kt, 2kt] (this might have a or b plus or minus, but you get the idea), which is around 4 * 10⁵ if not greater. Multiply this by range of i = 2t = 200 and you get size of 8 * 10⁷ and also multiply by 64 for long long. Finally you get 512 * 10⁷ which definitely exceeds 512 MB memory limit. So, we have to create a more memory efficient solution. Notice that for each transition, we only need to use i - 1. We will exploit this fact and eliminate the first dimension (i) completely.

More optimal solution:
As state above, we will eliminate the first dimension of dpa[i][j] and dpb[i][j] completely. So I will become dpa[j] and dpb[j]. This is when we use our prefix sum trick. I think ned_chu has already mentioned this but I will mention it again just for completeness :).

Transitions:
$\text{[math]}$
$\text{[math]}$

Base case:
$\text{[math]}$
$\text{[math]}$

So now we can calculate answer for dpa and dpb in O(1). However, you need to be careful. If you were to do use to same array and update dpa[j] = dpa[j - 1] + dpa[j + k] - dpa[j - k - 1] you will run into a WA because you are duplicating. There are a couple ways to avoid this. One way is to store the next state in a separate tmp array: tmp[j] = tmp[j - 1] + dpa[j + k] - dpa[j - k - 1] and after you have iterated all j, you set dpa[j] = tmp[j] to update the answer. Another method is instead of using extra tmp array, you can make dpa and dpb have two dimensions like this: dpa[0, 1][j] and dpb[0, 1][j]. However, you need to swap every iteration of i you calculate. It really depends on your preference, and there may be other methods available that I don't know (I'm still cyan after all lol).