CP is surprisingly useless in real life, but that doesn't mean you can't use it to beat your friends at "casual" games.
Statement
In the app "2 Player Games", there is this game yazy played somewhat often in Singapore. The rules are as follows:
Each player has 11 turns. Each turn starts by you rolling five fair 6-sided die. Then, you have $$$2$$$ rerolls. In each reroll, you pick a subset of dice to "lock". The dice that are not "lock" will be rerolled. After using some of your rerolls, you pick a category out of the following $$$11$$$ not yet chosen, and your score is described below.
- $$$1 \times \text{number of dice with 1}$$$
- $$$2 \times \text{number of dice with 2}$$$
- $$$3 \times \text{number of dice with 3}$$$
- $$$4 \times \text{number of dice with 4}$$$
- $$$5 \times \text{number of dice with 5}$$$
- $$$6 \times \text{number of dice with 6}$$$
- Sum of dice rolls if some number $$$x$$$ appears in at least $$$3$$$ die, $$$0$$$ otherwise.
- Sum of dice rolls if some number $$$x$$$ appears in at least $$$4$$$ die, $$$0$$$ otherwise.
- $$$25$$$ if some number $$$x$$$ appears in $$$3$$$ die and another number $$$y$$$ appears in $$$2$$$ die, $$$0$$$ otherwise.
- $$$40$$$ if the die rolls are $$$1,2,3,4,5$$$ or $$$2,3,4,5,6$$$ in some order, $$$0$$$ otherwise.
- $$$50$$$ if some number $$$x$$$ appears in all $$$5$$$ die, $$$0$$$ otherwise.
The winner is the player with higher score.
Since I wanted to all $$$0$$$ of my friends at it, I decided to write a program to find the maximum expected value.
Note: Having the maximum expected value does not mean you win with maximum probability (a player who always gets a score of $$$2$$$ will beat a player with $$$99$$$ percent chance of having a score of $$$1$$$ and $$$1$$$ percent chance of having a score of $$$10^9$$$), but I believe trying to maximise the probability of winning is much harder.
Solution
Simulating all turns is too slow. Instead, we note that to find the maximum score from a given position, we only need to consider these:
- Which categories have not been chosen
- The current state of the dice
- The number of rerolls left
The current score of the player is irrelevant, as we can simply add it to the expected value found. For example, if the expected score of a player with $$$0$$$ current score, categories $$$1$$$ to $$$6$$$ not yet chosen, and $$$3$$$ rerolls left is $$$59$$$, the expected score of a player with $$$10$$$ current score, categories $$$1$$$ to $$$6$$$ not yet chosen, and $$$3$$$ rerolls left is $$$69$$$.
Therefore, this leads us to formulate a DP, where $$$dp[\text{mask}][\text{state}][\text{rerolls}]$$$ is the maximum expected score if the categories not chosen are represented by the bitmask $$$\text{mask}$$$, the state of the dice is represented by $$$\text{state}$$$, and $$$\text{rerolls}$$$ is the number of rerolls left.
This DP gives us $$$2^{11} \times 6^5 \times 3 = 4.7 \times 10^7$$$ states. This seems somewhat reasonable until we realise that we have over $$$42$$$ transitions to consider, which are:
- Picking $$$1$$$ of the $$$11$$$ categories
- Picking $$$1$$$ out of the $$$32$$$ subsets to lock, and rerolling
In the second case, we have to consider all possible dice combinations we may get when rerolling, which takes too long.
To speed up our DP, notice that there is essentially no difference between a dice combination of $$$62451$$$ and $$$12456$$$ (besides re-ordering). Thus, by stars and bars, the number of distinct dice combinations is $$$\binom{10}5 = 252$$$, giving us $$$2^{11} \times 252 \times 3 = 1.5 \times 10^6$$$ states, and a transition complexity of approximately $$$32 \times 252 = 8064$$$. At this point, I stopped optimising my code and wrote out my DP (which took 10min to run with the O3 flag), only to realise that me doing this is the reason why I struggle to make friends. Anyways, the maximum expected score is $$$165.76$$$.



