gnudgnaoh's blog

By gnudgnaoh, history, 5 years ago, In English

Prereq: this digit dp blog

Cut the flag dimension

Usually, whatever states you use in the recursive dp function, you will memoize it. And often you will have some thing like this

int memo[pos][...][...][low]

Where low is the flag that checks if the current number is already smaller than the considered number.

It is totally possible to subtract this dimension (half the memory needed) by manipulating it in the recursive function:

Example problem: Perfect Number

This is what "normal" code would look like:

normal code

And this is the optimized code:

optimized code

Basically this trick only store the number if the low flag is on, since low isn't necessary to be memoized because its only meaning is to set the limit for the current digit.

Full submission for the "normal" code: "normal" code __ Full submission for the optimized code: optimized code


Different ways to memset

"Normal" memset

You memset every time you dp. This would takes a huge amount of time if you have to call dp many time or the memory is large. Example problem: LIDS

Example slow code:

slow code

You can clearly see that memset is executed many times for all digits, this would have complexity $$$\mathcal{O}(T * digits * memsize)$$$ where $$$T$$$ is the number of testcases, and $$$digits$$$ is the number of digits (from 0 to 9 in this case), and $$$memsize$$$ is the size of memory.

This is extremely slow.

Improvement using "time"

Now instead of memset every time you dp, you can keep an additional array vis[pos][...][...] which will store the "time" that the value in mem[pos][...][...] is set.

code

This way, the complexity is better but this is still too slow for many problems.

memset only once

You might wonder, "but how? you are doing dp many times on many different numbers!". Well actually, we are doing dp on the digits.

You might notice that we're always doing dp from the most significant digit to the least, usually from left to right, the most significant digit will be at position $$$0$$$ and the least at position $$$length - 1$$$.

This way, the memory for each number is different, like number $$$100$$$ will have different memory from number $$$1234$$$ since they have different $$$length$$$ and other states.

However, what if we let the most significant digit to be at position $$$length - 1$$$ and the least at position $$$0$$$?

Now, every digits of every number line up, and you only need to memset once only.

Example solution of: Perfect Number

code

This is an extremely important optimization for digit dp

Other optimizations problem-wise

Check sum of digits divisibility

For a single number

If you want to check if the sum of digits of a number is divisible by $$$D$$$. Instead of storing the whole sum(could lead to MLE), you can store only the remainder of the sum when divided by $$$D$$$.

For many numbers

Example problem: WORKCHEF (highly recommended, you will need to use a lot of optimizations to AC)

For many numbers, instead of having a state for the remainder for each number, eg: dp[...][rem2][rem3][...] you can store the remainder of their LCM, eg: checking sum of digits divisible by 1, 2, 3, ... , 9 -> check divisibility by $$$LCM(1, 2, ..., 9) = 2520$$$.

For numbers with special properties

If you want to check divisibility by 5, the last digit need to be 0 or 5. For 10, the last digit obviously must be 0. ... There are also many properties for different numbers.

Another way of digit dp

From this stackoverflow question

This can be very handy when handling problems relating to the structure of the numbers, eg: Palindromic Numbers

Example code:

code

Feel free to share any tricks or anything that people should know when doing digit dp! If there is any mistakes or suggestions, please let me know.

  • Vote: I like it
  • +31
  • Vote: I do not like it

»
5 years ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by gnudgnaoh (previous revision, new revision, compare).

»
5 years ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by gnudgnaoh (previous revision, new revision, compare).

»
5 years ago, hide # |
 
Vote: I like it +20 Vote: I do not like it

thank you very much sir bubu, impressive blog

anyway, the "memset improvement using time" is a good technique on its own, and it can also be applied for some other techniques (one of the common is Kuhn's algorithm for maximum matching)

»
5 years ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Also, there is kind of DP-Digit that only save partial part when the number of counting is dense and brute-forces when the counting is sparse. Some COCI problems must use this technique to AC

Also when DP-Digit on certain range with some property, you might find it possible to cut into a DP over a DP-Digit that might improve the calculation by reducing the re-calculating parts.

  • »
    »
    5 years ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    Can you provide the statement(or even better, links) for the COCI problems? And can you provide some examples about the DP over a digit DP part? Thanks!

»
5 years ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

In "LIDS" you don't need to memset every time. U can solve it taking into account current index, current digit, current lids only. [0.1s]

»
9 months ago, hide # |
Rev. 5  
Vote: I like it +5 Vote: I do not like it

I think that it’s easier to understand why reversing the digit order helps avoid repeated memset() if you try solving this example problem with just the smaller = true state. You will notice a pattern:

mem[pos] = mem[pos-1] + temp (temp is the result of choosing values for the pos-th digit.)

This becomes clearer when we solve the same example problem but with a = 1 and b = 999...9 (i.e., calculating the sum of digits for all numbers from 1 to n^10-1). In this case, the DP structure forms a neat recurrence as values from numbers with small number of digits can naturally contribute to numbers with larger number of digits.

This lets us reuse previously computed DP values when processing numbers with more digits. Thus, we only need to call memset() once at the beginning, instead of clearing the DP array every time we run DP on a new number. This can be very useful when approaching DP digit problems that use multitest (input has many tests).

Also, I think that the trick of eliminating smaller dimension from the DP array by storing only DP values with smaller = true can be understood by looking at how the recursive function calls itself. The recursive function with smaller = false is only called at most n + 1 times, where n is the number of digits. This is because each call with smaller = false can only lead to one more such call, which maintains the smaller = false state on a single path from pos = n-1 to pos = -1.

For a better understanding, look at the call tree of dp(pos, smaller) when dp(n-1, false) is called:

As shown, the smaller = false calls only occur along a single path (at most once per digit position, a total of n+1 times). Therefore, we don’t need to store their results, we can just compute them directly and only memoize the states where smaller = true!! This trick can halves DP array's memory allocated, which prevents getting MLE when solving problems with tight memory limit.

This is my optimized code for the example problem

By the way, your blog helped me in optimizing the DP digit code, thank you so much! (my English is bad, sorry^)

  • »
    »
    9 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    Yeah exactly. I also found a problem MYQ10, where repeated memsets will cause TLE. I did know the omitting low trick could save memory, but I didn't realize it actually avoid repeated memsets in multiple testcases until I read an editorial.