Required Substrings problem on CSES

#	User	Rating
1	tourist	3985
2	jiangly	3741
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3489
7	Radewoosh	3483
8	Kevin114514	3442
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
2	atcoder_official	162
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	nor	150

Can someone explain how to solve Required Substring problem in CSES : https://cses.fi/problemset/task/1112

Your task is to calculate the number of strings of length n having a given pattern of length m as their substring. All strings consist of characters A–Z.

My approach :

I was thinking of some dp approach and instead of counting the required quantity directly, I'd count the complement and subtract from $$$26^n$$$

But, I am stuck with this approach, I thought of the following recursion

$$$dp(i, j) = # of strings of length i not containing s, whose suffix of length i is a prefix of s (i.e. s_1\ldots s_j)$$$

$$$dp(i, j) = \sum_{j'\in S} dp(i-1, j') + dp(i-1, j-1)$$$, where $$$S$$$ = set of all $$$j'$$$ such that $$$s_1 \ldots s_{j'}$$$ has a prefix of length $$$j-1$$$ which is also its suffix

and then finally compute : $$$dp(i) = 26*dp(i-1)-dp(i-1,m-1)$$$ (supposed to count strings of length i not containing s)

but I cannot figure out how to instantiate the base case $$$dp(i, 0)$$$ for $$$i > m$$$ or $$$dp(i, 1)$$$

Any help is appreciated, thank you

Comments (1)

Show archived | Write comment?

felps4ntiago

10 months ago, # |

You could follow this dp idea of finding the complement and then the answer would be 26^n — complement. So you make a dp(i,j), meaning how many string from i to n doesn't have the pattern, knowing that the previous j-1 letters already matched with the pattern. The transition would be trying any of the 26 uppercase leetters as the new letter in the index i, but if j == m -1 (m is the size of the pattern), that means you can't choose the last letter of the pattern, cause that would result in us having the pattern on the string. When you reach i == n, that means you have a full string without the pattern, so you return 1. The difficulty lays on the fact that when pattern[j] doesn't match with the letter you chose to put on your string, you can't just say that j will be zero.

Example:

Imaging the pattern = ababc, and j = 4, meaning you already matched 'abab' and you know you can't choose 'c' cause that would make the whole pattern appear on the string, but if you choose 'a', now we have 'ababa', and the suffix 'aba' already matches with the three first letters of the pattern, so j will be 3. So you use the kmp algorithm to discover what would me the next j.

→ Reply

harsh-apcr's blog