UPD:Thanks to comment from arvindf232 and pajenegod,I rewrote parts of this post to improve clarity with the help of Chatgpt. I hope it reads more smoothly now.
Preface
In this blog, I want to share Border Theory, a useful and elegant insight about string borders that often appears in competitive programming. Although this idea deserves to be as well-known as the KMP algorithm, many people today aren’t aware of it or how to use it to speed up their solutions.
Main Theoretical Result of Border Theory
Main Result
For a string $$$S$$$, a border is a substring that is both a prefix of $$$S$$$ and a suffix of $$$S$$$.
For a string $$$S$$$, we define that:
$$$ \{b_i\}_{i=1}^n$$$is the sequence of all lengths of the border of $$$S$$$, sorted in increasing order.
According to the theory,it's garenteed that for all possible $$$S$$$, $$${B_i}_{i = 1}^n$$$ can be partitioned into $$$\operatorname{O}(\log |S|)$$$ continuous subsequences where each subsequence forms an arthmetic progressions.
Proof
First of all,if a string $$$T$$$ is a border of $$$S$$$ and $$$2|T| \ge |S|$$$, Then all positions where $$$T$$$ occurs in $$$S$$$ form an arthmetic progression.
proofSuppose three adjacent occurrences of $$$T$$$ had different gaps. As we found that there exists borders with the length of the gaps and the shorter border is also a border of the longer border, this would imply the existence of a shorter overlapping occurrence, which contradicts adjacency. Therefore all adjacent gaps must be equal.
Second, All border lengths satisfying $$$2 \cdot length \ge |S|$$$ also appear in an arithmetic sequence:
proofAssume that the sequence of border lengths satisfying $$$2\cdot length\ge∣S∣$$$ is not an arithmetic progression. Then, there must exist three consecutive border lengths, which we denote as $$$i+p$$$, $$$i$$$, and $$$i−q$$$, such that the common difference is not constant (i.e., $$$p\neq q$$$).
A fundamental property of borders is that a border of a border is also a border of the original string. Consequently, $$$p$$$ and $$$q$$$ must themselves be border lengths. Furthermore, the border of length $$$i$$$ has a border of length $$$i−p$$$, and the border of length $$$i+q$$$ has a border of length $$$i$$$ and is a border of $$$i + p$$$ if $$$p \gt q$$$. This implies that both $$$i−p$$$ and $$$i+q$$$ are also border lengths of $$$S$$$.
However, if $$$i−p$$$ is a border length, it must lie strictly between $$$i−q$$$ and $$$i$$$ if $$$p \lt q$$$. Similarly, if $$$i+q$$$ is a border length, it must lie strictly between $$$i$$$ and $$$i+p$$$ when $$$p \gt q$$$. This introduces a new border length between what were assumed to be consecutive borders, which is a contradiction.
Therefore, our initial assumption that $$$p\neq q$$$ must be false. We conclude that $$$p=q$$$, proving that the sequence of border lengths is indeed an arithmetic progression.
Using the two observations, we can show that for every two border length $$$x,y(x \lt y)$$$ do not lie in the same arithmetic subsequence, then $$$2x \lt y$$$, making the theory right.
Practical Use: The Series-Link (shlink) Trick
A practical formulation of this theory is the series link (often written shlink), which groups borders into arithmetic sequences so they can be processed efficiently.
We compute shlink[i] as the start of the next arithmetic progression. (Here link[i] is the usual border/failure link in KMP-style structures, len[i] is the length of the represented substring, and diff[i] is the differance of length between i and link[i].) Now $$$shlink_i$$$ can be calculated like this:
//define shlink[i]
//define link[i] as the maxinum border of i like fail pointers in KMP or PAM
//define len[i] as the length
//define diff[i] as the differance of length between i and link[i]
shlink[i] = link[i];
diff[i] = len[i] - len[link[i]];
if(diff[i] == diff[link[i]])
shlink[i] = shlink[link[i]];
When we tries to analyze on borders,we found that:

When analyzing borders through these links, we observe: within an arithmetic progression of border lengths for a prefix of length $$$i$$$, the difference between consecutive border values (shifted by $$$diff_i$$$) corresponds to a border of length $$$i - diff_i$$$ and the corresponding occurrence position $$$i - len_{shlink_i} - diff_i$$$. This allows us to jump across long chains of borders in limited steps.
One example
Link
Given up to $$$5$$$ test cases,each with a string $$$S(1 \le |S| \le 5 \times 10 ^ 5)$$$ and an integer $$$w(1 \le w \le 10 ^ {18})$$$, when we have a string $$$T$$$ we can construct a new string $$$Y$$$ in the two ways: - Append the full string $$$T$$$ and $$$S$$$ and form a string with length $$$|T| + |S|$$$; - If a suffix $$$A$$$ in $$$T$$$ is also a prefix of $$$S$$$, you can overlap them and construct a string of length $$$|T| - |A| + |S|$$$.
Find the number of distinct length of strings in the range $$$[1,w]$$$ that can result from the process.
solutionIn fact,if there's a way to model the problem:
For a fixed addition $$$x$$$,we can form a graph of $$$x$$$ nodes that each node represents a remainder modulo $$$x$$$, Each allowed addition corresponds to a directed edge between nodes. Our goal is to run a shortest paths algorithm to get the smallest reachable lengths for every modulo reminder.
However, the naive approach is too slow,as border lengths can be partitioned into $$$\operatorname{O}(\log |S|)$$$ distinct arthmetic progressions,we can just update the graph with one arthmetic progression at a time instead of processing each border together.
For each progression, we make a graph of $$$x$$$ verticles which $$$x$$$ is the smallest element in the arthmetic progression and process its corresponding cycles in the graph. In every cycle, we start from the node with the smallest current distance and update all nodes in that cycle using a monotonic queue, exactly like a sliding-window minimum problem. This allows us to update one full progression in $$$\operatorname{O}(x)$$$ time.
At first we start with a graph with $$$|S|$$$ nodes and only node $$$0$$$ with a distance $$$|S|$$$. Later, when we change the modulo $$$x$$$ to $$$x'$$$ to update in a now arthmetic progression,we can again start form the node with smallest distance in each cycle and update the graph in $$$\operatorname{O}(x + x')$$$.
Now we approached the algorithm and obtained the overall time complexy $$$\operatorname{O}(|S|\log |S|)$$$.
code#include<bits/stdc++.h>
using namespace std;
int t,n,bor[500009],diff[500009],dt[500009],shlink[500009],q[500009],head,tail;
long long w,dp1[500009],dp2[500009];
int mx[500009];
const long long INF = 1998244353999911659ll;
char c[500009];
int gcd(int x,int y){
return y == 0 ? x : gcd(y,x % y);
}
int main(){
scanf("%d",&t);
while(t--){
scanf("%d %lld",&n,&w);
scanf(" %s",c + 1);
bor[1] = 0;
for(int i = 2; i <= n; i ++){
int t = bor[i - 1];
while(t != 0 && c[i] != c[t + 1])
t = bor[t];
if(c[i] == c[t + 1])
t ++;
bor[i] = t;
}
for(int i = 1; i <= n; i ++){
diff[i] = i - bor[i];
shlink[i] = bor[i];
if(diff[i] == diff[bor[i]])
shlink[i] = shlink[bor[i]];
}
//for(int i = n; i > 0; i = bor[i])
// printf("%d\n",bor[i]);
//puts("OK");
dp1[0] = n;
for(int i = 1; i < n; i ++){
dp1[i] = INF;
}
int lst = n;
for(int i = bor[n]; i != 0; i = shlink[i]){
int nw = n - i;
int ls = (i - shlink[i]) / diff[i] - 1;
for(int j = 0; j < nw; j ++){
dp2[j] = INF;
}
int fp = gcd(n - i,lst);
for(int j = 0; j < fp; j ++){
mx[j] = n + 1;
dp2[n + 1] = INF;
}
for(int j = 0; j < lst; j ++){
dp2[dp1[j] % nw] = min(dp2[dp1[j] % nw],dp1[j]);
if(dp2[mx[dp1[j] % fp]] > dp2[dp1[j] % nw])
mx[dp1[j] % fp] = dp1[j] % nw;
}
for(int j = 0; j < fp; j ++){
long long mn = dp2[mx[j]];
if(mx[j] == n + 1)
continue;
for(int k = (mx[j] + lst) % nw; k != mx[j]; k = (k + lst) % nw){
dt[k] = -1;
mn = min(mn + lst,dp2[k]);
dp2[k] = mn;
}
}
// printf("%d\n",i);
fp = gcd(nw,diff[i]);
for(int j = 0; j < fp; j ++){
mx[j] = n + 1;
}
dp1[n + 1] = INF;
for(int j = 0; j < nw; j ++){
dt[j] = 0;
dp1[j] = dp2[j];
if(dp1[j] < dp1[mx[j % fp]])
mx[j % fp] = j;
}
for(int j = 0; j < fp; j ++){
head = 1,tail = 0;
for(int k = mx[j],l = 0; l < nw / fp; l ++,k = (k + diff[i]) % nw){
dt[k] = l;
while(head <= tail && l - dt[q[head]] > ls)
head ++;
if(head <= tail)
dp1[k] = min(dp1[k],dp1[q[head]] + 1ll * l * diff[i] - 1ll * dt[q[head]] * diff[i] + nw);
while(head <= tail && dp1[k] - 1ll * l * diff[i] <= dp1[q[tail]] - 1ll * dt[q[tail]] * diff[i])
tail --;
q[++tail] = k;
//printf("%d %lld\n",k,dp1[k]);
}
}
lst = nw;
}
long long ans = 0;
for(int i = 0; i < lst; i ++){
//printf("%d %lld\n",i,dp1[i]);
if(dp1[i] <= w)
ans = ans + (w - dp1[i]) / lst + 1;
}
printf("%lld\n",ans);
}
}
Extention to palindromic suffixes
A string $$$s$$$ is called a palindrome if it equals its reverse. A palindromic suffix of a string is a suffix that is also a palindrome.
As we found that The borders of palindormes are also the palindromic suffixes of a palindrome, then we can say that for a string $$$S$$$, the sequence of the lengths of all palindromic suffixes in increasing order(which can be computed using PAM) can be partitioned into $$$\operatorname{O}(\log |S|)$$$ continuous arthmetic progressions. We can also say that the sequence of the lengths of all even-length palindromic suffixes in increasing order can be partitioned into $$$\operatorname{O}(\log |S|)$$$ continuous arthmetic progressions.
Tasks
In 932G - Palindrome Partition we can construct a string $$$T$$$ by:
- $$$t_{2k - 1} = s_{k}$$$
- $$$t_{2k} = s_{n - k + 1}$$$
Then the task turns to the number of ways to divide $$$T$$$ into several even-length palindromes, it's solvable for dynamic planning as $$$dp_i$$$ is the answer when the string is the prefix of length $$$i$$$ in the string $$$T$$$, with series link to speed it up.
code#include<bits/stdc++.h>
using namespace std;
int n,b[1000009],tp[1000009],to[1000009][26],len[1000009],cnt,dp[1000009],f[1000009],shlink[1000009],diff[1000009];
char c[1000009],s[1000009];
const int mod = 1000000007;
int main(){
scanf(" %s",c);
n = strlen(c);
for(int i = 0; (i + 1) << 1 <= n; i ++){
s[i << 1 | 1] = c[i];
}
for(int i = n - 1; (i + 1) << 1 > n; i --){
s[(n - i) << 1] = c[i];
}
//printf("%s\n",s + 1);
for(int i = 2; i <= n; i ++){
int t = tp[i - 1];
while(t != 0 && s[i] != s[i - 1 - len[t]])
t = b[t];
if(s[i] == s[i - 1 - len[t]] && !to[t][s[i] - 'a']){
cnt ++;
//printf("%d\n",cnt);
to[t][s[i] - 'a'] = cnt;
len[cnt] = len[t] + 2;
int st = b[t];
while(st != 0 && s[i] != s[i - 1 - len[st]])
st = b[st];
if(s[i] == s[i - 1 - len[st]] && len[cnt] != 2)
st = to[st][s[i] - 'a'];
b[cnt] = st;
t = cnt;
}
else if(s[i] == s[i - 1 - len[t]])
t = to[t][s[i] - 'a'];
tp[i] = t;
//printf("%d %d %d\n",i,len[tp[i]],len[b[tp[i]]]);
}
for(int i = 1; i <= cnt; i ++){
diff[i] = len[i] - len[b[i]];
shlink[i] = b[i];
if(diff[i] == diff[b[i]])
shlink[i] = shlink[b[i]];
}
dp[0] = 1;
for(int i = 2; i <= n; i += 2){
if(tp[i] == 0)
continue;
int p = tp[i];
//dp[i] = dp[i - tp[i]];
while(p != 0){
f[p] = ((shlink[p] != b[p]) * f[b[p]] + dp[i - len[shlink[p]] - diff[p]]) % mod;
dp[i] = (dp[i] + f[p]) % mod;
//printf("%d %d %d %d\n",len[p],len[shlink[p]] + diff[p],f[p],f[b[p]]);
p = shlink[p];
}
//printf("%d %d\n",i,dp[i]);
}
printf("%d\n",dp[n]);
}
These are similar to the solution to 906E - Reverses,which constructs a string $$$T$$$ in this way:
- $$$t_{2k - 1} = x_k$$$
- $$$t_{2k} = y_k$$$
Then it's solvable for dynamic programming to find a partition of the string into even-length palindromes, minimizing the total count while allowing palindromes of length $$$2$$$ but excluding them from the count.
code#include<bits/stdc++.h>
using namespace std;
int n,m,shlink[1000009],b[1000009],to[1000009][26],tp[1000009],len[1000009],diff[1000009],cnt;
char s[500009],t[500009],ct[500009];
int dp[1000009],las[1000009],mn[1000009],flas[1000009];
int main(){
scanf(" %s %s",s,t);
n = strlen(s) << 1;
for(int i = 0; s[i] != '\0'; i ++)
ct[i << 1 | 1] = s[i],ct[(i << 1) + 2] = t[i];
for(int i = 2; i <= n; i ++){
int o = tp[i - 1];
while(o != 0 && ct[i] != ct[i - len[o] - 1])
o = b[o];
if(ct[i] == ct[i - len[o] - 1]){
if(to[o][ct[i] - 'a'] == 0){
cnt ++;
to[o][ct[i] - 'a'] = cnt;
len[cnt] = len[o] + 2;
int u = b[o];
while(u != 0 && ct[i] != ct[i - len[u] - 1])
u = b[u];
if(to[u][ct[i] - 'a'] != cnt && ct[i] == ct[i - len[u] - 1])
b[cnt] = to[u][ct[i] - 'a'];
shlink[cnt] = b[cnt];
diff[cnt] = len[cnt] - len[b[cnt]];
if(diff[cnt] == diff[b[cnt]])
shlink[cnt] = shlink[b[cnt]];
}
o = to[o][ct[i] - 'a'];
}
tp[i] = o;
//printf("%d %d %d\n",i,len[tp[i]],len[b[tp[i]]]);
}
mn[0] = (1 << 24);
for(int i = 2; i <= n; i += 2){
if(ct[i] == ct[i - 1]){
dp[i] = dp[i - 2];
las[i] = i - 2;
}
else{
dp[i] = (1 << 24);
}
int p = tp[i];
if(len[p] != 0){
do{
mn[p] = dp[i - len[shlink[p]] - diff[p]] + 1;
flas[p] = i - len[shlink[p]] - diff[p];
if(b[p] != shlink[p]){
mn[p] = min(mn[b[p]],mn[p]);
if(mn[p] == mn[b[p]])
flas[p] = flas[b[p]];
}
dp[i] = min(dp[i],mn[p]);
if(dp[i] == mn[p])
las[i] = flas[p];
p = shlink[p];
}while(len[p] != 0);
}
//printf("%d %d %d\n",i,dp[i],las[i]);
}
if((dp[n] >> 24) & 1)
puts("-1");
else{
printf("%d\n",dp[n]);
int o = n;
while(o != 0){
if(o - las[o] != 2)
printf("%d %d\n",(las[o] >> 1) + 1,o >> 1);
o = las[o];
}
}
}
Summary
Though bruteforce on borders is slow, we can approach the algorithm using border theory,that shows that border lengths cluster into structured arithmetic sequences bounded in number by $$$\operatorname{O}(\log |S|)$$$ to make it much faster.
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
%%%%
%%%
%%%%%%
%%%%%%
膜郑了
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
This is unreadable. Please use an LLM to translate
EDIT: That aside, after actually understanding what you meant (with an LLM) , this seemed exceptionally useful
What do you mean? Translate to Chinese or just do a summary? Are there any more details that have to be added? Thanks for reminding me to develop instead of something unresponsible.
Ensure that everything is correctly and clearly defined. Also there are lots of typos, grammatically awkward sentences. A little bit is fine but this time there are so many that it is very hard to understand what you meant.
If you have written it in another langauge, simply translate it. If you are far more comfortable writing in anotheer language, write in that langauge then translate. Or, ask LLM to fix up the entire post would be much better.
I've tried to develop once. Is the blog better now?
The short answer is, no.
I'm sorry to say this, but the level of the english language is honestly a mess. There are obvious things like the word "conclusion" being spelled "conclution". But the bigger issue is that the sentences are so weirdly written that I don't get what they are trying to say. For example, I still don't get what you mean by the word "bruteforce" in the title.
My advice would be, just as arvind recommended, to copy paste your text through some LLM (like chatgpt or deepseek etc...). That would take you like 1 minute, and it would probably do a pretty good job. And it would make your blog 100 times easier to read.
Thanks for you to remind me,I'm taking action now.
No, the blog is still incomprehensible. It doesn't look like much has changed. You still have conclution.
Here is an example where I asked chatgpt to fix the blog https://chatgpt.com/share/69469388-38b0-8012-ac00-334a11677e8a . Not only did it fix the language, it also fixed some issues in your proofs. Especially your first proof.
OK,updated again.
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
orz
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).
Auto comment: topic has been updated by OtterZ (previous revision, new revision, compare).