↵
Border Theory,which was not well-known enough,sh
↵
# Formal Border Theory↵
↵
## Conclution↵
↵
For a string $S$, There are a squence of strings:↵
↵
$$ \{b_i\}_{i=1}^n(b_1 = 0,b_i < b_{i + 1},b_n = |S|,S_{1,...,b_i} = S_{n - b_i + 1,...,n})$$↵
↵
it's garenteed that there exist a way to divide the sequence into $\operatorname{O}(\log |S|)$ continuous
↵
# Written before↵
↵
In this blog, I want to share Border Theory, a useful and elegant insight about string borders that often appears in competitive programming. Although this idea deserves to be as well-known as the KMP algorithm, many people today aren’t aware of it or how to use it to speed up their solutions.↵
↵
# Main Theoretical Result of Border Theory↵
↵
## Conclution↵
↵
For a string $S$, a border is a substring that is both a prefix of $S$ and a suffix of $S$.↵
↵
For a string $S$, we define that:↵
↵
$$ \{b_i\}_{i=1}^n$$↵
↵
is the sequence of all lengths of the border of $S$, sorted in increasing order.↵
↵
According to the theory,it's garenteed that for all possible $S$, ${B_i}_{i = 1}^n$ can be partitioned into $\operatorname{O}(\log |S|)$ continuous subsequences where each subsequence forms an arthmetic progressions.↵
↵
## Proof↵
↵
First of all,
↵
<spoiler summary="proof">↵
If there's pairs of adjacent
↵
- $p$ is the length of a border↵
- $q$ is the length of an another border↵
- The shorter border is a border of a larger border.↵
↵
Then we will found that if $p > q$, then we found that pair $(i - q,i)$ exists when $p > q$,or $(i,i + p)$ exists while $p < q$, which are not valid when we deem pairs $(i - p,i)$,$(i,i + q)$ as adjacent p
</spoiler>↵
↵
Second,
↵
<spoiler summary="proof">↵
if there's border lengths $i + p,i,i - q$ a
↵
- $p,q$ are the lengths of borders.↵
- $i - p$ is the length of a border of the border with length $i$.↵
- $i$ is the length of a border of the border with length $i + q$.↵
↵
We will find $i - q < i - p < i$ or $i < i + q < i + p$, making it
</spoiler>↵
↵
↵
## Task
↵
## Practicle way to use border theory:Shlink pointer↵
↵
As $shlink_i$ means to a node $i$,which node is the start of the next arthmetic sequence,can be calculated like this:↵
↵
↵
~~~~~↵
//define shlink[i]↵
//define link[i] as the maxinum border of i like fail pointers in KMP or PAM↵
//define len[i] as the length↵
//define diff[i] as the differance of length between i and link[i]↵
shlink[i] = link[i];↵
diff[i] = len[i] - len[link[i]];↵
if(diff[i] == diff[link[i]])↵
shlink[i] = shlink[link[i]];↵
~~~~~↵
↵
↵
When we tries to analyze on borders,we found that:↵
↵
↵
↵
In an arthmetic progression of the border of a prefix of length $i$ in $S$,the differance between to arthmetic progression and a similar one at $i - diff$ are a new border of length $i - diff_i$ and a new position that a border occurs as suffix $i - len_{shlink_i} - diff_i$. Which is really convenient to make use of Border Theory.↵
↵
## One example↵
↵
[Link](https://www.luogu.com.cn/problem/P4156)↵
↵
Given
- Append the full string $T$ and $S$ and form a string with length $|T| + |S|$
- If a suffix $A$ in $T$ is also
↵
Find the number of distinct length of strings in the range $[1,w]$
↵
<spoiler summary="solution">↵
In fact,if there's a way
↵
For a fixed addition $x$,we can form a graph of $x$ nodes
↵
But
↵
However, the naive a
↵
For each progression, we make a graph of $x$ verticles which $x$ is the smallest element in the arthmetic progression and process its corresponding cycles in the graph. In every cycle, we start from the node with the smallest
↵
At first we start with a graph with $|S|$ nodes and only node $0$ with a distance $|S|$. Later,
↵
</spoiler>↵
↵
<spoiler summary="code">↵
↵
~~~~~↵
#include<bits/stdc++.h>↵
using namespace std;↵
int t,n,bor[500009],diff[500009],dt[500009],shlink[500009],q[500009],head,tail;↵
long long w,dp1[500009],dp2[500009];↵
int mx[500009];↵
const long long INF = 1998244353999911659ll;↵
char c[500009];↵
int gcd(int x,int y){↵
return y == 0 ? x : gcd(y,x % y);↵
}↵
int main(){↵
scanf("%d",&t);↵
while(t--){↵
scanf("%d %lld",&n,&w);↵
scanf(" %s",c + 1);↵
bor[1] = 0;↵
for(int i = 2; i <= n; i ++){↵
int t = bor[i - 1];↵
while(t != 0 && c[i] != c[t + 1])↵
t = bor[t];↵
if(c[i] == c[t + 1])↵
t ++;↵
bor[i] = t;↵
}↵
for(int i = 1; i <= n; i ++){↵
diff[i] = i - bor[i];↵
shlink[i] = bor[i];↵
if(diff[i] == diff[bor[i]])↵
shlink[i] = shlink[bor[i]];↵
}↵
//for(int i = n; i > 0; i = bor[i])↵
// printf("%d\n",bor[i]);↵
//puts("OK");↵
dp1[0] = n;↵
for(int i = 1; i < n; i ++){↵
dp1[i] = INF;↵
}↵
int lst = n;↵
for(int i = bor[n]; i != 0; i = shlink[i]){↵
int nw = n - i;↵
int ls = (i - shlink[i]) / diff[i] - 1;↵
for(int j = 0; j < nw; j ++){↵
dp2[j] = INF;↵
}↵
int fp = gcd(n - i,lst);↵
for(int j = 0; j < fp; j ++){↵
mx[j] = n + 1;↵
dp2[n + 1] = INF;↵
}↵
for(int j = 0; j < lst; j ++){↵
dp2[dp1[j] % nw] = min(dp2[dp1[j] % nw],dp1[j]);↵
if(dp2[mx[dp1[j] % fp]] > dp2[dp1[j] % nw])↵
mx[dp1[j] % fp] = dp1[j] % nw;↵
}↵
↵
for(int j = 0; j < fp; j ++){↵
long long mn = dp2[mx[j]];↵
if(mx[j] == n + 1)↵
continue;↵
for(int k = (mx[j] + lst) % nw; k != mx[j]; k = (k + lst) % nw){↵
dt[k] = -1;↵
mn = min(mn + lst,dp2[k]);↵
dp2[k] = mn;↵
}↵
}↵
// printf("%d\n",i);↵
fp = gcd(nw,diff[i]);↵
for(int j = 0; j < fp; j ++){↵
mx[j] = n + 1;↵
}↵
dp1[n + 1] = INF;↵
for(int j = 0; j < nw; j ++){↵
dt[j] = 0;↵
dp1[j] = dp2[j];↵
if(dp1[j] < dp1[mx[j % fp]])↵
mx[j % fp] = j;↵
}↵
for(int j = 0; j < fp; j ++){↵
head = 1,tail = 0;↵
for(int k = mx[j],l = 0; l < nw / fp; l ++,k = (k + diff[i]) % nw){↵
dt[k] = l;↵
while(head <= tail && l - dt[q[head]] > ls)↵
head ++;↵
if(head <= tail)↵
dp1[k] = min(dp1[k],dp1[q[head]] + 1ll * l * diff[i] - 1ll * dt[q[head]] * diff[i] + nw);↵
while(head <= tail && dp1[k] - 1ll * l * diff[i] <= dp1[q[tail]] - 1ll * dt[q[tail]] * diff[i])↵
tail --;↵
q[++tail] = k;↵
//printf("%d %lld\n",k,dp1[k]);↵
}↵
}↵
lst = nw;↵
}↵
long long ans = 0;↵
for(int i = 0; i < lst; i ++){↵
//printf("%d %lld\n",i,dp1[i]);↵
if(dp1[i] <= w)↵
ans = ans + (w - dp1[i]) / lst + 1;↵
}↵
printf("%lld\n",ans);↵
}↵
}↵
~~~~~↵
↵
</spoiler>↵
↵
# Extention to palindromic suffixes↵
↵
As we
↵
# Shlink pointer ↵
↵
## introduction↵
↵
As $shlink_i$ means to a node $i$,which node is the start of the next arthmetic sequence,can be calculated like this:↵
↵
↵
~~~~~↵
//define shlink[i]↵
//define link[i] as the maxinum border of i like fail pointers in KMP or PAM↵
//define len[i] as the length↵
//define diff[i] as the differance of length between i and link[i]↵
shlink[i] = link[i];↵
diff[i] = len[i] - len[link[i]];↵
if(diff[i] == diff[link[i]])↵
shlink[i] = shlink[link[i]];↵
~~~~~↵
↵
↵
Then we'll find:↵
↵
↵
↵
Which is a good way to make use of Border Theory.↵
↵
## Tasks↵
↵
In [problem:932G] we can make a string $T$ by:↵
↵
- $t_{2k - 1} = s_{k}$↵
- $t_{2k} = s_{n - k + 1}$↵
↵
Then we found that The task turns to the number of ways to divide $T$ into several even-length palindromes, we tries to solve it using dynamic planning as $dp_i$ is the answer of the prefix of length $i$, then we use shlink pointers to make it $\operatorname{O}(|S| \log |S|)$
## Tasks↵
↵
In [problem:932G] we can make a string $T$ by:↵
↵
- $t_{2k - 1} = s_{k}$↵
- $t_{2k} = s_{n - k + 1}$↵
↵
Then we found that The task turns to the number of ways to divide $T$ into several even-length palindromes, we tries to solve it using dynamic planning as $dp_i$ is the answer when the string is the prefix of length $i$ in the string $T$, then we use shlink pointers to speed it up.↵
↵
<spoiler summary="code">↵
↵
~~~~~↵
#include<bits/stdc++.h>↵
using namespace std;↵
int n,b[1000009],tp[1000009],to[1000009][26],len[1000009],cnt,dp[1000009],f[1000009],shlink[1000009],diff[1000009];↵
char c[1000009],s[1000009];↵
const int mod = 1000000007;↵
int main(){↵
scanf(" %s",c);↵
n = strlen(c);↵
for(int i = 0; (i + 1) << 1 <= n; i ++){↵
s[i << 1 | 1] = c[i]; ↵
}↵
for(int i = n - 1; (i + 1) << 1 > n; i --){↵
s[(n - i) << 1] = c[i];↵
}↵
//printf("%s\n",s + 1);↵
for(int i = 2; i <= n; i ++){↵
int t = tp[i - 1];↵
while(t != 0 && s[i] != s[i - 1 - len[t]])↵
t = b[t];↵
if(s[i] == s[i - 1 - len[t]] && !to[t][s[i] - 'a']){↵
cnt ++;↵
//printf("%d\n",cnt);↵
to[t][s[i] - 'a'] = cnt;↵
len[cnt] = len[t] + 2;↵
int st = b[t];↵
while(st != 0 && s[i] != s[i - 1 - len[st]])↵
st = b[st];↵
if(s[i] == s[i - 1 - len[st]] && len[cnt] != 2)↵
st = to[st][s[i] - 'a'];↵
b[cnt] = st;↵
t = cnt;↵
}↵
else if(s[i] == s[i - 1 - len[t]])↵
t = to[t][s[i] - 'a'];↵
tp[i] = t;↵
//printf("%d %d %d\n",i,len[tp[i]],len[b[tp[i]]]);↵
}↵
for(int i = 1; i <= cnt; i ++){↵
diff[i] = len[i] - len[b[i]];↵
shlink[i] = b[i];↵
if(diff[i] == diff[b[i]])↵
shlink[i] = shlink[b[i]];↵
}↵
dp[0] = 1;↵
for(int i = 2; i <= n; i += 2){↵
if(tp[i] == 0)↵
continue;↵
int p = tp[i];↵
//dp[i] = dp[i - tp[i]];↵
while(p != 0){↵
f[p] = ((shlink[p] != b[p]) * f[b[p]] + dp[i - len[shlink[p]] - diff[p]]) % mod;↵
dp[i] = (dp[i] + f[p]) % mod;↵
//printf("%d %d %d %d\n",len[p],len[shlink[p]] + diff[p],f[p],f[b[p]]);↵
p = shlink[p];↵
}↵
//printf("%d %d\n",i,dp[i]);↵
}↵
printf("%d\n",dp[n]);↵
}↵
~~~~~↵
↵
↵
</spoiler>↵
↵
Such solution is similar to [problem:906E],which turns:↵
↵
- $t_{2k - 1} = x_k$↵
- $t_{2k} = y_k$↵
↵
And use dynamic planning to get the divition to even-length palindromes with least palindromes not with the length of $2$.↵
↵
<spoiler summary="code">↵
↵
~~~~~↵
#include<bits/stdc++.h>↵
using namespace std;↵
int n,m,shlink[1000009],b[1000009],to[1000009][26],tp[1000009],len[1000009],diff[1000009],cnt;↵
char s[500009],t[500009],ct[500009];↵
int dp[1000009],las[1000009],mn[1000009],flas[1000009];↵
int main(){↵
scanf(" %s %s",s,t);↵
n = strlen(s) << 1;↵
for(int i = 0; s[i] != '\0'; i ++)↵
ct[i << 1 | 1] = s[i],ct[(i << 1) + 2] = t[i];↵
for(int i = 2; i <= n; i ++){↵
int o = tp[i - 1];↵
while(o != 0 && ct[i] != ct[i - len[o] - 1])↵
o = b[o];↵
if(ct[i] == ct[i - len[o] - 1]){↵
if(to[o][ct[i] - 'a'] == 0){↵
cnt ++;↵
to[o][ct[i] - 'a'] = cnt;↵
len[cnt] = len[o] + 2;↵
int u = b[o];↵
while(u != 0 && ct[i] != ct[i - len[u] - 1])↵
u = b[u];↵
if(to[u][ct[i] - 'a'] != cnt && ct[i] == ct[i - len[u] - 1])↵
b[cnt] = to[u][ct[i] - 'a'];↵
shlink[cnt] = b[cnt];↵
diff[cnt] = len[cnt] - len[b[cnt]];↵
if(diff[cnt] == diff[b[cnt]])↵
shlink[cnt] = shlink[b[cnt]];↵
} ↵
o = to[o][ct[i] - 'a'];↵
}↵
tp[i] = o;↵
//printf("%d %d %d\n",i,len[tp[i]],len[b[tp[i]]]);↵
}↵
mn[0] = (1 << 24);↵
for(int i = 2; i <= n; i += 2){↵
if(ct[i] == ct[i - 1]){↵
dp[i] = dp[i - 2];↵
las[i] = i - 2;↵
}↵
else{↵
dp[i] = (1 << 24);↵
}↵
int p = tp[i];↵
if(len[p] != 0){↵
do{↵
mn[p] = dp[i - len[shlink[p]] - diff[p]] + 1;↵
flas[p] = i - len[shlink[p]] - diff[p];↵
if(b[p] != shlink[p]){↵
mn[p] = min(mn[b[p]],mn[p]); ↵
if(mn[p] == mn[b[p]])↵
flas[p] = flas[b[p]];↵
}↵
dp[i] = min(dp[i],mn[p]);↵
if(dp[i] == mn[p])↵
las[i] = flas[p];↵
p = shlink[p];↵
}while(len[p] != 0);↵
}↵
//printf("%d %d %d\n",i,dp[i],las[i]);↵
}↵
if((dp[n] >> 24) & 1)↵
puts("-1");↵
else{↵
printf("%d\n",dp[n]);↵
int o = n;↵
while(o != 0){↵
if(o - las[o] != 2)↵
printf("%d %d\n",(las[o] >> 1) + 1,o >> 1);↵
o = las[o];↵
}↵
}↵
} ↵
~~~~~↵
↵
</spoiler>↵
↵
# Summary↵
↵




