Border Theory:There's a fast approach to bruteforce on borders

№	Пользователь	Рейтинг
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	turmax	3559
6	tourist	3541
7	strapple	3515
8	ksun48	3461
9	dXqwq	3436
10	Otomachi_Una	3413

№	Пользователь	Вклад
1	Qingyu	157
2	adamant	153
3	Um_nik	147
4	Proof_by_QED	146
5	Dominater069	145
6	errorgorn	142
7	cry	139
8	YuukiS	135
9	TheScrasse	134
10	chromate00	133

# Written before↵
↵
Border Theory,which was not well-known enough,shUPD:Thanks to comment from [user:arvindf232,2025-12-20] and [user:pajenegod,2025-12-20], the blog haves been ~~publicized together with KMP or similar algorithms.But nowaday~~developed with the help of Chatgpt. Hope that such a development will make this blots of people don't know such a theory exist,making them unable to solve several tasks easily or even can't get a fast solution.So it's time to complete the blog as what I will do on this.↵
↵
# Formal Border Theory↵
↵
## Conclution↵
↵
For a string $S$, There are a squence of strings:↵
↵
$$ \{b_i\}_{i=1}^n(b_1 = 0,b_i < b_{i + 1},b_n = |S|,S_{1,...,b_i} = S_{n - b_i + 1,...,n})$$↵
↵
it's garenteed that there exist a way to divide the sequence into $\operatorname{O}(\log |S|)$ continuousg more readable.↵
↵
# Written before↵
↵
In this blog, I want to share Border Theory, a useful and elegant insight about string borders that often appears in competitive programming. Although this idea deserves to be as well-known as the KMP algorithm, many people today aren’t aware of it or how to use it to speed up their solutions.↵
↵
# Main Theoretical Result of Border Theory↵
↵
## Conclution↵
↵
For a string $S$, a border is a substring that is both a prefix of $S$ and a suffix of $S$.↵
↵
For a string $S$, we define that:↵
↵
$$ \{b_i\}_{i=1}^n$$↵
↵
is the sequence of all lengths of the border of $S$, sorted in increasing order.↵
↵
According to the theory,it's garenteed that for all possible $S$, ${B_i}_{i = 1}^n$ can be partitioned into $\operatorname{O}(\log |S|)$ continuous subsequences where each subsequence forms an arthmetic progressions.↵
↵
## Proof↵
↵
First of all,~~we found that if there's a string $T$ a~~if there's a border $T$ of $S$ ~~and~~that $2|T| \ge |S|$, Then ~~the occurrence of $T$~~all the positions where $T$ occur in $S$ fo~~rms~~llow an arthmetic p~~rogressio~~attern.↵
↵
<spoiler summary="proof">↵
If there's pairs of adjacent ~~occurrence of $T$~~positions where $T$ occurs in $S$ $(i - p,i)$ and $(i,i + q)$ that $p \neq q$,we found that:↵
↵
- $p$ is the length of a border↵
- $q$ is the length of an another border↵
- The shorter border is a border of a larger border.↵
↵
Then we will found that if $p > q$, then we found that pair $(i - q,i)$ exists when $p > q$,or $(i,i + p)$ exists while $p < q$, which are not valid when we deem pairs $(i - p,i)$,$(i,i + q)$ as adjacent p~~airs~~ositions where $T$ occurs in $S$,so $p = q$ garenteed.↵
</spoiler>↵
↵
Second,~~We found that all lengths of border $T$ that $2|T| \ge |S|$ will form~~ All border lengths satisfying that inequality also appear in an arithmetic ~~progression~~sequence:↵
↵
<spoiler summary="proof">↵
if there's border lengths $i + p,i,i - q$ a~~s no border length $l$ that $i - q < l < i$ or $i < l < i + p$~~djacent in the sequence of all border lengths satisfying the inequality,we'll found:↵
↵
- $p,q$ are the lengths of borders.↵
- $i - p$ is the length of a border of the border with length $i$.↵
- $i$ is the length of a border of the border with length $i + q$.↵
↵
We will find $i - q < i - p < i$ or $i < i + q < i + p$, making it ~~incorrect~~wrong,so $p = q$↵
</spoiler>↵
↵
So there's a way that if $p < q,p,q \text{ are the length of borders}$ not in the same continuous arthmetic progression, $2p < q$ exist,So the conclution is right.↵
↵
## TaskUsing the two lemma we can proof that for every two border length $x,y(x < y)$ not in the same subsequence, $2x < y$ is garenteed, making the theory right.↵
↵
## Practicle way to use border theory:Shlink pointer↵
↵
As $shlink_i$ means to a node $i$,which node is the start of the next arthmetic sequence,can be calculated like this:↵
↵
↵
~~~~~↵
//define shlink[i]↵
//define link[i] as the maxinum border of i like fail pointers in KMP or PAM↵
//define len[i] as the length↵
//define diff[i] as the differance of length between i and link[i]↵
shlink[i] = link[i];↵
diff[i] = len[i] - len[link[i]];↵
if(diff[i] == diff[link[i]])↵
    shlink[i] = shlink[link[i]];↵
~~~~~↵
↵
↵
When we tries to analyze on borders,we found that:↵
↵
![ ](/predownloaded/bd/60/bd60bd38059c53ea3ad6d8cd1a400a629f856231.png)↵
↵
In an arthmetic progression of the border of a prefix of length $i$ in $S$,the differance between to arthmetic progression and a similar one at $i - diff$ are a new border of length $i - diff_i$ and a new position that a border occurs as suffix $i - len_{shlink_i} - diff_i$. Which is really convenient to make use of Border Theory.↵
↵
## One example↵
↵
[Link](https://www.luogu.com.cn/problem/P4156)↵
↵
Given ~~at most $5$ cases in a~~up to $5$ test cases,each ~~cases gives~~with a string $S(1 \le |S| \le 5 \times 10 ^ 5)$ and an integer $w(1 \le w \le 10 ^ {18})$, when we have a string $T$ we can ~~make the new string by~~construct a new string $Y$ in the two ways:↵
- Append the full string $T$ and $S$ and form a string with length $|T| + |S|$ ~~or i~~;↵
- If a suffix $A$ in $T$ is also ~~the~~a prefix of $S$, ~~replacement from $A$ to $S$ can be done as there's~~you can overlap them and construct a string of length $|T| - |A| + |S|$.↵
↵
Find the number of distinct length of strings in the range $[1,w]$ ~~can be formed~~that can result from the process.↵
↵
<spoiler summary="solution">↵
In fact,if there's a way ~~for an~~to model the problem:↵
↵
For a fixed addition $x$,we can form a graph of $x$ nodes ~~and deem other additions as edges to form a graph. We'll have~~that each node represents a remainder modulo $x$, Each allowed addition corresponds to a directed edge between nodes. Our goal is to run a shortest paths algorithm to get the smallest ~~lengths with garenteed~~reachable lengths for every moduleo result.↵
↵
Butminder.↵
↵
However, the naive a~~lgorithm~~pproach is too slow,as border~~s form~~ lengths can be partitioned into $\operatorname{O}(\log |S|)$ distinct arthmetic progressions,we ~~just start updating~~can just update the graph with anone arthmetic progression ~~for $\operatorname{O}(\log |S|)$ rounds. Every time we start~~at a time instead of processing each border together.↵
↵
For each progression, we make a graph of $x$ verticles which $x$ is the smallest element in the arthmetic progression and process its corresponding cycles in the graph. In every cycle, we start from the node with the smallest ~~node in each cycl~~current distance and ~~can~~ update all ~~the~~ nodes in theat cycle using a monoton~~ous~~ic queue~~(as~~, exactly like a sliding -window ~~problem)~~minimum problem. This allows us to update one full progression in $\operatorname{O}(x)$ time.↵
↵
At first we start with a graph with $|S|$ nodes and only node $0$ with a distance $|S|$. Later, Twhen we ~~may~~ change the modulo $x$ to $x'$~~,it's also easy to~~ to update in a now arthmetic progression,we can again start form the node with smallest distance in each cycle and update itthe graph in $\operatorname{O}(x + x')$.↵
↵
~~At last we get an~~Now we approached the algorithm and obtained the overall time complexy $\operatorname{O}(|S|\log |S|)$ ~~algorithm~~.↵
</spoiler>↵
↵
<spoiler summary="code">↵
↵
~~~~~↵
#include<bits/stdc++.h>↵
using namespace std;↵
int t,n,bor[500009],diff[500009],dt[500009],shlink[500009],q[500009],head,tail;↵
long long w,dp1[500009],dp2[500009];↵
int mx[500009];↵
const long long INF = 1998244353999911659ll;↵
char c[500009];↵
int gcd(int x,int y){↵
return y == 0 ? x : gcd(y,x % y);↵
}↵
int main(){↵
scanf("%d",&t);↵
while(t--){↵
scanf("%d %lld",&n,&w);↵
scanf(" %s",c + 1);↵
bor[1] = 0;↵
for(int i = 2; i <= n; i ++){↵
int t = bor[i - 1];↵
while(t != 0 && c[i] != c[t + 1])↵
t = bor[t];↵
if(c[i] == c[t + 1])↵
t ++;↵
bor[i] = t;↵
}↵
for(int i = 1; i <= n; i ++){↵
diff[i] = i - bor[i];↵
shlink[i] = bor[i];↵
if(diff[i] == diff[bor[i]])↵
shlink[i] = shlink[bor[i]];↵
}↵
//for(int i = n; i > 0; i = bor[i])↵
// printf("%d\n",bor[i]);↵
//puts("OK");↵
dp1[0] = n;↵
for(int i = 1; i < n; i ++){↵
dp1[i] = INF;↵
}↵
int lst = n;↵
for(int i = bor[n]; i != 0; i = shlink[i]){↵
int nw = n - i;↵
int ls = (i - shlink[i]) / diff[i] - 1;↵
for(int j = 0; j < nw; j ++){↵
dp2[j] = INF;↵
}↵
int fp = gcd(n - i,lst);↵
for(int j = 0; j < fp; j ++){↵
mx[j] = n + 1;↵
dp2[n + 1] = INF;↵
}↵
for(int j = 0; j < lst; j ++){↵
dp2[dp1[j] % nw] = min(dp2[dp1[j] % nw],dp1[j]);↵
if(dp2[mx[dp1[j] % fp]] > dp2[dp1[j] % nw])↵
mx[dp1[j] % fp] = dp1[j] % nw;↵
}↵
↵
for(int j = 0; j < fp; j ++){↵
long long mn = dp2[mx[j]];↵
if(mx[j] == n + 1)↵
continue;↵
for(int k = (mx[j] + lst) % nw; k != mx[j]; k = (k + lst) % nw){↵
dt[k] = -1;↵
mn = min(mn + lst,dp2[k]);↵
dp2[k] = mn;↵
}↵
}↵
// printf("%d\n",i);↵
fp = gcd(nw,diff[i]);↵
for(int j = 0; j < fp; j ++){↵
mx[j] = n + 1;↵
}↵
dp1[n + 1] = INF;↵
for(int j = 0; j < nw; j ++){↵
dt[j] = 0;↵
dp1[j] = dp2[j];↵
if(dp1[j] < dp1[mx[j % fp]])↵
mx[j % fp] = j;↵
}↵
for(int j = 0; j < fp; j ++){↵
head = 1,tail = 0;↵
for(int k = mx[j],l = 0; l < nw / fp; l ++,k = (k + diff[i]) % nw){↵
dt[k] = l;↵
while(head <= tail && l - dt[q[head]] > ls)↵
head ++;↵
if(head <= tail)↵
dp1[k] = min(dp1[k],dp1[q[head]] + 1ll * l * diff[i] - 1ll * dt[q[head]] * diff[i] + nw);↵
while(head <= tail && dp1[k] - 1ll * l * diff[i] <= dp1[q[tail]] - 1ll * dt[q[tail]] * diff[i])↵
tail --;↵
q[++tail] = k;↵
//printf("%d %lld\n",k,dp1[k]);↵
}↵
}↵
lst = nw;↵
}↵
long long ans = 0;↵
for(int i = 0; i < lst; i ++){↵
//printf("%d %lld\n",i,dp1[i]);↵
if(dp1[i] <= w)↵
ans = ans + (w - dp1[i]) / lst + 1;↵
}↵
printf("%lld\n",ans);↵
}↵
}↵
~~~~~↵
↵
</spoiler>↵
↵
# Extention to palindromic suffixes↵
↵
As we'll found that The borders of palindormes are also the palindromic suffixes of a palindrome, then we can say that for a string $S$, the lengths of palindromic suffixes form $\operatorname{O}(\log |S|)$ continuous arthmetic progressions. We can also say that the lengths of even-length palindromic suffixes form $\operatorname{O}(\log |S|)$ continuous arthmetic progressions.↵
↵
# Shlink pointer ↵
↵
## introduction↵
↵
As $shlink_i$ means to a node $i$,which node is the start of the next arthmetic sequence,can be calculated like this:↵
↵
↵
~~~~~↵
//define shlink[i]↵
//define link[i] as the maxinum border of i like fail pointers in KMP or PAM↵
//define len[i] as the length↵
//define diff[i] as the differance of length between i and link[i]↵
shlink[i] = link[i];↵
diff[i] = len[i] - len[link[i]];↵
if(diff[i] == diff[link[i]])↵
    shlink[i] = shlink[link[i]];↵
~~~~~↵
↵
↵
Then we'll find:↵
↵
![ ](/predownloaded/bd/60/bd60bd38059c53ea3ad6d8cd1a400a629f856231.png)↵
↵
Which is a good way to make use of Border Theory.↵
↵
## Tasks↵
↵
In [problem:932G] we can make a string $T$ by:↵
↵
- $t_{2k - 1} = s_{k}$↵
- $t_{2k} = s_{n - k + 1}$↵
↵
Then we found that The task turns to the number of ways to divide $T$ into several even-length palindromes, we tries to solve it using dynamic planning as $dp_i$ is the answer of the prefix of length $i$, then we use shlink pointers to make it $\operatorname{O}(|S| \log |S|)$ found that The borders of palindormes are also the palindromic suffixes of a palindrome, then we can say that for a string $S$, the sequence of the lengths of all palindromic suffixes in increasing order can be partitioned into $\operatorname{O}(\log |S|)$ continuous arthmetic progressions. We can also say that the sequence of the lengths of all  even-length palindromic suffixes in increasing order can be partitioned into $\operatorname{O}(\log |S|)$ continuous arthmetic progressions.↵
## Tasks↵
↵
In [problem:932G] we can make a string $T$ by:↵
↵
- $t_{2k - 1} = s_{k}$↵
- $t_{2k} = s_{n - k + 1}$↵
↵
Then we found that The task turns to the number of ways to divide $T$ into several even-length palindromes, we tries to solve it using dynamic planning as $dp_i$ is the answer when the string is the prefix of length $i$ in the string $T$, then we use shlink pointers to speed it up.↵
↵
<spoiler summary="code">↵
↵
~~~~~↵
#include<bits/stdc++.h>↵
using namespace std;↵
int n,b[1000009],tp[1000009],to[1000009][26],len[1000009],cnt,dp[1000009],f[1000009],shlink[1000009],diff[1000009];↵
char c[1000009],s[1000009];↵
const int mod = 1000000007;↵
int main(){↵
scanf(" %s",c);↵
n = strlen(c);↵
for(int i = 0; (i + 1) << 1 <= n; i ++){↵
s[i << 1 | 1] = c[i]; ↵
}↵
for(int i = n - 1; (i + 1) << 1 > n; i --){↵
s[(n - i) << 1] = c[i];↵
}↵
//printf("%s\n",s + 1);↵
for(int i = 2; i <= n; i ++){↵
int t = tp[i - 1];↵
while(t != 0 && s[i] != s[i - 1 - len[t]])↵
t = b[t];↵
if(s[i] == s[i - 1 - len[t]] && !to[t][s[i] - 'a']){↵
cnt ++;↵
//printf("%d\n",cnt);↵
to[t][s[i] - 'a'] = cnt;↵
len[cnt] = len[t] + 2;↵
int st = b[t];↵
while(st != 0 && s[i] != s[i - 1 - len[st]])↵
st = b[st];↵
if(s[i] == s[i - 1 - len[st]] && len[cnt] != 2)↵
st = to[st][s[i] - 'a'];↵
b[cnt] = st;↵
t = cnt;↵
}↵
else if(s[i] == s[i - 1 - len[t]])↵
t = to[t][s[i] - 'a'];↵
tp[i] = t;↵
//printf("%d %d %d\n",i,len[tp[i]],len[b[tp[i]]]);↵
}↵
for(int i = 1; i <= cnt; i ++){↵
diff[i] = len[i] - len[b[i]];↵
shlink[i] = b[i];↵
if(diff[i] == diff[b[i]])↵
shlink[i] = shlink[b[i]];↵
}↵
dp[0] = 1;↵
for(int i = 2; i <= n; i += 2){↵
if(tp[i] == 0)↵
continue;↵
int p = tp[i];↵
//dp[i] = dp[i - tp[i]];↵
while(p != 0){↵
f[p] = ((shlink[p] != b[p]) * f[b[p]] + dp[i - len[shlink[p]] - diff[p]]) % mod;↵
dp[i] = (dp[i] + f[p]) % mod;↵
//printf("%d %d %d %d\n",len[p],len[shlink[p]] + diff[p],f[p],f[b[p]]);↵
p = shlink[p];↵
}↵
//printf("%d %d\n",i,dp[i]);↵
}↵
printf("%d\n",dp[n]);↵
}↵
~~~~~↵
↵
↵
</spoiler>↵
↵
Such solution is similar to [problem:906E],which turns:↵
↵
- $t_{2k - 1} = x_k$↵
- $t_{2k} = y_k$↵
↵
And use dynamic planning to get the divition to even-length palindromes with least palindromes not with the length of $2$.↵
↵
<spoiler summary="code">↵
↵
~~~~~↵
#include<bits/stdc++.h>↵
using namespace std;↵
int n,m,shlink[1000009],b[1000009],to[1000009][26],tp[1000009],len[1000009],diff[1000009],cnt;↵
char s[500009],t[500009],ct[500009];↵
int dp[1000009],las[1000009],mn[1000009],flas[1000009];↵
int main(){↵
scanf(" %s %s",s,t);↵
n = strlen(s) << 1;↵
for(int i = 0; s[i] != '\0'; i ++)↵
ct[i << 1 | 1] = s[i],ct[(i << 1) + 2] = t[i];↵
for(int i = 2; i <= n; i ++){↵
int o = tp[i - 1];↵
while(o != 0 && ct[i] != ct[i - len[o] - 1])↵
o = b[o];↵
if(ct[i] == ct[i - len[o] - 1]){↵
if(to[o][ct[i] - 'a'] == 0){↵
cnt ++;↵
to[o][ct[i] - 'a'] = cnt;↵
len[cnt] = len[o] + 2;↵
int u = b[o];↵
while(u != 0 && ct[i] != ct[i - len[u] - 1])↵
u = b[u];↵
if(to[u][ct[i] - 'a'] != cnt && ct[i] == ct[i - len[u] - 1])↵
b[cnt] = to[u][ct[i] - 'a'];↵
shlink[cnt] = b[cnt];↵
diff[cnt] = len[cnt] - len[b[cnt]];↵
if(diff[cnt] == diff[b[cnt]])↵
shlink[cnt] = shlink[b[cnt]];↵
} ↵
o = to[o][ct[i] - 'a'];↵
}↵
tp[i] = o;↵
//printf("%d %d %d\n",i,len[tp[i]],len[b[tp[i]]]);↵
}↵
mn[0] = (1 << 24);↵
for(int i = 2; i <= n; i += 2){↵
if(ct[i] == ct[i - 1]){↵
dp[i] = dp[i - 2];↵
las[i] = i - 2;↵
}↵
else{↵
dp[i] = (1 << 24);↵
}↵
int p = tp[i];↵
if(len[p] != 0){↵
do{↵
mn[p] = dp[i - len[shlink[p]] - diff[p]] + 1;↵
flas[p] = i - len[shlink[p]] - diff[p];↵
if(b[p] != shlink[p]){↵
mn[p] = min(mn[b[p]],mn[p]); ↵
if(mn[p] == mn[b[p]])↵
flas[p] = flas[b[p]];↵
}↵
dp[i] = min(dp[i],mn[p]);↵
if(dp[i] == mn[p])↵
las[i] = flas[p];↵
p = shlink[p];↵
}while(len[p] != 0);↵
}↵
//printf("%d %d %d\n",i,dp[i],las[i]);↵
}↵
if((dp[n] >> 24) & 1)↵
puts("-1");↵
else{↵
printf("%d\n",dp[n]);↵
int o = n;↵
while(o != 0){↵
if(o - las[o] != 2)↵
printf("%d %d\n",(las[o] >> 1) + 1,o >> 1);↵
o = las[o];↵
}↵
}↵
} ↵
~~~~~↵
↵
</spoiler>↵
↵
# Summary↵
↵
Sometimes we found that our solution is slow because we may have to bruteforce on borders. But when using the theory,we may found that the solution solvable and easy instead of other solutions that might be complexThough bruteforce on borders is slow, we can approach using border theory,that shows that border lengths cluster into structured arithmetic sequences, bounded in number by $\operatorname{O}(\log |S|)$, to make it much faster.

Rev.	Кто	Когда	Δ	Комментарий
en25	OtterZ	2025-12-22 11:32:10	84
en24	OtterZ	2025-12-22 05:32:17	157
en23	OtterZ	2025-12-22 04:20:06	992	Tiny change: 'so $p = q$\n</spoile' -> 'so $p = q$.\n</spoile'
en22	OtterZ	2025-12-21 11:03:05	35
en21	OtterZ	2025-12-21 10:22:19	6
en20	OtterZ	2025-12-21 10:21:10	200
en19	OtterZ	2025-12-21 10:17:25	27
en18	OtterZ	2025-12-21 08:05:21	171
en17	OtterZ	2025-12-21 07:59:36	13	Tiny change: 'now.\n\n# Written before\n\nIn th' -> 'now.\n\n# Preface\n\nIn th'
en16	OtterZ	2025-12-21 06:30:10	25	Tiny change: 'th $i + q$.\n\nAs a' -> 'th $i + q$,one of the border of $S$.\n\nAs a'
en15	OtterZ	2025-12-21 05:45:52	1413
en14	OtterZ	2025-12-20 06:19:26	2
en13	OtterZ	2025-12-20 04:57:35	4468	Tiny change: ' that for every possible $S$,${B_i}_{i ' -> ' that for all possible $S$, ${B_i}_{i '
en12	OtterZ	2025-12-19 09:05:15	265
en11	OtterZ	2025-12-19 08:45:48	475
en10	OtterZ	2025-12-19 08:29:45	6	Tiny change: 'eed modules.\n\nBut t' -> 'eed module result.\n\nBut t'
en9	OtterZ	2025-12-19 07:27:38	1	Tiny change: 'p,q \text{are the le' -> 'p,q \text{ are the le'
en8	OtterZ	2025-12-19 07:26:49	1	Tiny change: ' palindroms, we trie' -> ' palindromes, we trie'
en7	OtterZ	2025-12-19 05:19:03	1	Tiny change: 'n\n## Tasks\n\n[Link]' -> 'n\n## Task\n\n[Link]'
en6	OtterZ	2025-12-19 04:40:02	4594	(published)
en5	OtterZ	2025-12-19 04:20:15	65
en4	OtterZ	2025-12-19 04:19:26	770	Tiny change: 'ce,can be got like this' -> 'ce,can be calculated like this'
en3	OtterZ	2025-12-19 03:29:52	4119
en2	OtterZ	2025-12-18 16:53:34	6
en1	OtterZ	2025-12-18 16:48:23	2086	Initial revision (saved to drafts)

История