Idea: AndreyPavlov

Preparation: AndreyPavlov

Editorialist: AndreyPavlov

**Tutorial**

Note that there are two variants of which numbers to take to make their amount odd: $$$3$$$ odd number; $$$2$$$ even and $$$1$$$ odd. Let’s save all indices of even and odd numbers into two arrays, and check both cases.

**Implementation (Python)**

```
def solve():
n = int(input())
a = list(map(int, input().split()))
odd, even = [], []
for i in range(n):
if a[i] % 2 == 0:
even.append(i + 1)
else:
odd.append(i + 1)
if len(odd) >= 1 and len(even) >= 2:
print("YES")
print(odd[0], even[0], even[1])
elif len(odd) >= 3:
print("YES")
print(odd[0], odd[1], odd[2])
else:
print("NO")
t = int(input())
for test in range(t):
solve()
```

**Implementation (C++)**

```
#include <bits/stdc++.h>
using namespace std;
int main() {
int T;
cin >> T;
while (T--) {
int n;
cin >> n;
vector<int> odd, even;
for (int i = 1; i <= n; i++) {
int x;
cin >> x;
if (x % 2 == 0) {
even.push_back(i);
} else {
odd.push_back(i);
}
}
if (odd.size() >= 3) {
cout << "YES\n";
cout << odd[0] << " " << odd[1] << " " << odd[2] << '\n';
} else if (odd.size() >= 1 && even.size() >= 2) {
cout << "YES\n";
cout << odd[0] << " " << even[0] << " " << even[1] << '\n';
} else {
cout << "NO\n";
}
}
}
```

Idea: 74TrAkToR

Preparation: qualdoom

Editorialist: qualdoom

**Tutorial**

Let's note that it doesn't make sense for us to divide into more than $$$k = 2$$$ subsegments. Let's prove it.

Let us somehow split the array $$$a$$$ into $$$m > 2$$$ subsegments : $$$b_1, b_2, \ldots, b_m$$$. Note that $$$\gcd(b_1, b_2, \ldots, b_m) \le \gcd(b_1 + b_2, b_3, \ldots, b_m)$$$, since if $$$b_1$$$ and $$$b_2$$$ were multiples of $$$\gcd(b_1, b_2 , \ldots, b_m)$$$, so $$$b_1 + b_2$$$ is also a multiple of $$$\gcd(b_1, b_2, \ldots, b_m)$$$. This means that we can use $$$b_1 + b_2$$$ instead of $$$b_1$$$ and $$$b_2$$$, and the answer will not worsen, thus it is always beneficial to use no more than $$$k = 2$$$ subsegments.

How to find the answer? Let $$$s$$$ be the sum of the array $$$a$$$. Let's say $$$pref_i = {\sum_{j = 1}^{i} a_j}$$$, then the answer is $$$\max\limits_{1 \le i < n}(\gcd(pref_i, s - pref_i)$$$.

**Implementation (Python)**

```
from math import gcd
t = int(input())
for test in range(t):
n = int(input())
a = list(map(int, input().split()))
s = 0
for i in range(n):
s += a[i]
ans = 0
pref = 0
for i in range(n - 1):
pref += a[i]
ans = max(ans, gcd(pref, s - pref))
print(ans)
```

**Implementation (С++)**

```
#include <bits/stdc++.h>
using namespace std;
int main() {
ios::sync_with_stdio(false);
cin.tie(0), cout.tie(0);
int T = 1;
cin >> T;
while (T--) {
int n;
cin >> n;
vector<int> a(n);
for (int i = 0; i < n; i++) cin >> a[i];
long long s = accumulate(a.begin(), a.end(), 0ll), cur = 0;
long long ans = 1;
for (int i = 0; i < n - 1; i++) {
cur += a[i], s -= a[i];
ans = max(ans, __gcd(s, cur));
}
cout << ans << "\n";
}
}
```

Idea: AndreyPavlov

Preparation: qualdoom, AndreyPavlov

Editorialist: qualdoom, AndreyPavlov

**Tutorial**

There are two similar solutions to this problem, we will tell you both.

**Tutorial (AndreyPavlov)**

Subtract $$$1$$$. What can we say now about the number of units at the beginning of the binary notation of a number? There are exactly $$$cnt - cnt_w + 1$$$, where $$$cnt$$$ is the number of unit bits after subtracting the unit, and $$$cnt_w$$$ is — before subtracting. Now we subtract them, for this we need to subtract the number $$$2^{cnt - cnt_w + 1} - 1$$$ and continue the algorithm. But such an algorithm makes at worst $$$60$$$ requests. To save queries, note that we know the number of units after we remove the units from the beginning, therefore it is useless to make another request for this. Then at the same time, as we remove the units from the beginning, we will immediately subtract the unit. As a result, there will be absolutely no more than $$$cnt$$$ operations, where $$$cnt$$$ is the initial number of single bits in the record of the number $$$n$$$. This number does not exceed $$$O(log_2(n))$$$, in turn, $$$log_2(n)$$$ does not exceed 30, which fits into the restrictions.

**Tutorial (qualdoom)**

Let $$$ans$$$ be the desired number $$$n$$$, and $$$was$$$ be the initial number of bits in the number $$$n$$$. Let's subtract the powers of two : $$$2^0, 2^1, 2^2, ... 2^ k$$$, while $$$was$$$ it will not become 0. We will support the $$$Shift$$$ flag — whether there was a bit transfer to $$$n$$$ when subtracting any power of two. Suppose we subtracted $$$2^k$$$ at the $$$k$$$th step and the number of bits became equal to $$$cnt_{new}$$$, and before the subtraction was $$$cnt$$$, then consider two cases.

1) $$$cnt - cnt_{new} = 1$$$, then bit $$$k$$$ was included in $$$n$$$ at the $$$k$$$th step, and if $$$Shift = false$$$, then we add to $$$ans$$$ — $$$2^k$$$, since there was no bit transfer, which means bit k is also in the original number, and subtract from $$$was$$$ — $$$1$$$. If $$$Shift = true$$$, then we added this bit during previous operations, and it does not need to be taken into account.

2) $$$cnt - cnt_{new} \neq 1$$$, then we know that the number of bits has not decreased, also that in the number $$$n$$$ there was such a bit $$$m$$$ that $$$2^m > 2^k$$$, and at the same time the bit $$$m$$$ is in $$$n$$$. Moreover, $$$m - k - 1 = cnt_{new} - cnt$$$. So $$$m = k + 1 + cnt_{new} - cnt$$$. Let's add $$$2^m$$$ to the answer, subtract from $$$was$$$ — $$$1$$$ and assign the $$$Shift$$$ flag the value $$$true$$$, since there was a bit transfer.

Thus, we found the initial number $$$n$$$, which is equal to $$$ans$$$, and also made no more than $$$O(log_2(n))$$$ queries, since $$$k\le log_2(n)$$$.

Thus, the solution spends no more than 30 requests, which fits into the limitations of the task.

**Implementation (Python)**

```
t = int(input())
def ask (x):
print('-', x)
return int(input())
for i in range(t):
cur = int(input())
add = 0
ans = 0
while cur > 0:
x = ask(add + 1)
back = x - cur + 1
add = 2 ** back - 1
ans += 2 ** back
cur = x - back
print('!', ans)
```

**Implementation (С++)**

```
#include <iostream>
using namespace std;
int ask (int x) {
cout << "- " << x << endl;
if (x == -1)
exit(0);
cin >> x;
return x;
}
int main() {
int t;
cin >> t;
while (t--) {
int cnt;
cin >> cnt;
int n = 0;
int was = 0;
while (cnt > 0) {
n += 1;
int nw = ask(1 + was);
int back = nw - cnt + 1;
n += (1 << back) - 1;
was = (1 << back) - 1;
cnt = nw - back;
}
cout << "! " << n << endl;
}
}
```

1780E - Josuke and Complete Graph

Idea: IzhtskiyTimofey

Preparation: IzhtskiyTimofey

Editorialist: IzhtskiyTimofey

**Tutorial**

Let’s fix $$$g$$$ and check that the $$$g$$$ weight edge exists in $$$G'$$$. The first number, which is divided into $$$g$$$, starting with $$$L$$$ — $$$\lceil \frac{L}{g} \rceil \cdot g$$$, and the second — $$$(\lceil \frac{L}{g} \rceil + 1) \cdot g$$$, note that their $$$\gcd$$$ is $$$g$$$, so the edge between these vertices weighs $$$g$$$. If the second number is greater $$$R$$$, the edge with weight $$$g$$$ in the $$$G'$$$ doesn't exist, because on the segment from $$$L$$$ to $$$R$$$ at most one vertex, which is divided by $$$g$$$. That is, we should calculate the number of such $$$g$$$, which is $$$(\lceil \frac{L}{g} \rceil + 1) \cdot g \leq R$$$.

For $$$g \geq L$$$: $$$(\lceil \frac{L}{g} \rceil + 1) \cdot g = 2 \cdot g$$$. Get the upper limit on the $$$g \leq \lfloor \frac{R}{2} \rfloor$$$. That is, all $$$g$$$ on segment from $$$L$$$ to $$$\lfloor \frac{R}{2} \rfloor$$$ occur in the $$$G'$$$ as weight some edge. Add them to the answer.

Look at $$$g < L$$$.

Note that $$$\lceil \frac{L}{g} \rceil$$$ takes a $$$O(\sqrt{L})$$$ of different values. Let's fix some $$$f = \lceil \frac{L}{g} \rceil$$$. Note that $$$f$$$ corresponds to a consecutive segment $$$l \leq g \leq r$$$. Let's brute this segments in ascending order $$$f$$$. Then, if there is a left border $$$l$$$ of the segment, you can find $$$r$$$ either by binary search or by writing the formula. The next left border is $$$r + 1$$$. Then note, if $$$f$$$ is fixed, then $$$(f + 1) \cdot g \leq R$$$ is equivalent to $$$g \leq \lfloor \frac{R}{f + 1} \rfloor$$$. That is, with a fixed segment from $$$l$$$ to $$$r$$$, $$$g$$$ occurs in the $$$G'$$$ as weight some edge if $$$l \leq g \leq min(r, \lfloor \frac{R}{f + 1} \rfloor)$$$. Then brute all these segments and sum up of all good $$$g$$$.

Overall time complexity is $$$O(\sqrt{L})$$$ or $$$O(\sqrt{L} \cdot log(L))$$$.

**Implementation (Python)**

```
def solve():
L, R = map(int, input().split())
ans = max(0, R // 2 - L + 1)
left = 1
while left < L:
C = (L + left - 1) // left
right = (L + C - 2) // (C - 1) - 1
ans += max(0, min(right, R // (C + 1)) - left + 1)
left = right + 1
print(ans)
t = int(input())
for test in range(t):
solve()
```

**Implementation (С++)**

```
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
int main(){
ios::sync_with_stdio(false);
cin.tie(0), cout.tie(0);
int t;
cin >> t;
for (int test_case = 0; test_case < t; test_case++){
ll L, R;
cin >> L >> R;
ll ans = max(0ll, R / 2 - L + 1);
for (ll left = 1, right; left < L; left = right + 1){
ll C = (L + left - 1) / left;
right = (L + C - 2) / (C - 1) - 1;
ans += max(0ll, min(right, R / (C + 1)) - left + 1);
}
cout << ans << '\n';
}
}
```

Idea: AndreyPavlov

Preparation: AndreyPavlov

Editorialist: qualdoom

**Tutorial**

Let's sort the array and process the triples $$$i, j, k$$$, assuming that $$$i < j < k$$$ and $$$a_i < a_j < a_k$$$. Now if $$$\gcd(a_i, a_k) = 1$$$, then the number of ways to take the index $$$j$$$ is $$$k - i - 1$$$.

We will consider the answer to the problem for each $$$k$$$ from $$$1$$$ to $$$n$$$, assuming that $$$a_k$$$ is the maximum number in the triple. Now let $$$c$$$ be the number of numbers that are mutually prime with $$$a_k$$$ on the prefix from $$$1$$$ to $$$k - 1$$$, and $$$sum$$$ is the sum of their indices. Then you need to add $$$c\cdot i - sum - c$$$ to the answer.

It remains to find out the number of numbers that are mutually prime with $$$a_k$$$ and the sum of their indices. This can be done using the inclusion and exclusion method. Let $$$cnt_i$$$ be the number of numbers $$$a_j$$$ that are divisible by $$$i$$$, $$$s_i$$$ be the sum of the indices $$$j$$$ of numbers $$$a_j$$$ that are divisible by $$$i$$$. Let's look at the prime numbers $$$p_1, p_2, ..., p_m$$$ included in the factorization of the number $$$a_k$$$.

Then let $$$c$$$ initially be equal to the number of numbers on the prefix, and $$$sum$$$ to the sum of the indices on the prefix. Note that then we took into account the extra elements — numbers that are divisible by $$$p_1, p_2, ..., p_m$$$, since they will not be mutually simple with $$$a_k$$$, we subtract them from $$$c$$$ and $$$sum$$$. But having done this, we again took into account the extra elements that are multiples of the numbers of the form $$$p_i * p_j$$$, where $$$i \neq j$$$, add them back, etc. So we can iterate over the mask $$$mask$$$ of the primes $$$p_1, p_2, ..., p_m$$$. And depending on the parity of the bits in the mask, we will subtract or add elements that are multiples of $$$d$$$, where $$$d$$$ is the product of the primes included in $$$mask$$$. Having received $$$c$$$ and $$$sum$$$, we can recalculate the answer for the position $$$i$$$.

To move from position $$$i$$$ to position $$$i+1$$$, update the values of $$$cnt$$$ and $$$s$$$ by adding the element $$$a_{i-1}$$$ by iterating over the mask of the simple element $$$a_{i-1}$$$.

**Implementation (С++)**

```
#include "bits/stdc++.h"
using namespace std;
#include <ext/pb_ds/assoc_container.hpp>
using namespace __gnu_pbds;
#define sz(v) ((int)(v).size())
#define all(a) (a).begin(), (a).end()
#define rall(a) a.rbegin(), a.rend()
#define F first
#define S second
#define pb push_back
#define ppb pop_back
#define eb emplace_back
#define time ((double)clock() / (double)CLOCKS_PER_SEC)
using pii = pair<int, int>;
using ll = long long;
using int64 = long long;
using ld = double;
const ll infll = (ll) 1e18 + 27;
const ll inf = (ll) 1e9;
#define dbg(x) cout << #x << " = " << (x) << endl
template<class T>
using pq = priority_queue<T, vector<T>, less<T>>;
template<class T>
using pqr = priority_queue<T, vector<T>, greater<T>>;
template<typename T, typename T2>
istream &operator>>(istream &in, pair<T, T2> &b) {
in >> b.first >> b.second;
return in;
}
template<typename T, typename T2>
ostream &operator<<(ostream &out, const pair<T, T2> &b) {
out << "{" << b.first << ", " << b.second << "}";
return out;
}
template<typename T>
istream &operator>>(istream &in, vector<T> &b) {
for (auto &v : b) {
in >> v;
}
return in;
}
template<typename T>
ostream &operator<<(ostream &out, vector<T> &b) {
for (auto &v : b) {
out << v << ' ';
}
return out;
}
template<typename T>
ostream &operator<<(ostream &out, deque<T> &b) {
for (auto &v : b) {
out << v << ' ';
}
return out;
}
template<typename T>
void print(T x, string end = "\n") {
cout << x << end;
}
template<typename T1, typename T2>
bool chkmin(T1 &x, const T2 &y) { return x > y && (x = y, true); }
template<typename T1, typename T2>
bool chkmax(T1 &x, const T2 &y) { return x < y && (x = y, true); }
mt19937_64 rng(chrono::high_resolution_clock::now().time_since_epoch().count());
const int N = 3e5 + 10;
ll s[N];
ll cnt[N];
ll d[N][20];
int ptr[N];
bool u[N];
ll Cnt = 0;
ll Sum = 0;
ll Ans = 0;
void Answer (int x, int pos) {
ll C = Cnt;
ll X = Sum;
int K = (1ll << ptr[x]);
for (int mask = 1; mask < K; mask++) {
ll k = 1;
for (int j = 0; j < ptr[x]; j++) {
if ((mask >> j) & 1) {
k *= d[x][j];
}
}
int bits = __builtin_popcount(mask);
int D = k;
if (bits % 2 == 1) {
C -= cnt[D];
X -= s[D];
} else {
C += cnt[D];
X += s[D];
}
}
Ans += C * pos - X;
}
void add (int x, int pos) {
Cnt += 1;
Sum += pos + 1;
auto v = d[x];
int K = (1ll << ptr[x]);
for (int mask = 1; mask < K; mask++) {
ll k = 1;
for (int j = 0; j < ptr[x]; j++) {
if ((mask >> j) & 1) {
k *= d[x][j];
}
}
int D = k;
s[D] += pos + 1;
cnt[D] += 1;
}
}
void solve() {
for (int i = 2; i < N; i++) {
if (!u[i]) {
for (int j = i; j < N; j += i) {
u[j] = true;
d[j][ptr[j]] = i;
ptr[j]++;
}
}
}
int n;
cin >> n;
vector<int> a(n);
cin >> a;
sort(all(a));
for (int i = 0; i < n; i++) {
Answer(a[i], i);
if (i > 0) {
add(a[i - 1], i - 1);
}
}
cout << Ans << "\n";
}
int32_t main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
cout.tie(nullptr);
solve();
return 0;
}
```

Idea: IzhtskiyTimofey

Preparation: IzhtskiyTimofey, AndreyPavlov

Editorialist: IzhtskiyTimofey, AndreyPavlov

**Tutorial**

This problem has several solutions using different suffix structures. We will tell two of them — using suffix array, and using suffix automaton.

**Suffix array (t4m0fey)**

Let's build suffix array $$$suf$$$ and array $$$lcp$$$ (largest common prefixes) on the string $$$s$$$. Fix some $$$k > 1$$$ и consider about all substring $$$s$$$ length $$$k$$$. Match $$$1$$$ to positions $$$i$$$ array $$$lcp$$$, such that $$$lcp_i \geq k$$$, and match $$$0$$$ to other positions. One in position $$$i$$$ means that the substrings length $$$k$$$ starting at $$$suf_i$$$ and $$$suf_{i + 1}$$$ are equal. Consider any block of units length $$$len$$$, then for $$$len + 1$$$ substrings length $$$k$$$ number of occurrences in $$$s$$$ — $$$len + 1$$$, then in order for the substrings in this block to be delicious, it is necessary that $$$len + 1$$$ divided by $$$k$$$.

Let's brute $$$k$$$ from $$$n$$$ to $$$2$$$ и sustain all sizes blocks. Then, when shift to a new $$$k$$$ should events — change $$$0$$$ to $$$1$$$, for all $$$i$$$, such that $$$lcp_i = k$$$. To do this you can sustain DSU (Disjoint set union). Then for each block size we know the number of blocks with this size. Then it is enough to consider all blocks of length $$$len$$$, such as $$$len + 1$$$ — divider $$$k$$$. It can be done explicitly, just brute $$$k$$$, $$$2 \cdot k$$$, ..., as $$$len + 1$$$. And this works in sum of harmonic series: $$${\sum_{k = 2}^{n} \lfloor \frac{n}{k} \rfloor}$$$ $$$= O(n \cdot log(n))$$$.

For $$$k = 1$$$, obviously, any substring length $$$1$$$ satisfies, so you can just add $$$n$$$ to the answer.

Overall time complexity is $$$O(n \cdot log(n))$$$.

**Suffix automaton (AndreyPavlov)**

The solution with the suffix automaton is as follows: let's build the suffix automaton itself, now we calculate for each vertex of the suffix automaton the dynamics $$$dp_v$$$ — this is the number of paths from the vertex $$$v$$$ to the terminal vertices. This dynamics means the number of occurrences of the substring corresponding to this vertex in the entire string. Let's introduce the function $$$f(v)$$$ — the length of the longest substring leading to the vertex $$$v$$$. We know that all substrings of length from $$$l_v$$$ to $$$r_v$$$ lead to the vertex $$$v$$$ of the suffix automaton — each once. Where $$$r_v = f(v)$$$ and $$$l_v = f(suff_v) + 1$$$, where $$$suff_v$$$ is the suffix link of $$$v$$$. Why is it so? All substrings of the form $$$s[x:k], s[x+1:k], ..., s[y:k]$$$ lead to the vertex $$$v$$$ of the suffix automaton, and there is a suffix link $$$suff_v$$$ to which lead all substrings of the form $$$s[x + 1:k], s[y + 2:k], ..., s[t: k]$$$.

In order to solve the problem, let's go through the $$$v$$$ vertex and look at the number of occurrences of any substring that leads to the $$$v$$$ — $$$dp_v$$$ vertex, then fix $$$c$$$ — the number of such $$$l_v \le x \le r_v$$$, that $$$dp_v$$$ is evenly divisible by $$$x$$$. Therefore, $$$dp_v \cdot c$$$ must be added to the answer. All divisors can be stored in $$$O(n \cdot log(n))$$$ and each time find the number of such $$$x$$$ by binsearch.

Asymptotics $$$O(n \cdot log(n))$$$

**Implementation (t4m0fey)**

```
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
vector<int> suffix_array(string s){
int n = s.size();
vector<int> a(n), c(n), h(n);
vector<pair<char, int>> t;
for (int i = 0; i < n; i++) t.push_back({s[i], i});
sort(t.begin(), t.end());
int cur = -1;
for (int i = 0; i < n; i++){
if (i == 0 || t[i].first != t[i - 1].first){
cur++;
h[cur] = i;
}
a[i] = t[i].second;
c[a[i]] = cur;
}
vector<int> a1(n), c1(n);
for (int len = 1; len < n; len *= 2){
for (int i = 0; i < n; i++){
int j = (n + a[i] - len) % n;
a1[h[c[j]]++] = j;
}
a = a1;
cur = -1;
for (int i = 0; i < n; i++){
if (i == 0 || c[a[i]] != c[a[i - 1]] || c[(a[i] + len) % n] != c[(a[i - 1] + len) % n]){
cur++;
h[cur] = i;
}
c1[a[i]] = cur;
}
c = c1;
}
return a;
}
vector<int> build_lcp(string s, vector<int> suf){
int n = s.size();
vector<int> rsuf(n), lcp(n);
for (int i = 0; i < n; i++) rsuf[suf[i]] = i;
int cur = 0;
for (int i = 0; i < n; i++){
int j = rsuf[i];
if (j != n - 1){
while (s[suf[j] + cur] == s[suf[j + 1] + cur]) cur++;
lcp[j] = cur;
}
if (cur > 0) cur--;
}
return lcp;
}
const int N = 1e6 + 1;
int r[N], l[N], us[N], cnt[N];
vector<int> pos[N];
int main() {
ios::sync_with_stdio(false), cin.tie(0);
int n;
cin >> n;
string s;
cin >> s;
s += '$';
vector<int> suf = suffix_array(s);
vector<int> lcp = build_lcp(s, suf);
for (int i = 0; i < lcp.size(); i++) {
pos[lcp[i]].push_back(i);
}
ll ans = n;
for (int k = n; k >= 2; k--) {
for (int i : pos[k]) {
// add i
us[i] = true;
cnt[1]++;
l[i] = r[i] = i;
if (i != 0 && us[i - 1]) {
cnt[i - l[i - 1]]--;
cnt[1]--;
cnt[i - l[i - 1] + 1]++;
l[i] = l[i - 1];
r[l[i - 1]] = i;
}
if (i + 1 < lcp.size() && us[i + 1]) {
cnt[r[i + 1] - i]--;
cnt[i - l[i] + 1]--;
cnt[r[i + 1] - l[i] + 1]++;
l[r[i + 1]] = l[i];
r[l[i]] = r[i + 1];
}
}
for (int x = k; x <= n; x += k) {
ans += 1ll * x * cnt[x - 1];
}
}
cout << ans << "\n";
}
```

**Implementation (AndreyPavlov)**

```
/* Includes */
#include <bits/stdc++.h>
#include <ext/pb_ds/tree_policy.hpp>
/* Using libraries */
using namespace std;
/* Defines */
#define fast ios_base::sync_with_stdio(false);cin.tie(0);cout.tie(0)
#define ld long double
#define pb push_back
#define vc vector
#define sz(a) (int)a.size()
#define forn(i, n) for (int i = 0; i < n; ++i)
#define pii pair <int, int>
#define vec pt
#define all(a) a.begin(), a.end()
const int K = 26;
const int N = 1e6 + 1;
struct node {
int next[K];
int suf = -1, pr = -1, dp = 0, len = 0, ch = -1;
node () {
forn (i, K) next[i] = -1;
}
};
int get (char c) {
if (c >= 'a')
return c - 'a';
return c - 'A';
}
int lst = 0, sz = 1;
node t[N * 2];
int used[N * 2];
int add (int a, int x) {
int b = sz++;
t[b].pr = a;
t[b].suf = 0;
t[b].ch = x;
for (; a != -1; a = t[a].suf) {
if (t[a].next[x] == -1) {
t[a].next[x] = b;
t[b].len = max(t[b].len, t[a].len + 1);
continue;
}
int c = t[a].next[x];
if (t[c].pr == a) {
t[b].suf = c;
break;
}
int d = sz++;
forn (i, K) t[d].next[i] = t[c].next[i];
t[d].suf = t[c].suf;
t[c].suf = t[b].suf = d;
t[d].pr = a;
t[d].ch = x;
for (; a != -1 && t[a].next[x] == c; a = t[a].suf) {
t[a].next[x] = d;
t[d].len = max(t[d].len, t[a].len + 1);
}
break;
}
return b;
}
void add (char c) {
lst = add(lst, get(c));
}
void dfs (int u) {
used[u] = 1;
for (int i = 0; i < K; ++i) {
if (t[u].next[i] == -1) continue;
int v = t[u].next[i];
if (!used[v])
dfs(v);
t[u].dp += t[v].dp;
}
}
vc <int> p[N], pr;
int dr[N];
vc <pii> d;
int l, r, cur = 0;
int cnt_log (int x, int y) {
int z = 1, res = 0;
while (y >= z) {
z *= x;
++res;
}
return res - 1;
}
void rec (int i, int x) {
if (i == sz(d)) {
cur += l <= x;
return;
}
rec(i + 1, x);
for (int j = 1; j <= d[i].second; ++j) {
x *= d[i].first;
if (x > r)
break;
rec(i + 1, x);
}
}
void solve () {
int n;
cin >> n;
string s;
cin >> s;
for (char c : s)
add(c);
for (int a = lst; a != -1; a = t[a].suf)
t[a].dp = 1;
dfs(0);
for (int i = 2; i <= n; ++i) {
if (dr[i] == 0) {
dr[i] = i;
pr.pb(i);
}
for (int j = 0; j < sz(pr) && pr[j] <= dr[i] && i * pr[j] <= n; ++j)
dr[i * pr[j]] = pr[j];
}
long long ans = 0;
forn (i, sz) {
if (t[i].len == 0) continue;
l = t[t[i].suf].len + 1;
r = t[i].len;
int x = t[i].dp;
d.clear();
while (x > 1) {
int y = dr[x];
if (d.empty() || d.back().first != y)
d.pb({y, 1});
else
d.back().second++;
x /= y;
}
rec(0, 1);
ans += t[i].dp * cur;
cur = 0;
}
cout << ans << '\n';
}
/* Starting and precalcing */
signed main() {
/* freopen("input.txt","r",stdin);freopen("output.txt","w",stdout); */
fast;
int t = 1;
// cin >> t;
while (t--) solve();
return 0;
}
```