**Hint 1**

Only add problems when they are needed.

**Tutorial**

**Solution**

```
//By: OIer rui_er
#include <bits/stdc++.h>
#define rep(x, y, z) for(int x = (y); x <= (z); ++x)
#define per(x, y, z) for(int x = (y); x >= (z); --x)
#define endl '\n'
using namespace std;
typedef long long ll;
const int N = 105;
int T, n, a[N], b[N];
int main() {
ios::sync_with_stdio(false);
cin.tie(0); cout.tie(0);
for(cin >> T; T; --T) {
cin >> n;
rep(i, 1, n) cin >> a[i];
rep(i, 1, n) cin >> b[i];
int diff = 0, ans = 0;
rep(i, 1, n) {
if(a[i - diff] > b[i]) {
++ans;
++diff;
}
}
cout << ans << endl;
}
return 0;
}
```

**Hint 1**

Is there anything that *never* / *always* changes after each operation?

**Hint 2**

The parity.

**Tutorial**

**Solution**

```
//By: OIer rui_er
#include <bits/stdc++.h>
#define rep(x, y, z) for(int x = (y); x <= (z); ++x)
#define per(x, y, z) for(int x = (y); x >= (z); --x)
#define endl '\n'
using namespace std;
typedef long long ll;
int T, n;
string s;
int main() {
ios::sync_with_stdio(false);
cin.tie(0); cout.tie(0);
for(cin >> T; T; --T) {
cin >> n >> s;
int cntU = 0;
for(char c : s) if(c == 'U') ++cntU;
if(cntU & 1) cout << "YES" << endl;
else cout << "NO" << endl;
}
return 0;
}
```

**Hint 1**

What's the answer if $$$a_1=a_2=\cdots=a_n$$$ and $$$k=0$$$?

**Hint 2**

What's the answer if $$$k=0$$$?

**Hint 3**

You've already known the $$$\mathcal O(k)$$$ solution. How to improve it?

**Tutorial**

**Solution**

```
#include<bits/stdc++.h>
using namespace std;
void solve()
{
int n;long long k;
cin>>n>>k;
vector<long long>a(n);
for(int x=0;x<n;x++)
cin>>a[x];
sort(a.begin(),a.end());
reverse(a.begin(),a.end());
long long lst=a.back(),cnt=1;
a.pop_back();
while(!a.empty()&&lst==a.back())a.pop_back(),cnt++;
while(!a.empty())
{
long long delta=a.back()-lst;
if(k<delta*cnt)break;
k-=delta*cnt;
lst=a.back();
while(!a.empty()&&lst==a.back())a.pop_back(),cnt++;
}
lst+=k/cnt;
k%=cnt;
cnt-=k;
cout<<lst*n-cnt+1<<endl;
}
main()
{
ios::sync_with_stdio(false),cin.tie(0);
int t;
cin>>t;
while(t--)solve();
}
```

1967B1 - Reverse Card (Easy Version)

**Hint 1**

Denote $$$\gcd(a,b)$$$ as $$$d$$$.

**Hint 2**

Did you notice that $$$b\mid a$$$? How to prove that?

**Tutorial**

**Solution**

```
#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
const int N=2000005;
int tc,n,m; ll ans;
inline void solve(){
cin>>n>>m; ans=0;
for(int i=1;i<=m;i++)
ans+=(n+i)/(1ll*i*i);
cout<<ans-1<<'\n';
}
int main(){
ios::sync_with_stdio(0); cin.tie(0); cout.tie(0);
cin>>tc; while(tc--) solve();
return 0;
}
```

1967B2 - Reverse Card (Hard Version)

**Hint 1**

Denote $$$\gcd(a,b)$$$ as $$$d$$$. Assume that $$$a=pd$$$ and $$$b=qd$$$.

**Hint 2**

$$$\gcd(p,q)=1$$$.

**Hint 3**

How large could $$$p$$$ and $$$q$$$ be?

**Tutorial**

**Solution**

```
#include <bits/stdc++.h>
using namespace std;
#define nl "\n"
#define nf endl
#define ll long long
#define pb push_back
#define _ << ' ' <<
#define INF (ll)1e18
#define mod 998244353
#define maxn 110
int main() {
ios::sync_with_stdio(0);
cin.tie(0);
#if !ONLINE_JUDGE && !EVAL
ifstream cin("input.txt");
ofstream cout("output.txt");
#endif
int t;
cin>>t;
while(t--)
{
ll n,m; cin >> n>>m;
ll sq = sqrt(n) + 2,sqm=sqrt(m)+2;
vector bad(sq + 1, vector<bool>(sqm+1, 0));
for (ll i = 2; i <= min(sq,sqm); i++) {
for (ll a = i; a <= sq; a += i) {
for (ll b = i; b <= sqm; b += i) {
bad[a][b] = true;
}
}
}
ll ans = 0;
for (ll a = 1; a * a <= n; a++) {
for (ll b = 1; b * b <= m; b++) {
if (bad[a][b]) continue;
ans += min(n/(a+b)/a,m/(a+b)/b);
}
}
cout << ans << nl;
}
return 0;
}
```

**Hint 1**

The height of a Fenwick Tree is $$$\mathcal O(\log n)$$$, so operations like enumerating ancestors of each vertex will be acceptable.

**Hint 2**

What's the coefficient of $$$a_u$$$ in each $$$b$$$ value of its ancestors?

**Tutorial**

**Solution**

```
//By: OIer rui_er
#include <bits/stdc++.h>
#define rep(x, y, z) for(int x = (y); x <= (z); ++x)
#define per(x, y, z) for(int x = (y); x >= (z); --x)
#define debug(format...) fprintf(stderr, format)
#define fileIO(s) do {freopen(s".in", "r", stdin); freopen(s".out", "w", stdout);} while(false)
#define endl '\n'
using namespace std;
typedef long long ll;
mt19937 rnd(std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::system_clock::now().time_since_epoch()).count());
int randint(int L, int R) {
uniform_int_distribution<int> dist(L, R);
return dist(rnd);
}
template<typename T> void chkmin(T& x, T y) {if(x > y) x = y;}
template<typename T> void chkmax(T& x, T y) {if(x < y) x = y;}
template<int mod>
inline unsigned int down(unsigned int x) {
return x >= mod ? x - mod : x;
}
template<int mod>
struct Modint {
unsigned int x;
Modint() = default;
Modint(unsigned int x) : x(x) {}
friend istream& operator>>(istream& in, Modint& a) {return in >> a.x;}
friend ostream& operator<<(ostream& out, Modint a) {return out << a.x;}
friend Modint operator+(Modint a, Modint b) {return down<mod>(a.x + b.x);}
friend Modint operator-(Modint a, Modint b) {return down<mod>(a.x - b.x + mod);}
friend Modint operator*(Modint a, Modint b) {return 1ULL * a.x * b.x % mod;}
friend Modint operator/(Modint a, Modint b) {return a * ~b;}
friend Modint operator^(Modint a, int b) {Modint ans = 1; for(; b; b >>= 1, a *= a) if(b & 1) ans *= a; return ans;}
friend Modint operator~(Modint a) {return a ^ (mod - 2);}
friend Modint operator-(Modint a) {return down<mod>(mod - a.x);}
friend Modint& operator+=(Modint& a, Modint b) {return a = a + b;}
friend Modint& operator-=(Modint& a, Modint b) {return a = a - b;}
friend Modint& operator*=(Modint& a, Modint b) {return a = a * b;}
friend Modint& operator/=(Modint& a, Modint b) {return a = a / b;}
friend Modint& operator^=(Modint& a, int b) {return a = a ^ b;}
friend Modint& operator++(Modint& a) {return a += 1;}
friend Modint operator++(Modint& a, int) {Modint x = a; a += 1; return x;}
friend Modint& operator--(Modint& a) {return a -= 1;}
friend Modint operator--(Modint& a, int) {Modint x = a; a -= 1; return x;}
friend bool operator==(Modint a, Modint b) {return a.x == b.x;}
friend bool operator!=(Modint a, Modint b) {return !(a == b);}
};
const int N = 1e6 + 100, mod = 998244353;
typedef Modint<mod> mint;
int T, n, k;
mint a[N], inv[N];
inline int lowbit(int x) {return x & -x;}
int main() {
ios::sync_with_stdio(false);
cin.tie(0); cout.tie(0);
inv[0] = inv[1] = 1;
rep(i, 2, N - 1) inv[i] = (mod - mod / i) * inv[mod % i];
for(cin >> T; T; --T) {
cin >> n >> k;
rep(i, 1, n) cin >> a[i];
rep(i, 1, n) {
mint mul = 1;
for(int u = i + lowbit(i), d = 1; u <= n; u += lowbit(u), ++d) {
mul *= (d + k - 1) * inv[d];
a[u] -= mul * a[i];
}
}
rep(i, 1, n) cout << a[i] << " \n"[i == n];
}
return 0;
}
```

1967D - Long Way to be Non-decreasing

**Hint 1**

Binary search on the answer of magics.

**Hint 2**

You may come up with many $$$\mathcal O(m\log m + n\log^2 m)$$$ solutions with heavy data structures. Unfortunately, none of them is helpful.

**Hint 3**

The key is to judge $$$\mathcal O(m\log n)$$$ times whether vertex $$$u$$$ is reachable from vertex $$$v$$$ in $$$k$$$ steps, instead of querying the minimal value or something else.

**Tutorial**

**Solution**

```
#include <bits/stdc++.h>
namespace FastIO {
template <typename T> inline T read() { T x = 0, w = 0; char ch = getchar(); while (ch < '0' || ch > '9') w |= (ch == '-'), ch = getchar(); while ('0' <= ch && ch <= '9') x = x * 10 + (ch ^ '0'), ch = getchar(); return w ? -x : x; }
template <typename T> inline void write(T x) { if (!x) return; write<T>(x / 10), putchar(x % 10 ^ '0'); }
template <typename T> inline void print(T x) { if (x < 0) putchar('-'), x = -x; else if (x == 0) putchar('0'); write<T>(x); }
template <typename T> inline void print(T x, char en) { if (x < 0) putchar('-'), x = -x; else if (x == 0) putchar('0'); write<T>(x), putchar(en); }
}; using namespace FastIO;
#define MAXM 1000001
int dep[MAXM], id[MAXM], dfn[MAXM], to[MAXM], sz[MAXM], tot = 0;
std::vector<int> ch[MAXM];
void dfs(int u) {
sz[u] = 1, dfn[u] = ++tot;
for (int v : ch[u]) {
dep[v] = dep[u] + 1, id[v] = id[u];
dfs(v), sz[u] += sz[v];
}
}
inline bool inSub(int u, int v) /* v \in u ? */ { return dfn[u] <= dfn[v] && dfn[v] < dfn[u] + sz[u]; }
constexpr int INF = 0x3f3f3f3f;
inline int query(int u, int v) /* u -> v */ {
if (u == v) return 0;
if (id[u] != id[v]) return INF;
int res = INF;
if (inSub(v, u)) res = dep[u] - dep[v];
if (inSub(v, to[id[u]])) res = std::min(dep[u] - dep[v] + dep[to[id[u]]] + 1, res);
// printf("query(%d, %d) = %d\n", u, v, res);
return res;
}
#define MAXN 1000001
int a[MAXN], N, M;
bool check(int val) {
// printf("check %d\n", val);
int lst = 1;
for (int i = 1; i <= N; ++i) {
while (lst <= M && query(a[i], lst) > val) ++lst;
if (lst > M) return false;
// printf("a[%d] = %d\n", i, lst);
}
return true;
}
namespace DSU {
int fa[MAXM];
void inis(int n) { for (int i = 1; i <= n; ++i) fa[i] = i; }
inline int find(int x) { return x == fa[x] ? x : fa[x] = find(fa[x]); }
inline bool merge(int x, int y) { if (find(x) == find(y)) return false; fa[fa[x]] = fa[y]; return true; }
}; using namespace DSU;
int main() {
int T = read<int>();
while (T--) {
N = read<int>(), M = read<int>(), inis(M);
for (int i = 1; i <= N; ++i) a[i] = read<int>();
for (int x = 1; x <= M; ++x) dep[x] = id[x] = dfn[x] = to[x] = sz[x] = 0, ch[x].clear();
tot = 0;
for (int i = 1, p; i <= M; ++i) {
p = read<int>();
if (merge(i, p)) ch[p].push_back(i); else to[i] = p;
}
for (int i = 1; i <= M; ++i) if (to[i] > 0) id[i] = i, dfs(i);
if (!check(M)) { puts("-1"); continue; }
int L = 0, R = M;
while (L < R) {
int mid = L + R >> 1;
if (check(mid)) R = mid; else L = mid + 1;
}
print<int>(R, '\n');
}
}
```

1967E1 - Again Counting Arrays (Easy Version)

**Hint 1**

Use the simplest way to judge if an $$$a$$$ is valid.

**Hint 2**

We've got a $$$\mathcal O(nm)$$$ solution. Our target time complexity is $$$\mathcal O(n\sqrt{n})$$$.

**Hint 3**

It can be boiled down to a grid path counting problem.

**Tutorial**

**Solution**

```
#include <bits/stdc++.h>
namespace FastIO {
template <typename T> inline T read() { T x = 0, w = 0; char ch = getchar(); while (ch < '0' || ch > '9') w |= (ch == '-'), ch = getchar(); while ('0' <= ch && ch <= '9') x = x * 10 + (ch ^ '0'), ch = getchar(); return w ? -x : x; }
template <typename T> inline void write(T x) { if (!x) return; write<T>(x / 10), putchar(x % 10 ^ '0'); }
template <typename T> inline void print(T x) { if (x < 0) putchar('-'), x = -x; else if (x == 0) putchar('0'); write<T>(x); }
template <typename T> inline void print(T x, char en) { if (x < 0) putchar('-'), x = -x; else if (x == 0) putchar('0'); write<T>(x), putchar(en); }
}; using namespace FastIO;
#define MAXN 2000001
namespace Maths {
constexpr int MOD = 998244353;
long long qpow(long long a, long long x) { long long ans = 1; while (x) (x & 1) && (ans = ans * a % MOD), a = a * a % MOD, x >>= 1; return ans; }
long long frac[MAXN << 1 | 1], prac[MAXN << 1 | 1];
inline void inis(int V = MAXN * 2) { frac[0] = prac[0] = 1; for (int i = 1; i <= V; ++i) frac[i] = frac[i - 1] * i % MOD; prac[V] = qpow(frac[V], MOD - 2); for (int i = V - 1; i; --i) prac[i] = prac[i + 1] * (i + 1) % MOD; }
inline long long C(int N, int M) { if (N < 0 || M < 0 || N < M) return 0; return frac[N] * prac[M] % MOD * prac[N - M] % MOD; }
}; using namespace Maths;
struct Point {
int x, y;
Point () {}
Point (int X, int Y) : x(X), y(Y) {}
inline void flip(int b) { x += b, y -= b, std::swap(x, y); }
inline int calc() { return C(x + y, y); }
} pA, pB;
inline void add(int& x, int y) { (x += y) >= MOD && (x -= MOD); }
inline void del(int& x, int y) { (x -= y) < 0 && (x += MOD); }
int calc(int p, int q, int b1, int b2) {
pA = pB = Point(p, q);
int ans = pA.calc();
while (pA.x >= 0 && pA.y >= 0)
pA.flip(b1), del(ans, pA.calc()), pA.flip(b2), add(ans, pA.calc());
while (pB.x >= 0 && pB.y >= 0)
pB.flip(b2), del(ans, pB.calc()), pB.flip(b1), add(ans, pB.calc());
return ans;
}
int dp[2][3001], powm[MAXN];
void solve() {
int N = read<int>(), M = read<int>(), b0 = read<int>();
if (b0 >= M) return (void)print<int>(qpow(M, N), '\n');
powm[0] = 1; for (int i = 1; i <= N; ++i) powm[i] = 1ll * M * powm[i - 1] % MOD;
if (1ll * M * M <= N) { // dp
for (int k = 0; k < M; ++k) dp[0][k] = (int)(k == b0), dp[1][k] = 0;
int ans = 1ll * dp[0][M - 1] * (M - 1) % MOD * powm[N - 1] % MOD;
for (int i = 1; i <= N; ++i) {
auto now = dp[i & 1], lst = dp[(i & 1) ^ 1];
now[0] = 0; for (int k = 0; k + 1 < M; ++k) now[k + 1] = 1ll * lst[k] * (M - 1) % MOD;
for (int k = 1; k < M; ++k) add(now[k - 1], lst[k]);
if (i < N) add(ans, 1ll * now[M - 1] * (M - 1) % MOD * powm[N - i - 1] % MOD);
}
for (int k = 0; k < M; ++k) add(ans, dp[N & 1][k]);
print<int>(ans, '\n');
} else { // reflective inclusion-exclusion
const int B1 = M - b0, B2 = -1 - b0; int ans = qpow(M, N);
for (int x = b0, y = 0, k = 1, p = b0; p < N; p += 2, ++x, ++y, k = 1ll * k * (M - 1) % MOD)
del(ans, 1ll * calc(x, y, B1, B2) * k % MOD * powm[N - p - 1] % MOD);
print<int>(ans, '\n');
}
}
int main() { int T = read<int>(); inis(); while (T--) solve(); return 0; }
```

1967E2 - Again Counting Arrays (Hard Version)

Thanks A_G for discovering that E2 is possible!

**Hint 1**

Solve E1 with $$$\mathcal O(n\sqrt{n})$$$ solution first (The $$$\mathcal O(n\log^2n)$$$ solution doesn't help much in E2).

**Hint 2**

For a single round of inclusion-exclusion, write down the form of the answer as simple as possible.

**Tutorial**

**Solution**

```
#include <bits/stdc++.h>
using namespace std;
const int MOD = 998244353;
const int N = 2e6+5;
int inv[N];
int inversemod(int p, int q) {
// assumes p > 0
// https://mirror.codeforces.com/blog/entry/23365
return (p > 1 ? q-1LL*inversemod(q%p, p)*q/p : 1);
}
void add(int& x, int y) {
x += y;
if (x >= MOD) x -= MOD;
}
void sub(int& x, int y) {
x -= y;
if (x < 0) x += MOD;
}
int solve(int n, int m, int b0) {
// let X = hit -1, Y = hit m
// f(S) = count of sequences that end up in [0, infty] while containing the pattern S
// g(S) = count of sequences that end up in [-infty, -1] while containing the pattern S
// we want f() - f(X) + f(YX) - f(XYX) + f(YXYX) - f(XYXYX) + ...
// + g(Y) - g(XY) + g(YXY) - g(XYXY) + g(YXYXY) - ...
vector<int> pw(n+1);
pw[0] = 1;
for (int i = 1; i <= n; i++) pw[i] = 1LL*pw[i-1]*(m-1) % MOD;
// final ans will be sum from i = 0 to n of (n choose i) a_i
vector<int> a(n+2);
auto work = [&] (int c, int pw_coeff, int sgn_x) -> bool {
// let RANGE = [-infty, -1] if sgn_x == -1 and [0, infty] if sgn_x == 1
// for all x in RANGE such that (n+x+c)/2 is between 0 and n inclusive,
// add pw_coeff*pw[(n+x-b0)/2] to a[(n+x+c)/2]
// return 0 to signal that we are out of bounds and should exit, otherwise 1
int l = 0;
if (sgn_x == 1) l = max(l, (n+c+1)>>1);
int r = n;
if (sgn_x == -1) r = min(r, (n+c-1)>>1);
if (l > r) return 0;
add(a[l], 1LL*pw_coeff*pw[l-(b0+c)/2] % MOD);
sub(a[r+1], 1LL*pw_coeff*pw[r+1-(b0+c)/2] % MOD);
return 1;
};
int ans = 0;
// f(k*YX)
// after reflection trick, end up in x + 2*(m+1)*k
for (int k = 0; work(2*(m+1)*k - b0, 1, 1); k++);
// f(X + k*YX)
// after reflection trick, end up in -2-x - 2*(m+1)*k
for (int k = 0; work(2*(m+1)*k+2+b0, MOD-1, 1); k++);
// g(Y + k*XY)
// after reflection trick, end up in 2*m-x + 2*(m+1)*k
for (int k = 0; work(-2*m -2*(m+1)*k + b0, 1, -1); k++);
// g(k*XY)
// after reflection trick, end up in x - 2*(m+1)*k
for (int k = 1; work(-2*(m+1)*k - b0, MOD-1, -1); k++);
for (int i = 1; i <= n; i++) {
add(a[i], 1LL*a[i-1]*(m-1) % MOD);
}
// do the binomial stuff without precalculated factorials because why not
int coeff = 1;
for (int i = 0; i <= n; i++) {
add(ans, 1LL * coeff * a[i] % MOD);
coeff = 1LL * coeff * (n-i) % MOD * inv[i+1] % MOD;
}
return ans;
}
int main () {
ios_base::sync_with_stdio(0); cin.tie(0);
inv[1] = 1;
for (int i = 2; i < N; i++) inv[i] = 1LL*(MOD-MOD/i)*inv[MOD % i] % MOD;
int T;
cin >> T;
while (T--) {
int n, m, b0;
cin >> n >> m >> b0;
if (b0 >= m) {
int ans = 1;
for (int i = 0; i < n; i++) ans = 1LL*ans*m % MOD;
cout << ans << '\n';
continue;
}
cout << solve(n, m, b0) << '\n';
}
}
```

**Hint 1**

How to maintain $$$\sum\min(nxt_i-pre_i,x)$$$? Try $$$\sum\min(nxt_i-i,x)+\min(i-pre_i,x)$$$.

**Hint 2**

To maintain $$$\sum\min(nxt_i-i,x)$$$, we can use chunking. Just +1 and $$$\operatorname{chkmin}$$$.

**Hint 3**

To finish it, consider what we do in segment-beats.

**Tutorial**

**Solution**

```
#include<bits/stdc++.h>
using namespace std;
constexpr int maxn=300010,maxq=100010,B=400;
int n,bn,a[maxn],b[maxn],idx[maxn],nxt[maxn],add[maxn/B+5],mxadd[maxn/B+5],mx[maxn/B+5],se[maxn/B+5],t[maxn];
long long ans[maxq];
vector<int>val[maxn/B+5],mxval[maxn/B+5],pre[maxn/B+5],mxpre[maxn/B+5],pos[maxn/B+5],ks[maxn];
void vAdd(int i)
{
for(;i<=n;i+=(i&-i)) t[i]++;
}
int nQuery(int i)
{
int s=0;
for(;i;i-=(i&-i)) s+=t[i];
return s;
}
void vWork()
{
int i,j,tot=0;
for(i=1;i<=n;i++) b[a[i]]=i;
for(i=1;i<=n;i++)
{
int p=b[i],bp=(p-1)/B+1;
val[bp].clear();
mxval[bp].clear();
auto radixsort=[](vector<int>&v)
{
if(v.empty()) return;
static int buc[1024],res[B+5];
auto tmp=minmax_element(v.begin(),v.end());
int mn=*tmp.first,rg=*tmp.second-mn;
if(!rg) return;
int lv=__lg(rg)/2+1,len=1<<lv,i;
memset(buc,0,len*4);
for(int &it:v) it-=mn,buc[it&(len-1)]++;
for(i=1;i<len;i++) buc[i]+=buc[i-1];
for(int it:v) res[--buc[it&(len-1)]]=it;
memset(buc,0,len*4);
for(int it:v) buc[it>>lv]++;
for(i=1;i<len;i++) buc[i]+=buc[i-1];
for(i=v.size()-1;i>=0;i--) v[--buc[res[i]>>lv]]=res[i]+mn;
};
auto getpre=[&](vector<int>&pre,const vector<int>&ori)
{
pre.resize(ori.size());
if(ori.empty()) return;
pre[0]=ori[0];
for(int i=1;i<(int)ori.size();i++) pre[i]=pre[i-1]+ori[i];
};
int lstmx=mx[bp];
vAdd(p);
idx[p]=nQuery(p);
mx[bp]=nxt[p]=n*2;
se[bp]=0;
auto it=pos[bp].begin();
for(;it<pos[bp].end();it++)
{
j=*it;
if(j>p) break;
nxt[j]+=add[bp];
if(nxt[j]>lstmx) nxt[j]+=mxadd[bp];
nxt[j]=min(nxt[j],idx[p]);
idx[j]+=add[bp];
if(nxt[j]>mx[bp]) se[bp]=mx[bp],mx[bp]=nxt[j];
else if(nxt[j]>se[bp]) se[bp]=nxt[j];
}
it=pos[bp].insert(it,p);
for(it++;it!=pos[bp].end();it++)
{
j=*it;
nxt[j]+=add[bp];
if(nxt[j]>lstmx) nxt[j]+=mxadd[bp];
nxt[j]++;
idx[j]+=add[bp]+1;
if(nxt[j]>mx[bp]) se[bp]=mx[bp],mx[bp]=nxt[j];
else if(nxt[j]>se[bp]) se[bp]=nxt[j];
}
for(int j:pos[bp])
{
if(nxt[j]==mx[bp]) mxval[bp].push_back(nxt[j]-idx[j]);
else val[bp].push_back(nxt[j]-idx[j]);
}
add[bp]=mxadd[bp]=0;
radixsort(val[bp]);
getpre(pre[bp],val[bp]);
radixsort(mxval[bp]);
getpre(mxpre[bp],mxval[bp]);
for(j=bp+1;j<=bn;j++) add[j]++,mx[j]++,se[j]++;
for(j=1;j<bp;j++)
{
if(mx[j]<=idx[p]) continue;
if(se[j]<idx[p])
{
mxadd[j]+=idx[p]-mx[j],mx[j]=idx[p];
continue;
}
val[j].clear();
mxval[j].clear();
lstmx=mx[j];
mx[j]=idx[p],se[j]=0;
for(int x:pos[j])
{
nxt[x]+=add[j];
idx[x]+=add[j];
if(nxt[x]>lstmx) nxt[x]+=mxadd[j];
if(nxt[x]>=idx[p])
{
nxt[x]=idx[p];
mxval[j].push_back(nxt[x]-idx[x]);
}
else
{
if(nxt[x]>se[j]) se[j]=nxt[x];
val[j].push_back(nxt[x]-idx[x]);
}
}
add[j]=mxadd[j]=0;
radixsort(val[j]);
getpre(pre[j],val[j]);
radixsort(mxval[j]);
getpre(mxpre[j],mxval[j]);
}
for(int ki:ks[i])
{
tot++;
for(j=1;j<=bn;j++)
{
auto it=lower_bound(val[j].begin(),val[j].end(),ki);
ans[tot]+=(val[j].end()-it)*ki;
if(it!=val[j].begin()) ans[tot]+=pre[j][it-val[j].begin()-1];
it=lower_bound(mxval[j].begin(),mxval[j].end(),ki-mxadd[j]);
ans[tot]+=(mxval[j].end()-it)*ki;
if(it!=mxval[j].begin()) ans[tot]+=(it-mxval[j].begin())*mxadd[j]+mxpre[j][it-mxval[j].begin()-1];
}
}
}
}
int main()
{
ios::sync_with_stdio(false),cin.tie(0);
int T;
cin>>T;
while(T--)
{
int i,ki,tot=0;
cin>>n;
bn=(n-1)/B+1;
for(i=1;i<=n;i++) cin>>a[i];
for(i=1;i<=n;i++)
{
cin>>ki;
ks[i].resize(ki);
for(int &it:ks[i])
{
cin>>it;
ans[++tot]=-(i+it-1);
}
}
vWork();
reverse(a+1,a+n+1);
for(int x=1;x<=n;x++)
t[x]=0;
for(i=1;i<=bn;i++) mx[i]=0,se[i]=0,val[i].clear(),mxval[i].clear(),pos[i].clear();
vWork();
for(int x=1;x<=n;x++)
t[x]=0;
for(i=1;i<=bn;i++) mx[i]=0,se[i]=0,val[i].clear(),mxval[i].clear(),pos[i].clear();
for(i=1;i<=tot;i++) cout<<ans[i]<<'\n';
tot=0;
}
return 0;
}
```

too math, downvoted...

I thought the problem was cool

not enough persistent segment tree...

it is mid night here and i had to do mAtHs

Auto comment: topic has been updated by rui_er (previous revision, new revision, compare).I don't think that this round deserves hate. div2A-D2 were all good. I say that as someone stuck on D2 for almost 2 hours. My final submission was rejected because of strict limits. However, the trick $$$p^2 < n$$$ can compensate for it. I find it fascinating, and I am glad to have seen it. The model code is much simpler than mine.

The downside is that all problems (the ones I attempted) had the same (math) tag. But, it is a common trend nowadays. We cannot do anything about it, so cope.

it had some very nice problems, definitely

Well, here is the main contribution list of the writers.

Please feel free to tell us which is you favourite problem and why, and even the problems you dislike too. Meanwhile, I would also like to thank our hard-working coordinator and testers. They together made it possible for us to hold this round.

If there is a chance, we hope to meet again with better problems.

D1 is not mine, it was accidentally invented by the tester newb_programmer while he was solving D2.

I liked 2F/1D. Too hard for me to solve during round but solution nicely unfolds in several steps

2F/1D is really cool. I would have been annoyed if hint 2 was the intended solution.

Btw typo in editorialIf it is possible for ai=w after k steps, w←w+1; Otherwise, i←i+1. If i gets to n+1 first, it's valid. If w gets to m+1 first, it's invalid.

Here, w←w+1 and i←i+1 are at incorrect positions.

orzorzorz

can anyone explain, why my code is WA on test 7 (problem B)

Inadequate char array length.

Thanks!

but why? its size is n

You need a

`\0`

(which is $$$0$$$ if you convert it to`int`

) to tell the language that a C-type string ends here. So you'll need extra space.ok thanks! i will never use char[], scanf, printf

That's a bit of an overreaction, isn't it?

I only used this for three days

Whoa did not expect the solution to B would be so simple

is it possible to solve 2C/1A using Priority Queue? Not in $$$O(k)$$$. I din't think of binary search and now my Pupil is on the line :sob:

I've tried, but its making solution too much complex bcz of such input: 1 1 1 2 2 2 3 3 3

Here 258928208

can you explain please?

Here curr is storing the minimum value all the elements can reach and rem stores the remaining operations after that minimum value is reached. Every element in priority queue I am comparing with previously reached value and calculating whether it possible to reach next top of priority queue value with remaining k.

Too much time wasted for proving the solution of B

but how did you prove it? that Alice wins if the number of U's have to be odd.

At every turn, the parity of the total number of 'U' cards flips. This is because once we flip a 'U' card, the number of 'U' cards decreases by 1. Now since it has two neighbours , it will be either a). DD -> UU b). UU -> DD c). UD -> DU

Either the number of U increases by 2, decreases by 2 or remains same. Since we already discarded a U card, the parity always flips. Also since the maximum number of U can be the current size of the string, we can guarantee the game will end at most after n turns. The winner of the game is the person who gets the string with only 1 U left. So we can conclude that whoever starts with odd number of U's will be the one who will get to the state where there is 1 U left, and hence will win the game.

Thanks a lot for the explanation!

Hey, I understand that the parity of the number of $$$U$$$'s always changes but that does not provide any rigorous proof that if the number of $$$U$$$'s is odd then Alice wins.

I struggled a bit to provide more rigorous proof for this thing.

(Proof by induction does not work since the number of $$$U$$$'s can increase).

I think the

mainobservation is that the number of $$$U$$$'s is bounded by the length of the string $$$n$$$. That means that there is a point in time where the number of $$$U$$$'s always decreases and can't increase any more.If you consider the number of $$$U$$$'s throughout operations it might look like this:

`3 -> 4 -> 5 -> 2 -> 3 -> 4 -> ... -> some bound -> start decreasing till 1`

I would assume that the proof would work for any number of increase/decrease values as long as the parity always flips, there is an upper bound and the game will always end.

The game always ends in at most $$$n$$$ moves and Alice can always make a move since the number of coins facing up is odd, so nonzero, which means that the game must end on Bob's turn.

Nice explanation. I write too complex code for this problem

Yeah,Mathematical problems are difficult to prove.I hate math.

weak pt. I just wrote the limit as the correct limit -1.

then I fst at 1A

me too. I set the limit of binary search to

`1e18`

and it overflowed.Could solve all the way to the problem D2. My math knowledge was useful in both task D1 & D2, however,

mathforces.Editorial is Little bit confusing for me.

Please can anyone verify that following approach was correct?

1)sort array use binary search, to make all all element atleast equal to X

This x will have range from min of array to max+k of array..

2)then ..make arrangement,1->n in this cycle and calculate the answer...using this x

You don't need binary search if you already sorted array.

258906139

I used binary search on k, I sorted array just ,

how will you arrange for this test case 2 3 2 how will you add the extra 2 ?

in div2D1 you do not have to count using math, you can naively run a for loop. The time complexity will be still $$$\mathcal{O}(n+m)$$$ for each test case. This is because, for each value of $$$m$$$ the loop will take $$$\mathcal{O}(\frac{n}{k^2})$$$ time, and $$$\sum_{k=1}^\infty{1/{k^2}}$$$ converges to $$$\pi^2/6$$$.

We can also notice that $$$a$$$ must be a multiple of $$$b$$$ and iterate $$$b$$$ from $$$1,2,..m$$$ and $$$a$$$ from $$$b,2b,...n$$$, then it turns into $$$O(n\log{n})$$$.

Unfortunately couldn't make the same trick work for Div2D2, do you have any idea how?

My solution for D2 uses same trick 258920365. Though it is O(n log^3(n))

Cool that it fits in time

Well, i am sure that it is less than O( nlog^3 n), but i don't know what it is exactly.

Can you elaborate on your approach?

Consider $$$g = gcd(a, b)$$$, than $$$(a + b) | b \cdot g$$$ is equivalent to $$$k \cdot (a + b) = b\cdot g$$$ (for some $$$k$$$). Let $$$a_0 = \frac{a}{g}$$$, $$$b_0 = \frac{b}{g}:$$$ $$$ \newline k \cdot a_0 \cdot g + k \cdot b_0 \cdot g = b_0 \cdot g^2 $$$

Then dividing by $$$b:$$$ $$$ \frac{k \cdot a_0}{b_0} + k = g$$$

Notice that $$$gcd(a_0, b_0) = 1$$$, but $$$\frac{k \cdot a_0}{b_0}$$$ is some integer, thus $$$b_0|k$$$. In other words $$$k = b_0 \cdot c$$$ (for some $$$c$$$).

So we get $$$c \cdot a_0 + c \cdot b_0 = g$$$, it means that $$$c|g$$$. Also $$$g|b$$$. Having $$$c, g, b$$$ we can get $$$a, a_0$$$. So I used that trick to iterate over every triple $$$(c, g, b)$$$(Also you need to check that $$$gcd(a_0, b_0) = 1$$$)

And a bit of optimization: notice that $$$b < \frac{g^2}{c}$$$ (otherwise $$$a$$$ won't be positive).

Thanks. I wasn't thinking about $$$k=b0⋅c$$$. So I didn't solve this problem.

can you explain me why

amust be a multiple ofb?in div2.D1 b.gcd(a,b) divides (a+b) hence b divides (a+b) , => (a+b) = k.b Then a = (k-1).b , this means b divides a

this is so easy man, now I got it.

My O(n log^3 n) solution has passed the Div1B2, can anyone hack it or give a tighter complexity?

258888693

your complexity is $$$O(answer * log(n))$$$ :) max answer is $$$11680996$$$

Thanks.

Is this explaination correct? "For most cases, gcd(u,i-u)=1, so we can assume the enumeration quantities approximate to the answer."

Nice maths training contest

Can anyone give me the reason why this got signed integer overflow 258927405. I have checked the code it and it is correct for other ranges,but fails here.

mid=1e17, and k-=mid*n.

Thanks i got it ,k can get too much negative to handle more subtraction

but how can i fix this?

break when k<0

As far as I'm concerned, E is a bit classical and similar to CF837F.

Thanks anyway to this round that made me GM.

A different solution to div1. A, div2. C using binary search.

The answer is completely binary searchable and there isn't anything wrong with it, but it doesn't match my intuition.

Let binary search on the number of

permutations you create, we can handle the remaining part manually with an O(n), in binary search, let's check if we can have $$$x$$$ full permutations created, to check it we can just calculate the need for that much permutation for each item and see if have more or less if we had less we just add theFull`mid - a[i]`

to the`want`

variable, it's just the matter to check if`want <= k`

.Obviously,

lis the answer, so you know that you can create`n*(l-1) + 1`

full permutations, (for x permutations, we need $$$n+x-1$$$ elements).Now let's handle the remaining value. we manually loop through all the elements and if any element had anything less than l, we just add that to the

`want`

variable, otherwise the number of permutations you can create gets increased by one(reader's exercise: Why? ).the code: 258927507

Had the exact same idea ... dont know where it went wrong ... help

I think the issue with your code is that the initial high value is very high and may lead to overflow (I'm not certain how because I don't use C++ that often). If you change it to 1e13, it works just fine.

It was (mid-a[i]) so it overflowed ... Damn I am stupid

Please tell me how the number of permutations is increased by one if any element is more than k pls. if k=0 and you have 2 3 2 where will you add the extra two

think about how you set up the first permutation if you have some extra from one

can you give a bit more hint about the construction of the answer? I am still not able to fully understand how it will always increase the number permutations by one

Oh I just noticed I miss typed in the original comment

By k I meant L sorry

2 3 1 2

OMG D2...

I got:

`a = dp >= (p + q)p`

`b = dq >= (p + q)q`

So, I decided to sum up them and get:

`a + b >= (p + q)^2`

Nice contest though

Loved the round, thank you! The best moment of the contest for me was when I started coding Persistent Segment Tree in Div.1 D (it wouldn't have worked probably) and realized that there's a cute solution with much less code! Though E1 looks like a pretty standart problem to be honest...

B2

https://qr.ae/pswyHC

For 1972C - Permutation Counting, I first misread the problem as to find the maximum possible answer for any $$$m \le n$$$, where the subarrays form permutations of $$$[1,m]$$$. 258892957

I'm not sure if that was the right approach for the misread verison, but after realizing my mistake, simply disabling the outer loop to handle only $$$m=n$$$ case made it AC. 258896138

Would've been great for me if the problem was actually about the misread one :)

Very good math problems and fast editorial. Thanks!

I am a newbie and I intend to reach to pupil quickly.What resources should I follow to reach it and I really feel annoyed with questions like B that was asked today.I tend to spend a lot of time on such problems and many a times contest gets ended and I am unable to come up with solution.Please guide me

Take a look at the

codeforces catalog. There people much smarter than me have posted the exact thing you are looking for.From resources to guides on how to improve and tutorials on specific topics and even stuff on mindset and attitude.

https://mirror.codeforces.com/catalog

I like Div1E, thanks, though I couldn't make it on time...(QAQ) Here is my solution.

We can rephrase the problem as a random walk:

If we had infinite number of steps instead of $$$n$$$, the problem would be classical. Let $$$f_i$$$ be the probability that, starting from $$$i$$$, it is eventually absorbed at $$$-1$$$ after infinite number of steps. We have $$$f_{b_0} = z_{-1} + \sum_{0\le i\le m-1} z_i f_i$$$. It is known that $$$f_i = \dfrac{(m-1)^{m-i}-1}{(m-1)^{m+1}-1}$$$ for $$$m \ne 2$$$ and $$$f_i = \dfrac{m-i}{m+1}$$$ for $$$m = 2$$$. (Caveat: What if $$$998244353$$$ divides the denominator? It turns out no such case exists within the constraints, luckily.)

To compute $$$z_i$$$ for $$$0 \le i \le m-1$$$, we can reduce it to certain path counting and use reflection method as in the editorial. It takes $$$O(n/m)$$$ time for each $$$i$$$, thus $$$O(n)$$$ overall. (Caveat: The sum of $$$m$$$ is not bounded! We should do this only for $$$b_0-n \le i \le b_0+n$$$.)

Lol, I thought about the model solution to E1, but also thought "Huh? quite weird $$$O(min(nm, \frac{n^2}{m})$$$ tradeoff solution with a ton of detailed math? I'm not coding that for 1750 points! There must be some much slicker way to do this!" and that turned out to be exactly the intended solution xD... But I managed to get D instead and I wouldn't be able to do both anyway

I wish I could become as good as you to even come up with that complexity of the problem.

Could anyone explain the output of this test case for the div2C problem?

10 9

1 9 1 2 1 5 7 5 3 3

The answer is 27 but I am only getting 26 as max.

[6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10]

Here is an optimal solution for that test case

then you can arrange them as follows and get 27 as the maximum score:

Can you please explain why this case has answer 3? I can only simulate the answer is maximum 2, no matter how much I try. Thank you.

1

6 0

1 2 1 2 1 1

You can arrange them like this:

then you get

`[1, 6], [2, 7], [3, 8]`

as valid subarraysThank you so much, brother.

I have made detailed video explanations for C and D1 Question Here are the links for anyone who wishes to watch

C : https://www.youtube.com/watch?v=ZP4HPYTWtZQ D1 : https://www.youtube.com/watch?v=cSKooXv7FKA

Why didn't you mention in the comment that it's not in English? To gain some unintended view?

Will make english versions as well next time Sorry for the incovininence

Why did you write

`I have made detailed video explanations for C and D1 Question Here are the links for anyone who wishes to watch`

instead of`मैंने सी और डी1 प्रश्न के लिए विस्तृत वीडियो स्पष्टीकरण बनाया है, जो कोई भी देखना चाहता है उसके लिए यहां लिंक हैं`

?What drove your decision to advertise the videos in english? =)

can anyone help me finding what's wrong with this approach in 2/C

Inside the binary search, you can break when

`tmp < 0`

(clearly). Turns out, you "must" break otherwise some negative integer overflow will occur and unintentionally make tmp positive even though it should DEFINITELY be negative once`tmp < 0`

occurred. I got AC on your submission with this change. 258943287yeah i just found the problem with my code

NVM it was actually

signed integer overflow!:(We can also solve Div1C in $$$O(n \log^{2} n )$$$ by maintaining the generating function $$$g_{i}(x)$$$ for every position $$$1 \leq i \leq n$$$, where $$$[x^{n}]g_{i}(x)$$$ denotes the value of $$$f^{n}(a)[i]$$$ where $$$a$$$ is the array we are trying to find and $$$f$$$ is the "fenwick function" from the statement. We maintain it by processing $$$i$$$'s in the ascending order of their lowest bit and summing up the generating functions. The result can be extracted as $$$a[i] = [x^{k}]g_{i}(x)$$$.

I enjoyed div1C a lot, thank you! I came up with a solution different than the one mentioned in the editorial (using matrices).

SolutionSuppose we are currently at index $$$i$$$ and we know its value, we have to calculate its contribution to other values that follow it (there are at most $$$\log_2(n)$$$ values). Let's arrange values from smallest index to largest index, assuming they're $$$a, b, c, d, $$$($$$a$$$ $$$is$$$ $$$val_i), \cdots$$$, then we will do the following $$$k$$$ times:

and so on. As we see, it's easy to calculate using matrices with $O$( $$$log(n)^3$$$ $$$log(k))$$$ time.

So the total complexity is $$$O(nlog(n)$$$ $$$+$$$ $$$log(n)^3 log(k))$$$ time.

And here is my submission : 258943002

If i understood correctly, your solution is actually the same as the editorial solution, because if im not mistaken, the part that you calculate with matrices is exactly the same binomial coefficient mentioned in the editorial(if we solve the equations that you mentioned, to get an explicit formula).

Yeah, that's right. But the good thing is that I don't have to solve any equation. Just let matrices do it!

Yup, I agree.

I still need to practice,my math is very poor...

How to find more problems like B which require you to make a very minute observation like it was parity of number of 'U's removed here for every possible type of move, at times it is first and last element....

Is there any way of improving in such problems which don't follow of flow of logic and just break as soon as we make an observation ... ?

I was able to solve D1 and C both but couldn't come up with B and it happens quite a lot ..

A brutal solution for 1C: Let $$$T$$$ be the linear operator that does the "Fenwick tree" transform, one can verify that $$$(T-I)^{k}=0$$$ for $$$k = \log n$$$. So to solve the equation $$$T^m v = a$$$, just compute the coefficient of $$$x^{-m} \bmod (x-1) = \sum_{i<k} c_i x^i$$$, thus we have $$$v = \sum_{i<k} c_i T^i a$$$. Since $$$T$$$ can be applied to a vector in linear time, this has time complexity $$$O(n\log n)$$$.

I'm eager to grasp your solution; could you recommend any prerequisite resources to aid my understanding? Thanks in advance.

Solution này dùng ma trận của ánh xạ tuyến tính á. Nếu bạn học Đại Số Tuyến Tính ở chương trình ĐH rồi thì sẽ biết cách dùng này.

cam on bro

Just a minor typo, the mod should be $$$(x-1)^k$$$, not $$$(x -1)$$$, right?

Yes, thanks for pointing that out

I have yet to understand why the linear operator $$$T$$$, which is a $$$n\times n$$$ matrix, can be applied to a vector in linear time? That is, when we multiply a $$$n\times n$$$ matrix with $$$n\times 1$$$ matrix, the time complexity should be $$$O(n^2)$$$, right? Or have I left out something important?

I mean for the specific $$$T$$$ in this problem, it holds

Thank you very much. This problem is hard, even when you have a good understanding of Linear Algebra and found that $$$(T-I)^n = 0$$$ due to the Cayley-Hamilton theorem, that will just not enough, let alone proving that $$$T-I$$$ is nilpotent. You need to find the nilpotent degree of $$$T-I$$$, that is the smallest value of k such that $$$(T-I)^k = 0$$$ and come up with $$$k \leq \log n$$$ is not that easy at all. Also an inexperienced coder like me can stumbled on my dumb question above :)))

After reading your solution, I had difficulty in computing coefficient $$$c_i$$$ so I have changed my approach to the following:

Instead of computing $$$c_i$$$, I used Taylor Series here:

&\equiv 1 + -\frac{m}{1!}(x-1) + \ldots +(-1)^{k-1}\frac{m(m+1)\ldots(m+k-2)}{(k-1)!}(x-1)^{k-1} \mod (x-1)^k \end{aligned} $$$

So we have

We can set $$$k=20$$$ because $$$\log n \leq 20$$$. But if $$$m$$$ is large, then it's likely that we will get integer overflow even when using long long because the last coefficient can be up to $$$\displaystyle\frac{(10^9)^{20}}{20!}$$$ . So we have to compute these coefficient according to their modulo. That triggers another problem, we have to compute the inverse of $$$i$$$ modulo $$$998244353$$$ for all $$$i=\overline{1,19}$$$, since $$$998244353$$$ is a prime number, we can write a program to get all of them easily, that is

And I got accepted. Here is my solution

259730338

TOOOOOO much math... But the problem attached seem good imo Ive also stuck at D1/D2 almost 2h, Im not good at algebra... Each problem was so coooool but the set was tooooooo biased...

I managed to do both D1 and D2 from first using brute force on a smaller number and then finding a pattern to optimize it.

Yes I've tried to do as similar method. Maybe I almost reached to answer but failed to implement it due to limited time :( my skill issue lmao

what is | in D1 and D2 stand for?

It seems that there are many alternative solutions to E2. Looking through the submissions, here are some that I don't understand (and they look quite different from the editorial):

258928492 258921598 258925888 258917151

jiangly maroonrk maspy potato167 Would any of you mind explaining how your solution works? :)

Let $$$path_{[0,M)}(b,i)$$$ denote the number of Dyck path from $$$(0,b)$$$ to $$$(i,0)$$$, which lies within the range $$$0 \leq y < M$$$. Then, we need to compute $$$\sum_i f[i] \cdot path_{[0,M)}(b,i)$$$ for some coefficients $$$f[i]$$$.

By the reflection principle, $$$path_{[0,M)}(b,i)$$$ can be represented as $$$\sum_{b'} (\pm 1) \cdot path(b',i)$$$, where $$$path(b',i)$$$ has no restriction to the range of $$$y$$$.

Therefore, we need to compute $$$\sum_b (0 \text{ or } \pm 1) \cdot \sum_i f_i \cdot path(b,i)$$$. So, the problem is reduced to computing $$$\sum_i f_i \cdot path(b,i)$$$ for each $$$b$$$. In other words, computing the composition $$$\sum f_i(x+x^{-1})^i = f(x+x^{-1})$$$.

In this case, we can show that $$$f_i$$$ forms a geometric sequence (from $$$b_0$$$ to $$$N-1$$$), and $$$f(x)$$$ takes the form of $$$(\text{a sparse polynomial of degree N+1}) / 1-rx^2$$$.

Therefore, $$$x^{N-1}f(x+x^{-1})$$$ is a polynomial of degree $$$2N+2$$$ divided by a polynomial of degree $$$4$$$, and this can be computed in $$$O(N)$$$ time.

By the way, the composition $$$f(x+x^{-1})$$$ can be computed in $$$O(N\log N)$$$ time (https://mirror.codeforces.com/blog/entry/126124). I studied it today.

My solution is almost the same as maspy's solution. The only difference is the way to compute $$$g_b=\sum_{i} f_i \cdot path(b,i)$$$. I didn't use polynomial things and directly calculated the recurrence relations for $$$g_b$$$ by considering paths.

Would you mind elaborating?

Let $$$g_b=\sum_{0 \leq x, 0 \leq y, x+y \leq K, x-y=b} w^{x+y} {x+y \choose x}$$$. Then consider $$$g_b-w(g_{b+1}+g_{b-1})$$$. It can be calculated in $$$O(1)$$$ time.

Can anyone explain more clearly the solution of problem E div1? I do not really understand.

Update rank too late !

The solution of 2D2/1B2 is pretty amazing: I couldn't believe that we can delete

qof $$$(p+q)\mid qd$$$ in such an easy proof!All in all, I think this contest is better than my last contest Educational Codeforces Round 141 (Rated for Div. 2). Thinking deeply should be the biggest feature of codeforces, that's why I prefer this round. Though in

mathforces, a high quality contest is needy.Can you please explain how we get p<d?

Thank you!

Can you please explain the deletion part?

k=gcd(a,b) p=a/k,q=b/k -> gcd(p,q)=1

gcd(p+q,q)=gcd((p+q)-q,q)=gcd(p,q)=1

so q is useless in this fraction, and you can delete it.

Should it be "If it is possible for $$$a_i = w$$$ after $$$k$$$ steps, $$$i \gets i + 1$$$.; Otherwise, $$$w \gets w + 1$$$"?

I think so

C is one of the best and most satisfying problems I've ever seen, and D1 and D2 were also really nice as well. Thanks to authors for making a great contest!

C is good but C's pretest is one of the worst pretest ever.

can someone tell the time complexity of the code used for calculating if a pair is coprime in the solution of div-2 d2.

Seems to be $$$O(\frac{\sqrt{n} }{2} \cdot \frac{\sqrt{m} }{2} +\frac{\sqrt{n} }{3} \cdot \frac{\sqrt{m} }{3}+\frac{\sqrt{n} }{4} \cdot \frac{\sqrt{m} }{4}+\cdots)=O(\sqrt{nm}\cdot \sum _{i=2}\frac{1}{i^2})=O(\sqrt{nm}\cdot (\frac{\pi ^2}{6} -1))= O(\sqrt{nm} ) $$$

Anyway, after contest I just tried add an extra judgement of

`if (gcd(a, b) == 1)`

, and it didn't raise the time complexity too much and got AC as well.I don't understand the DIV 2 C tutorial at all. Is it just me?

I cannot understand why so many people upvoted the annoucement and editorial.

Can someone please explain div2 C. I was able to figure out that we need to binary search for the number of full segments of [1..n] and answer would be at-least n*(full-1) + 1. But how to handle the remaining elements which can be appended to the prefix and suffix of these full segments.

So once you have got the number of full segments let them be=temp. Now, traverse the input array and make all elements (which are less than temp) equal to temp. and reduce k accordingly. Now if still k>0 then traverse array again and increase all elements equal to temp by 1(why 1? because greedily we just want more no. of elements greater than temp). let no. of elements to be added be num=0. And finally traverse array again and if a[i]>temp do num++. seems a bit complicated lol but you can go through the code 258983345

I did brute force , and it worked for D1 , i just checked for each a and b , checked their gcd and put it in the formula , just kept increasing a by b in each iteration , that made the complexity (b)*(a/b) which is a , and it worked, didnt understand the answer given here tho :/ my submission link : https://mirror.codeforces.com/contest/1972/submission/258899009

in problem

2D2/1B2, how $$$gcd(p + q, q) = gcd(p, q) = 1$$$ helps in droping the term $$$q$$$ in $$$(p+q) ∣ (qd)$$$I think this is done to find the max values of p and q

since (p+q) can't divide q it doesn't contribute to find the max of p and q such that they can be used to find the solutions

hence it is removed .

might help

We also know that p≥1,q≥1,so p<d=ap≤np and thus p2<nwhy should p<d?? (div 2 D2)

If p>=d, then p+q>d, then d obviously can't be a multiple of p+q.

thanks!

div 2 D2:

Can anyone explain why in div1C the coefficient of a_u in b_v will be the mentioned value?

I still don't understand why 6th testcase in Div2 C is 32. I currently can only find 31 permutations.

Buy cards so that the array becomes

`7 6 4 7 6 4 4 4 4`

. One of the possible sequences:`1 2 4 5 3 6 7 8 9 (repeat 4 times) 1 2 4 5`

. Number of permutations =`9 * 4 + 4 - 8 = 32`

oh shit! What have I done? I thought Alice would win if both the number of facing up coins and the number of total coins were odd. I submitted the code using this logic and received a penalty. Had I knew the number of facing of coins being odd is enough for Alice to win the game, it would have been accepted.

My solution to 1967A — Permutation Counting using check max_element > (sum_element + k) / (number_of_elements)

https://mirror.codeforces.com/contest/1967/submission/258991320

For div 2 D1 why kd≤⌊n/d⌋+1 ?

In question E div2 Fenwick Tree. The editorial states that it's easy to prove that the coefficient is

Can someone walk me through it because I kept trying on my own and couldn't come up with it...

In the contest I try generating the fenwick tree and try to come up with the coefficient from the result

0 [100 0 0 0 0 0 0 0]

1 [100 100 0 100 0 0 0 100]

2 [100 200 0 300 0 0 0 400]

3 [100 300 0 600 0 0 0 1000]

4 [100 400 0 1000 0 0 0 2000]

you can get a coefficient matrix like:

where row_i is k, and column_j is delta_d

consider a_1 in b_1 to b_8

let c_u_k denote the coefficient of a_1 in b_u in f^k(a), then c_8_k = c_1_(k-1)+c_2_(k-1)+c_4_(k-1)+c_8_(k-1), note that c_4_k = c_1_(k-1)+c_2_(k-1)+c_4_(k-1), so we have c_8_k = c_4_k + c_8_(k-1), which is a classic formulate for combination number.

I understand what you're saying and I actually came up with something similar:

for i % 2 == 0: b(k)[i] = a[i] + b(k)[i/2] + b(k-1)[i]

for i % 2 == 1: b(k)[i] = a[i]

but I guess I'm a bit weak in combinatorics as I still don't know how to draw the general rule. I think I have to work on that.

But thanks a lot for explaining it <3

don't take all elements at once， take 8 for example， a_4 a_6 a_7 in b_8 are the same（ dep=1 ）， and then a2 a3 a5（ dep=2， and a2 a3 in b4 is dep=1， a5 in b6 is dep=1） ， then a1 in b8（ dep=3， a1 in b2 is dep=1， a1 in a4 is dep=2）， the rule is about dep and k

Oooooh, I totally get it now. Thanks a lot <3

Dealing with the fenwick tree as an actual tree that's a first for me, nice problem indeed.

Can someone explain the problem I have?

I use Code::Blocks for competitive programming and yesterday I submitted the code (258902555) with error 'out of bounds', but in Code::Blocks it worked correctly unlike system showed that there is the wrong answer on one of the tests of first pretest, so I was confused and decided to rewrite code on Python.

Why didn't it show that there is the error and, moreover, worked without error and I got different answers in Code::Blocks and testing system?

In the tutorial for D2 :-

Can anyone please explain how we got (p+q) | d from (p+q) | qd?

My understanding:

b * gcd(a, b) = (a + b) * k

From here we can say that a needs to be a multiple of b hence a = b*x

So, b * gcd(a, b) = (bx + b) * k

Hence as per the editorial p = x and q = 1 but then why aren't we considering q = 1 in (p+q) | d which will eventually look like (p + 1) | d?

p and q are co-prime if p+q|qd if there is a common divisor between p+q and q lets call it s so now we have s*x=p+q s*y=q; subtracting them we get s*(x-y)=p which means that s divides p, but s already divides q hence it is a contradiction to the fact that p and q are co prime hence there is no such divisor hence p+q|qd----> p+q|q

can someone explain me how stars and bars is used in problem Div1 C. .

"We know q=1 because gcd(p,q)=1." how can we say that doesnt this mean p and q both are coprimes?

p=(kd−1)q implies that gcd(p, q) >= q (maybe even gcd(p, q) = q), then q = gcd(p, q) = 1.

If q = 2, then p = (kd — 1) * 2, so gcd(p, q) = 2.

For 1967C — Fenwick Tree, I have a function that computes the inverse factorials under a modulo MOD, but I'm getting different values than the solution above?

It becomes wrong after inv_fact[3] = 166374059, compared to their calculate where inv[3] = 332748118. I've used the function above for many problems, couldn't imagine it is wrong.

I see my mistake, you are not dividing by the inv_fact[d] everytime, and the inv[d] is actually not the inverse factorial, it is 1 / d.

How to solve E1 using NTT?

For the $$$dp_{i,j}$$$ as mentioned, could understand that $$$dp_{i-1} \times (x^{-1}+(m-1)x^1) = dp_i$$$, but the transition has limit at

`-1`

and`m`

.Does it use "circular convolution" trick (on CDQ processing) just like this AT problem?

You may solve it using a D&C. When you need $$$a$$$ transforms, the $$$[a,m-a]$$$ part is easy by using a direct convolution, so you need just deal with the first $$$a$$$ terms and the last $$$a$$$ terms, and that's small.

Can you elaborate on how to deal with first/last $$$a$$$ terms? I can't figure out how to deal with it in $$$\tilde O(a)$$$ time.

finally figure out the details.

supposed that it only has the lower limit at

`Y=-1`

, and do not has the upper limit, we can use the following D&C to solve it:* supposed that we want to calculate $$$dp_R$$$, where $$$dp_L$$$ has been finished (which was used as the input for $$$dp_R$$$);

* let $$$len=R-L$$$, if the size of input is not greater than $$$len$$$, we just recurse it: by calculating the middle column first, and then use the middle row to calculate the right part...

* otherwise $$$[dp_{L,len},dp_{L,sz-1}]$$$ can do a direct convolution (add the result to the last column), and the remain part (with length not gerater than $$$len$$$) just do sth as the above.

when it has lower limit

`Y=-1`

and upper limit`Y=M`

, we can do some similarity:* when the input size not greater than $$$2len$$$, we just rucurse it;

* otherwise use $$$[dp_{L,len},dp_{L,sz-1-len}]$$$ to do a direct convolution;

* and the two remain parts are exactly on the situation: it has only one lower/upper limit.

here is my implemention.

Cool, I thought that algorithm can only handle the case where lower bound exist. Thank you!

I did a bit of analysis on it, and it seems such algorithm can handle counting non-decreasing path with limited degree of transition polynomial $$$f$$$ (let say its degree is $$$|f|$$$) with limited adjacent difference of lower bound/upper bound (let's say $$$L_n, U_n$$$ is respective lower/upper limit sequence and $$$max(max_{i = 2}^{n}(L_i - L_{i - 1}), max_{i = 2}^{n}(U_i - U_{i - 1})) = D$$$).

Since after we transit the "middle part" at some $$$(l, r)$$$ using direct convolution, the following called would only need to handle one bound, and this part would cost $$$O((n + m)\lg (n + m))$$$

Then for the following part handle only one bound (let's say lower part, the analysis for upper is similar), when recurse from $$$(l, r)$$$ to $$$(mid, r)$$$ (for $$$(l, mid)$$$ its degree would be no more than the $$$(mid, r)$$$ part), it would have to handle a polynomial of degree $$$(L_r - L_l) + |f|(mid - l) \leq (r - l)(D + |f|)$$$ and do at most one convolution on it, so this part would cost $$$O(n(D + |f|)\lg (n(D + |f|))\lg n)$$$

So the complexity of this algorithm is $$$O(n(D + |f|)\lg (n(D + |f|))\lg n + (n + m)\lg (n + m))$$$. And it would be fast enough as long as $$$m$$$ and $$$n(D + |f|)$$$ are small enough.

Yo guys, I am currently reading the editorial for D1. I understood everything up until this last section. I understand why we are enumerating d from 1 to m (since d is essentially just b since q is 1). I am not sure how p becomes floor(n / d). Since p * d = a, don't we have to also enumerate all a's up to n? Why are we just greedily using n (the bound for a) over all d's?

$$$a=pd=(kd-1)d$$$, so we enumerate $$$a$$$ by enumerating $$$k$$$. $$$k$$$ can be any positive integer from 1 to $$$\lfloor(\lfloor n/d\rfloor+1)/d \rfloor$$$ (unless $$$d=1$$$, when 1 must be omitted), so we add $$$\lfloor(\lfloor n/d\rfloor+1)/d \rfloor$$$ to the count.

In D2, Why does $$$gcd(p, q) = 1$$$ implies $$$q = 1$$$?

It doesn't in D2. In D1, it's because $$$q|p$$$.

For E1/E2, if you work in the ring of power series modulo $$$x^{2(m+1)}-1$$$, then the number of failing $$$a$$$-sequences is the constant coefficient in

Using this together with fast exponentiation-type algorithms and FFT convolution, you can get an $$$O(m \log m \log n)$$$ method, although it doesn't seem to be faster than the editorial method unless $$$n/m$$$ is quite large.

I could not comprehend the solution for Div1C (Fenwick Tree). Could someone please elaborate on the solution, or suggest any resources that could aid in my understanding?

P.S.: I am familiar with the basic Fenwick Tree

To calculate $$$\textrm{coefficient}\cdot a_u$$$ in $$$b_v$$$, think about how many ways $$$a_u$$$ can reach $$$b_v$$$. Say, $$$a_5$$$ can reach into $$$b_{16}$$$ via $$$b_6, b_8, b_{16}$$$. Now, the number $$$3$$$ came from the depth difference of $$$5$$$ and $$$16$$$ being $$$3$$$. And you want to cover this $$$3$$$ distance using $$$k$$$ steps, where in each step you either propagate forward or stay in place.

If you don't know stars and bars method, a short explanation is that $$$n$$$ things can be divided into $$$m$$$ portions (where a portion is allowed to have $$$0$$$ things in it) by inserting $$$m-1$$$ separators, thus the number of ways become $$$\binom{n+m-1}{n}$$$ (treat the separators also as "things", then make any valid arrangement).

Now, if you piece it together, using the example, $$$a_5$$$ can reach $$$a_{16}$$$ by first staying in $$$5^{th}$$$ place or propagating, in fact even staying in place for $$$k-2$$$ steps and propagating twice, or any other way. Every different choice represents a coefficient of $$$a_5$$$in $$$b_{16}$$$. This number of ways should be $$$\binom{3+k-1}{3}$$$, using $$$k-1$$$ separators between $$$3$$$ propagation actions. Replace $$$3$$$ with $$$\Delta d$$$ and you got the formula.

In div2/ D2, How this formula enumerate each (p,q), in the solution of editorial? min(n/(a+b)/a,m/(a+b)/b);

From the editorial you see that p + q | d, so d/(p+q) must be an integer. Furthermore, p and q must be coprime and less than sqrt(n) and sqrt(m) respectively.

For each valid couple (p, q) how many values of d satisfy that d/(p+q) must be an integer? d = a / p <= n / p and d = b / q <= m / q. Therefore d can assume values from 1 to min(n/p, m/q). How many of these are multiple of p + q? floor(min(n/p,m/q)/(p+q)).

In 1967B1 — Reverse Card (Easy Version) Tutorial, I got till the point where count of values of a is floor(n/d), but after that why the answer is floor((floor(n/d)+1))/d). Can anyone explain?

The condition $$$b⋅gcd(a,b)│a+b$$$ is equivalent to (p+1)=kd so (p + 1)/d must be an integer. Since q=1, d=b=gcd(a, b), therefore d has the same range as b: from 1 to m.

For each possible value of d, we can count how many couples (a, b) satisfy the condition (p + 1)/d=k. Since p=a/d<=floor(n/d), p + 1 <= floor(n/d)+1. So (p + 1)/d can assume values from 1 to (floor(n/d)+1)/d. How many are them? (floor(n/d)+1)/d.

We need to subtract one at the end because for p = 0 and d = 1 we get (p + 1)/d = 1 but that's not ok because p must be greater than 0.

div1C can be also considered in a linear algrebra perspective. Notice each time of BIT is simply a linear operation. And it is so spartial that we can apply some optimization on it. For each index $$$x$$$, thinking the chain $$$b_1 = x, b_2 = b_1+(b_1 \text{&} -b_1), b_3 = b_2 + (b_2 \text{&} -b_2) ...$$$ (which length at most $$$O(\log n)$$$) chain.

Define

If contribution of index $$$x$$$ ($$$a_x$$$) for location at each $$$b_j$$$ ($$$a_{b_j}$$$) after $$$T$$$ round is $$$v^T = (v^T_1, v^T_2, ..., v^T_{\log n})$$$, then after one more round, it is: $$$v^{T+1} = A v^T$$$. That is, the contribution of $$$a_x$$$ to each $$$b_j$$$ after $$$K$$$ op is $$$A^K \begin{pmatrix}1 \ 0 \ 0 \ ... \ 0\end{pmatrix}$$$.

$$$A^K$$$ can be simply calculated with pre compute $$$A^1, A^2, A^4, ..., A^{2^{30}}$$$ in $$$O((\log m)(\log n)^3)$$$, and evaluate in $$$O(\log K (\log n)^2)$$$ (segmv is $$$O((\log n)^2)$$$).

Then for each index $$$b_j$$$ except first one, just have $$$b_j \gets b_j - v_j^K \times a_x$$$ as the contribution, at the end it will ends with desired answer, which is $$$O(n \log n)$$$.

Total complexity is $$$O((\log m) (\log n)^3 + n\log n)$$$

Hi , I had some trouble understanding your solution , more precisely : how did you manage to make the sizes of the A and v logn , I think thats something you did with the chain optimization you mentioned but I couldnt understand that either :/ Can you explain that more elaborately.

Thanks in advance

Because for each index $$$x$$$, it only has at most $$$n$$$ parents in a BIT. Therefore, we only need to care about those $$$\log n$$$ ancestors instead of all of the nodes.

We are calculating the contribution of $$$a_x$$$ (before all operations) to the $$$b_j, 1 \leq j \leq n$$$ (after all operations) separately. Due to the good property of a tree, if $$$y$$$ is ancestor of $$$x$$$, $$$z$$$ is ancestor of $$$y$$$, then $$$z$$$ is an ancestor of $$$x$$$. So we could imaging the contribution of $$$a_x$$$ kind of only "floating up" due to the operations. So the only ones contributed by $$$a_x$$$ is all its ancestors, where there is $$$\log n$$$ of them.

I'm not sure to understand the tutorial for 1967A — Permutation Counting because of this edge case:

Assume you had:

n = 6 and k = 0

with cards: 1 1 1 2 2 2 3 3 4 4 4 5 5 6 6 6 (that is 3 3 2 3 2 3)

then the array we will make is: 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 ; plus one 4 that we have to use. Without the 4, there are

10permutations to this array.However, I don't know where to place the last 4 to generate the answer

11, the answer the algorithm would give this problem: cnt = 2 (for 3 and 5) lst = 2 (because 3 and 5 both have 2) ans = n*lst - cnt + 1 = 6*2 - 2 + 1 =11Why is the answer 11?

Because you can place the cards as 1 2 4 6 3 5 1 2 4 6 3 5 1 2 4 6.

I spent 3 Hrs figuring this sh*t out. Thank you, your comment saved me.

1967B1 — Reverse Card (Easy Version)

What if I set

`p = 1`

, How should I proceed from`qd | (p + q)`

?I can't come up with a useful bound for k if I set p = 1. Could you please help me how we can solve this considering p = 1? My understanding was that because gcd(p , q) = 1, so we can set either q = 1 or p = 1.

For problem Long Way to be Non-decreasing there is a simpler way to group each element without using DSU

https://mirror.codeforces.com/contest/1972/submission/[submission:264738791]

can anyone tell me why this solution is giving some wrong answers

.

.