On behalf of the organizing team, we extend our deepest gratitude to everyone who participated in CodeRed Prelims 2026. Witnessing your dedication and fierce competitive spirit on the leaderboard validated the immense effort our team invested behind the scenes.
Our problem-setters spent countless hours crafting and balancing these challenges. We sincerely hope the problemset provided a rewarding and engaging experience for the IIIT Allahabad community and our global participants. Below, you will find the official hints, detailed editorials, and author solutions. Happy upsolving!
A: Ashutosh's Triad
Author: sigma__1
The phrase "minimize the maximum salary" is a classic indicator that we should use Binary Search on the answer.
Since $$$X$$$ is square-free, the maximum number of distinct prime factors $$$m$$$ is very small ($$$m \le 15$$$). Think about representing each wizard's magic number as a bitmask of these prime factors.
To efficiently count the valid triplets for a particular maximum salary, you can use Sum Over Subsets (SOS) DP along with the Principle of Inclusion-Exclusion (PIE).
We need to select exactly 3 wizards such that the LCM of their magic numbers is a multiple of $$$X$$$. Mathematically, $$$\text{LCM}(A_{w1}, A_{w2}, A_{w3}) \equiv 0 \pmod X$$$. Since we need to minimize the maximum salary, we binary search the target maximum salary in the range $$$[1, 10^9]$$$. For a candidate salary mid, we only consider wizards where $$$S_i \le \text{mid}$$$.
First, we factorize $$$X$$$. Since $$$X$$$ is square-free and up to $$$10^{18}$$$, it has at most $$$15$$$ distinct prime factors. Trial division up to $$$10^{18}$$$ will give TLE, so we do trial division up to $$$10^6$$$. Let the remaining unfactored part be rem. If $$$\text{rem} \gt 1$$$, we iterate through the array and compute $$$g = \gcd(A_i, \text{rem})$$$ to safely split it into distinct factors if possible.
Once we have the $$$m$$$ distinct prime factors of $$$X$$$, we represent each wizard's magic number $$$A_i$$$ as a bitmask of length $$$m$$$. The $$$j$$$-th bit is set to $$$1$$$ if $$$A_i$$$ is divisible by the $$$j$$$-th prime factor of $$$X$$$. The divisibility condition translates to the bitwise OR of their masks covering all bits of $$$X$$$.
To check if a valid triplet exists among the permitted wizards for our binary search mid, counting naively takes $$$O(n^3)$$$. Instead, we use Sum Over Subsets (SOS) DP and the Principle of Inclusion-Exclusion (PIE). Let FullMask $$$= (1 \ll m) - 1$$$. We build a frequency array $$$F$$$ denoting how many permitted wizards have exactly that mask. Using SOS DP, we transform this array so that $$$F[\text{mask}]$$$ represents the number of wizards whose mask is a subset of mask.
Using PIE, the total number of valid triplets that strictly form the FullMask is:
If $$$\text{Triplets} \gt 0$$$, it means at least one valid combination of three wizards exists, so we update high = mid - 1. Otherwise, we update low = mid + 1.
Time Complexity: $$$O(\min(\sqrt{X}, 10^6) + n \log X + \log(\max S) \cdot m 2^m)$$$
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
struct Wizard {
int s;
ll a;
int mask;
};
ll gcd(ll a, ll b) {
return b == 0 ? a : gcd(b, a % b);
}
ll nCr3(ll n) {
return n * (n - 1) * (n - 2) / 6;
}
bool check(ll mid_salary, int m, const vector<Wizard>& wizards) {
int full_mask = (1 << m) - 1;
vector<ll> C(1 << m, 0);
for (const auto& w : wizards) {
if (w.s <= mid_salary) C[w.mask]++;
}
for (int i = 0; i < m; i++) {
for (int mask = 0; mask <= full_mask; mask++) {
if (mask & (1 << i)) C[mask] += C[mask ^ (1 << i)];
}
}
ll valid_triplets = 0;
for (int mask = 0; mask <= full_mask; mask++) {
int missing_bits = __builtin_popcount(full_mask ^ mask);
if (missing_bits % 2 == 1) valid_triplets -= nCr3(C[mask]);
else valid_triplets += nCr3(C[mask]);
}
return valid_triplets > 0;
}
void solve() {
int n;
ll x;
if (!(cin >> n >> x)) return;
vector<Wizard> wizards(n);
for (int i = 0; i < n; i++) {
cin >> wizards[i].s >> wizards[i].a;
wizards[i].a = gcd(wizards[i].a, x);
wizards[i].mask = 0;
}
vector<ll> factors;
ll rem = x;
for (ll i = 2; i * i <= rem && i <= 1000000; i++) {
if (rem % i == 0) {
factors.push_back(i);
rem /= i;
}
}
if (rem > 1) {
bool split = false;
for (int i = 0; i < n; i++) {
ll g = gcd(wizards[i].a, rem);
if (g > 1 && g < rem) {
factors.push_back(g);
factors.push_back(rem / g);
split = true;
break;
}
}
if (!split) {
factors.push_back(rem);
}
}
int m = factors.size();
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) {
if (wizards[i].a % factors[j] == 0) {
wizards[i].mask |= (1 << j);
}
}
}
ll low = 1, high = 1e9, ans = -1;
while (low <= high) {
ll mid = low + (high - low) / 2;
if (check(mid, m, wizards)) {
ans = mid;
high = mid - 1;
} else {
low = mid + 1;
}
}
cout << ans << "\n";
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int t = 1;
while (t--) {
solve();
}
return 0;
}
B: Vardaan's Segment Tree Problem
Author: DysfunctionalDegenerate
The grid rotates, but the required $$$H \times W$$$ safe zone remains fixed relative to the screen. Think about what happens to the grid dimensions and cell mappings after different numbers of 90° rotations.
Notice that after 0 or 2 rotations, the grid behaves like the original $$$N \times M$$$ grid. After 1 or 3 rotations, it behaves like an $$$M \times N$$$ grid. You only need to maintain two states to track the grid.
To efficiently count the safe zones, reduce the 2D problem to a 1D problem by fixing a strip of $$$H$$$ rows. Use bitmasks to track columns and calculate the contribution of consecutive zero-runs in $$$O(1)$$$ time.
1. Key Observations: Rotation Trick (Core Geometry) The grid rotates, but the required rectangle remains $$$H \times W$$$ in the current orientation. An important observation is that: * After $$$0$$$ or $$$2$$$ rotations, the grid behaves like the original $$$N \times M$$$ grid. * After $$$1$$$ or $$$3$$$ rotations, the grid behaves like an $$$M \times N$$$ grid.
Thus, we only need to maintain two states: * $$$st_0$$$: original orientation * $$$st_1$$$: rotated orientation
The final answer for any query simply depends on the parity of the current total rotations.
2. Coordinate Mapping When a meteor is toggled in the current orientation, we must map it back to the original grid to update our states. Let $$$(r,c)$$$ be the current position (0-indexed). Based on the number of rotations: * rot = 0: $$$(x,y) = (r,c)$$$ * rot = 1: $$$(x,y) = (N-1-c, r)$$$ * rot = 2: $$$(x,y) = (N-1-r, M-1-c)$$$ * rot = 3: $$$(x,y) = (c, M-1-r)$$$
Then, we update $$$st_0$$$ at $$$(x,y)$$$ and update $$$st_1$$$ at $$$(y, N-1-x)$$$.
3. Reducing 2D to 1D Fix a horizontal strip of exactly $$$H$$$ consecutive rows. For each column in this strip, we mark it as $$$1$$$ if any meteor exists in that column within the strip, and $$$0$$$ otherwise. The problem now reduces to: Count the number of subarrays of length $$$W$$$ consisting strictly of $0$s. If a continuous run of zeros has length $$$L$$$, it contributes:
4. Efficient Updates Each meteor toggle affects only the strips containing its row. Since a strip is $$$H$$$ rows tall, a single toggle affects at most $$$H$$$ strips.
For each strip, we maintain a bitmask of its columns using 64-bit integers. We can quickly find the nearest $$$1$$$ to the left or right using bit manipulation built-ins like __builtin_clzll and __builtin_ctzll.
Delta Update Mechanism: Let $$$p$$$ be the position of the previous $$$1$$$ and $$$n$$$ be the position of the next $$$1$$$. Before splitting a zero-run by inserting a new meteor at position $$$c$$$: $$$L = n - p - 1$$$
After inserting the meteor, the run splits into two: $$$L_1 = c - p - 1$$$ $$$L_2 = n - c - 1$$$
The change in the number of safe zones (Delta) is:
We simply reverse the signs when a meteor disappears (merging two zero-runs back into one).
5. Complexity Each meteor toggle updates at most $$$H$$$ strips. Using bitwise operations, each update inside a strip takes $$$O(1)$$$ time. * Time Complexity: $$$O(Q \cdot H)$$$ * Space Complexity: $$$O(N \cdot M)$$$ to store the states and configurations.
#include <bits/stdc++.h>
using namespace std;
static inline int msb32(unsigned x) {
return 31 - __builtin_clz(x);
}
static inline int msb64(unsigned long long x) {
return 63 - __builtin_clzll(x);
}
static inline int lsb64(unsigned long long x) {
return __builtin_ctzll(x);
}
struct State {
int R, C, H, W;
int rows;
vector<int> lo, hi;
vector<int> cnt;
vector<unsigned long long> bits;
vector<unsigned short> wmask;
vector<int> gain;
long long ans = 0;
State(int R_, int C_, int H_, int W_) {
R = R_; C = C_; H = H_; W = W_;
rows = R - H + 1;
lo.resize(R);
hi.resize(R);
for (int r = 0; r < R; ++r) {
lo[r] = max(0, r - H + 1);
hi[r] = min(r, rows - 1);
}
cnt.assign(rows * C, 0);
bits.assign(rows * 8, 0ULL);
wmask.assign(rows, 0);
gain.assign(C + 1, 0);
for (int i = 0; i <= C; i++)
gain[i] = max(0, i - W + 1);
ans = 1LL * max(0, rows) * max(0, C - W + 1);
}
int prevOne(int s, int c) const {
const auto &row = bits;
int w = c >> 6, b = c & 63;
unsigned long long cur = row[s * 8 + w];
if (b > 0) {
unsigned long long m = cur & ((1ULL << b) - 1);
if (m) return (w << 6) + msb64(m);
}
unsigned short lower = wmask[s] & ((1u << w) - 1);
if (lower) {
int ww = msb32(lower);
return (ww << 6) + msb64(row[s * 8 + ww]);
}
return -1;
}
int nextOne(int s, int c) const {
const auto &row = bits;
int w = c >> 6, b = c & 63;
unsigned long long cur = row[s * 8 + w];
if (b < 63) {
unsigned long long m = cur & (~0ULL << (b + 1));
if (m) return (w << 6) + lsb64(m);
}
unsigned short higher = wmask[s] & ~((1u << (w + 1)) - 1);
if (higher) {
int ww = __builtin_ctz(higher);
return (ww << 6) + lsb64(row[s * 8 + ww]);
}
return C;
}
void flipBit(int s, int c, bool add) {
int p = prevOne(s, c);
int n = nextOne(s, c);
int L = n - p - 1;
int L1 = c - p - 1;
int L2 = n - c - 1;
if (add)
ans += gain[L1] + gain[L2] - gain[L];
else
ans += gain[L] - gain[L1] - gain[L2];
int w = c >> 6, b = c & 63;
unsigned long long &val = bits[s * 8 + w];
if (add) val |= (1ULL << b);
else val &= ~(1ULL << b);
if (val) wmask[s] |= (1 << w);
else wmask[s] &= ~(1 << w);
}
void toggle(int r, int c, bool add) {
for (int s = lo[r]; s <= hi[r]; s++) {
int &x = cnt[s * C + c];
int before = x;
x += add ? 1 : -1;
if (add && before == 0) flipBit(s, c, true);
if (!add && before == 1) flipBit(s, c, false);
}
}
};
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
int N, M, H, W, Q;
cin >> N >> M >> H >> W >> Q;
State st0(N, M, H, W);
State st1(M, N, H, W);
vector<int> alive(N * M, 0);
int rot = 0;
auto toOrig = [&](int r, int c) {
if (rot == 0) return make_pair(r, c);
if (rot == 1) return make_pair(N - 1 - c, r);
if (rot == 2) return make_pair(N - 1 - r, M - 1 - c);
return make_pair(c, M - 1 - r);
};
while (Q--) {
int t; cin >> t;
if (t == 1) {
int r, c; cin >> r >> c;
r--; c--;
auto [x, y] = toOrig(r, c);
int id = x * M + y;
bool add = !alive[id];
alive[id] ^= 1;
st0.toggle(x, y, add);
st1.toggle(y, N - 1 - x, add);
} else {
rot = (rot + 1) & 3;
}
cout << ((rot & 1) ? st1.ans : st0.ans) << '\n';
}
}
C: Pranav's Strict Interviews
Author: goyalnikhil744
A candidate $$$i$$$ passes if placed in a slot $$$t$$$ such that $$$t \le A_i$$$. Since the maximum slot is $$$N-1$$$, the effective passing range is $$$[0, \min(A_i, N-1)]$$$. What happens if you sort the candidates by this upper bound?
By sorting, the valid slot ranges become nested. This allows a dynamic programming approach without tracking exactly which slots are taken. Try to define $$$dp[i][j]$$$ as the number of ways to choose exactly $$$j$$$ candidates from the first $$$i$$$ candidates and assign them to unique passing slots.
The DP gives you the ways to choose at least $$$j$$$ specific candidates to pass. Use the Principle of Inclusion-Exclusion (Binomial Inversion) to find the number of schedules where exactly $$$K$$$ candidates pass.
1. Key Observation A candidate $$$i$$$ with score $$$A_i$$$ passes if placed in slot $$$t$$$ such that $$$t \le A_i$$$. Since slots are $$$0 \dots N-1$$$, the effective passing range for candidate $$$i$$$ is $$$[0, R_i]$$$, where:
If we sort candidates such that $$$R_1 \le R_2 \le \dots \le R_N$$$, the passing ranges become nested. This allows us to use dynamic programming without tracking specific slot indices.
2. Dynamic Programming Let $$$dp[i][j]$$$ be the number of ways to choose exactly $$$j$$$ candidates from the first $$$i$$$ candidates and assign them to unique passing slots. When considering candidate $$$i$$$: * Case 1 (Fail): Candidate $$$i$$$ is not assigned a passing slot. Ways: $$$dp[i-1][j]$$$. * Case 2 (Pass): Candidate $$$i$$$ is assigned to one of their $$$R_i + 1$$$ passing slots. Since $$$j-1$$$ candidates were already assigned to passing slots (all of which are $$$\le R_i$$$ due to sorting), there are $$$(R_i + 1) - (j - 1)$$$ available slots remaining.
Transition:
3. Principle of Inclusion-Exclusion (PIE) The DP state $$$dp[N][j]$$$ counts ways to pick $$$j$$$ passing candidates. To form a full permutation, the remaining $$$N-j$$$ candidates are placed in the remaining $$$N-j$$$ slots in $$$(N-j)!$$$ ways. Let $$$G(j)$$$ be the number of schedules where at least $$$j$$$ specific candidates pass:
Using PIE (specifically the Binomial Inversion formula), the number of schedules with exactly $$$K$$$ passes is:
4. Complexity * Time Complexity: $$$O(N^2)$$$ to compute the DP table. * Space Complexity: $$$O(N)$$$ using a 1D rolling array for the DP states.
#include <bits/stdc++.h>
using namespace std;
const int MOD = 998244353;
long long power(long long base, long long exp) {
long long res = 1;
base %= MOD;
while (exp > 0) {
if (exp % 2 == 1) res = (res * base) % MOD;
base = (base * base) % MOD;
exp /= 2;
}
return res;
}
long long modInverse(long long n) { return power(n, MOD - 2); }
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, k;
if (!(cin >> n >> k)) return 0;
vector<int> r(n);
for (int i = 0; i < n; i++) {
int a;
cin >> a;
r[i] = min(a, n - 1);
}
// CRITICAL: Sorting allows the nested subset logic to work
sort(r.begin(), r.end());
// 1D Rolling Array for DP
vector<long long> dp(n + 1, 0);
dp[0] = 1;
for (int i = 1; i <= n; i++) {
// Iterate backwards to prevent overwriting dp[j-1] before it's used
for (int j = i; j >= 1; j--) {
long long options = max(0LL, 1LL * r[i - 1] + 1 - (j - 1));
dp[j] = (dp[j] + dp[j - 1] * options) % MOD;
}
}
// Precompute Factorials for PIE
vector<long long> fact(n + 1, 1), invFact(n + 1, 1);
for (int i = 1; i <= n; i++) fact[i] = (fact[i - 1] * i) % MOD;
invFact[n] = modInverse(fact[n]);
for (int i = n - 1; i >= 0; i--) invFact[i] = (invFact[i + 1] * (i + 1)) % MOD;
auto nCr = [&](int n, int r) {
if (r < 0 || r > n) return 0LL;
return fact[n] * invFact[r] % MOD * invFact[n - r] % MOD;
};
// Principle of Inclusion-Exclusion
long long ans = 0;
for (int j = k; j <= n; j++) {
long long g_j = (dp[j] * fact[n - j]) % MOD;
long long term = (nCr(j, k) * g_j) % MOD;
if ((j - k) % 2 == 1) ans = (ans - term + MOD) % MOD;
else ans = (ans + term) % MOD;
}
cout << ans << "\n";
return 0;
}
D: Ritesh's Grid Challenge
Author: rn_das_2004
Analyze the maximum number of -1 elements you can place in a single row to keep the sum strictly positive. Then, calculate the minimum number of -1 elements needed on a diagonal to make its sum strictly negative.
Based on the bounds from Hint 1, check the small cases. Why are $$$n=1$$$, $$$2$$$, and $$$4$$$ impossible?
For $$$n \ge 5$$$, a simple greedy construction works. Try placing -1 elements on the entire main diagonal and anti-diagonal, and 1 elements everywhere else. Does this violate the row sum condition?
1. Observation: Bounding the Negative Elements Let's first analyze the maximum number of -1 elements we can place in any given row. For a row of length $$$n$$$ to have a strictly positive sum, the number of 1 elements must be strictly greater than the number of -1 elements. Let $$$x$$$ be the number of -1 elements in a row. We must have:
Thus, the maximum number of -1 elements allowed in any row is $$$\lfloor \frac{n-1}{2} \rfloor$$$.
Now, let's look at the diagonal constraints. Both the main diagonal and the anti-diagonal are of length $$$n$$$ and must have a strictly negative sum. This means each diagonal must contain strictly more -1 elements than 1 elements. Therefore, each diagonal needs at least $$$\lfloor \frac{n}{2} \rfloor + 1$$$ elements set to -1.
2. Case Analysis
Impossible Cases: $$$n \in {1, 2, 4}$$$ * $$$n=1$$$: We need the row sum $$$ \gt 0$$$ (so the single element is 1) and the diagonal sum $$$ \lt 0$$$ (so the single element is -1). This is a contradiction. * $$$n=2$$$: The maximum number of -1 elements allowed per row is $$$\lfloor 1/2 \rfloor = 0$$$. Since we cannot place any negative ones, the diagonals will solely consist of 1s, making a negative sum impossible. * $$$n=4$$$: The maximum number of -1 elements allowed per row is $$$\lfloor 3/2 \rfloor = 1$$$. Across 4 rows, the grid can contain at most $$$4 \times 1 = 4$$$ negative ones in total. However, each diagonal requires at least $$$\lfloor 4/2 \rfloor + 1 = 3$$$ negative ones. Since $$$n$$$ is even, the two diagonals do not intersect. Thus, we need at least $$$3 + 3 = 6$$$ negative ones in the grid to satisfy the diagonal constraints. Since $$$6 \gt 4$$$, no such grid can exist.
General Case: $$$n \ge 5$$$ For $$$n \ge 5$$$, a simple greedy construction works perfectly. We can populate both the main diagonal and the anti-diagonal entirely with -1 elements, and fill the rest of the grid with 1 elements.
Let's verify if this violates the row sum condition: Since we place -1 elements only on the diagonals, each row will contain at most 2 negative ones (exactly 2 if $$$n$$$ is even, and 1 in the middle row if $$$n$$$ is odd). The row sum will therefore be at least $$$n - 2(-1) = n - 4$$$. Since $$$n \ge 5$$$, the minimum row sum across any row is $$$5 - 4 = 1 \gt 0$$$. Both diagonals contain entirely -1 elements, so their respective sums are $$$-n \lt 0$$$. This fully satisfies all constraints.
Edge Case: $$$n = 3$$$ For $$$n=3$$$, the maximum number of -1 elements per row is $$$\lfloor 2/2 \rfloor = 1$$$. The total number of -1 elements in the grid cannot exceed 3. Each diagonal requires at least $$$\lfloor 3/2 \rfloor + 1 = 2$$$ negative ones. Since the diagonals intersect at the center, they can share a -1 at coordinate $$$(2,2)$$$.
We can construct a valid grid by placing exactly one -1 in each row such that the diagonals receive the required negative weight:
- Row sums: $$$-1+1+1 = 1 \gt 0$$$ (valid for all 3 rows).
- Main diagonal: $$$-1 - 1 + 1 = -1 \lt 0$$$ (valid).
- Anti-diagonal: $$$1 - 1 - 1 = -1 \lt 0$$$ (valid).
Complexity * Time Complexity: $$$O(n^2)$$$ to iterate and print the grid. * Space Complexity: $$$O(1)$$$ auxiliary space if printed directly, or $$$O(n^2)$$$ if stored in a 2D array.
#include <bits/stdc++.h>
using namespace std;
void solve() {
int n;
cin >> n;
if (n == 1 || n == 2 || n == 4) {
cout << 0 << "\n";
return;
}
cout << 1 << "\n";
if (n == 3) {
cout << "-1 1 1\n";
cout << "1 -1 1\n";
cout << "-1 1 1\n";
return;
}
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
if (i == j || i + j == n - 1) {
cout << -1 << " ";
} else {
cout << 1 << " ";
}
}
cout << "\n";
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int t = 1;
while (t--) {
solve();
}
return 0;
}
E: The Great Sibling Grid War
Author: DysfunctionalDegenerate
Simulating the game move by move and checking the entire board takes $$$O(M \cdot N^2)$$$, which will result in a Time Limit Exceeded (TLE) verdict. Ask yourself: does the state of winning ever reverse?
Since pieces are never removed, the winning condition is strictly monotonic (once a square is formed, it stays formed). This allows you to use Binary Search on the move number instead of a linear scan.
For the check(mid) function in your Binary Search, you need to verify if any $$$K \times K$$$ square is fully occupied in $$$O(N^2)$$$ time. Use 2D Prefix Sums to achieve $$$O(1)$$$ sum queries for any subgrid.
1. The Brute Force Approach (and why it fails) The most naive way to solve this is to simulate the game move by move. After each of the $$$M$$$ moves, you could scan the entire $$$N \times N$$$ board to see if a $$$K \times K$$$ square has been formed.
Scanning the board takes $$$O(N^2)$$$ time. Doing this for every move results in an overall time complexity of $$$O(M \cdot N^2)$$$. Given the constraints ($$$M \le 2 \cdot 10^5$$$ and $$$N \le 1000$$$), $$$M \cdot N^2$$$ approaches $$$2 \cdot 10^{11}$$$ operations, which will result in a clear Time Limit Exceeded (TLE).
2. The Intuition: Monotonicity and Binary Search To optimize, we must ask: Does the state of winning ever reverse? No. Because pieces are never removed from the board, if Vardaan completes a $$$K \times K$$$ square at move $$$X$$$, that square will still exist at move $$$X+1$$$.
Because this condition is strictly monotonic (False, False, False... True, True, True), we can apply Binary Search on the Answer. Instead of checking move $$$1$$$, then move $$$2$$$, we can binary search the exact move number (from $$$1$$$ to $$$M$$$) where the game transitions from "No Winner" to "Winner". This reduces our outer loop from $$$M$$$ to $$$\log_2(M)$$$.
3. The Implementation: 2D Prefix Sums For our Binary Search to be efficient, our check(mid) function needs to be fast. If we are checking the state of the board at move mid: * We place the first mid moves onto an empty $$$N \times N$$$ grid. * We must rapidly check if any $$$K \times K$$$ subgrid is entirely filled by player 1 or player 2.
We achieve this using 2D Prefix Sums. We maintain two prefix sum arrays—one for Vardaan and one for Vishwas.
The prefix sum at pref[i][j] stores the total number of a player's symbols in the rectangle from $$$(1,1)$$$ to $$$(i,j)$$$. It is built using the formula:
Once the prefix array is built, the number of symbols in any $$$K \times K$$$ subgrid whose bottom-right corner is at $$$(i, j)$$$ can be calculated in $$$O(1)$$$ time:
If this $$$\text{Sum}$$$ equals $$$K^2$$$, that player has won!
Complexity * Time Complexity: $$$O(N^2 \log M)$$$ * Space Complexity: $$$O(N^2)$$$ to store the grid and prefix sum arrays.
#include <bits/stdc++.h>
using namespace std;
int N, K, M;
vector<pair<int, int>> moves;
// Function to check if a KxK grid exists after exactly 'mid' moves
bool check(int mid, int& winner) {
// Reconstruct the grid for the first 'mid' moves.
// 0 = empty, 1 = Vardaan, 2 = Vishwas
vector<vector<int>> grid(N + 1, vector<int>(N + 1, 0));
for (int i = 0; i < mid; ++i) {
if (i % 2 == 0) {
grid[moves[i].first][moves[i].second] = 1; // Vardaan's turn (Even indices: 0, 2, 4...)
} else {
grid[moves[i].first][moves[i].second] = 2; // Vishwas's turn (Odd indices: 1, 3, 5...)
}
}
// Initialize 2D Prefix sum arrays for both players
vector<vector<int>> prefV(N + 1, vector<int>(N + 1, 0));
vector<vector<int>> prefW(N + 1, vector<int>(N + 1, 0));
// Build the 2D Prefix Sum matrices
for (int i = 1; i <= N; ++i) {
for (int j = 1; j <= N; ++j) {
// Formula includes inclusion-exclusion principle to avoid double counting
prefV[i][j] = prefV[i-1][j] + prefV[i][j-1] - prefV[i-1][j-1] + (grid[i][j] == 1 ? 1 : 0);
prefW[i][j] = prefW[i-1][j] + prefW[i][j-1] - prefW[i-1][j-1] + (grid[i][j] == 2 ? 1 : 0);
}
}
int target = K * K; // The area required to win
bool v_wins = false;
bool w_wins = false;
// Slide a KxK window across the prefix sum arrays
// We start at i=K, j=K because a KxK square requires at least K rows and K columns
for (int i = K; i <= N; ++i) {
for (int j = K; j <= N; ++j) {
// O(1) query to check the sum of the current KxK window
int sumV = prefV[i][j] - prefV[i-K][j] - prefV[i][j-K] + prefV[i-K][j-K];
int sumW = prefW[i][j] - prefW[i-K][j] - prefW[i][j-K] + prefW[i-K][j-K];
if (sumV == target) v_wins = true;
if (sumW == target) w_wins = true;
}
}
// Resolve the winner
// Edge case: If both technically have a winning board at 'mid',
// the person who just played the 'mid'-th move is the true winner.
if (v_wins && w_wins) {
winner = (mid % 2 != 0) ? 1 : 2;
return true;
} else if (v_wins) {
winner = 1;
return true;
} else if (w_wins) {
winner = 2;
return true;
}
// No one has a KxK square yet
return false;
}
int main() {
// Fast I/O
ios_base::sync_with_stdio(false);
cin.tie(NULL);
if (!(cin >> N >> K >> M)) return 0;
moves.resize(M);
for (int i = 0; i < M; ++i) {
cin >> moves[i].first >> moves[i].second;
}
int low = 1, high = M;
int ans_move = -1;
int ans_winner = -1;
// Binary Search on the Answer space [1, M]
while (low <= high) {
int mid = low + (high - low) / 2;
int winner = -1;
// If a winner exists at 'mid' moves, record the answer and search LOWER (earlier) moves
if (check(mid, winner)) {
ans_move = mid;
ans_winner = winner;
high = mid - 1;
} else {
// If no winner yet, we need MORE moves. Search HIGHER.
low = mid + 1;
}
}
// Output formatting
if (ans_move != -1) {
if (ans_winner == 1) cout << "Vardaan " << ans_move << "\n";
else cout << "Vishwas " << ans_move << "\n";
} else {
cout << "Draw\n";
}
return 0;
}
F: Siddhartha's Dream
Author: sigma__1
Notice that $$$B = N - A$$$ implies $$$A + B = N$$$. Try to relate addition with XOR and AND operations using the fundamental bitwise identity: $$$A + B = (A \oplus B) + 2(A \text{ AND } B)$$$. Let $$$X = A \oplus B$$$ and $$$Y = A \text{ AND } B$$$.
Instead of iterating over all possible values of $$$A$$$, iterate over valid values of $$$X$$$. For a fixed valid $$$X$$$ and $$$Y$$$, how many original pairs $$$(A, B)$$$ exist? Think about the number of choices for each bit.
To compute the sum of $$$X^K$$$, use Digit DP from the Least Significant Bit (LSB) to the Most Significant Bit (MSB). Since you need the sum of $$$X^K$$$, maintain a DP state that tracks all powers up to $$$K$$$ and use Binomial Expansion for the transitions.
1. The Core Identity Reduction The problem requires us to calculate $$$\sum_{A=0}^{N}(A \oplus (N-A))^{K} \pmod{998244353}$$$. Let $$$B = N - A$$$. This guarantees that $$$A + B = N$$$ is always satisfied. The main expression simplifies to $$$A \oplus B$$$.
We can use the fundamental bitwise identity that relates addition, XOR, and AND operations:
Let us introduce two new variables: $$$X = A \oplus B$$$ and $$$Y = A \text{ AND } B$$$.
Substituting these into our conditions gives us a new system of constraints: 1. $$$X + 2Y = N$$$ 2. $$$X \text{ AND } Y = 0$$$ (This holds true because if the $$$i$$$-th bit of $$$X$$$ is 1, the $$$i$$$-th bits of $$$A$$$ and $$$B$$$ must be different, making their AND strictly 0 at that position).
2. The Combinatorial Argument Instead of iterating over all possible values of $$$A$$$, we can iterate over all valid values of $$$X$$$. But for a fixed valid $$$X$$$ and $$$Y$$$, how many original pairs $$$(A, B)$$$ exist?
Let's analyze this bit by bit: * Case 1 (The $$$i$$$-th bit of $$$X$$$ is 0): This implies the $$$i$$$-th bit of $$$A$$$ and $$$B$$$ are identical. The exact bit (0 or 1) is strictly determined by the $$$i$$$-th bit of $$$Y$$$. Thus, $$$A$$$ has exactly 1 choice for this bit position. * Case 2 (The $$$i$$$-th bit of $$$X$$$ is 1): This implies the $$$i$$$-th bits of $$$A$$$ and $$$B$$$ are different. There are 2 valid choices here: either (A's bit is 1, B's bit is 0) or (A's bit is 0, B's bit is 1).
Conclusion: For every valid integer $$$X$$$, there exist exactly $$$2^{\text{popcount}(X)}$$$ valid choices for $$$A$$$. Our problem is now completely transformed into finding:
3. Digit DP with Binomial Expansion Given $$$N \le 10^{18}$$$, we can build $$$X$$$ bit by bit from the Least Significant Bit (LSB) to the Most Significant Bit (MSB). Since we need the sum of $$$X^K$$$, we must maintain a Dynamic Programming state that tracks all powers up to $$$K$$$.
We define our DP state as: dp[bit_index][carry][power] where power ranges from 0 to $$$K$$$, storing the sum of $$$X^{\text{power}}$$$ for the prefix built so far.
When we process the $$$i$$$-th bit and decide to set $$$x_i \in {0, 1}$$$, the numerical value added to our current $$$X$$$ is $$$V = x_i \cdot 2^i$$$. The new sum $$$(X + V)^j$$$ can be expanded using the Binomial Theorem:
4. DP Transitions At each bit position $$$i$$$, we iterate through all valid choices of $$$x_i \in {0, 1}$$$ and $$$y_i \in {0, 1}$$$, subject to the following rules: * $$$x_i \text{ AND } y_i = 0$$$ * $$$x_i + 2y_i + \text{carry} = \text{bit}_i(N) + 2 \cdot \text{next_carry}$$$
If we choose $$$x_i = 1$$$, we must multiply the number of ways by 2 for this transition, because the weight $$$2^{\text{popcount}(X)}$$$ increases by a factor of 2. For $$$x_i = 0$$$, the multiplier is 1. We apply the binomial expansion to update all powers $$$r$$$ from $$$j$$$ to $$$K$$$ in our DP table.
5. Complexity Analysis * Time Complexity: There are $$$\approx 60$$$ bits (since $$$N \le 10^{18}$$$). The carry can at most be 1. For each bit and each carry, we iterate over $$$K$$$ powers, and for each power $$$j$$$, we iterate over $$$r$$$ up to $$$K$$$ to apply the binomial expansion. This nested iteration takes $$$O(K^2)$$$. Total Time Complexity: $$$O(\log N \cdot K^2)$$$. Given $$$K \le 100$$$, $$$60 \cdot 10000 \approx 6 \times 10^5$$$ operations, which easily runs within the 1.0-second time limit. * Space Complexity: We need to store dp[62][2][105]. This uses negligible memory, strictly bounded by $$$O(\log N \cdot K)$$$.
#include <iostream>
#include <vector>
using namespace std;
long long N;
int K;
const int MOD = 998244353;
long long C[105][105];
long long dp[62][2][105];
void precompute() {
for (int i = 0; i <= 100; i++) {
C[i][0] = 1;
for (int j = 1; j <= i; j++) {
C[i][j] = (C[i - 1][j - 1] + C[i - 1][j]) % MOD;
}
}
}
int main() {
ios::sync_with_stdio(false);
cin.tie(NULL);
if (!(cin >> N >> K)) return 0;
precompute();
// Base case: at bit 0, carry 0, sum of X^0 is 1
dp[0][0][0] = 1;
for (int i = 0; i < 60; i++) {
int n_bit = (N >> i) & 1;
long long val_v = (1LL << i) % MOD; // current bit value 2^i
// Precompute powers of val_v
vector<long long> v_pow(K + 1);
v_pow[0] = 1;
for(int j = 1; j <= K; j++) {
v_pow[j] = (v_pow[j-1] * val_v) % MOD;
}
for (int c = 0; c < 2; c++) {
for (int x = 0; x < 2; x++) {
for (int y = 0; y < 2; y++) {
if (x & y) continue; // Condition: X AND Y = 0
int total = x + 2 * y + c;
if ((total & 1) != n_bit) continue; // Bit match check
int next_c = total >> 1;
// Transition using Binomial Expansion
for (int j = 0; j <= K; j++) {
if (dp[i][c][j] == 0) continue;
long long ways = (x == 1) ? 2 : 1;
long long base_val = (dp[i][c][j] * ways) % MOD;
for (int r = j; r <= K; r++) {
if (x == 0 && r > j) continue;
long long term = (C[r][j] * v_pow[r - j]) % MOD;
long long add = (base_val * term) % MOD;
dp[i+1][next_c][r] = (dp[i+1][next_c][r] + add) % MOD;
}
}
}
}
}
}
// Final answer
cout << dp[60][0][K] << "\n";
return 0;
}
G: Shifting Roots of CKC
Author: sigma__1
Treat each token independently. Since a token can move $$$1$$$ or $$$2$$$ edges closer to the root, this is exactly equivalent to a Nim pile of size $$$D$$$ where you can remove at most $$$2$$$ stones. What does the Sprague-Grundy theorem say about this?
The SG value of a token at distance $$$D$$$ is $$$D \pmod 3$$$. The global XOR sum must be $$$0$$$ for Bob to win. How can you simplify checking if the XOR sum of many 0, 1, and 2s is zero?
$$$1$$$ and $$$2$$$ don't share active bits in binary. Thus, the XOR sum is $$$0$$$ if and only if the number of tokens at distance $$$1 \pmod 3$$$ is even, and the number of tokens at distance $$$2 \pmod 3$$$ is even. Use Tree Re-rooting DP (In/Out DP) to count these parities for all roots in $$$O(N)$$$.
1. Game Theory Reduction (Sprague-Grundy) The game allows a player to move a token towards the root by $$$1$$$ or $$$2$$$ edges. From the perspective of any single token, the game is entirely independent of the others. A token at a distance $$$D$$$ from the root is perfectly equivalent to a Nim pile of size $$$D$$$, where a player can remove at most $$$2$$$ stones.
According to the Sprague-Grundy theorem, the SG (Grundy) value of a single token at distance $$$D$$$ is:
The global state of the game is determined by the XOR sum of the SG values of all $$$M$$$ tokens. Bob wins if and only if the global XOR sum is 0:
2. The Bitwise Simplification Evaluating this XOR sum naively for every root takes $$$O(N \cdot M)$$$ time, which is too slow. However, observe the possible SG values: $$$0, 1,$$$ and $$$2$$$. In binary representation: * $$$1 = 01_2$$$ * $$$2 = 10_2$$$
Since $$$1$$$ and $$$2$$$ do not share any active bits, they cannot interfere with each other during an XOR operation. This means the XOR sum evaluates to $$$0$$$ if and only if: 1. The total number of tokens at distance $$$1 \pmod 3$$$ is EVEN. 2. The total number of tokens at distance $$$2 \pmod 3$$$ is EVEN.
Tokens at distance $$$0 \pmod 3$$$ contribute $$$0$$$ to the XOR sum and do not alter the parity. Therefore, we only need to track the parity (even/odd) of token counts at distances $$$0, 1,$$$ and $$$2 \pmod 3$$$ from the root.
3. Re-rooting DP (In/Out DP) To efficiently compute these parities for all $$$N$$$ possible roots, we use the Re-rooting DP technique on trees. We maintain two states for every node $$$u$$$: * $$$dp[u][c]$$$: The parity of tokens strictly inside $$$u$$$'s subtree whose distance from $$$u$$$ is $$$c \pmod 3$$$. * $$$out[u][c]$$$: The parity of tokens outside $$$u$$$'s subtree whose distance from $$$u$$$ is $$$c \pmod 3$$$.
Transitions (The Shift): Whenever we traverse an edge (either moving from a child to a parent or vice versa), the distance increases by $$$1$$$. Consequently, tokens that were at a distance $$$c \pmod 3$$$ will now be at $$$(c+1) \pmod 3$$$. For example, inside the dfs_in function:
The dfs_out function utilizes a similar shift to compute the outside contributions in $$$O(1)$$$ time per edge by subtracting (via XOR) the current child's contribution from the total.
4. Complexity Analysis The solution requires two standard Depth-First Searches over the tree. The state transitions are $$$O(1)$$$ since the modulo states are strictly bounded to size $$$3$$$. Thus, the total time complexity is $$$O(N)$$$. The space complexity is $$$O(N)$$$ to store the DP arrays and the adjacency list.
#include <iostream>
#include <vector>
#include <string>
using namespace std;
const int MAXN = 200005;
vector<int> adj[MAXN];
int tokens[MAXN];
int dp[MAXN][3];
int out[MAXN][3];
void dfs_in(int u, int p) {
dp[u][0] = tokens[u] % 2;
dp[u][1] = 0;
dp[u][2] = 0;
for (int v : adj[u]) {
if (v == p) continue;
dfs_in(v, u);
dp[u][0] ^= dp[v][2];
dp[u][1] ^= dp[v][0];
dp[u][2] ^= dp[v][1];
}
}
void dfs_out(int u, int p) {
int total[3];
total[0] = dp[u][0] ^ out[u][0];
total[1] = dp[u][1] ^ out[u][1];
total[2] = dp[u][2] ^ out[u][2];
for (int v : adj[u]) {
if (v == p) continue;
int rem[3];
rem[0] = total[0] ^ dp[v][2];
rem[1] = total[1] ^ dp[v][0];
rem[2] = total[2] ^ dp[v][1];
out[v][0] = rem[2];
out[v][1] = rem[0];
out[v][2] = rem[1];
dfs_out(v, u);
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int N, M;
if (!(cin >> N >> M)) return 0;
for (int i = 0; i < M; i++) {
int pos;
cin >> pos;
tokens[pos]++;
}
for (int i = 0; i < N - 1; i++) {
int u, v;
cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u);
}
out[1][0] = out[1][1] = out[1][2] = 0;
dfs_in(1, 0);
dfs_out(1, 0);
string result = "";
for (int i = 1; i <= N; i++) {
int total_1 = dp[i][1] ^ out[i][1];
int total_2 = dp[i][2] ^ out[i][2];
if (total_1 == 0 && total_2 == 0) {
result += "B";
} else {
result += "A";
}
}
cout << result << "\n";
return 0;
}
H: Hypothetical C++ Compiler
Author: sigma__1
How can we model the combination of types into a magic_tuple where cyclic shifts are identical? Think about necklaces and the Pólya Enumeration Theorem (PET).
Let $$$A(x)$$$ be the ordinary generating function for the number of unique types. Set up a functional equation for $$$A(x)$$$ using PET, ensuring you carefully handle the constraint that a valid tuple must have at least 2 arguments ($$$M \ge 2$$$).
The resulting self-referential equation can be simplified using logarithmic series. To find the coefficients up to $$$N$$$, use Newton's Method for Formal Power Series (FPS). You'll need the Number Theoretic Transform (NTT) to handle polynomial multiplications efficiently.
1. Problem Understanding and Intuition The problem asks us to count the number of unique valid types of weight exactly $$$N$$$. A type can be either a basic type (weight $$$1$$$) or a magic_tuple containing at least $$$2$$$ other valid types. The crucial constraint is that two magic_tuples are considered identical if their arguments are cyclic shifts of each other.
Let $$$A(x) = \sum_{n=1}^{\infty} a_n x^n$$$ be the Ordinary Generating Function (OGF) where the coefficient $$$a_n$$$ represents the number of unique valid types of exact weight $$$n$$$. Our base types contribute $$$Kx$$$ to this function.
2. Necklace Counting via Pólya Enumeration Theorem A magic_tuple with $$$M$$$ arguments can be modeled as a necklace of length $$$M$$$ formed by elements chosen from our generating function $$$A(x)$$$. According to the Pólya Enumeration Theorem (PET) over the cyclic group $$$\mathbb{Z}_M$$$, the generating function for unique necklaces of length $$$M$$$ is:
where $$$\phi(d)$$$ is Euler's totient function. Since a valid magic_tuple must have at least $$$2$$$ arguments ($$$M \ge 2$$$), our self-referential recursive equation becomes:
3. Mathematical Simplification (The Calculus Trick) Working with an infinite sum of divisors is computationally heavy. We can reorder the summation by letting $$$M = c \cdot d$$$, where $$$d$$$ is the divisor and $$$c$$$ is the multiplier:
We split this into two cases to strictly enforce the $$$M \ge 2$$$ constraint:
- Case 1 ($$$d=1$$$): Since $$$M = c \cdot d \ge 2$$$, the multiplier $$$c$$$ must start from $$$2$$$. Using the standard Maclaurin series identity $$$-\ln(1-z) = \sum_{i=1}^{\infty} \frac{z^i}{i}$$$, we get:
- Case 2 ($$$d \ge 2$$$): Since $$$d \ge 2$$$, any $$$c \ge 1$$$ guarantees $$$M = cd \ge 2$$$. Thus, the summation over $$$c$$$ forms a complete logarithmic series:
Combining both cases gives us a functional equation:
By moving everything to one side, we define a function $$$G(A)$$$ that we need to root:
where $$$R(x)$$$ is the known component representing all higher-order terms: $$$R(x) = Kx + \sum_{d=2}^{\infty} \frac{\phi(d)}{d} (-\ln(1-A(x^d)))$$$.
4. Implementation Details and Newton's Method To find the polynomial $$$A(x)$$$ such that $$$G(A) \equiv 0 \pmod{x^{N+1}}$$$, we utilize the Newton-Raphson iteration generalized for Formal Power Series (FPS). The iterative update rule is:
Computing the formal derivative $$$G'(A)$$$ with respect to $$$A(x)$$$:
Substituting this back gives our core iterative formula:
Step-by-Step Implementation Approach: 1. Precomputation: First, precompute the Euler Totient function $$$\phi(i)$$$ up to size $$$2N$$$ using a sieve. Similarly, precompute modular inverses for the FPS integral operations. 2. Iterative Doubling: We build the polynomial $$$A(x)$$$ by doubling its resolved length $$$L$$$ at each step ($$$L = 2 \to 4 \to 8 \dots$$$ until $$$L \gt N$$$). 3. Computing $$$R(x)$$$ online: The crucial realization that makes this solvable is that $$$R(x)$$$ modulo $$$x^L$$$ only depends on $$$A(x^d)$$$ for $$$d \ge 2$$$. Therefore, to compute $$$R(x)$$$ up to degree $$$L$$$, we only need the coefficients of $$$A(x)$$$ up to degree $$$\lfloor L/2 \rfloor$$$. Since we just computed $$$A(x)$$$ up to degree $$$L/2$$$ in the previous iteration, $$$R(x)$$$ can be evaluated safely! 4. NTT Mechanics: At each doubling step of size $$$L$$$, we perform FPS Logarithm (to find $$$\ln(1-A)$$$) and FPS Inversion (to find $$$(1-2A)^{-1}$$$). All these polynomial multiplications must be handled using the Number Theoretic Transform (NTT) to avoid $$$O(N^2)$$$ bottlenecks. We pad our arrays to the nearest power of 2 to satisfy the NTT requirements.
5. Time and Space Complexity Analysis
Time Complexity: $$$O(N \log N)$$$ * Precomputations: Computing the sieve for $$$\phi$$$ takes $$$O(N \log \log N)$$$. Precomputing the inverse array takes $$$O(N)$$$. Both are strictly bounded by $$$O(N \log N)$$$. * FPS Operations: Let $$$T(L)$$$ be the time taken for a single Newton iteration resolving up to degree $$$L$$$. Inside the loop, we perform polynomial multiplication, inversion, and derivation/integration (for the log function). Using NTT, multiplying two polynomials of degree $$$L$$$ takes $$$O(L \log L)$$$. Thus, $$$T(L) = O(L \log L)$$$. * Total Loop Time: Because we double the length at each step, the total time is the sum of a geometric progression:
By the Master Theorem, the heavy lifting is dominated by the final iteration step, strictly keeping the asymptotic bound at $$$O(N \log N)$$$.
Space Complexity: $$$O(N)$$$ * The polynomial arrays $$$A(x)$$$, $$$R(x)$$$, and the intermediate vectors used inside the NTT functions (like derivatives, inverses, and temporary multiplication pads) will at most scale up to the nearest power of $$$2$$$ greater than $$$2N$$$ (to prevent circular convolution overlap). * Thus, the maximum memory allocated scales linearly with the input size $$$N$$$, resulting in an $$$O(N)$$$ space complexity, which comfortably passes within standard 256MB constraints.
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
const int MOD = 998244353;
const int G = 3;
int power(int base, long long exp) {
int res = 1;
base %= MOD;
while (exp > 0) {
if (exp % 2 == 1) res = (1LL * res * base) % MOD;
base = (1LL * base * base) % MOD;
exp /= 2;
}
return res;
}
int inverse(int n) { return power(n, MOD - 2); }
namespace FPS {
void ntt(vector<int>& a, bool invert) {
int n = a.size();
for (int i = 1, j = 0; i < n; i++) {
int bit = n >> 1;
for (; j & bit; bit >>= 1) j ^= bit;
j ^= bit;
if (i < j) swap(a[i], a[j]);
}
for (int len = 2; len <= n; len <<= 1) {
int wlen = power(G, (MOD - 1) / len);
if (invert) wlen = inverse(wlen);
for (int i = 0; i < n; i += len) {
int w = 1;
for (int j = 0; j < len / 2; j++) {
int u = a[i + j];
int v = (1LL * a[i + j + len / 2] * w) % MOD;
a[i + j] = (u + v < MOD ? u + v : u + v - MOD);
a[i + j + len / 2] = (u - v >= 0 ? u - v : u - v + MOD);
w = (1LL * w * wlen) % MOD;
}
}
}
if (invert) {
int n_inv = inverse(n);
for (int &x : a) x = (1LL * x * n_inv) % MOD;
}
}
vector<int> multiply(vector<int> const& a, vector<int> const& b, int deg = -1) {
if(a.empty() || b.empty()) return {};
vector<int> fa(a.begin(), a.end()), fb(b.begin(), b.end());
int n = 1;
while (n < fa.size() + fb.size()) n <<= 1;
fa.resize(n); fb.resize(n);
ntt(fa, false); ntt(fb, false);
for (int i = 0; i < n; i++) fa[i] = (1LL * fa[i] * fb[i]) % MOD;
ntt(fa, true);
if (deg != -1) fa.resize(deg);
return fa;
}
vector<int> inv(vector<int> a, int deg) {
if (deg == 1) return {inverse(a[0])};
vector<int> res = inv(a, (deg + 1) / 2);
int n = 1;
while (n < deg * 2) n <<= 1;
vector<int> fa(a.begin(), a.begin() + min((int)a.size(), deg));
fa.resize(n);
vector<int> fres = res;
fres.resize(n);
ntt(fa, false); ntt(fres, false);
for (int i = 0; i < n; ++i) {
fres[i] = (1LL * fres[i] * (2LL - 1LL * fa[i] * fres[i] % MOD + MOD)) % MOD;
}
ntt(fres, true);
fres.resize(deg);
return fres;
}
vector<int> deriv(vector<int> a) {
vector<int> res(max(1, (int)a.size() - 1), 0);
for (int i = 1; i < a.size(); ++i) res[i - 1] = (1LL * a[i] * i) % MOD;
return res;
}
vector<int> integr(vector<int> a, const vector<int>& inv_arr) {
vector<int> res(a.size() + 1, 0);
for (int i = 0; i < a.size(); ++i) {
res[i + 1] = (1LL * a[i] * inv_arr[i + 1]) % MOD;
}
return res;
}
vector<int> ln(vector<int> a, int deg, const vector<int>& inv_arr) {
vector<int> derivative = deriv(a);
vector<int> inverse_poly = inv(a, deg);
vector<int> res = integr(multiply(derivative, inverse_poly, deg - 1), inv_arr);
res.resize(deg, 0);
return res;
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int N;
long long K_input;
if (!(cin >> N >> K_input)) return 0;
int K = K_input % MOD;
if (N == 1) {
cout << K << "\n";
return 0;
}
int next_pow2 = 1;
while (next_pow2 <= N) next_pow2 <<= 1;
int max_len = 2 * next_pow2 + 5;
vector<int> phi(max_len);
for (int i = 0; i < max_len; ++i) phi[i] = i;
for (int i = 2; i < max_len; ++i) {
if (phi[i] == i) {
for (int j = i; j < max_len; j += i) {
phi[j] -= phi[j] / i;
}
}
}
vector<int> inv_arr(max_len, 1);
for (int i = 2; i < max_len; ++i) {
inv_arr[i] = 1LL * (MOD - MOD / i) * inv_arr[MOD % i] % MOD;
}
int len = 2;
vector<int> A = {0, K};
while (len <= N) {
int L = len * 2;
A.resize(L, 0);
vector<int> one_minus_A(len, 0);
one_minus_A[0] = 1;
for (int i = 1; i < len; ++i) one_minus_A[i] = (MOD - A[i]) % MOD;
vector<int> Q = FPS::ln(one_minus_A, len, inv_arr);
for (int &x : Q) x = (MOD - x) % MOD;
vector<int> R(L, 0);
R[1] = K;
for (int d = 2; d < L; ++d) {
int max_i = (L - 1) / d;
long long term = (1LL * phi[d] * inv_arr[d]) % MOD;
for (int i = 1; i <= max_i; ++i) {
if (i < Q.size()) {
R[i * d] = (R[i * d] + term * Q[i]) % MOD;
}
}
}
vector<int> one_minus_A_L(L, 0);
one_minus_A_L[0] = 1;
for (int i = 1; i < L; ++i) one_minus_A_L[i] = (MOD - A[i]) % MOD;
vector<int> ln_term = FPS::ln(one_minus_A_L, L, inv_arr);
vector<int> G_poly(L, 0);
for (int i = 0; i < L; ++i) {
G_poly[i] = (2LL * A[i] + ln_term[i] - R[i]) % MOD;
if (G_poly[i] < 0) G_poly[i] += MOD;
}
vector<int> one_minus_2A_L(L, 0);
one_minus_2A_L[0] = 1;
for (int i = 1; i < L; ++i) {
one_minus_2A_L[i] = (MOD - 2LL * A[i] % MOD) % MOD;
if (one_minus_2A_L[i] < 0) one_minus_2A_L[i] += MOD;
}
vector<int> inv_2A = FPS::inv(one_minus_2A_L, L);
vector<int> M = FPS::multiply(one_minus_A_L, inv_2A, L);
vector<int> shift = FPS::multiply(G_poly, M, L);
for (int i = 0; i < L; ++i) {
A[i] = (A[i] - shift[i] + MOD) % MOD;
}
len = L;
}
cout << A[N] << "\n";
return 0;
}
I: The Harmonious String
Author: sigma__1
A string is harmonious if every distinct character appears the exact same number of times. Let $$$C$$$ be the number of distinct characters in the string. Since the total length is $$$N$$$, what must be the relationship between $$$C$$$ and $$$N$$$?
$$$C$$$ must be a divisor of $$$N$$$, and since there are only 26 lowercase letters, $$$1 \le C \le 26$$$. For a fixed $$$C$$$, the target frequency for each character is $$$F = N/C$$$. How can you choose which characters to keep to minimize operations?
To minimize the number of replacements, greedily pick the $$$C$$$ most frequent characters from the original string. Keep up to $$$F$$$ occurrences of each chosen character and replace all extra or unchosen characters.
1. The Core Observation (Search Space Reduction) A string is called harmonious if every distinct character present in it appears the exact same number of times.
Let $$$C$$$ be the number of distinct characters in the final string, and $$$F$$$ be the frequency of each of these characters. Since the total length of the string is $$$N$$$, the following relation must hold:
This immediately implies that $$$C$$$ must be a divisor of $$$N$$$. Furthermore, since there are only $$$26$$$ lowercase English letters, $$$C$$$ is strictly bounded: $$$1 \le C \le 26$$$. Therefore, our search space is extremely small. We only need to iterate over all possible values of $$$C \in [1, 26]$$$ such that $$$N \pmod C == 0$$$.
2. Greedy Choice (Cost Minimization) For a fixed valid $$$C$$$ (which also fixes the target frequency $$$F = N/C$$$), we need to choose exactly $$$C$$$ characters from the alphabet to form our final string.
To minimize the number of replacement operations, we must maximize the number of characters we keep from the original string. * We count the frequencies of all characters in the original string $$$S$$$ and sort them in descending order. * It is always optimal to greedily pick the $$$C$$$ most frequent characters. * For a chosen character with an original frequency of $$$count_i$$$, we can keep at most $$$F$$$ instances of it. Thus, the number of characters we save for this letter is $$$\min(count_i, F)$$$.
The total replacement cost for a specific $$$C$$$ will be:
We compute this cost for all valid $$$C$$$ and select the one that yields the minimum $$$Cost_C$$$. Let's call these optimal values $$$best_C$$$ and $$$best_F$$$.
3. Construction Logic Once we have $$$best_C$$$ and $$$best_F$$$, we identify the specific $$$best_C$$$ characters we decided to keep. 1. We do a first pass over the original string $$$S$$$. If the current character is one of our chosen characters and we have kept strictly less than $$$best_F$$$ instances of it so far, we keep it and increment our kept counter for this character. 2. If the character is not in our chosen set, or we have already kept $$$best_F$$$ instances of it, we mark its index as a "blank" that needs to be replaced. 3. We do a second pass over our chosen characters. For any character that still has less than $$$best_F$$$ instances kept, we place it into the available "blank" indices until its frequency exactly reaches $$$best_F$$$.
4. Complexity Analysis * Time Complexity: Finding the frequencies takes $$$O(N)$$$. Iterating over the $$$26$$$ possible values of $$$C$$$ and sorting a $$$26$$$-element array takes $$$O(26 \log 26)$$$, which is effectively $$$O(1)$$$. The final construction takes another $$$O(N)$$$ pass. Thus, the overall time complexity is strictly $$$O(N)$$$ per test case, perfectly fitting the $$$\sum N \le 10^5$$$ constraint. * Space Complexity: $$$O(N)$$$ to store the answer string and the indices of the replaced characters. The frequency arrays take $$$O(26)$$$ space.
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
using namespace std;
void solve() {
int N;
cin >> N;
string S;
cin >> S;
vector<int> count(26, 0);
for (char c : S) {
count[c - 'a']++;
}
vector<pair<int, char>> freq;
for (int i = 0; i < 26; i++) {
freq.push_back({count[i], (char)(i + 'a')});
}
sort(freq.rbegin(), freq.rend());
int min_ops = N + 1;
int best_C = -1;
for (int C = 1; C <= 26; C++) {
if (N % C == 0) {
int F = N / C;
int current_keep = 0;
for (int i = 0; i < C; i++) {
current_keep += min(freq[i].first, F);
}
int ops = N - current_keep;
if (ops < min_ops) {
min_ops = ops;
best_C = C;
}
}
}
int best_F = N / best_C;
vector<int> target_count(26, 0);
for (int i = 0; i < best_C; i++) {
target_count[freq[i].second - 'a'] = best_F;
}
string ans = S;
vector<int> current_count(26, 0);
vector<int> replace_indices;
for (int i = 0; i < N; i++) {
int c = S[i] - 'a';
if (target_count[c] > 0 && current_count[c] < target_count[c]) {
current_count[c]++;
} else {
replace_indices.push_back(i);
}
}
int ptr = 0;
for (int i = 0; i < 26; i++) {
while (current_count[i] < target_count[i]) {
ans[replace_indices[ptr++]] = (char)(i + 'a');
current_count[i]++;
}
}
cout << min_ops << "\n" << ans << "\n";
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int T;
if (cin >> T) {
while (T--) solve();
}
return 0;
}
J: Sankalp's Two-Tone Subsequences
Author: sigma__1
To form a valid subsequence of length $$$K$$$ with exactly $$$2$$$ distinct elements $$$X$$$ and $$$Y$$$, you must pick $$$i$$$ elements of $$$X$$$ and $$$K-i$$$ elements of $$$Y$$$. How can you compute the total ways without a loop? Think about Vandermonde's Convolution.
Iterating over all pairs of distinct elements will give a Time Limit Exceeded (TLE) verdict. Notice that the number of ways depends only on the frequencies of the elements, not their actual values.
What is the maximum number of distinct frequencies an array of size $$$N$$$ can have? In the worst case, it is tightly bounded by $$$O(\sqrt{N})$$$. Group elements by their frequencies and iterate only over pairs of unique frequencies to easily pass within the time limit.
1. The Core Idea We need to choose a subsequence of length $$$K$$$ that contains exactly $$$2$$$ distinct elements. Let these elements be $$$X$$$ and $$$Y$$$ with frequencies $$$f_x$$$ and $$$f_y$$$ respectively in the array.
To form a valid subsequence, we must pick $$$i$$$ elements of $$$X$$$ and $$$K-i$$$ elements of $$$Y$$$, where $$$1 \le i \le K-1$$$. The total ways to do this for a specific pair $$$(X, Y)$$$ is:
2. Observation 1: Vandermonde's Convolution Computing the summation directly using a loop would be too slow. We can optimize this using Vandermonde's Convolution, which states that picking $$$K$$$ items from two groups of sizes $$$f_x$$$ and $$$f_y$$$ is the same as picking $$$K$$$ items from their union:
Since our valid summation strictly excludes $$$i=0$$$ (picking only $$$Y$$$) and $$$i=K$$$ (picking only $$$X$$$), we can simply subtract these two invalid cases:
This allows us to compute the number of ways for any valid pair $$$(X, Y)$$$ in $$$O(1)$$$ time using precomputed factorials.
3. Observation 2: Bounding Distinct Frequencies If we iterate over all unique pairs of elements, the worst-case time complexity would be $$$O(N^2)$$$ (e.g., when all elements in the array are distinct). This will lead to a Time Limit Exceeded (TLE) verdict.
The key observation is that the number of ways depends only on the frequencies of the elements, not their actual values. What is the maximum number of distinct frequencies $$$M$$$ an array of size $$$N$$$ can have?
In the worst case, the frequencies would be as tightly packed as possible: $$$1, 2, 3, \dots, M$$$. The sum of these frequencies cannot exceed $$$N$$$:
For $$$N = 200,000$$$, the maximum number of distinct frequencies is $$$M \le 632$$$. Thus, we can group elements by their frequencies and iterate only over pairs of unique frequencies. This reduces our iterations to at most $$$O(M^2) \approx O(N)$$$.
4. Final Calculation Let $$$C_u$$$ be the number of distinct elements having frequency exactly $$$f_u$$$. We iterate over the unique frequencies.
- Case 1: Pairing elements with the same frequency $$$f_u$$$ The number of such pairs we can form is $$$\frac{C_u(C_u - 1)}{2}$$$. The number of valid subsequences per pair is:
- Case 2: Pairing elements with different frequencies $$$f_u$$$ and $$$f_v$$$ The number of such pairs we can form is $$$C_u \times C_v$$$. The number of valid subsequences per pair is:
We calculate the ways for both cases, multiply by the number of pairs, and accumulate them to our total sum modulo $$$10^9+7$$$.
#include <bits/stdc++.h>
using namespace std;
const int MOD = 1e9 + 7;
const int MAXN = 200005;
long long fact[MAXN], inv[MAXN];
long long power(long long base, long long exp) {
long long res = 1;
base %= MOD;
while (exp > 0) {
if (exp % 2 == 1) res = (res * base) % MOD;
base = (base * base) % MOD;
exp /= 2;
}
return res;
}
long long modInverse(long long n) {
return power(n, MOD - 2);
}
void precompute() {
fact[0] = 1;
inv[0] = 1;
for (int i = 1; i < MAXN; i++) {
fact[i] = (fact[i - 1] * i) % MOD;
}
inv[MAXN - 1] = modInverse(fact[MAXN - 1]);
for (int i = MAXN - 2; i >= 1; i--) {
inv[i] = (inv[i + 1] * (i + 1)) % MOD;
}
}
long long nCr(int n, int r) {
if (r < 0 || r > n) return 0;
return fact[n] * inv[r] % MOD * inv[n - r] % MOD;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
precompute();
int n, k;
if (!(cin >> n >> k)) return 0;
map<int, int> freq;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
freq[x]++;
}
// count_of_freq[f] = C_f
map<int, int> count_of_freq;
for (auto const& [val, f] : freq) {
count_of_freq[f]++;
}
vector<int> unique_freqs;
for (auto const& [f, c] : count_of_freq) {
unique_freqs.push_back(f);
}
long long total_ans = 0;
for (int i = 0; i < unique_freqs.size(); i++) {
int f_u = unique_freqs[i];
long long C_u = count_of_freq[f_u];
// Pairing two elements that have the same frequency
if (C_u >= 2) {
long long ways = (nCr(2 * f_u, k) - 2LL * nCr(f_u, k) % MOD + MOD) % MOD;
long long pairs = (C_u * (C_u - 1) / 2LL) % MOD;
total_ans = (total_ans + ways * pairs % MOD) % MOD;
}
// Pairing two elements with different frequencies
for (int j = i + 1; j < unique_freqs.size(); j++) {
int f_v = unique_freqs[j];
long long C_v = count_of_freq[f_v];
long long ways = (nCr(f_u + f_v, k) - nCr(f_u, k) - nCr(f_v, k)) % MOD;
ways = (ways + 2LL * MOD) % MOD; // Handle negative values after modulo
long long pairs = (C_u * C_v) % MOD;
total_ans = (total_ans + ways * pairs % MOD) % MOD;
}
}
cout << total_ans << "\n";
return 0;
}








