Hello Codeforces! Today I want to share some thoughts on the magic of symmetry and linearity of expectation. We have all encountered counting problems that seem completely impossible to calculate directly. But there is a beautiful trick: if we use symmetry to convert a pure counting problem into an expected value problem, we unlock the ability to bypass massive formulas and solve the problem elegantly.
Before we begin, make sure you are familiar with the basics of:
Expected values (Obviously)
Indicator variables: An indicator variable is a simple binary variable that equals 1 if a specific event occurs, and 0 if it does not. The most useful property of an indicator variable is that its expected value is exactly equal to the probability of the event happening ($$$E[I] = P(I=1)$$$). This makes them the perfect atomic building blocks for counting complex occurrences.
Linearity of expectation: the expected value of a sum of random variables is always equal to the sum of their individual expected values, $$$E[X + Y] = E[X] + E[Y]$$$, this property is that it holds perfectly true even if the variables are dependent on each other.
Let us jump right into the topic by looking at the following example.
Counting Inversions in all possible permutations
Given a positive integer $$$n$$$, define $$$f(p)$$$ to be the count of inversions in the permutation $$$p$$$. Count the sum of $$$f(p)$$$ over all possible permutations of size $$$n$$$, $$$\sum_{p \in S_n} f(p)$$$.
Let us discuss the direct solution first. If we try to tackle this purely through combinatorics, looking at entire permutations all at once is too complex. Instead, we use the contribution to the sum technique. We focus on a specific pair of positions $$$(i, j)$$$ where $$$i \lt j$$$, and ask in how many permutations $$$p_i \gt p_j$$$. First, we need to choose the two actual numbers that will go into positions $$$i$$$ and $$$j$$$. There are $$$\binom{n}{2}$$$ ways to pick these two values from the set of numbers from 1 to $$$n$$$. Once we pick these two numbers, there is only one valid way to place them to form an inversion where the larger number must go to position $$$i$$$, and the smaller number must go to position $$$j$$$. We still have $$$n-2$$$ remaining numbers, and they can be arranged in the remaining $$$n-2$$$ positions in exactly $$$(n-2)!$$$ ways. So, for any fixed pair of positions $$$(i, j)$$$, exactly $$$\binom{n}{2} \times (n-2)! = \frac{n!}{2}$$$ permutations have an inversion at these two positions. Now, how many pairs of positions $$$(i, j)$$$ are there in total? There are $$$\binom{n}{2}$$$ pairs of positions. We multiply the number of position pairs by the number of inversions per pair to get our final sum $$$\frac{n! \cdot n(n-1)}{4}$$$.
Now, let us discuss the other approach, instead of counting the total sum directly, we can find the expected number of inversions in a single, uniformly random permutation, and then just multiply that expectation by the total number of permutations, $$$n!$$$. Note that this is 100% correct since if we try to choose a random permutation then every permutation is equal likely to be our choice, which is the symmetry. Let $$$X$$$ be the total number of inversions. We can break $$$X$$$ down into a sum of tiny indicator variables: let $$$X_{i,j}$$$ be a variable that equals 1 if $$$p_i \gt p_j$$$, and 0 otherwise
$$$X = \sum_{1 \le i \lt j \le n} X_{i,j}$$$. By Linearity of Expectation, the expected value of the sum is the sum of the expected values:
$$$E[X] = \sum_{1 \le i \lt j \le n} E[X_{i,j}]$$$Since these are the only two possibilities, the probability of $$$p_i \gt p_j$$$ is exactly $$$\frac{1}{2}$$$. Therefore, the expected value of any single indicator variable is just $$$\frac{1}{2}$$$. Thus, $$$E[X] = \frac{n(n-1)}{2} \times \frac{1}{2} = \frac{n(n-1)}{4}$$$. Therefore, the answer is $$$n! \times \frac{n(n-1)}{4}$$$.
Now let's see another harder example.
In a nutshell, you are given a string $$$s$$$ of length $$$n$$$ consisting of lowercase Latin letters $$$a$$$, $$$b$$$ and $$$c$$$ and question marks $$$?$$$. You need to replace every $$$?$$$ with one of the three letters. Your task is to count the total number of subsequences $$$abc$$$ in all resulting strings.
Hint 1Read the title of the Blog. There exists symmetry in our problem since every final string is equal likely to be the final string.
Hint 2Assume every $$$?$$$ is replaced by $$$a$$$, $$$b$$$, or $$$c$$$ uniformly at random, each with probability $$$\frac{1}{3}$$$. By Linearity of Expectation, we can iterate over every index $$$i$$$ and ask: What is the expected number of $$$abc$$$ subsequences where $$$s[i]$$$ acts as the middle character $$$b$$$?
SolutionLet us count the Expected value of having $$$abc$$$ as a subsequence and then multiply by $$$3^{\text{count of '?'}}$$$. For a fixed middle index $$$i$$$, the expected number of valid 'abc's passing through it is the product of three independent probabilities/expectations
The Left Side ($$$E[A_{left}]$$$): Every fixed 'a' to the left contributes exactly 1. Every '?' to the left has a $$$\frac{1}{3}$$$ chance of becoming an 'a'.
$$$E[A_{left}] = (\text{count of 'a'}) + \frac{1}{3} \times (\text{count of '?'})$$$The Right Side ($$$E[C_{right}]$$$): Similarly, every fixed 'c' to the right contributes 1, and every '?' contributes $$$\frac{1}{3}$$$.
$$$E[C_{right}] = (\text{count of 'c'}) + \frac{1}{3} \times (\text{count of '?'})$$$The Middle ($$$P(s[i] == \text{'b'})$$$): If $$$s[i]$$$ is 'b', this probability is 1. If $$$s[i]$$$ is '?', the probability is $$$\frac{1}{3}$$$. If it's 'a' or 'c', it's 0.
Because the choices for the '?' characters on the left, the right, and the middle are strictly independent, the expected value of their product is the product of their expected values:
$$$E[\text{subsequences where } i \text{ is the middle element}] = E[A_{left}] \times E[C_{right}] \times P(s[i] == \text{'b'})$$$ Code#include<bits/stdc++.h>
#pragma GCC optimize("Ofast, unroll-loops")
#ifdef ONLINE_JUDGE
#define debug(...) ((void)0)
#else
#include "debug.h"
#endif
using namespace std;
using ll = long long;
#define endl '\n'
template <int MOD_> struct modnum {
static constexpr int MOD = MOD_;
static_assert(MOD_ > 0, "MOD must be positive");
private:
static int minv(int a, int m) {
a %= m;
assert(a);
return a == 1 ? 1 : int(m - ll(minv(m, a)) * ll(m) / a);
}
public:
int v;
modnum() : v(0) {}
modnum(ll v_) : v(int(v_ % MOD)) { if (v < 0) v += MOD; }
explicit operator int() const { return v; }
friend std::ostream& operator << (std::ostream& out, const modnum& n) { return out << int(n); }
friend std::istream& operator >> (std::istream& in, modnum& n) { ll v_; in >> v_; n = modnum(v_); return in; }
friend bool operator == (const modnum& a, const modnum& b) { return a.v == b.v; }
friend bool operator != (const modnum& a, const modnum& b) { return a.v != b.v; }
modnum inv() const { modnum res; res.v = minv(v, MOD); return res; }
friend modnum inv(const modnum& m) { return m.inv(); }
modnum neg() const { modnum res; res.v = v ? MOD-v : 0; return res; }
friend modnum neg(const modnum& m) { return m.neg(); }
modnum operator- () const { return neg(); }
modnum operator+ () const { return modnum(*this); }
modnum& operator ++ () { v ++; if (v == MOD) v = 0; return *this; }
modnum& operator -- () { if (v == 0) v = MOD; v --; return *this; }
modnum& operator += (const modnum& o) { v += o.v; if (v >= MOD) v -= MOD; return *this; }
modnum& operator -= (const modnum& o) { v -= o.v; if (v < 0) v += MOD; return *this; }
modnum& operator *= (const modnum& o) { v = int(ll(v) * ll(o.v) % MOD); return *this; }
modnum& operator /= (const modnum& o) { return *this *= o.inv(); }
friend modnum operator ++ (modnum& a, int) { modnum r = a; ++a; return r; }
friend modnum operator -- (modnum& a, int) { modnum r = a; --a; return r; }
friend modnum operator + (const modnum& a, const modnum& b) { return modnum(a) += b; }
friend modnum operator - (const modnum& a, const modnum& b) { return modnum(a) -= b; }
friend modnum operator * (const modnum& a, const modnum& b) { return modnum(a) *= b; }
friend modnum operator / (const modnum& a, const modnum& b) { return modnum(a) /= b; }
};
const int MOD = 1e9 + 7;
using mint = modnum<MOD>;
template <typename T> mint binpow(T aa, long long b) {
assert(b >= 0);
mint a = aa;
mint r = 1; while (b) { if (b & 1) r *= a; b >>= 1; a *= a; } return r;
}
template <int N>
struct counting{
mint fact[N], inv[N], invfact[N];
counting() {
fact[0] = fact[1] = invfact[0] = invfact[1] = inv[1] = 1;
for (int i = 2; i < N; i++) {
fact[i] = mint(i) * fact[i - 1];
inv[i] = mint(0) - mint(MOD / i) * inv[MOD % i];
invfact[i] = invfact[i - 1] * inv[i];
}
}
mint choose(int n, int r) {
if (n < 0 || r < 0 || n < r) return mint(0);
return fact[n] * invfact[n - r] * invfact[r];
}
mint stars_and_bars(int n, int k){
return choose(n + k - 1, k - 1);
}
mint P(int n, int r){
if (n < 0 || r < 0 || r > n) return mint(0);
return fact[n] * invfact[n - r];
}
mint catalan(int n){
return fact[2 * n] * invfact[n] * invfact[n + 1];
}
};
const int N = 1e6 + 5;
// counting<N> comb;
// DO NOT FORGET TO CHANGE THE MOD
void SOLVE() {
int n; cin >> n;
string s; cin >> s;
mint answer = 0;
mint A = 0;
mint C = count(begin(s), end(s), 'c');
mint markL = 0, markR = count(begin(s), end(s), '?');
mint INV = mint(3).inv();
for(int i = 0; i < n; i++){
if(s[i] == 'c') C--;
else if(s[i] == '?') markR--;
if(s[i] == 'b' || s[i] == '?') {
answer += (A + INV * markL) * (C + INV * markR) * (s[i] == '?' ? INV : mint(1));
}
if(s[i] == 'a') A++;
else if(s[i] == '?') markL++;
}
cout << answer * binpow(3, (int)markL);
}
int main() {
std::ios::sync_with_stdio(false);
std::cin.tie(nullptr); std::cout.tie(nullptr);
// int o_o = 1; cin >> o_o; while(o_o --)
SOLVE(); return 0;
}
You have an airplane with $$$n$$$ seats in a single row and $$$m$$$ passengers $$$m \leq n$$$. Every passenger independently chooses an assigned seat from $$$1$$$ to $$$n$$$ and a walking direction either facing the front or facing the back of the plane. When it is their turn to board, they go straight to their assigned seat. If it is empty, they sit. If it is occupied, they keep walking in their chosen direction to take the very next available seat. Your goal is to find the total number of initial choices assigned seat and direction for all $$$m$$$ passengers where every single person successfully finds a seat without walking off either edge of the airplane.
Try to solve it first without symmetry of expectations, I don't know how but just try. Now, for our solution
Hint 1If you try to calculate the probabilities directly, you will quickly realize that the seats are not equal. A person aiming for seat $$$1$$$ and walking left falls off the plane immediately if it's occupied, while a person walking right has $$$n-1$$$ backup options. The edges of the array completely destroy any natural symmetry. But what if there were no edges? How can we make every single position perfectly symmetric to all the others?
Hint 2To remove the edges, we need to turn our linear airplane into a circle. Imagine we add exactly one dummy seat at the very end of the plane, giving us $$$n+1$$$ seats. Then, we connect this dummy seat back to the first seat. Now, passengers walk in a continuous loop. No one ever gets angry or falls off because they will just keep circling until they find an empty seat. In this circular universe, a boarding configuration is valid for our original straight airplane if and only if the dummy seat at $$$n+1$$$ remains completely empty. If someone had to sit in the dummy seat, it means they walked off the edge of the real plane.
SolutionRead the two hints first.
There are now $$$n+1$$$ seats. Each of the $$$m$$$ passengers independently chooses one of these $$$n+1$$$ seats and one of $$$2$$$ directions. This gives us a total of $$$(2(n+1))^m$$$ possible starting configurations. Since there are $$$n+1$$$ total seats and $$$m$$$ passengers will eventually sit down, there will always be exactly $$$n+1-m$$$ empty seats left over. I think if you reach this point of the blog I can left this part as an exercise for the reader of calculating the probability of having the seat $$$n+1$$$ empty.
Spoiler$$$P(\text{seat } n+1 \text{ is empty}) = \frac{n+1-m}{n+1}$$$ To find the answer to the original problem, we just take the total number of configurations in our circular universe and multiply it by the probability that the configuration is actually valid, $$$(2(n+1))^m \times \frac{n+1-m}{n+1}$$$.
I hope this blog helped in exploring the magic of expected values and symmetry for you. If you know of any other cool problems share it with us in the comments.