High Expectations from Expectations

Revision en2, by __AB__, 2026-04-01 22:37:59

High Expectations from Expectations

In this blog I aim to demonstrate how expectations can be used effectively to solve problems whose solutions otherwise look very clumsy.


Theory

Let’s start with the basics.

1. Linearity of Expectation

For any finite set of random variables $$$X_1, X_2, \dots, X_N$$$, which need not be independent, we have

$$$ \mathbb{E}\left[\sum_{i=1}^{N} X_i \right] = \sum_{i=1}^{N} \mathbb{E}[X_i] $$$

That’s it. Simple, cute and effective. This one line is usually enough to solve most of the probability problems.

Convinced it is easy, right?

Example 1: Game on Tree

Given a rooted tree with $$$n$$$ nodes, a game is played by Momiji where in each step, a node is chosen uniformly at random from the set of remaining nodes. The chosen node and its entire subtree are then removed from the tree. The process repeats until all nodes have been deleted. The goal is to calculate the expected number of operations required to finish the game.


In this problem, clearly the expected number of moves is equal to the number of nodes removed by Momiji.

Define $$$I_i$$$ as an indicator variable for whether the $$$i$$$-th node was removed directly by Momiji (and not as part of some ancestor's subtree).

We are interested in finding:

$$$ \mathbb{E}\left[\sum_{i=1}^{n} I_i\right] = \sum_{i=1}^{n} \mathbb{E}[I_i] = \sum_{i=1}^{n} P(I_i = 1) $$$

(The expectation of an indicator random variable equals the probability that it equals 1)

A node is picked by Momiji only if none of its ancestors have been picked earlier, otherwise it would already be deleted.
This means if you take the node and all its ancestors, put them in a bag, and draw one uniformly at random, the node must be drawn first for it to be picked by Momiji.
Hence the probability is:

$$$ P(I_i = 1) = \frac{1}{1 + \text{number of ancestors of node } i} = \frac{1}{\text{depth}_i} $$$

Therefore, the expected number of moves is:

$$$ \sum_{i=1}^{n} \frac{1}{\text{depth}_i} $$$

Example 2: Random Knockout Tournament

A tournament consists of $$$n$$$ equally skilled players. In each round, two players are randomly drawn to play a match; the loser is eliminated, and the winner proceeds. This continues until only one player remains. What is the probability that two specific players, $$$i$$$ and $$$j$$$, will face each other at some point during the tournament?


Due to the symmetry of the tournament (all players are equally skilled and pairings are random), the probability that any two players $$$i$$$ and $$$j$$$ face each other must be the same for every possible pair.

Let this constant probability be $$$P$$$.

Let $$$X_{i,j}$$$ be an indicator random variable for the pair $$${i, j}$$$:

$$$X_{i,j} = 1$$$ if player $$$i$$$ and player $$$j$$$ play a match.

$$$X_{i,j} = 0$$$ otherwise.

The expected value of this indicator is:

$$$E[X_{i,j}] = P(X_{i,j} = 1) = P$$$

In any knockout tournament with $$$n$$$ players, exactly $$$n-1$$$ players must be eliminated to produce one winner. Since each match eliminates exactly one player, there are exactly $$$n-1$$$ matches in total.

The total number of matches $$$M$$$ can also be expressed as the sum of all indicator variables for all possible pairs:

$$$M = \sum_{1 \le i \lt j \le n} X_{i,j}$$$

Taking the expectation of the total number of matches:

$$$E[M] = E\left[ \sum_{1 \le i \lt j \le n} X_{i,j} \right]$$$

Since $$$M$$$ is always $$$n-1$$$, $$$E[M] = n-1$$$. By linearity of expectation:

$$$n-1 = \sum_{1 \le i \lt j \le n} E[X_{i,j}]$$$

There are $$$\binom{n}{2}$$$ pairs in the summation. Substituting $$$E[X_{i,j}] = P$$$:

$$$n-1 = \binom{n}{2} \cdot P$$$
$$$P = \frac{2}{n}$$$

Linear Expectation in Combinatorics

In counting problems over a complete set, instead of trying to sum up every individual case, we can treat the set as a probability space. Calculate the expected value for one random configuration, then multiply by the total number of configurations to get the final sum. This transformation allows us to bypass messy combinatorics entirely

Problem 1: Count the Blocks

Consider all integers from $$$0$$$ to $$$10^n - 1$$$ written as exactly $$$n$$$ digits with leading zeros ($$$1 \le n \le 2 \cdot 10^5$$$).

A block is a maximal consecutive sequence of the same digit.
For each length $$$i$$$ from 1 to $$$n$$$, count the total number of blocks of exactly length $$$i$$$ that appear across all these $$$n$$$-digit strings.


While this problem is often approached with direct counting, it becomes much cleaner using Linearity of Expectation.

Instead of counting blocks across all $$$10^n$$$ strings, we find the expected number of blocks of length $$$k$$$ in a single, uniformly random string of length $$$n$$$, then multiply by $$$10^n$$$.

Let $$$X$$$ be the number of blocks of length exactly $$$k$$$. We want to find:

$$$10^n \times \mathbb{E}[X]$$$

Define $$$I_i$$$ as an indicator random variable where $$$I_i = 1$$$ if a block of length $$$k$$$ starts at position $$$i$$$, and $$$I_i = 0$$$ otherwise.

For a block of length $$$k$$$ to start at position $$$i$$$, two conditions must be met:

  1. Internal Consistency: All $$$k$$$ digits from $$$i$$$ to $$$i+k-1$$$ must be the same. (Probability: $$$10 \times (1/10)^k = 1/10^{k-1}$$$)

  2. Maximality: Any digit immediately adjacent to this sequence must be different to ensure the block doesn't "stretch" longer than $$$k$$$.

Case 1: The block is at an end ($$$i = 1$$$ or $$$i = n-k+1$$$)

There is exactly one neighbor. To ensure the block is exactly length $$$k$$$, that neighbor must be one of the 9 other digits.

$$$P(I_{\text{end}} = 1) = \left( \frac{1}{10^{k-1}} \right) \times \frac{9}{10} = \frac{9}{10^k}$$$

(Note: If $$$k=n$$$, there are no neighbors, so the probability is simply $$$1/10^{n-1}$$$.)

Case 2: The block is in the middle ($$$1 \lt i \lt n-k+1$$$)

There are two neighbors (left and right). Both must be different from the block's digits.

$$$P(I_{\text{mid}} = 1) = \left( \frac{1}{10^{k-1}} \right) \times \left( \frac{9}{10} \right)^2 = \frac{81}{10^{k+1}}$$$

Now.

to get the total count, we multiply the expected value by $$$10^n$$$:

  1. If $$$k = n$$$: Only 1 possible position.
$$$\text{Count} = 10^n \times \frac{1}{10^{n-1}} = 10$$$
  1. If $$$k \lt n$$$:
  • Ends: 2 positions ($$$i=1$$$ and $$$i=n-k+1$$$).
  • Middle: $$$(n-k+1) - 2 = (n-k-1)$$$ positions.
$$$\text{Total Count} = 10^n \times \left( 2 \cdot \frac{9}{10^k} + \max(0, n-k-1) \cdot \frac{81}{10^{k+1}} \right)$$$
$$$ \implies \text{Total Count} = 10^{n-k-1} \times \left( 180 + \max(0, n-k-1) \cdot {81} \right)$$$

Check my submission here.


Problem 2: Max History

For a permutation $$$a$$$ of $$$n$$$ integers, let $$$f(a)$$$ be the sum of elements that are greater than or equal to all preceding elements. Calculate the total sum of $$$f(a)$$$ over all $$$n!$$$ permutations, modulo $$$10^9 + 7$$$


This is a beautiful problem but the solution looks terrifying in the editorial. Let us try to solve this problem using probability theory.

We are asked to find the sum of $$$f(a)$$$ over all permutations of the array. This is equivalent to finding

$$$ n! \times \mathbb{E}[f(a)] $$$

Because the expected value of $$$f(a)$$$ over a uniformly random permutation is exactly the average of all those values.


Now, let’s analyze $$$f(a)$$$.
Define $$$I_i$$$ as an indicator random variable for element $$$a_i$$$:

  • $$$I_i = 1$$$ if $$$a_i$$$ contributes to $$$f(a)$$$ (i.e., it is greater than all elements before it),
  • $$$I_i = 0$$$ otherwise.

Then

$$$ f(a) = \sum_{i=1}^{n} I_i \cdot a_i $$$

Taking expectation on both sides and using linearity:

$$$ \mathbb{E}[f(a)] = \sum_{i=1}^{n} a_i \cdot \mathbb{E}[I_i] = \sum_{i=1}^{n} a_i \cdot P(I_i = 1) $$$

Now, what is $$$P(I_i = 1)$$$?
It is simply the probability that $$$a_i$$$ is greater than all the elements before it in the permutation. In other words, it is the probability that $$$a_i$$$ occurs before all the elements greater than or equal to it.

Let $$$g_i$$$ = number of elements greater than or equal to $$$a_i$$$, including $$$a_i$$$

Then $$$a_i$$$ should occur first in these $$$g_i$$$ elements. This happens with probability $$$\frac{1}{g_i}$$$.

So

$$$ P(I_i = 1) = \frac{1}{g_i} $$$

Therefore

$$$ \mathbb{E}[f(a)] = \sum_{i=1}^{n} a_i \cdot \frac{1}{g_i} $$$

and the final answer (sum over all permutations) is

$$$ n! \times \sum_{i=1}^{n} \frac{a_i}{g_i} $$$

Ignoring the edge case as it is irrelevant to the blog. Check my submission here.


Tags probability

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en2 English __AB__ 2026-04-01 22:37:59 1286 Tiny change: 'ot 10^5$). \n\nA **block*' -> 'ot 10^5$)._ \n\n_A **block*'
en1 English __AB__ 2026-04-01 21:35:01 7799 Initial revision (published)