Hi everyone!
Today I want to write about the Gale-Ryser Theorem and some of its applications.
Gale-Ryser Theorem
We have an array of $$$n$$$ non-negative integers $$$a_1, a_2, \ldots, a_n$$$ and $$$m$$$ positive integers $$$b_1, b_2, \ldots, b_m$$$ up to $$$n$$$. The array $$$b$$$ is a sequence of operations, in $$$i$$$-th operation we need to decrease $$$b_i$$$ positions by $$$1$$$, formally pick $$$b_i$$$ unique indices $$$j_1, j_2, \ldots, j_{b_i}$$$ and decrease $$$a_{j_p}$$$ by $$$1$$$ for $$$1 \le p \le b_i$$$. We want to know if it's possible to have all $$$a_i \ge 0$$$ after the operations.
Without loss of generality $$$b_1 \ge b_2 \ldots \ge b_m$$$. The Theorem says that it's possible iff $$$\sum \limits_{i=1}^k \min(a_i, k) \ge \sum \limits_{j=1}^k b_j$$$ for $$$\forall 1 \le k \le m$$$.
NecessityIf we can do all operations and all $$$a_i$$$ remain non-negative, then the sum of $$$a_i$$$ is at least the sum of $$$b_j$$$ ($$$\sum a \ge \sum b$$$). Since we do $$$k$$$ operations, each $$$a_i$$$ cannot contribute more than $$$k$$$ times, we can replace it with $$$\min(a_i, k)$$$. And then we check that the sum is still at least the sum of $$$b$$$. The condition that $$$b$$$ is non-increasing is here because we want to verify the condition only for the maximal set of $$$k$$$ operations. If we can do $$$k$$$ maximal operations then we can do every other set of $$$k$$$ operations.
SufficiencyProof by inductionI'll do induction on $$$m$$$. Base case $$$m = 1$$$ is trivial. Then I want to decrease $$$b_1$$$ maximum positions and show that all the inequalities would still hold for the new array $$$a'$$$ and the array $$$b_2, b_3, \ldots, b_m$$$.
Let's verify every inequality for $$$2 \le k \le m$$$.
Case $$$k \gt 1$$$. Suppose $$$a$$$ is also non-increasing, then after we do the first operation the inequality would look like this:
$$$\sum \limits_{i=1}^{b_1} \min(a_i - 1, k - 1) + \sum \limits_{i=b_1 + 1}^{n} \min(a_i, k - 1) \ge \sum \limits_{j=2}^k b_j$$$
$$$\sum \limits_{i=1}^{b_1} \min(a_i, k) - b_1 + \sum \limits_{i=b_1 + 1}^{n} \min(a_i, k - 1) \ge \sum \limits_{j=2}^k b_j$$$
$$$\sum \limits_{i=1}^{b_1} \min(a_i, k) + \sum \limits_{i=b_1 + 1}^{n} \min(a_i, k - 1) \ge \sum \limits_{j=1}^k b_j$$$
First case — $$$a_{b_1 + 1} \lt k$$$. Then the sum hasn't changed from the initial.
Second case — $$$a_{b_1 + 1} \ge k$$$. Then $$$a_{b_1} \ge k$$$, so $$$\sum \limits_{i=1}^{b_1} \min(k, a_i) = k \cdot b_1 \ge \sum \limits_{j=1}^k b_j$$$ since $$$b_1$$$ is the maximum value.
So after doing the first operation the condition doesn't fail and the induction transition is complete.
Proof by mincutWe can phrase this task in terms of max-flow problem. We have two parts $$$A$$$ and $$$B$$$. For $$$v \in A$$$ we draw an edge $$$S \rightarrow v$$$ with capacity $$$a_v$$$. For $$$u \in B$$$ we draw an edge $$$u \rightarrow T$$$ with capacity $$$b_u$$$. For $$$v \in A, u \in B$$$ we draw an edge $$$v \rightarrow u$$$ with capacity $$$1$$$.

The max-flow cannot be greater than $$$\sum b$$$, we want to find a sufficient condition for it to be equal to $$$\sum b$$$. By Ford-Fulkerson's theorem max-flow = min-cut. Consider an $$$S \backslash T$$$ cut. Split $$$A$$$ and $$$B$$$ into $$$AS, AT, BS, BT$$$. Then $$$cut = \sum \limits_{v \in AT} a_v + \sum \limits_{u \in BS} b_u + AS \cdot BT$$$. Let's fix the size of $$$BT$$$ and call it $$$k$$$. Among all cuts where $$$|BT| = k$$$ we want to find the minimal and compare it to $$$\sum b$$$. Since all terms that depend on $$$A$$$ don't depend on the choice of $$$BS$$$ and $$$BT$$$ if $$$|BT|$$$ is fixed, we can add $$$m - k$$$ minimum elements of $$$b$$$ to $$$BS$$$ and $$$k$$$ maximum elements in $$$BT$$$. Then for each $$$a_i$$$ we can either add it to $$$AT$$$ and add $$$a_i$$$ to mincut or add to $$$AS$$$ and add $$$k$$$ to mincut. So essentialy mincut is increased by $$$\min(k, a_i)$$$.
So, $$$mincut_k = \sum \limits_{v \in A} \min(k, a_i) + \sum \limits_{j=k+1}^{m} b_j$$$. Since we want to show that mincut is at least $$$\sum b$$$, $$$mincut_k \ge \sum b$$$ for $$$\forall k$$$.
$$$\sum \limits_{i=1}^n \min(k, a_i) + \sum \limits_{j=k+1}^{m} b_j \ge \sum \limits_{j=1}^m b_j$$$
$$$\sum \limits_{i=1}^n \min(k, a_i) \ge \sum \limits_{j=1}^k b_j$$$
Proof by induction also suggests a strategy for the construction, which turns out to be the first greedy that comes to mind — always selecting $$$b_i$$$ maximum positions. Now you know the proof behind this greedy. While it doesn't follow from this proof, it's actually not needed to sort $$$b$$$ to run the greedy. Later in the blog it will also be proven, but for now just remember that we can pick maximum positions in any order we want.
An important special case, all $$$b_i = t$$$. That is, $$$m$$$ times $$$t$$$ positions are decreased. Then it is only needed to check that $$$\sum \limits_{i=1}^n \min(t, a_i) \ge mt$$$
ProofCondider a function $$$f(k) = \sum \limits_{i=1}^n \min(k, a_i) - kt$$$. We want to find its minimum to check if it's less than $$$0$$$.
$$$f(k + 1) - f(k) = \sum \limits_{i=1}^n \min(k + 1, a_i) - \min(k, a_i) - t$$$. Sum of differences of minimums is the number of $$$a_i \gt k$$$, which decreases with $$$k$$$, so $$$f$$$ is a concave function. That means that the minimum value is either $$$f(0)$$$ or $$$f(m)$$$. Since $$$f(0) = 0$$$ we only need to check $$$f(m)$$$.
Problem from AtCoder ABC which uses the Theorem
Another example is 1774B - Coloring. Statement: we have $$$n$$$ cells and $$$m$$$ colors, each cell must be colored. For each color there must be exactly $$$a_i$$$ cells painted with that color ($$$\sum a = n$$$). Also, every window of given size $$$k$$$ can't have cells of one color.
Solution (yes, editorial for div2B)For each window of size $$$k$$$ we choose $$$k$$$ different colors and decrease their frequency by $$$1$$$. That's exactly the process of the Theorem. The last window will have size $$$n \mod k$$$, there we put all the remaining colors. So we have operations $$$b_1 = b_2 = \ldots = b_{\lfloor \frac{n}{k} \rfloor} = k, b_{\lfloor \frac{n}{k} \rfloor} + 1 = n \mod k$$$. From the previous fact there are only $$$2$$$ conditions we need to check, for the first $$$\lfloor \frac{n}{k} \rfloor$$$ operations and for all of them.
$$$\sum \limits_{i=1}^m \min(\lfloor \frac{n}{k} \rfloor, a_i) \ge n - n \mod k$$$
$$$\sum \limits_{i=1}^m \min(\lfloor \frac{n}{k} \rfloor + 1, a_i) \ge n$$$
The first condition means that there are at most $$$n \mod k$$$ values in $$$a$$$ that are greater than $$$\lfloor \frac{n}{k} \rfloor$$$.
The second condition together with the fact that $$$\sum a = n$$$ means that $$$\max(a) \le \lfloor \frac{n}{k} \rfloor + 1$$$.
Lastly, how do we select the colors for each cell? For the first window we just pick $$$n \mod k$$$ most frequent colors and then for each window of size $$$k$$$ we first pick colors and then from left to right select any color that can be put. It works because of pigeonhole principle.
In this problem the same idea is used 1893D - Colorful Constructive
There is a symmetrical case to this problem. Suppose on each operation we decrease \textbf{at most} $$$b_i$$$ elements by $$$1$$$, but the goal now is to get all $$$a_i = 0$$$. Just swap $$$a$$$ and $$$b$$$ and we'll get the same problem as before! Originally all $$$b_j$$$ required $$$b_j$$$ positions in $$$a$$$ and each position $$$a_i$$$ could've been picked at most $$$a_i$$$ times. And here each $$$a_i$$$ must be picked exactly $$$a_i$$$ times (we need to pick exactly $$$a_i$$$ positions in $$$b$$$) and each $$$b_j$$$ can be picked at most $$$b_j$$$ times.
Here are a couple of special cases, which I've seen.
All operations are $$$2$$$, we want to find the minimal number of operations needed to make all $$$a_i = 0$$$. This one I see sometimes but right now I can't remember a single problem. For proofs assume that $$$a$$$ is non-increasing.
SolutionWe have $$$\sum \limits_{i=1}^m min(k, b_i) \ge \sum \limits_{i=1}^k a_i$$$.
$$$m \cdot min(k, 2) \ge \sum \limits_{i=1}^k a_i$$$.
It's optimal to choose either $$$k = 1$$$ or $$$k = n$$$ to find minimal $$$m$$$.
$$$k = 1$$$. $$$m \ge a_1$$$
$$$k = n$$$. $$$2m \ge \sum a \Rightarrow m \ge \lceil \frac{\sum a}{2} \rceil$$$
So we get that $$$m = \max(a_1, \lceil \frac{\sum a}{2} \rceil)$$$.
All operations are $$$n - 1$$$, we want to find the minimal number of operations needed to make all $$$a_i = 0$$$. Can be used here 2181G - Greta's Game
Solution$$$m \cdot min(k, n - 1) \ge \sum \limits_{i=1}^k a_i$$$.
$$$k \le n - 1$$$. $$$m \cdot k \ge \sum \limits_{i=1}^k a_i$$$. $$$k = 1 \Rightarrow m \ge a_1 \Rightarrow tm \ge t \cdot a_1 \ge \sum \limits_{i=1}^t a_i$$$
$$$k = n$$$. $$$m \ge \lceil \frac{\sum a}{n} \rceil$$$.
So $$$m = \max(a_1, \lceil \frac{\sum a}{n} \rceil)$$$.
Application of this idea in Meta Hacker Cup
Lastly, there is a problem from Open Olympiad 24-25 XIX Open Olympiad in Informatics - Final Stage, Day 2 (Unrated, Online Mirror, IOI rules), which motivated me to understand this theorem. I will only state facts from it without providing proof.
This greedy gives lexigographically largest sorted array possible among all arrays which are achievable with the operations. The proof also doesn't use the fact that the operations are done in non-increasing order.
If all of the operations are equal ($$$m$$$ operations, each decreases $$$k$$$ positions) it is possible to find any prefix sum in $$$O(\log m \cdot $$$(time to get certain sums)$$$)$$$