So, I participated in Codeforces Round 1089 (Div. 2) recently and solved ABC1, and when I moved on to C2 I thought to myself "wow, that's a big difficulty jump!"
After the contest was over, I also thought about whether there is a way to put a number to this, and after a bit of thinking, I came up with the following process:
- Take the solve counts of each problem (the green numbers at the bottom of the Standings page). Call it $$$\mathrm{solves}$$$.
- Say that there are $$$n$$$ problems. Define the array $$$\mathrm{ratios}\left[i\right] = \frac{\mathrm{solves}\left[i\right]}{\mathrm{solves}\left[i+1\right]}$$$.
- The imbalance is calculated as $$$\mathrm{stdev}\left(\mathrm{ln}\left(\mathrm{ratios}\right)\right)$$$.
So I decided to test this method on some recent contests. After playing around, I decided that this function was good to use and then I wanted to make a CSV containing the information of contest ID, contest starting time in UTC timezone, contest title, and measured contest imbalance.
The way I did this was to first check an already existing CSV file for any data already computed, then for each contest not in the CSV file:
- Query the Codeforces API for standings information.
- For each participant, if their score is greater than zero for problem $$$i$$$, count that as an accepted solution to compute the $$$\mathrm{solved}$$$ array.
- Use the contest information returned by Codeforces API
contest.standingsto get the contest start time as Unix in UTC and title. - Use the $$$\mathrm{solved}$$$ array we computed to calculate imbalance as given above.
- For any contest, if the API gives status code 400 (e.g. contest 1597) or returns empty standings (e.g. contest 399), write that to a file named
status400.txtso that these contests can be skipped on the next run.
You can find the source code, computed CSV, and status400.txt in this GitHub repo.
Now, what is imbalance actually measuring? I like to think of it as an estimate of how rough the difficulty jumps between problems are. For example, if until some problem each problem is slightly harder than the last but the next problem is much harder, this makes the imbalance high. On the other hand, if the difficulties progress smoothly the imbalance will be low. The difficulties of the problems are estimated by accepted counts.
It's time for some statistics. Out of the last 10 rated contests, the highest imbalance belongs to Codeforces Round 1089 (Div. 2) at around 1.341. The rated contest with the highest imbalance that is also "ordered" by difficulty is Codeforces Round 421 (Div. 2) due to the fact that the author solution failed on a hack, and most of the accepted solutions for C FSTed. The rated contest with the lowest imbalance is Codeforces Beta Round 1.
Goodbye, that's it for now, also here's a reminder to not go harassing the contests with the high imbalance.







