Me and DeepSeek analyzed 1600 Codeforces contests and 8000+ problems — some interesting patterns
please check this result : https://ayushgirigoswami.github.io/codeforces_analysis_report/
A few weeks ago I got curious about something most of us probably notice intuitively but rarely measure properly:
- Is Codeforces getting harder over time?
- Which topics are becoming more common?
- Are some problems overrated or underrated?
- How different are Educational rounds from regular rounds?
So I wrote a Python script that pulls contest/problem data from the CF API, analyzes ratings + tags, and generates interactive visualizations.
Report
https://ayushgirigoswami.github.io/codeforces_analysis_report/
Source code
https://github.com/Ayushgirigoswami
Dataset
The analysis includes:
- 1600 contests
- 8354 rated problems
- Div.1 / Div.2 / Div.3 / Div.4 / Div.1+2
- 2011 → present
Average problem rating across the dataset: 1793
Some interesting results
1) Div.1+Div.2 rounds have the widest difficulty spread
| Division | Avg Rating | Median | Std Dev |
|---|---|---|---|
| Div.1 | 2358 | 2400 | 714 |
| Div.2 | 1630 | 1600 | 652 |
| Div.1+2 | 2096 | 2100 | 927 |
| Div.3 | 1429 | 1400 | 513 |
| Div.4 | 1213 | 1100 | 412 |
Div.1+2 rounds have by far the largest standard deviation.
Makes sense in hindsight: these rounds combine easy entry problems with very high-end G/H problems, so the spread becomes huge.
Also interesting: even Div.1 A problems average around 1537, which is already harder than many Div.2 mid-problems.
2) Biggest difficulty jumps are usually B→C and C→D
Average Div.2 ratings by position:
- A ≈ 903
- B ≈ 1203
- C ≈ 1552
- D ≈ 1932
- E ≈ 2300
- F ≈ 2614
The largest jumps are:
- B → C : +349
- C → D : +380
This matches what many contestants experience during contests: B is often straightforward, while C/D is where actual problem solving starts becoming important.
For Div.1+2 rounds, the jump near the end becomes even more extreme:
- F ≈ 2697
- G ≈ 3102
- H ≈ 3160
3) Topic trends over time
Increasing frequency
- greedy
- math
- constructive algorithms
- data structures
- binary search
- dp
- trees
- bitmasks
- interactive
Decreasing frequency
- implementation
- geometry
The increase in interactive problems during the last few years was especially noticeable.
Geometry also appears much less frequently than older rounds.
4) Most common tags
| Tag | Total |
|---|---|
| greedy | 2885 |
| math | 2805 |
| implementation | 2407 |
| dp | 1980 |
| constructive algorithms | 1677 |
| brute force | 1644 |
| data structures | 1620 |
| binary search | 1022 |
Some observations:
- DP is disproportionately common in Div.1.
- Implementation dominates Div.2/3 but drops heavily in Div.1.
- Graph-related problems appear much more frequently than I expected.
5) Educational rounds vs regular rounds
This part surprised me.
| Type | Overall Avg | A | B | C | D | E | F |
|---|---|---|---|---|---|---|---|
| Educational | 1769 | 873 | 1118 | 1465 | 1842 | 2225 | 2628 |
| Regular | 1767 | 1050 | 1344 | 1714 | 2088 | 2417 | 2525 |
Overall average difficulty is almost identical.
But position-by-position: Educational rounds are consistently easier from A→E, while F problems are actually harder on average.
6) Problems whose ratings seem unusual
Examples of problems that appear easier/harder than their ratings suggest based on solve counts.
Easier than expected
- 1264F — Beautiful Fibonacci Problem (rated 3500, but solved by 1000+ users)
Harder than expected
- 2190F — Xor Product
- 2066F — Curse
- 1967F — Next and Prev
- 949F — Astronomy
These had surprisingly low solve counts relative to their ratings.
7) Contest “symmetry”
I also tried measuring how balanced contest difficulty curves are.
Average symmetry score: 0.454 / 1
Interpretation:
-
0.7 → balanced progression
- < 0.4 → heavily front-loaded
Most CF contests lean slightly front-loaded: easy opening problems followed by a sharp wall.
Running the script
Requirements:
pip install requests pandas numpy plotly matplotlib tqdm
Run:
python deep.py
The script:
- Fetches contests/problems from the CF API
- Performs statistical analysis
- Generates interactive Plotly graphs
- Detects trends/anomalies
Fetching everything takes around 15–20 minutes because of API rate limiting.
One thing I still want to analyze is solve timing during contests (for example: when most users solve C/D problems), but that would require a much larger amount of contest.status API calls.
If anyone has ideas for additional analyses, suggestions are welcome :)




