Me and DeepSeek analyzed 1600 Codeforces contests and 8000+ problems — some interesting patterns

please check this result : https://ayushgirigoswami.github.io/codeforces_analysis_report/

A few weeks ago I got curious about something most of us probably notice intuitively but rarely measure properly:

Is Codeforces getting harder over time?
Which topics are becoming more common?
Are some problems overrated or underrated?
How different are Educational rounds from regular rounds?

So I wrote a Python script that pulls contest/problem data from the CF API, analyzes ratings + tags, and generates interactive visualizations.

Report

https://ayushgirigoswami.github.io/codeforces_analysis_report/

Source code

https://github.com/Ayushgirigoswami

Dataset

The analysis includes:

1600 contests
8354 rated problems
Div.1 / Div.2 / Div.3 / Div.4 / Div.1+2
2011 → present

Average problem rating across the dataset: 1793

Some interesting results

1) Div.1+Div.2 rounds have the widest difficulty spread

Division	Avg Rating	Median	Std Dev
Div.1	2358	2400	714
Div.2	1630	1600	652
Div.1+2	2096	2100	927
Div.3	1429	1400	513
Div.4	1213	1100	412

Div.1+2 rounds have by far the largest standard deviation.

Makes sense in hindsight: these rounds combine easy entry problems with very high-end G/H problems, so the spread becomes huge.

Also interesting: even Div.1 A problems average around 1537, which is already harder than many Div.2 mid-problems.

2) Biggest difficulty jumps are usually B→C and C→D

Average Div.2 ratings by position:

A ≈ 903
B ≈ 1203
C ≈ 1552
D ≈ 1932
E ≈ 2300
F ≈ 2614

The largest jumps are:

B → C : +349
C → D : +380

This matches what many contestants experience during contests: B is often straightforward, while C/D is where actual problem solving starts becoming important.

For Div.1+2 rounds, the jump near the end becomes even more extreme:

F ≈ 2697
G ≈ 3102
H ≈ 3160

3) Topic trends over time

Increasing frequency

greedy
math
constructive algorithms
data structures
binary search
dp
trees
bitmasks
interactive

Decreasing frequency

implementation
geometry

The increase in interactive problems during the last few years was especially noticeable.

Geometry also appears much less frequently than older rounds.

4) Most common tags

Tag	Total
greedy	2885
math	2805
implementation	2407
dp	1980
constructive algorithms	1677
brute force	1644
data structures	1620
binary search	1022

Some observations:

DP is disproportionately common in Div.1.
Implementation dominates Div.2/3 but drops heavily in Div.1.
Graph-related problems appear much more frequently than I expected.

5) Educational rounds vs regular rounds

This part surprised me.

Type	Overall Avg	A	B	C	D	E	F
Educational	1769	873	1118	1465	1842	2225	2628
Regular	1767	1050	1344	1714	2088	2417	2525

Overall average difficulty is almost identical.

But position-by-position: Educational rounds are consistently easier from A→E, while F problems are actually harder on average.

6) Problems whose ratings seem unusual

Examples of problems that appear easier/harder than their ratings suggest based on solve counts.

Easier than expected

1264F — Beautiful Fibonacci Problem (rated 3500, but solved by 1000+ users)

Harder than expected

2190F — Xor Product
2066F — Curse
1967F — Next and Prev
949F — Astronomy

These had surprisingly low solve counts relative to their ratings.

7) Contest “symmetry”

I also tried measuring how balanced contest difficulty curves are.

Average symmetry score: 0.454 / 1

Interpretation:

0.7 → balanced progression
< 0.4 → heavily front-loaded

Most CF contests lean slightly front-loaded: easy opening problems followed by a sharp wall.

Running the script

Requirements:

pip install requests pandas numpy plotly matplotlib tqdm

Run:

python deep.py

The script:

Fetches contests/problems from the CF API
Performs statistical analysis
Generates interactive Plotly graphs
Detects trends/anomalies

Fetching everything takes around 15–20 minutes because of API rate limiting.

One thing I still want to analyze is solve timing during contests (for example: when most users solve C/D problems), but that would require a much larger amount of contest.status API calls.

If anyone has ideas for additional analyses, suggestions are welcome :)

	Rev.	Язык	Кто	Когда	Δ	Комментарий
	en2		ayushgirigoswami15	2026-05-15 20:29:09	16	Tiny change: '# I analyzed ' -> '# Me and DeepSeek analyzed '
	en1		ayushgirigoswami15	2026-05-15 20:28:28	5347	Initial revision (published)

№	Пользователь	Рейтинг
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3611
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	dXqwq	3436
8	Radewoosh	3415
9	Otomachi_Una	3413
10	Um_nik	3376

№	Пользователь	Вклад
1	Qingyu	163
2	adamant	149
3	Um_nik	145
4	Dominater069	143
5	errorgorn	141
6	cry	138
7	Proof_by_QED	135
7	YuukiS	135
9	chromate00	134
10	soullless	132