How to Interpret Score Distribution

In the comments section of my most recent round, there has been a lot of misconceptions about interpreting score distribution. This blog is just to inform the correct way of understanding them from a problemsetter standpoint.

The biggest misconception I saw was score distributions are always almost equal to its difficulty. This is not true. Do you really think 1852A - Набор Ntarsis is a rated 500 problem?

The correct way to interpreting score distribution is looking at the differences between each adjacent gaps. I will give suggestions on interpreting difference in points in a Div. 2 round. Div. 1 Scoring may be different because the difficulty curve is different.

Usually, a $$$250$$$ point gap means the two problems are relatively similar in difficulty but with a slight difficulty increment. A $$$500$$$ point gap represents the standard gap between adjacent problems that you'd expect. I think any gap $$$750$$$ or above means there is a decently large difficulty discrepancy between the two problems. I will use Codeforces Round 965 (Div. 2) as an example.

Problems A and B are expected to be similar difficulty with B being slightly challenging. B and C are $$$500$$$ apart, which means that the difficulty difference between B and C is expected to be larger than that for A and B. In hindsight, I probably should've assigned a $$$750$$$ point difference. This is also why you shouldn't rely on score distribution to determine whether you should approach a problem or not, as the assigned difference could be higher or lower than the actual gap. Again, there is a $$$250$$$ gap between C and D so they are expected to be relatively close in difficulty but D should still be harder than C; same for D and E1.

Problems with subtasks usually come with $$$(x + y)$$$ in the announcement blog, with $$$x$$$ being the score assigned to the easy version and $$$y$$$ for the harder version. In most round, this usually means that if the easy version is a stand-alone problem, then it should be assigned $$$x$$$ points. If only hard version is proposed, then it should be worth $$$x+y$$$ points. However, I don't believe that this should always be the case in the future, which is why this blog is also a suggestion to future problemsetters.

Originally, E1 was going to be assigned $$$2000$$$ points and E2 assigned $$$1000$$$ points. Then, one of our testers, Dominater069 said "I wrote 10 mins of ds for E1 for $$$2000$$$ points... And then take 30 mins just for $$$1000$$$ points?" I thought about this and I was like, he has a point. There has been a lot of instances where the hard version is a lot harder than the easy version, but isn't worth a significant amount of points on its own. Therefore, I think setters should weigh the hard version like a standalone problem, and decrease its value based on how much of the observation that the easy version gives away.

Lastly, score distributions aren't always accurate. They are usually only decided by setters, the coordinator, and a small subset of the testers. You shouldn't get scared if you see a large score gap between problems. Most of the time, us setters also get surprised by the performance of you codeforcers.

Комментарии (7)

Написать комментарий?

Dominater069

20 месяцев назад, скрыть # |

I also wanted to make a blog about how almost every hard version problem on cf gets lower points than it should (but i suppose its necessary)

Last div1 is another excellent example....D2 is probably harder than E yet still 1250 and F2 is also only 1250...(going from D1 to D2 and F1 to F2 are harder than the individual tasks for sure)

→ Ответить

akane646

Score distribution with sub task problem... I'm also one of those who've been struggle with it.

The score system right now is giving huge advantage for whom choose to dive straight into the hard and submit the same code to the easier version, but not the other way around. (I will take Codeforces Round 961 (Div. 2) and Codeforces Round 964 (Div. 4) for example).

About Codeforces Round 965 (Div. 2) It's so cursed with hate. I can feel it too because I was really desperate and doubting my abilities. I cannot solve it in time and forced to give up when the contest is over. Checking the score of C again and the score gap between B-C (Then again I think the score distribution is pretty normal-ish). Then think "Why I cannot solve it? Why it's so hard? It's only a C problem!!"

In the hindsight, It won't be so desperate if the score gap show like 750 — 1000 score diff between B and C so I can feel the fear of facing the big boss and not doubting my own. So I'd blame it's the score distribution things. When the gap is big enough, It also shows that you could put up another better div2-C problem to fill it also.

I shared my thoughts as a perspective of a pupil. I'm still grinding hard for ratings and skills but not really desperate for it. Because I've been feeling the joy of it recently so of course no hate comments even if things go a bit awry... Cope a bit and move on.

20 месяцев назад, скрыть # ^ |

But hey, it's very nice of you cry to actually write constructive ways to interpret the score distribution instead of just saying method A or method B is wrong and walk away.

Hope to see your contest next time in the future. I love it.

djm03178

+10

Here's a very old blog of mine, and I think I still hold this opinion. I think the score difference between problems shouldn't just be "$$$250$$$ for small, $$$500$$$ for medium, and $$$750$$$ for large difficulty gap", but also should generally increase when the score itself is high. For example, A $$$500$$$-point problem and a $$$750$$$-point problem already have a decent amount of difference because solving a $$$750$$$-point problem slow (like an hour late) will still be better than solving a $$$500$$$-point problem fast. But it is easy for solving a $$$3500$$$-point problem slow to be worse than solving a $$$3000$$$-point problem fast, so this $$$500$$$ score gap isn't really that huge.

Hey, that's a very good point. I'm with you also

electric_boogaloo

← Rev. 2 →

"How to interpret weather forecasts":

If the forecast says that a rain is likely, it means that it's likely to rain, although it's not a guarantee that it will happen

7etuPr0mK_X-VPA.8-ER1SYJ

For problems with subtasks, I also want to suggest that the gap between the easy version and the hard version should not be very big. It is unnecessary to have a *3000 problem with a *2200 subtask, for example. I think it's good for the subtasks to have different approaches. For example, an intended dp solution for subtask 1 and greedy for subtask 2. It's not recommended to just remove data structures/FFT to split a problem into two subtasks.

№	Пользователь	Рейтинг
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	turmax	3559
6	tourist	3541
7	strapple	3515
8	ksun48	3461
9	dXqwq	3436
10	Otomachi_Una	3413

№	Пользователь	Вклад
1	Qingyu	157
2	adamant	153
3	Um_nik	147
4	Proof_by_QED	146
5	Dominater069	145
6	errorgorn	141
7	cry	139
8	YuukiS	135
9	TheScrasse	134
10	chromate00	133

Блог пользователя cry