How to expand the class of problems in CP without losing objectivity + simplifying organization

#	User	Rating
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

Who am I? I took up ACM ICPC in 2006 as a student, went to the semi-finals 5 times, trained the teams of a provincial university, held several camps for schoolchildren from the zero level and above, and participated in the organization of regional competitions many times (ContestSfedu aka Championship of the South of the Russia aka GP of Azov Sea — Opencup). In recent years, I have been pursuing a career, so I do not actively participate in the sports program, but from time to time I continue to help in holding competitions in Taganrog.

Why am I speaking out? More and more often I find myself thinking that competition programming does not keep up with the rapidly changing world and it becomes more difficult to find arguments for students to do it. It's probably easier with schoolchildren, but it's only a matter of time. Do those who organize ICPC, school Olympiads, etc. understand this problem? — I don't know, because I'm not actively involved in the activities of the community. It is likely that everything that I will say next has already been discussed many times here or in the corridors of IFMO, and may already be implemented on some platform unknown to me (I did not conduct research). And suddenly not?

** What problems do I see in the competetion programming **:

Problems of polynomial complexity are finite. Every year it becomes more difficult to come up with original problems, especially for competitions with a low level of participants. Sportsprog is becoming more and more like chess in the sense that, due to the finiteness of tasks, in the foreseeable future, AI will cope with it better than humans, and in human competitions, not knowledge will come to the fore, but the ability to isolate meaning from a confusing text of conditions and apply standard knowledge (and even libraries) in a stressful situation in a limited time. Online is at risk of suffering from the dominance of cheaters. The outside world (employers, sponsors, etc.) will treat the tasks being solved every year with greater skepticism, since in real life the degree of uncertainty is an order of magnitude higher. I will add that due to the desire of the jury to accept solutions in different languages, optimizations in the spirit of “speed up by 2 times” or “remove the logarithm” are practically meaningless in the sports program today, despite the fact that in life they have a noticeable economic effect.
It is very expensive to do tasks of high quality. Preparing a task, even if it is streamlined, takes a huge amount of time, and therefore money. First of all, this concerns test coverage, because for the organizer there is nothing worse than accept a deliberately wrong or non-optimal decision. It becomes more and more difficult to hold local offline competitions, because it is impossible to find resources to prepare original problems, and the old ones kill objectivity and lead to worse experience than in high-quality online contests. Due to the lack of local competitions, diversity is reduced, social bonds between participants are not formed, and participants, up to reaching the top level, cannot really try themselves as authors without being hated for insufficiently high-quality tasks.
All attempts to expand the class of problems run into the problem of objective evaluation. The jury cannot rate one heuristic above another, because there is a set of tests on which the result will be the opposite, and the participants do not know the tests and their distribution. The only successful case of expanding the classical set of problems was the emergence of interactive problems, but this class is also very limited in size and imposes the same requirements on the completeness of the solution and tests.

What will be great to achieve:

Contests should be interesting. Applying standard algorithms over and over again is not very interesting, so it is necessary to increase the influence of intuition and the creative aspect. Sponsors should be interested in the fact that the problems meet the requirements of today, and ideally they can easily bring their real problem to the contest and get new solutions and tests using the collective mind. Spectators should be interested, because the contest lasts for a reasonable time, and the result is not entirely predictable until the very end, ideally if it can also be visualized.
There should be a lot of contests and their holding should not require large expenses for preparation. Ideally, the formula should allow any provincial university or school to prepare an original competition in a week or two with the help of teachers or students of not the highest level.
Contests must be objective. The influence of the jury on the result should be minimized. Participants must be on an equal footing and compete primarily with each other. The random factor should be as small as possible.

What I suggest:

Remove all restrictions on tasks solved at contests. NP-complete problems, problems with imprecise solutions, complex games, etc. — everything can be used as tasks, because this is a huge amount of new knowledge and skills that are no less useful for participants than those currently used in the CP.
Do not require the jury to prepare complete sets of tests, ideally correct or various incorrect solutions for problems in a bunch of languages. Ideally, there should be enough statement, samples to understand the format and the simplest pretests to cut off deliberately crooked solutions (rather to check compliance with the format than to check correctness).
Participants during the contest can upload the solution and upload the generator (or interactor). Decision required, generator — optional. When a new solution appears, it is tested on all tests generated by the participants. The result of the execution is compared according to some criterion with other participants, as a result of which the points are redistributed. When a new generator appears from a participant, his old tests are canceled along with the points received for them, and all solutions are tested on new tests, which leads to a new redistribution of points.

That's all :) In my opinion, it is precisely such a simple scheme that can work, just as the ACM ICPC scheme once worked, where a plus sign is given regardless of the complexity of the task.

How to count results?

The most similar scheme to ACM ICPC — it is to give all tasks the same price in points and distribute them in percentage among the participants. For good measure, let's assume that 10 participants solved the problem (that is, their solutions passed the jury's pretests to cut off the complete trash and save system resources), and only 5 of them filled in their generators, each of which created 20 tests. Then we have 100 tests in the system, on each of which we will run all 10 solutions, i.e. each test yields 1 percent. Suppose on one of the tests the participant found the solution better than everyone else, then he will receive 1% for it. And on another test, all 10 participants found the same solution — they will get 0.1% for it. If two participants found a solution better than the other 8, then they will receive 0.5% each. Well, etc. Obviously, the sum of points will always equal 100%. At the same time, due to the presence of both large and small tests, weak participants will be able to claim points until the very end of the competition. Various degenerate cases like "one participant passed the pretests and didn't flood the generator" also do not interfere with summing up — we give him 100% for the task.

Now a few details, why I think that with such a scheme of conduct, you can objectively calculate the result for any task:

To begin with, we apply this scheme to existing ACM problems that have an exact solution. Obviously, the participants will be distributed in the same way as they are distributed in school contests, except for cases when some mistakes are not covered by any tests. Well… is it part of the tactic, is it to waste time on the generator in an attempt to take points from rivals or to solve other problems? But now it will not be possible to accuse the jury of incomplete coverage of tests, and each team will have a person with developed testing skills (this format will be the most interesting in team competition).
Let's apply the scheme to some NP-complete problem or a problem without an exact solution. Obviously, the output of the system will be a bunch of tests for different heuristics, and the solutions will largely be random. But it will no longer be possible to accuse the jury of preferring some kind of heuristic, because you yourself could write a generator that will break your rivals. Will this bunch of heuristics have any value? Certainly! Indeed, in sum, they form a program that covers them all, and in a reasonable execution time. The higher the level of the contest participants, the better both the solutions and the tests will be.
Let's apply the scheme to some game, for example, checkers. On the one hand, there will be a decision (interactor) with moves for white, on the other hand, a decision of the opponent with moves for black. Obviously, according to the results of the launch of each with each, a completely objective table of results will be obtained, regardless of whether the competition is held among younger students or among top participants. And although in neither case the result will be comparable with specialized programs, it does not interfere with the contest, and for preparation you only need to describe the rules of the game, the data format and write a checker to check the correctness of the move.

A few small nuances that will complement the overall picture:

A solution run multiple times on a fixed set of tests (as well as a generator) should produce the same result, i.e. the participant must use a fixed seed when randomizing. Otherwise, the results become unpredictable, and therefore biased.
It is necessary to force the participants to create in the generators not only large tests, but also small ones. The easiest way is to require the generator to create all tests at once, and in the condition to describe the criteria for them. For example, to require that some of them be small.
So that the scoreboard can be displayed online, and the participants do not overwhelm the checking system with endless testing, the number or time between sending generators should be limited, at least until the monitor is frozen. Obviously, after freezing, you can test selectively (for example, only the last attempts).
Online contests must use room splitting. Otherwise, the number of tests will be too large. The final result can be calculated already among the top participants from the rooms. For offline events, the generation of 10-20 tests by each will not be a big problem, because the number of tasks per contest will also be reduced due to the need to write generators.
It is necessary to think about how to take into account the time in the parcels and whether it should be taken into account at all. Obviously, under such rules, participants cannot completely stop working on a task at any point in time, because at any moment tests may appear on which their solution will lose effectiveness, or an idea will appear how to break other people's solutions. Most likely, a huge number of tactics will appear, how to waste the time of the contest, which will definitely increase the entertainment.

Show or not show tests to participants, whether to assign some bonuses for the "first solution" or "good generator", with what accuracy to compare the result, whether to take into account time and memory, whether to limit multithreading, whether to require sorts to be stuffed into one file or to allow already execute anything in an isolated container — All these are minor nuances that do not change the overall picture.

What is required to get started?

The Codeforces needs to be improved, and given the existing infrastructure of hacks, educational rounds, a testing ground, etc. it doesn't look very big. It is quite possible that the existing API already allows such a contest to be held and everything will rest only on the visualization of the results.

In general, if the idea is good — like, participate in the discussion — maybe it will be of interest to other organizers. Without mass application, it will not be of interest, therefore I am writing here. If the idea is old — share a link to the implementation, especially interested in the opportunity to hold your own contests, trainings, etc. It's clear that there are hundreds of contests in the world with inaccurate solutions, but I'm talking about reforms in the classic competition program, where there are long-established traditions.

Comments (4)

Write comment?

Xellos

20 months ago, # |

+21

You're pushing a lot of the burden of preparation onto contestants... and onto CF. There's a reason why contests with non-exact solutions tend to run for much longer — it's a potentially unlimited amount of work for potentially no payoff.

But it will no longer be possible to accuse the jury of preferring some kind of heuristic, because you yourself could write a generator that will break your rivals.

You forgot to add: if you have the time. Besides the concern was never accusing the jury of preferring some heuristic, but lack of a clear specification that makes such concerns irrelevant. What if some specific heuristic passes because nobody made tests for it? What if there are all kinds of tests, but very few of a particular kind? The jury will have to decide to give them weights (equal weights also count) and that introduces the bias you want to avoid. If there are 2 tests which give 1% to one solution and 1 test which gives 1% to a different solution, it fair to get more points just because there are more tests? Will the jury manually filter that? More bias!

We've already had problems without exact general solutions, but with an exact (condition-based not score-based) checker + guarantees like data being generated randomly. They have hacks disabled. I think that's a much better approach.

However, you're free to make a site hosting such contests yourself to try your idea in practice. There's an alarming dropoff in the number of CF-style programming contest sites recently, definitely a gap you could fill.

→ Reply

Seryi

20 months ago, # ^ |

← Rev. 2 →

-19

The idea is to put competitors in equal conditions, give them simple rules and... watch what will happen. It is the same idea as ICPC in terms of simplicity. When many people start to solve difficult problems and make tests for it in a very short time of competition, we don't know what will happen, but for sure system will balance itself. When CP started no one could imagine that competitors will solve some problems in 5-10 minutes. I remember the days when giving suffix array task meaned 0 attempts in table and now it's a standart school task. For sure one day such epic improvement of skill will take place in a difficult problems and test-making. 5 minutes from the start of contest and you'll see that top-teams already posted random solutions and tests better than some professionals will make in a week.

Wind_Eagle

This is an english translation of my russian comment I made a few minutes ago, sorry for the duplicate, didn't see this blog has an english translation.

Sportsprog is becoming more and more like chess in the sense that, due to the finiteness of tasks, in the foreseeable future, AI will cope with it better than humans, and in human competitions, not knowledge will come to the fore, but the ability to isolate meaning from a confusing text of conditions and apply standard knowledge (and even libraries) in a stressful situation in a limited time.

In my opinion, expanding the list of topics would not help at all, for the reason that the so-called "standard knowledge" would be somewhat more, that's all.

I will add that due to the desire of the jury to accept solutions in different languages, optimizations in the spirit of “speed up by 2 times” or “remove the logarithm” are practically meaningless in the sports program today, despite the fact that in life they have a noticeable economic effect.

It seems to me that optimization to remove the logarithm comes in handy quite often. And generally speaking, sports programming is primarily a competition to come up with algorithms, not to write them. I see nothing wrong with not forcing participants to optimize a constant in a solution. This is especially true in school olympiads: they may not have enough knowledge to optimize the solution, say, by hitting the cache more often, or by applying processor optimizations.

Due to the lack of local competitions, diversity is reduced, social bonds between participants are not formed, and participants, up to reaching the top level, cannot really try themselves as authors without being hated for insufficiently high-quality tasks.

And how do you propose to solve this problem? Lack of local competitions is a problem of organization, not a problem of Olympiad format. As for the fact that the less experienced participants cannot try themselves as the authors of the problems. This is absolutely normal: it is difficult (not impossible, but difficult) to come up with a problem if you are not experienced enough in problems. It is like composing a chess study, if you are bad at chess.

All attempts to expand the class of problems run into the problem of objective evaluation. The jury cannot rate one heuristic above another, because there is a set of tests on which the result will be the opposite, and the participants do not know the tests and their distribution.

At the Belarusian olympiads, for example, there are problems with open-ended tests: 10 tests are given, and one must generate the best possible answer with the help of various heuristics and approximate algorithms.

Contests should be interesting. Applying standard algorithms over and over again is not very interesting, so it is necessary to increase the influence of intuition and the creative aspect.

I can propose contests that consist only of ad-hoc problems and constructives. There will definitely be no standard algorithms.

Contests must be objective. The influence of the jury on the result should be minimized.

And how to achieve this? The main influence of the jury on the result is coming up with a set of problems, and the themes and hardness of the problems in this set determine the key influence on the result.

Remove all restrictions on tasks solved at contests. NP-complete problems, problems with imprecise solutions, complex games, etc. — everything can be used as tasks, because this is a huge amount of new knowledge and skills that are no less useful for participants than those currently used in the CP.

Author, have you really not seen NP-complete problems at contests? They've been there for a long time. Regarding heuristics: for one local Belarusian contest I gave a problem, where it was necessary to solve a traveling salesman with the so-called wooden algorithm (not directly, of course, but through a series of steps the problem was reduced to that). As you know, this algorithm produces an answer not more than twice as big as the optimal one. So how did I get around this? Very simply: I guaranteed that all tests are unweighted graph, which has a Hamiltonian path. Did the problem become worse? Of course not, but it did fit into the normal grading system.

Participants during the contest can upload the solution and upload the generator (or interactor). Solution is required, generator — optional.

So, it is proposed to have a not very long contest, with super-unusual tasks, and for all that participants will spend little time, make them also... generate tests?! Well, I see a number of problems here, of course. One of the first: it makes no sense for strong participants to generate tests, and weak participants are unlikely to be able to do it qualitatively. Also another problem: sometimes generating strong tests is not easier than solving a problem, and if you do that, then the problem as there were weak tests, and so will remain.

It is necessary to force participants to create in generators not only large tests, but also small ones.

It is necessary to force, but the generator is optional...

So, judging from the text, you propose the following: to turn sportproga into a sort of huawei challenge, where tasks (heuristic and qualitative!) can be brought by any, even inexperienced participants, tests will not be prepared, and participants themselves during the competition should try to make tests, with all that, after each new test all solutions will be retested (which will cause an incredible load on servers, if done constantly). As a format it is quite interesting, but it can hardly be a serious competition. If you really want, you can hold an Olympiad, closer to the real tasks: to calculate something multi-threaded, for example, to develop the server and so on. But first of all, it would definitely not be for schoolchildren, and second, it is unlikely to require knowledge of algorithms.

In general, I think that your idea would work, of course. You could hold several Olympiads in this format, and frankly, it would be incredibly interesting. But I don't want to make all competitions like that at all.

Too much issues about "don't change anything". I'm not an classic competition programming hater, so you should not explain the benefits of current format. I spent 15+ years as contestant/coach/author and I love it :) But I see the big problems with it, especially in mass-market. Many students now don't want to compete after finishing school, cause they sure that they've got all benefits from CP and better to try something else. My students prefer not to start compete at all, cause they have no chance to global success without school-background and almost no local competitions around. Don't you see that Russian teams win allmost all World Finals? Is that because they best of the best, or because nowadys many great students in other countries prefer another path for their self-improvement? Without a big changes CP will collapse into a closed comunity of fans who year by year solving the same problems and people from outside will completely loose interest to it, cause around CP will be more agile and lively communities.

Seryi's blog