whynesspower's blog

By whynesspower, history, 14 months ago, In English

The primary concern of this post:

Prevent bots from scraping away data from codeforces, as it will make AI tools more powerful and harm codeforces in long term.

OpenAI just started providing services for to customise Large Language Models on your own custom data

Why is this a problem now?

Codeforces has a very rich database of community driven questions (Approx. 10,000) Now you can easily feed a lot of data of codeforces to ChatGPT and make it permanently learn the stuff. It will enhance its existing problem solving ablities of algorithmic quesitons. Codeforces has a large set of both the question and their respective tutorials.

(My opinion) Chances are that when chatGPT was being created, it was already fed the codeforces data once, which allows the model to be able to code in a manner which can solve codeforces questions. But it was not custom trained SPECIFICALLY for this, which is now possible.

Any individual of the world can now scrap entire codeforces (relatively easy task) and needs just $200 in GPT credits to custom train a model and make a new service or a product which can solve even the most difficult of the codeforces problems in no time.

How is leetcode fighting this? 1. (last saturday itself) They have implemented CloudFlare's anti-scraping on their website. Which makes it super difficult to scrape data from automatic scripts like selenium or beautiful soup.

I propose:

  1. Adding a service to avoid data scraping.
  2. Adding capcha wherever possible.

Full text and comments »

  • Vote: I like it
  • -29
  • Vote: I do not like it

By whynesspower, history, 14 months ago, In English

If you are a problem setter: reverse test the questions if they can be solved via any LLM models.

Especially ChatGPT 3.5 Turbo (which is the default free available model as of Oct 2023)

It's unfortunate but true, since the past one year, standard competitive programming has taken a blow because of malpractices- heavy cheating and plagarism.

If you are a high rated profile, it might not have affected you (not yet) but when it comes to entry-level ranked people (like myself) ranking system is getting more and more unjust.

I propose:

  1. A guidline for the the problem setters which enforces that every question should be throughly reversed checked, if chatGPT is able to solve it, then rephrase the questions until it no longer can.

  2. Systems must be developed to check if the code has been copied by a LLM model

Full text and comments »

  • Vote: I like it
  • -15
  • Vote: I do not like it

By whynesspower, history, 15 months ago, In English

Domain Sharding: This technique involves spreading your assets across multiple subdomains of your main domain (e.g., assets1 example com, assets2.example com). Be cautious with domain sharding, as it can also increase DNS lookup times if not managed properly.

During contests which have a lot of traffic, I have seen codeforces suggesting to use different domains, sometimes it even redirects to different domains which are not codeforces.com

Just curious, why is Domain sharding used in codeforces?

My simple guess is maybe they don't want to vertically scale their servers and hence they are horizontally scaling it by distributing the load to totally different domains? But it could also be because they want to get over the browser limitations of number of simultaneous connections to a particular webiste? I am just curious.

Thanks & Regards

Full text and comments »

  • Vote: I like it
  • +16
  • Vote: I do not like it