[URGENT] Codeforces need anti-scraping

Правка en6, от whynesspower, 2023-10-30 05:19:15

The primary concern of this post:

Prevent bots from scraping away data from codeforces, as it will make AI tools more powerful and harm codeforces in long term.

OpenAI just started providing services for to customise Large Language Models on your own custom data

Why is this a problem now?

Codeforces has a very rich database of community driven questions (Approx. 10,000) Now you can easily feed a lot of data of codeforces to ChatGPT and make it permanently learn the stuff. It will enhance its existing problem solving ablities of algorithmic quesitons. Codeforces has a large set of both the question and their respective tutorials.

(My opinion) Chances are that when chatGPT was being created, it was already fed the codeforces data once, which allows the model to be able to code in a manner which can solve codeforces questions. But it was not custom trained SPECIFICALLY for this, which is now possible.

Any individual of the world can now scrap entire codeforces (relatively easy task) and needs just $200 in GPT credits to custom train a model and make a new service or a product which can solve even the most difficult of the codeforces problems in no time.

How is leetcode fighting this? 1. (last saturday itself) They have implemented CloudFlare's anti-scraping on their website. Which makes it super difficult to scrape data from automatic scripts like selenium or beautiful soup.

I propose:

  1. Adding a service to avoid data scraping.
  2. Adding capcha wherever possible.
Теги chatgpt, admin

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en6 Английский whynesspower 2023-10-30 05:19:15 135
en5 Английский whynesspower 2023-10-30 05:17:29 4
en4 Английский whynesspower 2023-10-29 08:30:03 49
en3 Английский whynesspower 2023-10-29 08:27:02 226
en2 Английский whynesspower 2023-10-29 08:26:34 33 Tiny change: '1. Adding CloudFlare or similar service t' -> '1. Adding a service t'
en1 Английский whynesspower 2023-10-29 08:24:50 1729 Initial revision (published)