SHADOWEEN's blog

By SHADOWEEN, history, 6 weeks ago, In English

I usually don't like writing negative blogs, but after seeing some of the comments on recent posts, I feel we need to talk about the direction our community is heading.

It started when I noticed user monkey_is_back replying to me with an offensive picture. Curious, I decided to check his other comments to see if it was a one-off mistake. Unfortunately, I found a pattern of pure toxicity and racism that I think crosses every line a competitive programmer should have:

  1. Targeting a newcomer with racism: "Hello, bloody indian. I am from your father country."
  2. Abusing someone trying to post solution of a problem if it will be helpful to someone: "Stop posting this shit you bloody sun of a beach"
  3. Insulting another user directly in a joke thread: "Because alice is a hoe like GHOUS1425."
  4. Using racial slurs while mocking someone's achievement: "Yo N1GGA, how you managed to get -ve rating bruh? Big congo btw"

We are all here to learn, compete, and grow. Comments like these don't just break the rules; they actively discourage beginners and create a hostile environment for everyone. It’s the opposite of sportsmanship.

I urge monkey_is_back to reflect on this and stop. I also want to ask the community and the admins (MikeMirzayanov, Vladosiya): What can we do to prevent this?

Should there be stricter penalties for first-time hate speech? Is there a better way to report such users quickly? I believe we can all agree this is unacceptable, but I’d love to hear your thoughts on how to fix it.

UPD: The same user has returned with a new handle: doantunglamdatrang, and is now posting even more offensive comments. This pattern of toxicity continues.

Solutions (from comments)

  • IP-based bans: Instead of just banning accounts, consider IP bans after a certain number of reports (e.g., $$$n$$$ reports) to prevent users from simply creating new accounts.
  • Keyword-based harassment detection: Implement a simple filter that flags comments containing common offensive words or racial slurs.
  • NLP-based harassment detection: Use a more advanced natural language processing model that can understand context and detect harassment even without explicit keywords.
  • Comment restrictions for new/unrated users: Prevent unrated users with negative contribution from commenting until they have participated in at least one or two rated rounds. Exceptions could be made for special "announcement only" accounts like MidnightCodeCup, JetBrains, etc.
  • Comment restrictions for low contribution.: Comments of low contribution users will be once reviewed by mods and only than posted.

Full text and comments »

  • Vote: I like it
  • +91
  • Vote: I do not like it

By SHADOWEEN, history, 6 weeks ago, In English

Hi everyone,

I'm working on a personal RAG-based search tool to quickly find problems, editorials, and relevant blog posts (for example: "problems on DP with bitmask that have a detailed explanation").

I know how to scrape the data manually, but honestly, I'd really prefer not to. Why?

  • I don't want to put unnecessary load on Codeforces servers.
  • I'm afraid of getting banned even with polite delays.
  • It feels wasteful to scrape the same data that someone might already have collected.

So I'm wondering — does a dataset of Codeforces problems, editorial content, and educational blog posts already exist? Something like an archive or a collection that the community maintains?

If not — what would be the most respectful way to collect this kind of data without causing any trouble? (I'm okay with waiting, using APIs where possible, or joining some existing effort.)

Thanks in advance!

Full text and comments »

  • Vote: I like it
  • 0
  • Vote: I do not like it