vn4k's blog

By vn4k, history, 2 months ago, translation, In English

Hello, Codeforces.

I've participated a few rounds and noticed that there are too many cheaters. Now the cheater detection is community-driven and only a few of cheaters are being detected.

Idea

I’m proposing Codeforces Anti‑Cheat (CFAC) – an automated flagging system that works after each contest and automatically detects cheaters using:

NLP-model based submission (and maybe replacement) checking

Timings-based detection: if gray solves div.2 e in 3 mins, its suspicious

all of these metrics are combined into suspicion score matrix where score[u][p] is value normalized [-1, 1] where

— -1 — if participant $$$u$$$ 100% not cheating at problem $$$p$$$;

— 1 — if participant $$$u$$$ 100% cheating at problem $$$p$$$;

Need help

I need help in

— collecting labelled data for cheater's code

— final testing of anti-cheat system

My review on my NLP-based model

It works pretty well, but it can detect only well-LLMed submissions like that:

Submission 1
Submission 2

Why it isnt working well?:

  • because my AI-generated samples were very-very simple to detect

  • because some LLMish things can be too difficult do detect using only CodeBERT-generated embeddings

As solution I will start everything from scratch to make my model detect more AI landmarks which are hard to see through embeddings

Updates

  • Created cfac repo on github
  • Updated post text without AI addressing hate comments about AI-slop and pilliamw blog post
  • Major update: (finally) trained a model for classifying cheaters/not cheaters (not pushed changes to repo yet)

Full text and comments »

  • Vote: I like it
  • +122
  • Vote: I do not like it