Enough Is Enough: A Concrete Plan to Tackle Cheating on Codeforces

Hello, Codeforces.↵
↵
I've participated a few rounds and noticed that there are **too many cheaters**.↵
Now the cheater detection is community-driven and only a few of cheaters are being detected.↵
↵
###Idea↵
I’m proposing Codeforces Anti‑Cheat (CFAC) – an automated flagging system that works after each contest and automatically detects cheaters using:↵
↵
— **NLP-model based submission (and maybe replacement) checking**↵
↵
— **Timings-based detection: if gray solves div.2 e in 3 mins, its suspicious**↵
↵
all of these metrics are combined into suspicion score matrix where score[u][p] is value↵
normalized [-1, 1] where ↵
↵
— -1 — if participant $u$ 100% not cheating at problem $p$;↵
↵
— 1 — if participant $u$ 100% cheating at problem $p$;↵
↵
### Need help↵
I need help in↵
↵
— collecting labelled data for cheater's code↵
↵
— final testing of anti-cheat system↵
↵
### My review on my NLP-based model↵
It works pretty well, but it can detect only well-LLMed submissions like that:↵
↵
<spoiler summary="Submission 1">↵
```↵
import sys↵
↵
def solve() -> None:↵
    it = iter(sys.stdin.read().strip().split())↵
    t = int(next(it))↵
    out_lines = []↵
    for _ in range(t):↵
        n = int(next(it))↵
        q = int(next(it))↵
        a = [int(next(it)) for _ in range(n)]↵
        b = [int(next(it)) for _ in range(n)]↵
        # c[i] = max(a[i], b[i])↵
        c = [max(ai, bi) for ai, bi in zip(a, b)]↵
        # suffix maxima M[i] = max_{j>=i} c[j]↵
        M = [0] * n↵
        M[-1] = c[-1]↵
        for i in range(n-2, -1, -1):↵
            M[i] = max(c[i], M[i+1])↵
        # prefix sums of M↵
        pref = [0] * (n + 1)↵
        for i in range(n):↵
            pref[i+1] = pref[i] + M[i]↵
        # answer queries↵
        ans = []↵
        for __ in range(q):↵
            l = int(next(it))↵
            r = int(next(it))↵
            ans.append(str(pref[r] - pref[l-1]))↵
        out_lines.append(" ".join(ans))↵
    sys.stdout.write("\n".join(out_lines))↵
↵
if __name__ == "__main__":↵
    solve()↵
```↵
</spoiler>↵
↵
<spoiler summary="Submission 2">↵
```↵
import sys↵
↵
# Function to calculate the sum of digits of a number↵
def get_digit_sum(n):↵
    s = 0↵
    while n > 0:↵
        s += n % 10↵
        n //= 10↵
    return s↵
↵
def solve():↵
    # Read all input from standard input↵
    input_data = sys.stdin.read().split()↵
    ↵
    if not input_data:↵
        return↵
↵
    iterator = iter(input_data)↵
    try:↵
        # First token is the number of test cases↵
        t = int(next(iterator))↵
    except StopIteration:↵
        return↵
↵
    results = []↵
    ↵
    for _ in range(t):↵
        try:↵
            x = int(next(iterator))↵
        except StopIteration:↵
            break↵
            ↵
        count = 0↵
        # We are looking for y such that y - d(y) = x.↵
        # This can be rewritten as y = x + d(y).↵
        # Let s = d(y). Then y = x + s.↵
        # We need to check if d(x + s) == s.↵
        # Since x <= 10^9, y is roughly 10^9.↵
        # The maximum sum of digits for a number <= 10^9 + 100 is 81 (for 999,999,999).↵
        # Thus, s will not exceed 90. We iterate s from 1 to 100 to be safe.↵
        ↵
        for s in range(1, 100):↵
            y = x + s↵
            if get_digit_sum(y) == s:↵
                count += 1↵
        ↵
        results.append(str(count))↵
    ↵
    # Print all results separated by newlines↵
    print('\n'.join(results))↵
↵
if __name__ == '__main__':↵
    solve()↵
```↵
</spoiler>↵
↵
**Why it isnt working well?**:↵
↵
- because my AI-generated samples were very-very simple to detect↵
↵
- because some LLMish things can be too difficult do detect using only CodeBERT-generated embeddings↵
↵
As solution I will start everything from scratch to make my model detect more AI landmarks which are hard to see through embeddings↵
↵
###Updates↵
- Created cfac [repo on github](https://github.com/vn4ka/cfac)↵
- Updated post text without AI addressing hate comments about AI-slop and [user:pilliamw,2026-03-24] blog post↵
- **Major update**: (finally) trained a model for classifying cheaters/not cheaters (not pushed changes to repo yet)↵

Rev.	Кто	Когда	Δ	Комментарий
en7	vn4k	2026-03-26 20:12:32	2962	Tiny change: ')\n~~~~~\n<\spoiler>\n\n\n###U' -> ')\n~~~~~\n\n\n###U'
en6	vn4k	2026-03-26 19:38:17	119
en5	vn4k	2026-03-24 14:45:17	8	Tiny change: 'n github](github.com' -> 'n github](https://github.com'
en4	vn4k	2026-03-24 14:39:35	17	Tiny change: 'or cheaters and not cheaters\n\n &mdas' -> 'or cheater's code\n\n &mdas'
ru3	vn4k	2026-03-24 14:36:58	5217
en3	vn4k	2026-03-24 14:36:00	45
en2	vn4k	2026-03-24 14:32:46	5111	Tiny change: 'lp\n[user:You] I need' -> 'lp\n[user:you] I need'
ru2	vn4k	2026-03-23 11:19:39	107	Мелкая правка: 'ted cfac [repo on github](https://' -> 'ted cfac [repo on github](https://'
en1	vn4k	2026-03-22 10:26:55	4103	Initial revision for English translation
ru1	vn4k	2026-03-22 10:26:19	4103	Первая редакция (опубликовано)

№	Пользователь	Рейтинг
1	Benq	3792
2	VivaciousAubergine	3647
3	jiangly	3631
4	Kevin114514	3574
5	maroonrk	3521
6	strapple	3515
7	Radewoosh	3461
8	tourist	3428
9	turmax	3378
10	Um_nik	3376

№	Пользователь	Вклад
1	Qingyu	161
2	adamant	148
3	Um_nik	146
4	Dominater069	143
5	errorgorn	140
6	cry	138
7	Proof_by_QED	136
8	YuukiS	135
9	chromate00	134
10	soullless	133

История