Hello Codeforces!
I want to share an open-source project I've been building: a multi-agent AI pipeline that generates complete, Polygon-ready competitive programming problems from scratch — statement, validator, checker, solutions (ACC, TLE, WA), test generator, and interactor — all from a single command.
→ GitHub: https://github.com/7oSkaaa/polygon-problems-generator
The Problem
Setting a proper problem for a contest takes a lot of time, even when you know the idea:
- Writing a clean LaTeX statement with the right style
- Writing a bulletproof
testlib.hvalidator - Choosing or writing the right checker
- Writing an accepted solution, a TLE brute force, and a WA solution for stress testing
- Writing a test generator with a complete FreeMarker script
- Reviewing everything against Polygon conventions
That's hours of work per problem. I wanted to automate all of it.
How It Works
You describe the problem idea once. A pipeline of specialised sub-agents — each with its own isolated context and strict instructions — produces every file needed to upload the problem to Polygon.
Each agent owns exactly one concern:
| Agent | Produces |
|---|---|
statement-agent | statement.tex — Polygon-ready LaTeX statement |
statement-agent | tutorial.tex — Polygon-ready LaTeX editorial |
validator-agent | validator.cpp — testlib.h input validator |
checker-agent | checker.cpp — standard or custom checker |
interactor-agent | interactor.cpp — interactor for interactive problems |
solutions-agent | acc.cpp, acc_java.java, brute.cpp, wa.cpp |
generator-agent | generator.cpp — testlib.h generator + FreeMarker script |
reviewer-agent | Full review — blocks on any FAIL verdict |
The reviewer agent is the gatekeeper. It checks every component against all guidelines and re-triggers generation for anything that fails, looping until all verdicts are PASS.
Setup
Prerequisites: Git, Python 3, and any AI coding tool that supports agent/skill definitions (Claude Code, Cursor, GitHub Copilot, Windsurf, OpenAI Codex, or Google Antigravity).
# 1. Fork the repo on GitHub, then clone your fork
git clone https://github.com/<your-username>/polygon-problems-generator.git
cd polygon-problems-generator
# 2. Activate the pre-commit hook (one-time per machine)
git config core.hooksPath .githooks
# 3. Always pull before generating — new rules are pushed regularly
git pull origin main
The pre-commit hook runs sync-ai-configs.py automatically before every commit, keeping configs for all supported AI tools in sync with .claude/agents/ as the single source of truth.
Generating a Problem
Open the repo in your AI coding tool and run:
/generate-problem
name: <snake_case_identifier>
statement: <one or two sentences describing what to compute>
solution: <intended algorithm / approach>
constraints: <full constraint block>
multitest: <yes / no> (optional, default: yes)
interactive: <yes / no> (optional, default: no)
sample tests:
Input: ...
Output: ...
All parameters explained:
| Parameter | Required | Default | Description |
|---|---|---|---|
name | yes | — | Snake_case identifier. Converted to a readable title automatically: carrot_sum → Carrot Sum |
statement | yes | — | One or two sentences describing what the solver must compute |
solution | yes | — | The intended algorithmic idea / approach |
constraints | yes | — | Full constraint block, e.g. 1 ≤ t ≤ 10^4, 1 ≤ n ≤ 10^5 |
multitest | no | yes | Whether the problem has multiple test cases per file |
interactive | no | no | Whether the problem is interactive and needs an interactor |
sample tests | yes | — | At least one sample input/output pair. For interactive problems: a full interaction example |
Example (standard problem):
/generate-problem
name: carrot_sum
statement: Count integers in [L, R] whose digit sum is prime
and the number is divisible by the sum of its digits.
solution: Digit DP — precompute suffix-count tables for each
prime digit-sum up to 162, then process queries offline
in O(len × 10) per prime.
constraints: 1 ≤ t ≤ 10^4, 1 ≤ L ≤ R ≤ 10^18
sample tests:
Input:
3
1 10
12 12
1 100
Output:
4
1
10
Example (interactive problem):
/generate-problem
name: find_the_number
statement: There is a hidden integer x in [1, n]. Find it using
at most ceil(log2(n)) queries of the form "? v",
to which the judge responds "<", "=", or ">".
solution: Binary search on [1, n].
constraints: 1 ≤ t ≤ 100, 1 ≤ n ≤ 10^9
multitest: yes
interactive: yes
sample tests:
> ? 500000000
< <
> ? 250000000
> >
> ? 375000000
< <
> ! 312500000
The Pipeline — 12 Steps
After you run /generate-problem, the pipeline runs fully automatically:
Step 1 → Create problem folder from templates
Step 2 → Generate LaTeX statement
Step 3 → Generate LaTeX tutorial (editorial)
Step 4 → Generate testlib.h validator
Step 5 → Recommend or generate checker
Step 5b → (interactive only) Generate interactor
Step 6 → Suggest algorithmic approaches
Step 7 → Generate ACC solution — C++ + Java
Step 8 → Generate TLE solution (intentionally slow)
Step 9 → Generate WA solution (intentionally buggy)
Step 10 → Generate test generator + FreeMarker script
Step 11 → Full review — re-generates any FAIL component
The agent stops and asks you only if a required parameter is missing. Otherwise it runs all the way to the end.
What Gets Generated
Every problem lands in problems/<name>/ with this structure:
problems/carrot_sum/
├── statement/
│ ├── statement.tex ← Polygon-ready LaTeX statement
│ └── tutorial.tex ← Polygon-ready LaTeX editorial
├── solutions/
│ ├── acc.cpp ← Correct solution, C++ (ACC)
│ ├── acc_java.java ← Correct solution, Java (ACC)
│ ├── brute.cpp ← Intentionally slow (TLE)
│ └── wa.cpp ← Intentionally wrong (WA)
├── generators/
│ └── generator.cpp ← testlib.h generator + FreeMarker script
├── validator.cpp
├── checker.cpp
└── interactor.cpp ← only for interactive problems
All files are upload-ready for Codeforces Polygon.
Generator Rules Worth Highlighting
The generator agent enforces a rule I find particularly useful: the FreeMarker script must include at least one test case where each variable is at its minimum value and at least one where it is at its maximum value. Boundary coverage is not optional.
The generator accepts CLI flags like -n and -k for exact-value tests so the script can reach these boundaries deterministically.
Interactive Problem Support
The interactor-agent generates a complete testlib.h interactor using the correct stream model:
- Reads test secrets from
inf(the test input file) - Reads participant output from
ouf(never fromcin) - Sends responses to the participant via
coutfollowed bycout.flush() - Issues verdicts with
quitf(_ok, ...)/quitf(_wa, ...)/quitf(_pe, ...)/quitf(_fail, ...)
A full interactor guide (streams, flushing, query limits, multi-test, Polygon setup, common mistakes) lives in tutorials/interactor.md in the repo and is read by the agent at runtime.
Multi-Tool Support
The agent definitions live in .claude/agents/ as the single source of truth. A pre-commit hook auto-syncs them to every major AI coding tool on every commit:
.cursor/rules/ → Cursor
AGENTS.md → OpenAI Codex
.github/copilot-instructions.md → GitHub Copilot
.windsurfrules → Windsurf
.agents/skills/ → Google Antigravity
You update one file. Every tool stays in sync. No drift.
Tutorials Included
The tutorials/ folder contains writing guides that agents read at runtime. Each tutorial teaches the agent the full conventions for that component:
statement.md— LaTeX formatting, legend style, I/O section conventionsvalidator.md—testlib.hvalidator patterns, whitespace/newline checkschecker.md— when to use standard checkers,readAnsparadigm for custom onesgenerator.md—rnd.next(),rnd.partition(), FreeMarker scripts, boundary coverageinteractor.md— stream model, flushing, verdict functions, Polygon setup
Repository
→ https://github.com/7oSkaaa/polygon-problems-generator
To use it: 1. Fork the repo
Clone your fork
Run
git config core.hooksPath .githooksPull before each session — new agent rules are pushed regularly
Open in your AI coding tool and run
/generate-problem
Follow me on GitHub for updates: https://github.com/7oSkaaa
I would love to hear feedback from problem setters who try it. If something breaks or the output quality is poor for a particular problem type, open an issue, and I'll fix it.








it's really helpful thanks for your efforts
I will use it myself, really helpful <3
orz
Great work, Thanks for your efforts
I will try it ^_^, Great Work 7oSkaaa