In this post, we will use the recent OpenAI Codex product based on GPT-3, first introduced in Evaluating Large Language Models Trained on Code (Chen et al., 2021) to solve some Codeforces problems.
We use the round Codeforces Round 739 (Div. 3) because the current version of the Codex model was released before this round. As the model has been trained on the Github data, it is possible that it memorized some solutions for older problems.
Solving 1560A - Dislike of Threes from the statement only
This is a simple problem, and the statement kind of explains what we should implement. Let's just give the plaintext statement to Codex:
Disclaimers
It produces correct code on the second try, the first C++ submission was WA 2. The problem text has been cleaned a bit, and "end in the digit 3" was changed to "have the last digit 3" -- because for some reason Codex thinks "end in the digit 3" means "contain the digit 3".
Solving 1560B - Who's Opposite? from the problem statement and the editorial
Here we have no hope for the solution straight from the problem statement, as it requires some thinking. However, if you give it the editorial, it can produce working code from the first try:
No formatting is required: just copy and paste the statement and the editorial from the Codeforces website. No inductive bias in user input was needed at all!
What you can do if this is cool or frightening to you
It is a net good to the world if everyone has access to models such as Codex or GPT-3.
If you are a machine learning researcher, or maybe just a competent person in general, try to get involved in EleutherAI, Hugging Face, or something similar. If you want to work on important things, in my opinion language models are very critical right now.
Bonus: What you can do if you own a competitive programming site
The advice below is my personal opinion.
- Try to get a tester with Codex access for future rounds. I predict it could become a part of the cheating issue as soon as 2022.
- Think of what the rules should say about using this kind of tools.
- Prepare for the effects of competitive programming becoming less relevant to job interviews. If your financing model relies on job interview preparation, you will need to change it in a few years. (Codeforces and Atcoder seem to be in the best shape here.)
FAQ
Can language models solve harder problems?
Language models are a new and poorly understood object. Here is a gross oversimplification of the current conventional wisdom:
GPT-3 is a language model. Its main purpose is translating between different representations of the same data. It can do English to German, it can be used to describe imagess, and with Codex it can translate textual commands to code. Codex can solve the Div3 A from scratch because the solution just implementation of the text, but it needs the editorial to solve Div3 B.
As of 2021, language models are not intended for solving algorithmic puzzles or producing mathematical proofs. It is likely we will have a much more powerful Codeforces solver by say 2026, when the research manages to combine proof search methods with the GPT-3 architecture. If I manage to read enough papers on this, I might write more on my academic blog.
I can't see the demo.
If you can't access imgur, go to this link.
How to get access to Codex?
Have some academic credentials, and fill in the waitlist form at OpenAI's website with an interesting research proposal. Please do not use Codex for bad purposes such as cheating or spam.
How to integrate Codex with vim? I got access but I can only use it in the Playground.
Use vim_codex with your OpenAI keys.
What are the Codex parameters in the demo?
I used the OpenAI API from Python as follows: