I am these days a very irregular Codeforces visitor, but yesterday saw the post [About Problem Coincidences](https://mirror.codeforces.com/blog/entry/112709) by [user:MikeMirzayanov,2023-02-20] and was wondering about a solution along the lines of a problem database like what was [suggested](https://mirror.codeforces.com/blog/entry/95956) some time ago by [user:Um_nik,2023-02-20]. Lots of the comments there mention the idea of constructing a notion of problem similarity using solutions, which seems reasonable, but others countered that they don't want to have to write a whole solution just to tell if a similar problem exists. Given the current state of AI, could something like the following work now or in the near future?↵
↵
1. For every problem statement in the CF, AtCoder, etc. universe, use AlphaCode or the editorial to get a solution.↵
2. Embed each of these solutions into a vector space. AFAIK this is straightforward to do.↵
↵
Then when someone wants to see the most similar problems to their proposed statement, they either write a solution, or just write a statement and behind the scenes the database can have AlphaCode (or similar) generate a solution, embed using same method as in 2. and query for nearest vectors in the space.Of course, AlphaCode can't solve all problems, but maybe it at least writes something that "approximates" a correct solution well enough that the resulting vector is close to the "correct" one?↵
↵
Of course, AlphaCode itself is not available to the public, but probably it or something similar will be soon. Thoughts?
↵
1. For every problem statement in the CF, AtCoder, etc. universe, use AlphaCode or the editorial to get a solution.↵
2. Embed each of these solutions into a vector space. AFAIK this is straightforward to do.↵
↵
Then when someone wants to see the most similar problems to their proposed statement, they either write a solution, or just write a statement and behind the scenes the database can have AlphaCode (or similar) generate a solution, embed using same method as in 2. and query for nearest vectors in the space.
↵
Of course, AlphaCode itself is not available to the public, but probably it or something similar will be soon. Thoughts?