Finally, semantic search for competitive programming problems

→ Обратите внимание

До соревнования
Codeforces Round 991 (Div. 3)
44:53:17
Зарегистрироваться »

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	tourist	3985
2	jiangly	3741
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3489
7	Radewoosh	3483
8	Kevin114514	3443
9	ecnerwala	3392
9	Um_nik	3392

Страны | Города | Организации

Всё →

→ Лидеры (вклад)

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	atcoder_official	162
3	maomao90	162
5	adamant	158
5	-is-this-fft-	158
7	awoo	156
8	TheScrasse	154
8	djm03178	154
10	Dominater069	153

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя TLE

Finally, semantic search for competitive programming problems

Автор TLE, история, 13 месяцев назад, По-английски

Hello Codeforces,

It has been a long while, but in this project we close the long-standing open problem proposed by Umnik 2021. You can try it here (discontinued, see the new link below) while supplies last. Currently I only imported problems from Codeforces & BZOJ (the dead Chinese OJ) but adding other OJs should be easy as long as we have the statements crawled (PRs?). Cheers!

Update (8 months later): We finally got an update! In the new version I collected and uploaded most of vjudge (which means 160k problems!). It also got a shiny new domain http://yuantiji.ac. Enjoy! :D

+1200

TLE
13 месяцев назад
29

Комментарии (27)

Показать архивные | Написать комментарий?

Mo_Huzaifa

13 месяцев назад, # |

Nice Initiative

→ Ответить

gin_spirit

13 месяцев назад, # |

+13

I don't understanding anything. But it looks like something useful so I upvote for you.

→ Ответить

ARMINIUS

13 месяцев назад, # |

Nice job!

→ Ответить

oToToT

13 месяцев назад, # |

← Rev. 2 →

out of curiosity: how bad would the search results be if we don't use chatgpt to simplify the problem.

→ Ответить

kpw29

13 месяцев назад, # |

-29

Truly amazing work. You should write a paper about it or sth.

I'm slightly worried about consequences for competitive programming... you should probably block usage during contests as an anti-cheating measure. Otherwise you'll lose your credits pretty quickly :)

→ Ответить

GusterGoose27

13 месяцев назад, # |

+133

In regards to cheating concerns, this may actually reduce cheating incidents by making it easier for authors to find repeated problems.

→ Ответить

TwentyOneHundredOrBust

13 месяцев назад, # |

+37

this is neat, but then won't the training data annotators know your next problem when you plug it into openai?

→ Ответить

TLE

13 месяцев назад, # ^ |

← Rev. 2 →

+41

I'm using their paid API (same function as chatgpt but not free..), so in theory they should not be used for training :|

→ Ответить

TwentyOneHundredOrBust

13 месяцев назад, # ^ |

← Rev. 2 →

I guess even then someone working at openai who really really wants to cheat on a contest could do it, but that's probably not going to occur. How well does it work? I tried plugging in this year's FHC 3B which seems very similar to 1870E but the ones it gives don't seem very closely related to it.

→ Ответить

Lyrically

13 месяцев назад, # ^ |

← Rev. 2 →

However, I plugged CF1793F into it but it showed nothing that is even close to CF765F. What can possibly be the issue here? ig it's because of the problem background, but how to deal with that?

Update: also, with this year's CSP-S problem 2, i plugged it in but still, it isnt showing CF1223F. Even after it's paraphrased to "Given an array of integers, we want to count the number of non-empty continuous subarrays that can be reduced to an empty array by repeatedly removing adjacent identical elements.", CF1223F shows nowhere on that list.

→ Ответить

TLE

13 месяцев назад, # ^ |

+21

Yeah, the system is still imperfect — we should probably experiment a better prompt to remove the backgrounds (you can find the current prompts here).

For your second example, it seems doable with a bit of luck...

→ Ответить

Lyrically

13 месяцев назад, # ^ |

Yeah, automatically removing the background and actually "formalizing" the statement will be a great feature, and will help a lot ig:)

→ Ответить

Rewinding

13 месяцев назад, # |

+19

Amazing work!

→ Ответить

AprLsity

13 месяцев назад, # |

Amazing work!

→ Ответить

lis05

13 месяцев назад, # |

+15

Honestly, that is impressive. I wonder if the same thing could be done but with the editorials (so that people can find applications of different ideas and algorithms).

→ Ответить

ankeshgupta007

13 месяцев назад, # |

Is the link still live? not working for me

→ Ответить

Misa-Misa

13 месяцев назад, # |

+12

Wow what a wonderful work. Ask Um_nik for 1000 dollars.

→ Ответить

avighnakc

12 месяцев назад, # |

Lol, this could be used in today's contest to figure out the solution for D.

→ Ответить

huikang

11 месяцев назад, # |

← Rev. 2 →

+14

I have implemented this project on Poe, so that the cost of calling ChatGPT will be borne by the platform (i.e. other subscribers).

(Disclaimer: I currently work for Quora / Poe).

Sample query — https://poe.com/huikang/1512928000278451

After reading the code, someone could try some of these

Using better LLMs to summarize (e.g. GPT-4)
Other retrieval indexes (predicted topics, keywords)
Use an LLM to rerank the retrieved problems
Instead of a decimal similarity value, provide a LLM-generated summary on whether the two problems are the same
Craft some evaluation benchmarks

Edit (October 2024): The tool no longer works.

→ Ответить

steinum

6 недель назад, # ^ |

→ Ответить

entropy07

11 месяцев назад, # |

← Rev. 3 →

One idea: Use LLMs to read and write a summary for top solutions' source code of a problem and use the summary together with the problem statement for searching.

→ Ответить