Large language models are getting more advanced and capable recently. We can utilize them to draft an email, make a travel plan, explain a complex topic, summarize an article, etc. Now, you are given a chance to develop your mini language model. In this problem, you will develop a model that summarizes programming problem statements into a sentence of not more than 25 words.
Before the problem statement is given to your model, it will be preprocessed in this order:
Therefore, the processed problem statement will be a sequence of words separated by single spaces. A word consists of not more than 20 lowercase letters and digits.
To measure the performance of your model, we will compare your summary against the summary generated by Gemini 1.5 Pro using the prompt below. The temperature is set to 0 to minimize the randomness.
Summarize the following statement of a problem that appeared in a competitive programming contest.
given an integer n determine whether it is a multiple of 3
Summarize the statement using a sentence. The sentence should contain at least 20 and not more than 25 words. Output all words in lowercase without any punctuation.
Although large language models will try to follow your instructions as much as possible, it is not guaranteed that the generated summary is between 20 and 25 words (inclusive). Actually, large language models are pretty bad at counting.
Assume that Gemini's summary has $$$k$$$ words. We calculate the number of pairs of identical words in Gemini's summary and your summary. For example, if Gemini's summary is find the number of prime numbers between 1 and 100 and your summary is how many prime numbers are there in the first 100 natural numbers, the number of pairs of identical words is 5: the numbers prime numbers 100. Note that if a word appears $$$x$$$ times in Gemini's summary and $$$y$$$ times in your summary, the number of pairs is $$$xy$$$. If the number of pairs of identical words is at least $$$k/3$$$, your summary (program output) will be accepted.
There are 10 test cases, the first 2 cases are the sample cases provided below. The other 8 cases used problem statements from the 1st to 8th La Salle—Pui Ching Programming Challenge (1 per year).
The first line contains an integer $$$N$$$, the number of words in the problem statement. ($$$10 \le N \le 300$$$)
The next line contains $$$N$$$ words. The words are separated by a single space. Each word consists of 1 to 20 (inclusive) lowercase letters and digits.
In the first line, output an integer $$$s$$$, the number of words in your generated summary. ($$$1 \le s \le 25$$$)
The next line should contain $$$s$$$ words. Similar to the input format, the words should be separated by a single space. Each word should consist of 1 to 20 (inclusive) lowercase letters and digits.
12given an integer n determine whether it is a multiple of 3
8 check if n is a multiple of 3
36everyone love pizza being hungry you just ordered a pizza for dinner the pizza s diameter is d inches and it is divided into n slices your task is to compute the area of one slice
11 a pizza is sliced and find the area of a slice
In the first example, Gemini's summary is given an integer n the task is to determine if n is divisible by three. The number of pairs of identical words is 5.
In the second example, Gemini's summary is given a pizza with diameter d inches and n slices calculate the area of a single slice. The number of pairs of identical words is 10.
Here we provide an extra example. This is not one of the test cases.
Processed problem statement: the la salle pui ching programming challenge is being intruded by gpt many submissions from different contestants are suspected to be written by the gpt model so it is believed that gpt has hacked into the judge and submitted source codes that are written by itself in different contestants name to recover from the damage caused by gpt all of the submissions need to be checked and those that are suspected to be written by gpt need to be removed luckily since the output length of the gpt model is limited if the source code it writes exceeds 500 characters only the first 500 characters will be kept including spaces and end of line characters and the rest of the output will be replaced by the sentence as an ai model my output is limited to 500 characters followed by an end of line character to reduce the number of submissions to be manually checked as a contestant please write a program to check if the given source code is suspected to be written by gpt and exceeds its output length
Gemini's summary: a programming contest needs help identifying and removing submissions suspected to be generated by a language model with a 500 character limit
| Название |
|---|


