Translated by GPT-4 with some adjustment. Original post: 「正义可以迟来但不能缺席」:关于 NXIST 的一些新证据
This article provides a logically complete set of evidence, which does not involve any non-public internet resources, regarding the "suspected cheating" incident involving the ICPC Yinchuan Station and ICPC Shenyang Station in 2021. By discovering the suspected GitHub account (NaokiLH, renamed to https://github.com/brokenTarget) of a team member from Ningxia Institute of Science and Technology (NXIST) TS 1 team, Lan Hao, two years ago, and by mining and analyzing the commit records of his algorithm competition repo, we have obtained direct evidence that at least 4 questions from the 2021 Yinchuan regional contest set and at least 6 questions (including scrapped questions) from the 2021 Shenyang regional contest set were leaked to him at least one week before the competition. The substantial amount of new public evidence indicates that the TS 1 team indeed cheated, and the relevant students were heavily involved.
This is the first direct evidence related to the incident after several years of discussion. This article is based on the mining and analysis of publicly available online information by @lucas110550 and @曾耀辉, and all the evidence provided does not involve any infringement or violation of relevant regulations. At the same time, @陈靖邦 conducted overall coordination and review. We welcome everyone to report and supervise.
Considering that the vast majority of the evidence comes from the commit history of NaokiLH (suspected account of Lan Hao)'s GitHub repo, to prevent the person involved from deleting and fleeing after this article is published, we strongly suggest everyone fork the corresponding repo to permanently keep this record.
https://github.com/NaokiLH/algorithm_trans
UPD: The original repo has been deleted, those interested can move to the personal backup repo:
https://github.com/NXIST-backup/algorithm_trans
Background Information
How to evaluate the ICPC Yinchuan Competition in 2021?
https://weibo.com/u/7535856183
Main Content
Recently, NXIST announced the hosting of the 2023 Silk Road China Invitational.
Upon learning of this, I was not only shocked but also deeply saddened: What is the purpose of doing such things?
So, on a leisurely afternoon, I began to search the internet for information about the award-winning team members, namely the TS 1 team members: Lan Hao, Ni Binqi, and Zhou Jianing. In an inconspicuous corner of GitHub, I found a homework submission repository for "Geek University's Python Advanced Training Camp — 1st Term" with the same name as one of the parties involved:
Week08 Homework Link Collection · Issue #52 · Python001-class01/Python001-class01
The information submitted by user upupqi contains the name of one of the parties involved, Ni Binqi. This led me to the user upupqi's profile directly:
Of course, we can't directly conclude that this is the person in question (after all, there are many people with the same name). After some investigation (such as his algorithm competition repository, confirming that he is also an algorithm competition participant), we obtained a very strong piece of evidence (and the source of this article): his mutual follower NaokiLH, an ID that is suspected to point to another party involved, Lan Hao (LH).
In an early issue raised by NaokiLH's account, there is a screenshot of his computer interface, where we can find a "Lan Hao 45418016" compressed file, which preliminarily confirms that the owner of this account is also named Lan Hao.
Now that we have the GitHub accounts of the two parties involved, curiosity drove me to dig through their GitHub repos to see if there was anything interesting. The first conclusions were: 1. Both of them are not very proficient in using GitHub (including upupqi not knowing how to inherit repositories, and NaokiLH's commits being very messy and not compliant, criticism is raised here) 2. Both of their algorithm levels are not very high (both of their algorithm repos had only learned some basic things before May 2021, and upupqi was still in the AcWing training camp in August 2021. Most of their Codeforces VPs are only at the Div.2 AB level), and it is hard to imagine how such a team could win a gold medal in the regional competition.
What really gave birth to this article was NaokiLH's algorithm competition repo:
https://github.com/NaokiLH/algorithm_trans
It seems quite normal, nothing strange.
Going directly to the commit records during May 2021, I found some interesting things:
The commits are very casual, and I raise criticism. Let's first look at the commit "3123131" at the bottom, which occurred on May 10, 2021: 3123131 · NaokiLH/algorithm_trans@03efcf1
I found that NaokiLH created a new folder called "yinchuan" and uploaded codes for problems B, G, and I:
Then in the commit "423423" on May 13, 2021: 423423 · NaokiLH/algorithm_trans@76bd49e
NaokiLH uploaded the code for problem K:
Let's carefully compare these four pieces of code with the official competition problems of the 2020 Yinchuan:
Ref:
NaokiLH's B vs Yinchuan B 104022B - The Great Wall
NaokiLH's G vs Yinchuan G 104022G - Photograph
NaokiLH's I vs Yinchuan I 104022I - The Answer!
NaokiLH's K vs Yinchuan K 104022K - Browser Games
We can see that, apart from the correctness of these four pieces of code, their input and output, as well as some variable names, can be completely matched with the problem statement. By submitting (interested readers can verify themselves), two of these four pieces of code can only pass the sample cases, while the other two cannot even pass the sample cases.
So when did the official competition of the 2020 Yinchuan take place? May 16, 2021.
In other words, NaokiLH (suspected account of Lan Hao) had already obtained enough information on May 10 and May 13 (one week before the competition) to complete the initial codes for problems B, G, I, and K, which were supposed to take place in the official competition on May 16. The input and output matched the problem statement, and some of the code could already pass the sample cases. We can reasonably suspect that the problem statement was leaked one week before the competition, and Lan Hao, as a party involved, was already aware of it and heavily involved.
In the subsequent official competition, he passed problems A, B, E, G, J, and K, among which B, G, and K are highly suspicious problems derived from the above investigation. Problems B and J have been mentioned in Dai@NeverLand: PDS Plagiarism Detection System Example: Plagiarism Demonstration of a Certain Competition with code overlap.
On May 22, 2021 (one week after the competition), the uploaded code was deleted by NaokiLH with a recorded commit: 321312 · NaokiLH/algorithm_trans@80c1103
After May 22, everything returned to normal. NaokiLH began learning KMP and participating in AcWing training.
In-depth review, let's speculate on the situation at the time according to the timeline:
Early May, NaokiLH obtains the leaked problem statements, which include at least Problems B, G, I, and K. However, the leak only contains problem statements and examples, not solutions or standard inputs and outputs.
May 10, NaokiLH, through research or seeking help from others, writes the code for Problems B, G, and I. However, given their skill level, they cannot guarantee the correctness of these three pieces of code. NaokiLH thinks for a while and decides to upload the code to GitHub as a backup.
May 13, NaokiLH completes the code for Problem K and uploads it to GitHub as a backup.
May 16, The Yinchuan Regional Competition officially begins. TS 1 team tries (perhaps?) to submit the pre-written code without success. They then obtain the passing code from other teams through some means provided by the organizer and submit it to achieve AC (Accepted), ultimately winning the gold medal. Public opinions start to form.
May 22, NaokiLH deletes the code for Problems B, G, I, and K from the GitHub repo.
Bonus Content
2020 Shenyang Regional Contest: How Do TS 1 Team Prove Themselves?
Background Information
Translation: We have already talked to quailty that we will participate in Shenyang.
Translation: Now, the competitions have all come to an end, and they have returned to their college life, doing the same things they have always done, over and over again. "There's nothing to be proud of" is the phrase that appears most often in their conversations. While others are still immersed in their last victory, they have already started preparing for the next competition. (From the NXIST public account)
Explanation of the Leak of the 2020-2021 Shenyang Contest Abandoned Questions
Video at 1 minute 29 seconds: The competition time for the Shenyang station was postponed from the original May 23 to July 18.
Content
May 21, One week after the end of the Yinchuan competition, NaokiLH makes a new round of commits: 88888 · NaokiLH/algorithm_trans@7e35b60. They create a new directory under the original repo called ICPC/shenyang and upload A.cpp. The next day, May 22, NaokiLH creates another directory called blue_book/sh and moves the original A.cpp from ICPC/shenyang to this new directory: 321312 · NaokiLH/algorithm_trans@80c1103.
May 23, NaokiLH uploads B.cpp to the blue_book/sh directory: 4324324 · NaokiLH/algorithm_trans@1f8b5a2
May 24, NaokiLH uploads F.cpp and H.cpp to the blue_book/sh directory: 3123123 · NaokiLH/algorithm_trans@d765bf7
By June 11, NaokiLH had made modifications and ultimately completed the changes to the code in the blue_book/sh directory. Here is the final version of the directory at that time (including code for Problems A, B, F, and H).
We can easily find that the code for Problems A, B, F, and H does not match the problem statements of the Shenyang Regional Contest, and the clues seem to be disconnected. What went wrong? As it turns out, this situation is closely related to the Shenyang Regional Contest's scrapped problem event (see earlier references):
A.cpp actually corresponds to a problem called "jailbreak" in the scrapped Shenyang Regional Contest. As of the time of writing, this problem has not been publicly used. However, due to the passage of time, the scrapped problem PDF has been lost. Here, we provide only the relevant information and preview of the problem statement:
The problem was based on the Polygon platform for the question-making process, and the last edit was made on 2021-05-16 11:30:53 (UTC time).
We can see that the input method of this code is completely consistent with the original problem, but the output does not match: the problem requires the output to be "yes" or "no", while the output in the code is "YES" or "NO" (with an additional line of information). This actually corresponds to subsequent modifications to the problem, although this code still cannot pass the problem:
B.cpp can correspond to Problem H of the Shenyang Regional Contest 103202H - The Boomsday Project. The input and output methods are completely consistent, and it can pass the example cases. However, due to the adjustment of the data range in subsequent versions, it cannot pass all data. Interested students can compare it themselves.
F.cpp can correspond to Problem J of the Shenyang Regional Contest 103202J - Descent of Dragons. The input and output methods are completely consistent. This code can pass only some of the data besides the example cases.
H.cpp can correspond to the 2021 NewCoder Summer Camp Training 8: F. Robots. We need to explain that this problem was once one of the scrapped problems of the Shenyang Regional Contest and was later used in the 2021 NewCoder Summer Camp Training 8 held on August 9, 2021. Prior to that, it had not been publicly used. As it is a paid competition that requires registration to view the problems, we also provide a preview of the problem statement here:
We can see that the input method of this code is completely consistent with the original problem, but the output does not match: the problem requires the output to be "yes" or "no", while the output in the code is "Y" or "N". Interestingly, this coincides with subsequent modifications to the problem (shown below). By aligning the output methods, the code can also pass some of the official data besides the example cases.
A brief recap:
From May 21, 2021, to June 11, 2021, all of NaokiLH's commits related to the Shenyang site were impossible to complete without leaking the problems. The submitted codes correspond to some of the problems from the Shenyang Regional Contest (July 2021) or some of the scrapped problems from the official contest. Some of the scrapped problems only appeared in the August 2021 NewCoder multi-school contest, and some problems like Jailbreak have not appeared to this day.
After June 11, NaokiLH continued to study LeetCode and regularly checked in with AcWing. Until the Shenyang Regional Contest offline competition on July 18, no other suspicious commits appeared.
On July 18, the 2020 ICPC Shenyang Regional Contest offline competition officially began. Team TS 1 ultimately won the silver medal. The second wave of public opinion began.