Hi codeforces, I was curious about how this platform worked, so I implemented my own version from scratch,It has all the essentials that an online judge has and I also added various functions such as collaborative coding and group voice chat with up to 10 users
The users can scrap problems from codeforces or they can just create their own problems
This is how it looks:
You can check it out here
and feel free to ask anything,It all was fueled by curiosity :)
Update: I recently pushed another version, it includes a complete refactor of the backend while improving on various things like handling concurrent submissions in the backend, and problem parsing has also became much faster with caching, do check it out! :)
Update 1: Im planning to add more features to the system, feel free to give any recommendations
So Cool!! One of the best projects I have ever seen!!
If you are using free tier on AWS for hosting your website then,keep in mind that Amazon won't notify you when you cross the freely usable limit on EC2 or other instances.
Once happened with me and I got a bill of $1000 (which is a lot of money in my country).
You have built a nice website btw. Very impressive.
They have not hosted it. It's on localhost
He has mentioned EC2 instance somewhere. I read it on his github page.
bro is hosting it from his pc?
looks dumb
How do you get the full testcases data for the problem? For afaik, there is no convenient way (yet) to get them
I don't, the scraper only scrapes the data on the problem page, you can check out the scraper here , the main test cases need to be added manually.
I'm planning to add main test cases parsing in future though (atleast the ones which are accessible)
you can also write a generator which generates some testcases
you can get the answer of those testcases by verifying with anyones submission on cf using scraping ig
I had a similar idea but was daunted by the effort to reward ratio for implementing that.
well in the future if you decide in making problems, u will need a generator right to make the testcases. Of course it isnt possible to manually entire upto 1e5 numbers
and making a generator is only 1 days hard work, then u can just adjust the constraints, press on run and then sit back and relax. all testcases are made automatically
There seems to be too many corner cases, would you like to contribute to it ?
What if someone submits a malicious code to corrupt the backend?
Apologies for the late reply,was caught up in something, every compilation request triggers the creation of a docker container which runs alpine linux (which also happens to be one of the most lightweight distro out there with the docker container requiring no more than 8 MB), other than that each container has nothing but the compiler and the code to be executed in it , it compiles, sends out the output and get's destroyed, it all happens in a period of like 2 seconds, so even if someone sends in some code which intends to modify the backend files, they can't, because the container does not have access to it.
Even if they try to do the cpp equivalent of "sudo rm -rf /*", it won't matter, because each container is created fresh on demand and has nothing to with last thing it compiled, You should read more about docker
tldr: The container are a isolated runtime environment which only has access to the request code, they can't even access the internet, one can cause any mayhem they want in there, it won't matter.
While Docker containers are definitely far more secure than they used to be a few years ago, I still wouldn't consider it a replacement for a proper sandbox when executing untrusted code.
Upon more research, it seems that kernel level exploits might still be an issue as it's shared with the main backend, at the same time my risk profile isn't as high, I think that keeping the systems updated should protect against most well known vulnerabilities, the kernel exploits seem more like a black swan scenario.
I'll still consider adding complete virtualization though at some point in the future.
Can you provide more resources to learn about docker
You can check this out
we can identify those words and not run the code in the first place maybe
This won't be reliable as languages keep changing and there might come some new ways to do something in future versions of a language, so doing this would just be like putting bandaids to fix a water leak
how do you handle security aspects
hey, you can check my above reply to the comment if you're wondering about malicious code injection attacks, here are some other things:
As the architecture in the readme,each compilation requeset to the backend has has a rate limiter attached, each IP address is limited to 100 submissions per 15 minutes,and like I said in the previous reply, the submission codes never touch the actual backend, but a docker container.
And each submission has a time limit for 2 seconds, if it doesn't end its execution within that time, the container terminate and a TLE status is sent, so that it doesn't keep running indefinitely.
Basically I'm running this script inside the docker container to evaluate submissions.
Feel free to ask anything :)
:))
:(( , after a second thought, someone can also modify the expected output to match their output, I'll patch this soon
He's talking about remote code execution, that if you haven't isolated the runtime access, the code will be able to run arbitrary function in the os, maybe exploiting your servers for free data storage foe instance.
No he's not,Have you read any of the replies I wrote to above comments?
Hey there, I recently shipped a new version of the submissions system here, also can you please try to hack this one as well, I tried to cover all the sneaky ways and implement proper sandboxing.
Some highlights:
:))
I think now you need someone more experienced to break into this system. :)
Only one thing comes to mind: compiling probably isn't really that safe. At the very least you should
timeout
it (since you can write pretty much arbitrarily complex preprocessor macros), but I'm not sure, maybe it is possible to do worse things with it (especially since you are compiling as root).If you want to look into sandboxing even more, here is a repo on how IOI does it. There's also a bunch of papers linked in the README about other considerations they took into account.
What does this thing do?
Hey there, it gets a random string from urandom of length 69
The script then sets this as the password for root so someone can't get into root userspace
By the way, if you are using
diff
to compare outputs, at least usediff --ignore-trailing-space --strip-trailing-cr
.(That is, if you want to keep things simple and not introduce custom checkers.)
Also
--brief
in order not to make big files for no reason.But it's best not to use
diff
anyway for the quadratic complexity.nobody asked
which tool you used to create that high level system design?
You can do the same with LucidChart!
I used draw.io
Auto comment: topic has been updated by nubskr (previous revision, new revision, compare).
Nice one. I was also trying to build something like this. During that I heard about Linux Containers. They are lightweight and faster to boot than docker, but harder to implement. You can refer https://youtu.be/SD4KgwdjmdI?si=DVDdtuBRFFh5z0YX.
Correct me if I am wrong. Also please share resources from which you have taken reference.
Hey, I checked it out and it seems like works in a similar way like mine but better
For starters, it also uses docker as a base and then inside the container runs a custom script to code executions, both mine and the one that is used there are similar in various ways(the latter one being a lot more mature) but mine also determines the verdict inside the container which was a stupid idea and easily exploitable as someone mentioned above.
One fix that I'm planning to add is to make the container just spit out the program output and then compare that with the expected output outside the container, that way the container won't need to have access to the verdict and expected output file, making it essentially bulletproof against those attacks
You could host it on vercel if you want free hosting.
Hi, the backend uses docker API,redis and a file system which can't be hosted on any random host, this needs a proper VM.
Damn bro nice one. BTW which year are u in?
he's chinese
Ah yes, the Chinese year.
This is probably more reliable and secure than actual codeforces :icant:
Auto comment: topic has been updated by nubskr (previous revision, new revision, compare).
Looks cool
Ok
Ok
When are you adding support for other languages ? I would like to contribute if possible
Hi, apologies for late reply, at this point the only thing needed for adding languages is the compiler, do let me know if you have any lightweight docker images for the language you need.
Also, you can contribute here :)
for security, you can see judge0
Doesn't github already have something called codespace
and i am batman
Kudos on the completion of the project !!!
I saw that you have been working on this since last year's October, demmmmm!!!!
UPD: My bad I just saw you completed it 5 months ago but still this is very impressive