Skip to main content

Command Palette

Search for a command to run...

How we designed a code evaluation pipeline using Judge0 with Golang, Echo & Redis.

From Pit Lane to Production: How we designed our backend to survive the CookOff Weekend | Part two

Updated
5 min read
How we designed a code evaluation pipeline using Judge0 with Golang, Echo & Redis.

Race Day Madness: The Submission Pipeline and Judge0 Pit Stops.

Welcome back to the pit lane.
Previously, we designed the backend and confidently assumed the hard part was over. It wasn’t. Because buried deep inside this system lived a group of routes so important that if it breaks, the entire application falls over like Senna's Imola crash. This part of the series is dedicated to that route, how it works, and why it exists.

So, buckle up and let's dive into part two of

"From Pit-Lane to Production".

How Submissions Were Processed and Sent to Judge0

Imagine pressing Run in our platform. It felt like starting a fresh engine just to check whether the engine fires up or immediately bursts into flames. Behind that tiny button was a pit crew of backend logic that took your code, wrapped it in digital bubble wrap, and launched it straight to Judge0 at full speed.

Run

Run was basically Free Practice 1. No championship points. No pressure. Just vibes.
You clicked it, the backend fetched the sample test case from the Database, built a shiny JSON payload that included your source code, test input, language ID, and expected output, and shipped it to Judge0. Judge0 then compared the output and returned a verdict. Could be Accepted. Could be Wrong Answer. Could be a spectacular Runtime Error fireball. The backend tossed that result straight back to the frontend, giving users instant joy or instant depression.

Run Custom

Run Custom was FP2 with experimental setups. The backend skipped stored test cases and used whatever weird input the user typed. Same payload, minus expected output, because at this point, nobody really cared what the car was supposed to do. Judge0 ran it, sent the raw output back, and users debugged until their sanity evaporated.

Submit

And then came Race Day.
This endpoint did not care about feelings. It checked JWT auth, made sure the user was not banned, verified that the round was active, and only then started the full submission routine like a formation lap.

The backend fetched every test case using a simple for loop. For each one, it created a Judge0 payload. Every payload was collected into a batch. Those batches were sent to Judge0 in one angry POST request. Judge0 replied with a bunch of tokens. Each token represented a test case. These tokens were dumped into Redis, and from this point on, the backend just prayed Judge0 behaved.

Redis

Redis was our FIA race control.
It tracked tokens like a clipboard guy with too much caffeine. Each token was saved as token:tokenID mapping to submission and testcase IDs. Redis also kept a set for each submission to track how many tokens were still pending. Every time a callback was processed, the token was deleted. When the set hit zero, we knew the submission was complete. Very elegant. Very illegal if used in Monaco.

Asynq and the Worker System

Once Judge0 sent back the results, they weren’t processed instantly on the same thread. That would have been too civilized. Instead, we threw the payload into an Asynq queue, which let dedicated workers handle the heavy lifting in the background.

The InitQueue function spun up an Asynq server and client, connecting to Redis as the job broker. Each callback from Judge0 created a new task of type submission: process. The worker, sitting patiently (or not), picked up these tasks and started unwrapping the payloads.

Inside ProcessJudge0CallbackTaskThe worker parsed the JSON payload, fetched the submission and test case IDs from Redis, and then mapped Judge0’s numeric status codes into readable verdicts from “success” and “wrong answer” to the ever-popular “runtime error.” It then logged the runtime, memory, and status into the database using SQLC queries.

To keep things clean, the worker also deleted the processed token from Redis. It then checked whether any tokens remained for that submission. If none did, it called UpdateSubmission, which aggregated all the test case results, counted passed and failed cases, and updated the user’s total score.

After this chaotic drive, every submission found its destiny. Some roared to the top of the leaderboard like Verstappen on a dry track. Others retired with a spectacular "Compilation Error" on the first corner.

The Bigger Picture

When you zoom out, the whole system looked like a perfectly organized Grand Prix weekend.

Redis was basically the race director shouting orders from the control box. It kept track of every token, every pending test case.

Asynq was the pit crew on RedBull. The moment a token was picked up from the queue, the workers leaped on it like mechanics changing tyres in two seconds, except instead of tyres, they were fighting with building JSON, status codes, and database queries.

Judge0 was the engine supplier that could either deliver championship-level performance or explode in a ball of errors without warning.

And the backend itself. That was our Toto. Too many responsibilities, too little sleep, and constantly pretending everything was under control. One eye on the scoreboard, one eye on Redis, emotionally praying to the digital ocean gods.

In the end, the submission pipeline didn’t explode, no surprise DNFs. The Ferrari-inspired engineers had built a backend that did its job, lap after lap, quietly and reliably. And honestly, this might be the fastest thing we built all season.

Coming up next: A deep dive into how the clock, the admin routes held the entire system together.