Ruslan Kovalev velavokr

Analysis of the code behind the agent that took first place in the BitGN PAC1 blind run

(translation of the original telegram post here)

So, Operation Pangolin took first place in the blind run on the Accuracy Leaderboard (tied with codex-on-rails).

What is under the hood? It is not so much a chatbot agent as a compact programmable analyst with a strict checklist and a REPL loop.

The core is written in TypeScript. It calls Anthropic Claude (Sonnet for debugging, Opus for the competition). Notably, the LLM does not have a large set of tools, but only a single one: execute_code. In other words, the LLM generates Python code, which gets access to the runtime tools through the Workspace class, as well as to memory (scratchpad) and a dictionary of variables. The results are then passed back to Claude. This repeats until the code eventually produces an answer through ws.answer(scratchpad, verify), which successfully passes the built-in verification.

The solution works very

	#include <assert.h>
	#include <fcntl.h>
	#include <malloc.h>
	#include <stdio.h>
	#include <string.h>
	#include <unistd.h>

	#include <sys/stat.h>

	#include <zdict.h>