Most MCP servers just wrap CRUD JSON APIs into tools — I did it too with scim-mcp and garmin-mcp-app. It works, until you realize a tool call dumps 50KB+ into context.
MCP isn't dead — but we need to design MCP tools with the context window in mind.
That's what code mode does. The LLM writes a small script, the server runs it in a sandbox against the raw data, and only the script's compact output enters context.
Inspired by Cloudflare's Code Mode, but using a local sandboxed runtime instead of a remote one — no external dependencies, isolated from filesystem and network by default.
Works best with well-known APIs (SCIM, Kubernetes, GitHub, Stripe, Slack, AWS) because LLMs already know the schemas — they write the extraction script in one shot.
Copy-paste this into any AI agent inside your MCP server project:
Add a "code mode" tool to this MCP server. Code mode lets the LLM write a processing
script that runs against large API responses in a sandboxed runtime — only the script's
stdout enters context instead of the full response.
Steps:
1. Read the codebase. Identify which tools return large responses.
2. Pick a sandbox isolated from filesystem and network by default:
- TypeScript/JS: `quickjs-emscripten`
- Python: `RestrictedPython`
- Go: `goja`
- Rust: `boa_engine`
3. Create an executor that injects `DATA` (raw response as string) into the sandbox,
runs the script, captures stdout.
4. Create a code mode MCP tool accepting `command`, `code`, and optional `language`.
5. Create a benchmark comparing before/after sizes across realistic scenarios.
Walk me through your plan before implementing. Confirm each step.
If you prefer an interactive planning experience with detailed sandbox comparisons and benchmark templates, install the full agent skill:
https://github.com/chenhunghan/code-mode-skill