Skip to content

Instantly share code, notes, and snippets.

@johnlindquist
Created December 10, 2025 06:33
Show Gist options
  • Select an option

  • Save johnlindquist/632929f01eff917e348df9097a5ea830 to your computer and use it in GitHub Desktop.

Select an option

Save johnlindquist/632929f01eff917e348df9097a5ea830 to your computer and use it in GitHub Desktop.
Code-Mode Architecture Comparison

Code-Mode Migration: Architecture Comparison

This document outlines the architectural shift from the current "Lootbox/RPC" model to the new "Code-Mode/UTCP" paradigm.

1. High-Level Architecture

Before: Sequential RPC (Lootbox/Standard MCP)

The LLM acts as a "Router". It decides on one tool, waits for the result, then decides the next step. This incurs a round-trip latency cost and token cost (re-reading history) for every single step.

sequenceDiagram
    participant LLM
    participant Server
    participant Tool

    LLM->>Server: Call Tool A (args)
    Server->>Tool: Execute A
    Tool-->>Server: Result A
    Server-->>LLM: Result A (Text)
    Note over LLM: "Thinking..." (Context Window Fill)
    LLM->>Server: Call Tool B (args)
    Server->>Tool: Execute B
    Tool-->>Server: Result B
    Server-->>LLM: Result B (Text)
Loading

After: Batched Execution (Code-Mode)

The LLM acts as a "Programmer". It writes a complete script to solve the problem. The server executes this script in a sandbox, with tools available as native functions.

sequenceDiagram
    participant LLM
    participant Sandbox
    participant Tool

    LLM->>Sandbox: Send Script (TypeScript)
    Note over Sandbox: Execute Script
    Sandbox->>Tool: Call Tool A
    Tool-->>Sandbox: Result A (Object)
    Note over Sandbox: Logic / Loop / Filter
    Sandbox->>Tool: Call Tool B
    Tool-->>Sandbox: Result B (Object)
    Sandbox-->>LLM: Final Result (JSON)
Loading

2. Code Example: "Find and Analyze Errors"

Scenario: Search logs for errors and summarize them.

Before (Current State)

Requires 3+ LLM turns.

  1. User: "Find errors in logs."
  2. LLM: Calls search_logs("error")
  3. Tool: Returns list of 50 log lines.
  4. LLM: Reads 50 lines... "Okay, I see these. I'll summarize the database ones."
  5. LLM: Calls summarize("db connection failed...")

After (New Code-Mode State)

Requires 1 LLM turn.

The LLM writes and executes this once:

// The LLM writes this script:
const logs = await cm.tools.search_logs({ query: "error" });

// Filter locally (Zero token cost, infinite logic)
const dbErrors = logs.filter(l => l.includes("database"));

if (dbErrors.length > 0) {
    // Pass data directly to next tool
    const summary = await cm.ai.summarize({ content: dbErrors.join("\n") });
    return summary;
} else {
    return "No database errors found.";
}

3. Key Differences Table

Feature Current (RPC/Lootbox) New (Code-Mode)
Logic Location Inside the LLM (Prompt Engineering) Inside the Script (Code)
Data Processing String manipulation via LLM Native JS Arrays/Objects
Context Usage High (Intermediate data is tokenized) Low (Only final result is returned)
Latency High (Sum of all tool RTTs + Inference) Low (Single inference + fast execution)
Tool Integration Complex (JSON Schema parsing) Simple (Native TS Interfaces)
CLI Tools Wrapped individually Imported as global objects (e.g., cass.search)

4. Why this matters for cass & beads

Your existing tools (cass, beads) wrap powerful CLIs.

  • Currently: To use cass, the LLM must understand the CLI output text.
  • New Way: The LLM receives Typed Objects from cass. It can map, reduce, or filter these results using standard TypeScript before presenting them to you.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment