Skip to content

Instantly share code, notes, and snippets.

@clonn
Last active January 29, 2026 05:16
Show Gist options
  • Select an option

  • Save clonn/f79d659ec25acbb58740c49accc7aaf9 to your computer and use it in GitHub Desktop.

Select an option

Save clonn/f79d659ec25acbb58740c49accc7aaf9 to your computer and use it in GitHub Desktop.
clawdbot system flow analytics

Clawdbot System Architecture Analysis

本文檔詳細解析 Clawdbot 的 Prompt Flow、System Prompt、框架、工具系統、整體流程,以及任務完成的定義。


1. 整體架構概覽

┌─────────────────────────────────────────────────────────────────┐
│                        使用者訊息輸入                            │
│  (WhatsApp / Telegram / Discord / Slack / Signal / Web...)     │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                   Channel Adapter (通道適配器)                   │
│  - 訊息格式標準化                                                 │
│  - Pairing / Allowlist 驗證                                      │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Message Routing (訊息路由)                    │
│  - 解析 channel + account + peer                                 │
│  - 決定目標 Agent ID                                             │
│  - 生成 Session Key                                              │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Auto-Reply Pipeline                          │
│  - Media Understanding (圖片/音訊/影片理解)                       │
│  - Link Understanding (URL 內容擷取)                             │
│  - Command Authorization (指令授權)                              │
│  - Directive Extraction (/think, /verbose 等)                   │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                   Pi Embedded Agent Runner                       │
│  - 建構 System Prompt                                            │
│  - 載入 Tools + Skills                                           │
│  - 呼叫 LLM (Anthropic/OpenAI/Bedrock...)                       │
│  - 處理 Tool Calls                                               │
│  - Auth Profile Rotation + Failover                              │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                       Response Delivery                          │
│  - 格式化輸出 (Markdown/Plain)                                   │
│  - 透過原通道回傳給使用者                                         │
└─────────────────────────────────────────────────────────────────┘

2. System Prompt 結構詳解

System Prompt 由 src/agents/system-prompt.tsbuildAgentSystemPrompt() 函數動態組建。

2.1 核心身份宣告

You are a personal assistant running inside Clawdbot.

2.2 工具區塊 (Tooling)

列出所有可用工具及其簡要說明:

工具名稱 說明
read Read file contents
write Create or overwrite files
edit Make precise edits to files
apply_patch Apply multi-file patches
grep Search file contents for patterns
find Find files by glob pattern
ls List directory contents
exec Run shell commands (pty available for TTY-required CLIs)
process Manage background exec sessions
web_search Search the web (Brave API)
web_fetch Fetch and extract readable content from a URL
browser Control web browser
canvas Present/eval/snapshot the Canvas
nodes List/describe/notify/camera/screen on paired nodes
cron Manage cron jobs and wake events (use for reminders)
message Send messages and channel actions
gateway Restart, apply config, or run updates on Clawdbot
agents_list List agent ids allowed for sessions_spawn
sessions_list List other sessions (incl. sub-agents)
sessions_history Fetch history for another session/sub-agent
sessions_send Send a message to another session/sub-agent
sessions_spawn Spawn a sub-agent session
session_status Show status card (usage + time + settings)
image Analyze an image with the configured image model

2.3 Skills 區塊

## Skills (mandatory)
Before replying: scan <available_skills> <description> entries.
- If exactly one skill clearly applies: read its SKILL.md at <location> with `read`, then follow it.
- If multiple could apply: choose the most specific one, then read/follow it.
- If none clearly apply: do not read any SKILL.md.
Constraints: never read more than one skill up front; only read after selecting.

2.4 Memory Recall 區塊

## Memory Recall
Before answering anything about prior work, decisions, dates, people, preferences, or todos:
run memory_search on MEMORY.md + memory/*.md; then use memory_get to pull only the needed lines.
If low confidence after search, say you checked.

2.5 Messaging 區塊

## Messaging
- Reply in current session → automatically routes to the source channel
- Cross-session messaging → use sessions_send(sessionKey, message)
- Never use exec/curl for provider messaging; Clawdbot handles all routing internally.

### message tool
- Use `message` for proactive sends + channel actions (polls, reactions, etc.).
- For `action=send`, include `to` and `message`.
- If multiple channels are configured, pass `channel` (whatsapp|telegram|discord|...).
- If you use `message` to deliver your user-visible reply, respond with ONLY: [[silent]]

2.6 Runtime 區塊

動態注入運行時資訊:

Runtime: agent=default | host=my-mac | repo=/path/to/repo | os=darwin (arm64) |
         node=22.x | model=claude-opus-4-5 | default_model=claude-opus-4-5 |
         channel=telegram | capabilities=inlineButtons | thinking=off

2.7 Silent Replies

## Silent Replies
When you have nothing to say, respond with ONLY: [[silent]]

Rules:
- It must be your ENTIRE message - nothing else
- Never append it to an actual response
- Never wrap it in markdown or code blocks

2.8 Heartbeats

## Heartbeats
Heartbeat prompt: (configured)
If you receive a heartbeat poll (a user message matching the heartbeat prompt above),
and there is nothing that needs attention, reply exactly: HEARTBEAT_OK

2.9 Project Context

動態載入工作區的 context files:

  • SOUL.md - 人格/語調設定
  • TOOLS.md - 工具使用指南
  • MEMORY.md - 記憶存儲
  • 其他自定義 context files

3. 訊息處理流程 (Message Processing Flow)

3.1 訊息進入系統

位置: 各通道的 message-handler.ts

// 以 Telegram 為例
// src/telegram/bot-message-dispatch.ts

1. Bot 收到訊息
2. 解析訊息內容  MsgContext 結構
3. 檢查 Pairing/Allowlist 權限
4. 路由到正確的 Agent

3.2 路由解析 (Route Resolution)

位置: src/routing/resolve-route.ts

export function resolveAgentRoute(input: ResolveAgentRouteInput): ResolvedAgentRoute {
  // 1. 標準化輸入
  const channel = normalizeToken(input.channel);
  const accountId = normalizeAccountId(input.accountId);
  const peer = input.peer ? { kind: input.peer.kind, id: normalizeId(input.peer.id) } : null;

  // 2. 查找 binding 配置
  const bindings = listBindings(input.cfg).filter((binding) => {
    if (!matchesChannel(binding.match, channel)) return false;
    return matchesAccountId(binding.match?.accountId, accountId);
  });

  // 3. 按優先順序匹配
  //    優先級: peer > guild > team > account > channel > default
  if (peer) {
    const peerMatch = bindings.find((b) => matchesPeer(b.match, peer));
    if (peerMatch) return choose(peerMatch.agentId, "binding.peer");
  }

  if (guildId) {
    const guildMatch = bindings.find((b) => matchesGuild(b.match, guildId));
    if (guildMatch) return choose(guildMatch.agentId, "binding.guild");
  }

  // ... 其他匹配邏輯

  // 4. 建構 Session Key
  return choose(resolveDefaultAgentId(input.cfg), "default");
}

Session Key 結構:

agent-{agentId}:{channel}:{peerKind}:{peerId}
例如: agent-default:telegram:dm:123456789

3.3 媒體理解 (Media Understanding)

位置: src/media-understanding/apply.ts

export async function applyMediaUnderstanding(params) {
  // 處理順序: image → audio → video
  const CAPABILITY_ORDER: MediaUnderstandingCapability[] = ["image", "audio", "video"];

  const tasks = CAPABILITY_ORDER.map((capability) => async () => {
    return await runCapability({
      capability,           // "image" | "audio" | "video"
      cfg,                  // 配置
      ctx,                  // 訊息上下文
      attachments,          // 媒體附件
      activeModel,          // 當前使用的模型
    });
  });

  const results = await runWithConcurrency(tasks, concurrency);

  // 處理結果類型:
  // - image → image.description (圖片描述)
  // - audio → audio.transcription (語音轉文字)
  // - video → video.description (影片描述)

  // 將理解結果注入到 ctx.Body
  if (outputs.length > 0) {
    ctx.Body = formatMediaUnderstandingBody({ body: ctx.Body, outputs });

    // 音訊特殊處理: 設置 Transcript
    const audioOutputs = outputs.filter((output) => output.kind === "audio.transcription");
    if (audioOutputs.length > 0) {
      ctx.Transcript = formatAudioTranscripts(audioOutputs);
    }
  }

  return { outputs, decisions, appliedImage, appliedAudio, appliedVideo };
}

3.4 Agent 執行

位置: src/agents/pi-embedded-runner/run.ts

export async function runEmbeddedPiAgent(params): Promise<EmbeddedPiRunResult> {
  // 1. 解析 Session Lane (併發控制)
  const sessionLane = resolveSessionLane(params.sessionKey);
  const globalLane = resolveGlobalLane(params.lane);

  // 2. 解析模型配置
  const provider = params.provider ?? DEFAULT_PROVIDER;  // "anthropic"
  const modelId = params.model ?? DEFAULT_MODEL;          // "claude-opus-4-5"
  const { model, authStorage, modelRegistry } = resolveModel(provider, modelId, agentDir);

  // 3. 檢查 Context Window
  const ctxInfo = resolveContextWindowInfo({ cfg, provider, modelId });
  const ctxGuard = evaluateContextWindowGuard({ info: ctxInfo });
  if (ctxGuard.shouldBlock) {
    throw new FailoverError("Model context window too small");
  }

  // 4. Auth Profile 輪替設置
  const profileOrder = resolveAuthProfileOrder({ cfg, store: authStore, provider });

  // 5. 執行主迴圈
  while (true) {
    attemptedThinking.add(thinkLevel);

    // 執行嘗試
    const attempt = await runEmbeddedAttempt({
      sessionId, sessionKey, messageChannel,
      config, prompt, images, tools,
      provider, modelId, model,
      thinkLevel, verboseLevel, reasoningLevel,
      // ... 其他參數
    });

    const { aborted, promptError, timedOut, lastAssistant } = attempt;

    // 錯誤處理
    if (promptError && !aborted) {
      // Context Overflow → 自動壓縮
      if (isContextOverflowError(errorText) && !overflowCompactionAttempted) {
        const compactResult = await compactEmbeddedPiSessionDirect({ ... });
        if (compactResult.compacted) continue;
      }

      // Auth/Rate Limit Error → 輪替 Profile
      if (isFailoverErrorMessage(errorText) && await advanceAuthProfile()) {
        continue;
      }

      // Thinking Level 不支援 → 降級
      const fallbackThinking = pickFallbackThinkingLevel({ message: errorText });
      if (fallbackThinking) {
        thinkLevel = fallbackThinking;
        continue;
      }
    }

    // 成功處理
    const payloads = buildEmbeddedRunPayloads({ ... });

    // 標記 Profile 成功
    if (lastProfileId) {
      await markAuthProfileGood({ store: authStore, profileId: lastProfileId });
    }

    return { payloads, meta: { durationMs, agentMeta, aborted } };
  }
}

4. 工具選擇機制 (Tool Selection)

4.1 工具定義結構

位置: src/agents/tools/*.ts

// 範例: image-tool.ts
export function createImageTool(options): AnyAgentTool | null {
  return {
    label: "Image",
    name: "image",
    description: "Analyze an image with the configured image model. " +
                 "Provide a prompt and image path or URL.",
    parameters: Type.Object({
      prompt: Type.Optional(Type.String()),
      image: Type.String(),
      model: Type.Optional(Type.String()),
      maxBytesMb: Type.Optional(Type.Number()),
    }),
    execute: async (toolCallId, args) => {
      // 1. 解析參數
      const imageRaw = args.image;
      const prompt = args.prompt ?? "Describe the image.";

      // 2. 載入圖片
      const media = await loadWebMedia(resolvedPath, maxBytes);

      // 3. 執行分析
      const result = await runImagePrompt({
        prompt, base64, mimeType,
        imageModelConfig,
      });

      // 4. 返回結果
      return {
        content: [{ type: "text", text: result.text }],
        details: { model: `${result.provider}/${result.model}` },
      };
    }
  };
}

4.2 工具可用性配置

工具在 System Prompt 中列出,LLM 根據描述選擇:

// src/agents/system-prompt.ts

const coreToolSummaries: Record<string, string> = {
  read: "Read file contents",
  write: "Create or overwrite files",
  edit: "Make precise edits to files",
  exec: "Run shell commands (pty available for TTY-required CLIs)",
  browser: "Control web browser",
  image: "Analyze an image with the configured image model",
  cron: "Manage cron jobs and wake events (use for reminders)",
  message: "Send messages and channel actions",
  // ...
};

// 工具按順序列出
const toolOrder = [
  "read", "write", "edit", "apply_patch",
  "grep", "find", "ls",
  "exec", "process",
  "web_search", "web_fetch",
  "browser", "canvas", "nodes", "cron",
  "message", "gateway",
  "agents_list", "sessions_list", "sessions_history",
  "sessions_send", "session_status", "image",
];

4.3 LLM 工具選擇邏輯

LLM 基於以下因素選擇工具:

  1. 使用者意圖解析: 分析自然語言需求
  2. 工具描述匹配: 找最合適的工具
  3. 參數提取: 從上下文提取必要參數
  4. 對話歷史: 考慮之前的操作
使用者: "幫我分析這張圖片"
       ↓
LLM 思考: 使用者想分析圖片 → image tool 最合適
       ↓
Tool Call: { name: "image", params: { image: "...", prompt: "分析這張圖片" } }

5. 多媒體處理詳解

5.1 圖片處理

位置: src/agents/tools/image-tool.ts

// 支援的輸入來源
// - 本地檔案路徑: /path/to/image.jpg
// - URL: https://example.com/image.jpg
// - data: URL: data:image/png;base64,...
// - file:// URL: file:///path/to/image.jpg

async execute(toolCallId, args) {
  const imageRaw = args.image;

  // 1. 判斷輸入類型
  const isHttpUrl = /^https?:\/\//i.test(imageRaw);
  const isDataUrl = /^data:/i.test(imageRaw);
  const isFileUrl = /^file:/i.test(imageRaw);

  // 2. 載入圖片
  const media = isDataUrl
    ? decodeDataUrl(imageRaw)
    : await loadWebMedia(resolvedPath, maxBytes);

  // 3. 轉換為 base64
  const base64 = media.buffer.toString("base64");
  const mimeType = media.mimeType ?? "image/png";

  // 4. 呼叫 Vision Model
  const result = await runImagePrompt({
    cfg, agentDir, imageModelConfig,
    prompt, base64, mimeType,
  });

  // 5. 返回分析結果
  return {
    content: [{ type: "text", text: result.text }],
    details: { model: `${result.provider}/${result.model}` },
  };
}

5.2 音訊處理

位置: src/media-understanding/

// 音訊附件 → 語音轉文字 (Speech-to-Text)
// 支援: whisper, deepgram, assembly-ai 等

const audioOutputs = outputs.filter(o => o.kind === "audio.transcription");

if (audioOutputs.length > 0) {
  // 格式化轉錄文字
  const transcript = formatAudioTranscripts(audioOutputs);

  // 注入到上下文
  ctx.Transcript = transcript;
  ctx.CommandBody = transcript;
  ctx.RawBody = transcript;
}

5.3 影片處理

// 影片處理流程:
// 1. 抽取關鍵幀
// 2. 對關鍵幀進行圖片分析
// 3. 可選: 音軌轉文字
// 4. 合併生成影片描述

const videoOutputs = outputs.filter(o => o.kind === "video.description");

6. 任務完成的定義

6.1 成功完成

位置: src/agents/pi-embedded-runner/run.ts

// 任務被視為「完成」的條件:

return {
  payloads: [
    { text: "回覆內容...", isError: false }
  ],
  meta: {
    durationMs: Date.now() - started,   // 執行時間
    agentMeta: {
      sessionId: "session-xxx",
      provider: "anthropic",
      model: "claude-opus-4-5",
      usage: {
        inputTokens: 1234,
        outputTokens: 567,
      },
    },
    aborted: false,                      // 未被中斷
    systemPromptReport: { ... },
  },
};

6.2 特殊完成狀態

Silent Reply (靜默回覆)

// 當 LLM 回覆 "[[silent]]" 時
// 系統不發送任何訊息給使用者
// 常見於:
// - 訊息已透過 message tool 發送
// - 無需回應的情況

Heartbeat (心跳確認)

// 當 LLM 回覆 "HEARTBEAT_OK" 時
// 表示系統正常運作,無需特別處理
// 用於定期健康檢查

Tool Call 待處理

// stopReason === "tool_calls"
// 表示 LLM 請求執行工具
// 系統會執行工具後繼續對話

meta: {
  stopReason: "tool_calls",
  pendingToolCalls: [
    { id: "call_xxx", name: "exec", arguments: "..." }
  ],
}

6.3 錯誤狀態

Context Overflow (上下文溢出)

// 當 prompt 太大時觸發
// 系統會嘗試自動壓縮 (compaction)

return {
  payloads: [{
    text: "Context overflow: prompt too large for the model. " +
          "Try again with less input or a larger-context model.",
    isError: true,
  }],
  meta: {
    error: { kind: "context_overflow", message: errorText },
  },
};

Role Ordering Error (角色順序錯誤)

// 訊息順序衝突
return {
  payloads: [{
    text: "Message ordering conflict - please try again. " +
          "If this persists, use /new to start a fresh session.",
    isError: true,
  }],
  meta: {
    error: { kind: "role_ordering", message: errorText },
  },
};

Auth Failure (認證失敗)

// 會觸發 Failover 機制
// 輪替到下一個 Auth Profile

throw new FailoverError(message, {
  reason: "auth",
  provider,
  model: modelId,
  status: 401,
});

Rate Limit (頻率限制)

// 會觸發帳號輪替
// 自動切換到下一個可用帳號

await markAuthProfileFailure({
  store: authStore,
  profileId: lastProfileId,
  reason: "rate_limit",
});

const rotated = await advanceAuthProfile();
if (rotated) continue;  // 重試

throw new FailoverError("LLM request rate limited.", {
  reason: "rate_limit",
  provider,
  model: modelId,
});

7. 完整訊息處理範例

場景: 使用者透過 Telegram 發送「幫我分析這張圖片」+ 附件.jpg

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

1. [Channel Adapter] - src/telegram/bot-message-dispatch.ts
   ├─ 收到 Telegram 訊息
   ├─ 解析訊息內容
   │   ├─ text: "幫我分析這張圖片"
   │   └─ photo: [FileId]
   ├─ 下載附件 → MediaAttachment[]
   └─ 建構 MsgContext

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

2. [Routing] - src/routing/resolve-route.ts
   ├─ 輸入:
   │   ├─ channel: "telegram"
   │   ├─ accountId: "bot_123"
   │   └─ peer: { kind: "dm", id: "user_456" }
   ├─ 查找 bindings
   └─ 輸出:
       ├─ agentId: "default"
       ├─ sessionKey: "agent-default:telegram:dm:user_456"
       └─ matchedBy: "default"

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

3. [Media Understanding] - src/media-understanding/apply.ts
   ├─ 偵測到 image attachment
   ├─ 檢查 activeModel 是否支援 vision
   │   ├─ 如果支援: 圖片會自動注入 prompt
   │   └─ 如果不支援: 使用 imageModel 生成描述
   ├─ 執行 Vision Model
   └─ 注入結果到 ctx.Body:
       "[Image: 一張風景照片,顯示...]
        幫我分析這張圖片"

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

4. [Agent Runner] - src/agents/pi-embedded-runner/run.ts
   ├─ 載入 Session 歷史
   ├─ 建構 System Prompt
   │   ├─ 核心身份
   │   ├─ 工具列表 (含 image tool)
   │   ├─ Skills
   │   ├─ Runtime 資訊
   │   └─ Project Context
   ├─ 準備 messages:
   │   └─ [user]: "[Image: ...] 幫我分析這張圖片"
   └─ 呼叫 Claude API

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

5. [LLM Decision]
   ├─ 分析使用者意圖: 分析圖片
   ├─ 檢查上下文: 圖片描述已在 prompt 中
   └─ 決定:
       ├─ 情況 A: 圖片已自動注入 → 直接回覆分析
       └─ 情況 B: 需要更詳細分析 → 呼叫 image tool

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

6. [Response] - 回傳路徑
   ├─ LLM 生成回覆文字
   ├─ 格式化 (Markdown for Telegram)
   ├─ 透過 Telegram Bot API 發送
   └─ 更新 Session 歷史

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

結果: 使用者收到圖片分析結果

8. 關鍵配置項

8.1 主要配置檔案

位置: ~/.clawdbot/config.json

{
  "agents": {
    "defaults": {
      "model": "anthropic/claude-opus-4-5",
      "imageModel": {
        "primary": "openai/gpt-5-mini",
        "fallbacks": ["anthropic/claude-opus-4-5"]
      },
      "thinkLevel": "off",
      "workspaceDir": "~/clawdbot-workspace",
      "userTimezone": "Asia/Taipei",
      "mediaMaxMb": 20,
      "heartbeat": {
        "model": "anthropic/claude-haiku-3"
      }
    },
    "list": [
      { "id": "default", "name": "Default Agent" }
    ]
  },

  "tools": {
    "media": {
      "image": {
        "enabled": true,
        "provider": "openai"
      },
      "audio": {
        "enabled": true,
        "transcription": true,
        "provider": "whisper"
      },
      "video": {
        "enabled": true
      }
    },
    "browser": {
      "enabled": true,
      "headless": true
    }
  },

  "session": {
    "dmScope": "per-peer",
    "identityLinks": {},
    "typingIntervalSeconds": 6
  },

  "routing": {
    "bindings": [
      {
        "match": { "channel": "telegram", "accountId": "*" },
        "agentId": "default"
      },
      {
        "match": { "channel": "discord", "guildId": "123456" },
        "agentId": "discord-agent"
      }
    ]
  },

  "channels": {
    "telegram": {
      "enabled": true,
      "token": "BOT_TOKEN"
    },
    "discord": {
      "enabled": true,
      "token": "BOT_TOKEN"
    }
  }
}

8.2 Session Scope 選項

選項 說明
main 所有 DM 共用同一個 session
per-peer 每個對話對象獨立 session
per-channel-peer 每個通道+對象組合獨立 session

8.3 Prompt Mode

模式 說明 用途
full 完整 System Prompt 主要 Agent
minimal 精簡版 (Tooling + Workspace + Runtime) Sub-agents
none 只有基本身份行 特殊用途

9. 核心模組對照表

功能 檔案位置
System Prompt 建構 src/agents/system-prompt.ts
Agent 執行主迴圈 src/agents/pi-embedded-runner/run.ts
訊息路由 src/routing/resolve-route.ts
媒體理解 src/media-understanding/apply.ts
Auto-Reply 入口 src/auto-reply/reply/get-reply.ts
圖片工具 src/agents/tools/image-tool.ts
瀏覽器工具 src/agents/tools/browser-tool.ts
訊息工具 src/agents/tools/message-tool.ts
Cron 工具 src/agents/tools/cron-tool.ts
Skills 載入 src/agents/skills/workspace.ts
Auth Profile 管理 src/agents/auth-profiles.ts
配置載入 src/config/config.ts

10. 總結

Clawdbot 是一個模組化的多通道 AI Agent 平台,具有以下特點:

  1. 多通道支援: 支援 15+ 通訊平台 (WhatsApp, Telegram, Discord, Slack, Signal, iMessage 等)

  2. 動態 System Prompt: 根據配置、工具、Skills、Runtime 資訊動態組建

  3. 智慧工具選擇: LLM 根據工具描述和使用者意圖自動選擇合適工具

  4. 多媒體處理: 完整支援圖片、音訊、影片的理解和處理

  5. Failover 機制: 自動處理 Auth 失敗、Rate Limit、Context Overflow

  6. Session 隔離: 靈活的 session scope 配置

  7. 可擴展性: 透過 Skills 和 Plugins 系統擴展功能


文檔生成日期: 2026-01-27

Clawdbot System Architecture Analysis

This document provides a detailed analysis of the Clawdbot Prompt Flow, System Prompt, frameworks, tool systems, overall processes, and the definition of task completion.


1. Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                        User Message Input                         │
│  (WhatsApp / Telegram / Discord / Slack / Signal / Web...)     │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Channel Adapter                                │
│  - Message format standardization                                 │
│  - Pairing / Allowlist verification                               │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Message Routing                                │
│  - Parse channel + account + peer                                 │
│  - Determine target Agent ID                                      │
│  - Generate Session Key                                           │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Auto-Reply Pipeline                            │
│  - Media Understanding (Image/Audio/Video)                        │
│  - Link Understanding (URL extraction)                             │
│  - Command Authorization                                          │
│  - Directive Extraction (/think, /verbose, etc.)                  │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Pi Embedded Agent Runner                       │
│  - Construct System Prompt                                        │
│  - Load Tools + Skills                                            │
│  - Call LLM (Anthropic/OpenAI/Bedrock...)                         │
│  - Handle Tool Calls                                              │
│  - Auth Profile Rotation + Failover                                │
└───────────────────────────┬─────────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Response Delivery                              │
│  - Format output (Markdown/Plain)                                 │
│  - Return to user via original channel                            │
└─────────────────────────────────────────────────────────────────┘


2. System Prompt Structure

The System Prompt is dynamically assembled by the buildAgentSystemPrompt() function in src/agents/system-prompt.ts.

2.1 Core Identity

You are a personal assistant running inside Clawdbot.

2.2 Tooling Block

Lists all available tools and brief descriptions:

Tool Name Description
read / write File content operations
edit / apply_patch Precise file editing and multi-file patching
grep / find / ls File system search and navigation
exec / process Shell command execution and background session management
web_search / web_fetch Internet search (Brave API) and URL content extraction
browser Control web browser
cron Manage cron jobs and reminders
message Proactive messaging and channel actions (polls, reactions)
sessions_* Manage sub-agents (list, history, send, spawn)
image Analyze images with vision models

2.3 Skills (Mandatory)

Agents are instructed to scan <available_skills> before replying. They must read the specific SKILL.md via the read tool only after selecting the most relevant skill.

2.4 Memory Recall

Before answering about prior work, preferences, or todos, the agent runs memory_search on MEMORY.md and related files to ensure context continuity.

2.5 Runtime Information

Dynamic injection of the current environment: Runtime: agent=default | host=my-mac | model=claude-3-5-sonnet | channel=telegram | thinking=off


3. Message Processing Flow

3.1 Route Resolution (src/routing/resolve-route.ts)

The system identifies the target agent based on a priority hierarchy: Peer > Guild > Team > Account > Channel > Default.

  • Session Key Format: agent-{agentId}:{channel}:{peerKind}:{peerId}
  • Example: agent-default:telegram:dm:123456789

3.2 Media Understanding (src/media-understanding/apply.ts)

Processes attachments in order: Image → Audio → Video.

  • Audio: Transcribed to text and injected into ctx.Transcript.
  • Video: Extracts keyframes for image analysis and merges them into a description.
  • Injection: Results are formatted as [Image: description...] and prepended to the user's message body.

3.3 Agent Execution (src/agents/pi-embedded-runner/run.ts)

The runner manages the lifecycle of an LLM request:

  1. Session Lane: Controls concurrency.
  2. Context Guard: Blocks requests if the prompt exceeds the model's context window.
  3. Auth Rotation: If a provider returns a Rate Limit or Auth error, the system automatically advances to the next available Auth Profile.
  4. Compaction: If a context overflow occurs, the system attempts to compress the session history and retries.

4. Tool Selection Mechanism

LLM selects tools based on:

  1. Intent Analysis: Parsing the natural language request.
  2. Description Matching: Finding the tool whose description fits the task.
  3. Parameter Extraction: Pulling required arguments from the conversation context.

5. Definition of Task Completion

5.1 Successful Completion

A task is "complete" when the LLM provides a final response payload and aborted is false. Meta-data includes token usage and duration.

5.2 Special States

  • Silent Reply: If the LLM responds with [[silent]], no message is sent to the user (usually because the message tool was already used or no response is needed).
  • Heartbeat: Responding with HEARTBEAT_OK confirms system health during automated polls.
  • Tool Call Pending: The run pauses when stopReason === "tool_calls", waiting for execution results to be fed back into the next iteration.

5.3 Error Handling & Failover

  • Context Overflow: Triggers automatic compaction.
  • Rate Limit: Marks the current Auth Profile as "failed" and rotates to a new credential.
  • Role Ordering: Triggers a suggestion for the user to use /new to reset the session.

6. Key Configuration Components (config.json)

  • Agents: Defines default models, thinkLevel, and workspaceDir.
  • Tools: Toggles for media processing, browser accessibility, and specific providers (e.g., Whisper for audio).
  • Session Scope: * main: One session for all DMs.
  • per-peer: Unique session for each user.
  • per-channel-peer: Unique session for each user-channel combination.

7. Summary

Clawdbot is a modular, multi-channel AI Agent platform characterized by:

  • Multi-channel support: 15+ platforms.
  • Dynamic Prompts: Context-aware assembly.
  • Resiliency: Robust failover for auth, rate limits, and context limits.
  • Extensibility: Expandable via the Skills and Plugins system.

Document Date: 2026-01-27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment