本文檔詳細解析 Clawdbot 的 Prompt Flow、System Prompt、框架、工具系統、整體流程,以及任務完成的定義。
┌─────────────────────────────────────────────────────────────────┐
│ 使用者訊息輸入 │
│ (WhatsApp / Telegram / Discord / Slack / Signal / Web...) │
└───────────────────────────┬─────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ Channel Adapter (通道適配器) │
│ - 訊息格式標準化 │
│ - Pairing / Allowlist 驗證 │
└───────────────────────────┬─────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ Message Routing (訊息路由) │
│ - 解析 channel + account + peer │
│ - 決定目標 Agent ID │
│ - 生成 Session Key │
└───────────────────────────┬─────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ Auto-Reply Pipeline │
│ - Media Understanding (圖片/音訊/影片理解) │
│ - Link Understanding (URL 內容擷取) │
│ - Command Authorization (指令授權) │
│ - Directive Extraction (/think, /verbose 等) │
└───────────────────────────┬─────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ Pi Embedded Agent Runner │
│ - 建構 System Prompt │
│ - 載入 Tools + Skills │
│ - 呼叫 LLM (Anthropic/OpenAI/Bedrock...) │
│ - 處理 Tool Calls │
│ - Auth Profile Rotation + Failover │
└───────────────────────────┬─────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ Response Delivery │
│ - 格式化輸出 (Markdown/Plain) │
│ - 透過原通道回傳給使用者 │
└─────────────────────────────────────────────────────────────────┘
System Prompt 由 src/agents/system-prompt.ts 的 buildAgentSystemPrompt() 函數動態組建。
You are a personal assistant running inside Clawdbot.
列出所有可用工具及其簡要說明:
| 工具名稱 | 說明 |
|---|---|
read |
Read file contents |
write |
Create or overwrite files |
edit |
Make precise edits to files |
apply_patch |
Apply multi-file patches |
grep |
Search file contents for patterns |
find |
Find files by glob pattern |
ls |
List directory contents |
exec |
Run shell commands (pty available for TTY-required CLIs) |
process |
Manage background exec sessions |
web_search |
Search the web (Brave API) |
web_fetch |
Fetch and extract readable content from a URL |
browser |
Control web browser |
canvas |
Present/eval/snapshot the Canvas |
nodes |
List/describe/notify/camera/screen on paired nodes |
cron |
Manage cron jobs and wake events (use for reminders) |
message |
Send messages and channel actions |
gateway |
Restart, apply config, or run updates on Clawdbot |
agents_list |
List agent ids allowed for sessions_spawn |
sessions_list |
List other sessions (incl. sub-agents) |
sessions_history |
Fetch history for another session/sub-agent |
sessions_send |
Send a message to another session/sub-agent |
sessions_spawn |
Spawn a sub-agent session |
session_status |
Show status card (usage + time + settings) |
image |
Analyze an image with the configured image model |
## Skills (mandatory)
Before replying: scan <available_skills> <description> entries.
- If exactly one skill clearly applies: read its SKILL.md at <location> with `read`, then follow it.
- If multiple could apply: choose the most specific one, then read/follow it.
- If none clearly apply: do not read any SKILL.md.
Constraints: never read more than one skill up front; only read after selecting.## Memory Recall
Before answering anything about prior work, decisions, dates, people, preferences, or todos:
run memory_search on MEMORY.md + memory/*.md; then use memory_get to pull only the needed lines.
If low confidence after search, say you checked.## Messaging
- Reply in current session → automatically routes to the source channel
- Cross-session messaging → use sessions_send(sessionKey, message)
- Never use exec/curl for provider messaging; Clawdbot handles all routing internally.
### message tool
- Use `message` for proactive sends + channel actions (polls, reactions, etc.).
- For `action=send`, include `to` and `message`.
- If multiple channels are configured, pass `channel` (whatsapp|telegram|discord|...).
- If you use `message` to deliver your user-visible reply, respond with ONLY: [[silent]]動態注入運行時資訊:
Runtime: agent=default | host=my-mac | repo=/path/to/repo | os=darwin (arm64) |
node=22.x | model=claude-opus-4-5 | default_model=claude-opus-4-5 |
channel=telegram | capabilities=inlineButtons | thinking=off
## Silent Replies
When you have nothing to say, respond with ONLY: [[silent]]
Rules:
- It must be your ENTIRE message - nothing else
- Never append it to an actual response
- Never wrap it in markdown or code blocks## Heartbeats
Heartbeat prompt: (configured)
If you receive a heartbeat poll (a user message matching the heartbeat prompt above),
and there is nothing that needs attention, reply exactly: HEARTBEAT_OK動態載入工作區的 context files:
SOUL.md- 人格/語調設定TOOLS.md- 工具使用指南MEMORY.md- 記憶存儲- 其他自定義 context files
位置: 各通道的 message-handler.ts
// 以 Telegram 為例
// src/telegram/bot-message-dispatch.ts
1. Bot 收到訊息
2. 解析訊息內容 → MsgContext 結構
3. 檢查 Pairing/Allowlist 權限
4. 路由到正確的 Agent位置: src/routing/resolve-route.ts
export function resolveAgentRoute(input: ResolveAgentRouteInput): ResolvedAgentRoute {
// 1. 標準化輸入
const channel = normalizeToken(input.channel);
const accountId = normalizeAccountId(input.accountId);
const peer = input.peer ? { kind: input.peer.kind, id: normalizeId(input.peer.id) } : null;
// 2. 查找 binding 配置
const bindings = listBindings(input.cfg).filter((binding) => {
if (!matchesChannel(binding.match, channel)) return false;
return matchesAccountId(binding.match?.accountId, accountId);
});
// 3. 按優先順序匹配
// 優先級: peer > guild > team > account > channel > default
if (peer) {
const peerMatch = bindings.find((b) => matchesPeer(b.match, peer));
if (peerMatch) return choose(peerMatch.agentId, "binding.peer");
}
if (guildId) {
const guildMatch = bindings.find((b) => matchesGuild(b.match, guildId));
if (guildMatch) return choose(guildMatch.agentId, "binding.guild");
}
// ... 其他匹配邏輯
// 4. 建構 Session Key
return choose(resolveDefaultAgentId(input.cfg), "default");
}Session Key 結構:
agent-{agentId}:{channel}:{peerKind}:{peerId}
例如: agent-default:telegram:dm:123456789
位置: src/media-understanding/apply.ts
export async function applyMediaUnderstanding(params) {
// 處理順序: image → audio → video
const CAPABILITY_ORDER: MediaUnderstandingCapability[] = ["image", "audio", "video"];
const tasks = CAPABILITY_ORDER.map((capability) => async () => {
return await runCapability({
capability, // "image" | "audio" | "video"
cfg, // 配置
ctx, // 訊息上下文
attachments, // 媒體附件
activeModel, // 當前使用的模型
});
});
const results = await runWithConcurrency(tasks, concurrency);
// 處理結果類型:
// - image → image.description (圖片描述)
// - audio → audio.transcription (語音轉文字)
// - video → video.description (影片描述)
// 將理解結果注入到 ctx.Body
if (outputs.length > 0) {
ctx.Body = formatMediaUnderstandingBody({ body: ctx.Body, outputs });
// 音訊特殊處理: 設置 Transcript
const audioOutputs = outputs.filter((output) => output.kind === "audio.transcription");
if (audioOutputs.length > 0) {
ctx.Transcript = formatAudioTranscripts(audioOutputs);
}
}
return { outputs, decisions, appliedImage, appliedAudio, appliedVideo };
}位置: src/agents/pi-embedded-runner/run.ts
export async function runEmbeddedPiAgent(params): Promise<EmbeddedPiRunResult> {
// 1. 解析 Session Lane (併發控制)
const sessionLane = resolveSessionLane(params.sessionKey);
const globalLane = resolveGlobalLane(params.lane);
// 2. 解析模型配置
const provider = params.provider ?? DEFAULT_PROVIDER; // "anthropic"
const modelId = params.model ?? DEFAULT_MODEL; // "claude-opus-4-5"
const { model, authStorage, modelRegistry } = resolveModel(provider, modelId, agentDir);
// 3. 檢查 Context Window
const ctxInfo = resolveContextWindowInfo({ cfg, provider, modelId });
const ctxGuard = evaluateContextWindowGuard({ info: ctxInfo });
if (ctxGuard.shouldBlock) {
throw new FailoverError("Model context window too small");
}
// 4. Auth Profile 輪替設置
const profileOrder = resolveAuthProfileOrder({ cfg, store: authStore, provider });
// 5. 執行主迴圈
while (true) {
attemptedThinking.add(thinkLevel);
// 執行嘗試
const attempt = await runEmbeddedAttempt({
sessionId, sessionKey, messageChannel,
config, prompt, images, tools,
provider, modelId, model,
thinkLevel, verboseLevel, reasoningLevel,
// ... 其他參數
});
const { aborted, promptError, timedOut, lastAssistant } = attempt;
// 錯誤處理
if (promptError && !aborted) {
// Context Overflow → 自動壓縮
if (isContextOverflowError(errorText) && !overflowCompactionAttempted) {
const compactResult = await compactEmbeddedPiSessionDirect({ ... });
if (compactResult.compacted) continue;
}
// Auth/Rate Limit Error → 輪替 Profile
if (isFailoverErrorMessage(errorText) && await advanceAuthProfile()) {
continue;
}
// Thinking Level 不支援 → 降級
const fallbackThinking = pickFallbackThinkingLevel({ message: errorText });
if (fallbackThinking) {
thinkLevel = fallbackThinking;
continue;
}
}
// 成功處理
const payloads = buildEmbeddedRunPayloads({ ... });
// 標記 Profile 成功
if (lastProfileId) {
await markAuthProfileGood({ store: authStore, profileId: lastProfileId });
}
return { payloads, meta: { durationMs, agentMeta, aborted } };
}
}位置: src/agents/tools/*.ts
// 範例: image-tool.ts
export function createImageTool(options): AnyAgentTool | null {
return {
label: "Image",
name: "image",
description: "Analyze an image with the configured image model. " +
"Provide a prompt and image path or URL.",
parameters: Type.Object({
prompt: Type.Optional(Type.String()),
image: Type.String(),
model: Type.Optional(Type.String()),
maxBytesMb: Type.Optional(Type.Number()),
}),
execute: async (toolCallId, args) => {
// 1. 解析參數
const imageRaw = args.image;
const prompt = args.prompt ?? "Describe the image.";
// 2. 載入圖片
const media = await loadWebMedia(resolvedPath, maxBytes);
// 3. 執行分析
const result = await runImagePrompt({
prompt, base64, mimeType,
imageModelConfig,
});
// 4. 返回結果
return {
content: [{ type: "text", text: result.text }],
details: { model: `${result.provider}/${result.model}` },
};
}
};
}工具在 System Prompt 中列出,LLM 根據描述選擇:
// src/agents/system-prompt.ts
const coreToolSummaries: Record<string, string> = {
read: "Read file contents",
write: "Create or overwrite files",
edit: "Make precise edits to files",
exec: "Run shell commands (pty available for TTY-required CLIs)",
browser: "Control web browser",
image: "Analyze an image with the configured image model",
cron: "Manage cron jobs and wake events (use for reminders)",
message: "Send messages and channel actions",
// ...
};
// 工具按順序列出
const toolOrder = [
"read", "write", "edit", "apply_patch",
"grep", "find", "ls",
"exec", "process",
"web_search", "web_fetch",
"browser", "canvas", "nodes", "cron",
"message", "gateway",
"agents_list", "sessions_list", "sessions_history",
"sessions_send", "session_status", "image",
];LLM 基於以下因素選擇工具:
- 使用者意圖解析: 分析自然語言需求
- 工具描述匹配: 找最合適的工具
- 參數提取: 從上下文提取必要參數
- 對話歷史: 考慮之前的操作
使用者: "幫我分析這張圖片"
↓
LLM 思考: 使用者想分析圖片 → image tool 最合適
↓
Tool Call: { name: "image", params: { image: "...", prompt: "分析這張圖片" } }
位置: src/agents/tools/image-tool.ts
// 支援的輸入來源
// - 本地檔案路徑: /path/to/image.jpg
// - URL: https://example.com/image.jpg
// - data: URL: data:image/png;base64,...
// - file:// URL: file:///path/to/image.jpg
async execute(toolCallId, args) {
const imageRaw = args.image;
// 1. 判斷輸入類型
const isHttpUrl = /^https?:\/\//i.test(imageRaw);
const isDataUrl = /^data:/i.test(imageRaw);
const isFileUrl = /^file:/i.test(imageRaw);
// 2. 載入圖片
const media = isDataUrl
? decodeDataUrl(imageRaw)
: await loadWebMedia(resolvedPath, maxBytes);
// 3. 轉換為 base64
const base64 = media.buffer.toString("base64");
const mimeType = media.mimeType ?? "image/png";
// 4. 呼叫 Vision Model
const result = await runImagePrompt({
cfg, agentDir, imageModelConfig,
prompt, base64, mimeType,
});
// 5. 返回分析結果
return {
content: [{ type: "text", text: result.text }],
details: { model: `${result.provider}/${result.model}` },
};
}位置: src/media-understanding/
// 音訊附件 → 語音轉文字 (Speech-to-Text)
// 支援: whisper, deepgram, assembly-ai 等
const audioOutputs = outputs.filter(o => o.kind === "audio.transcription");
if (audioOutputs.length > 0) {
// 格式化轉錄文字
const transcript = formatAudioTranscripts(audioOutputs);
// 注入到上下文
ctx.Transcript = transcript;
ctx.CommandBody = transcript;
ctx.RawBody = transcript;
}// 影片處理流程:
// 1. 抽取關鍵幀
// 2. 對關鍵幀進行圖片分析
// 3. 可選: 音軌轉文字
// 4. 合併生成影片描述
const videoOutputs = outputs.filter(o => o.kind === "video.description");位置: src/agents/pi-embedded-runner/run.ts
// 任務被視為「完成」的條件:
return {
payloads: [
{ text: "回覆內容...", isError: false }
],
meta: {
durationMs: Date.now() - started, // 執行時間
agentMeta: {
sessionId: "session-xxx",
provider: "anthropic",
model: "claude-opus-4-5",
usage: {
inputTokens: 1234,
outputTokens: 567,
},
},
aborted: false, // 未被中斷
systemPromptReport: { ... },
},
};// 當 LLM 回覆 "[[silent]]" 時
// 系統不發送任何訊息給使用者
// 常見於:
// - 訊息已透過 message tool 發送
// - 無需回應的情況// 當 LLM 回覆 "HEARTBEAT_OK" 時
// 表示系統正常運作,無需特別處理
// 用於定期健康檢查// stopReason === "tool_calls"
// 表示 LLM 請求執行工具
// 系統會執行工具後繼續對話
meta: {
stopReason: "tool_calls",
pendingToolCalls: [
{ id: "call_xxx", name: "exec", arguments: "..." }
],
}// 當 prompt 太大時觸發
// 系統會嘗試自動壓縮 (compaction)
return {
payloads: [{
text: "Context overflow: prompt too large for the model. " +
"Try again with less input or a larger-context model.",
isError: true,
}],
meta: {
error: { kind: "context_overflow", message: errorText },
},
};// 訊息順序衝突
return {
payloads: [{
text: "Message ordering conflict - please try again. " +
"If this persists, use /new to start a fresh session.",
isError: true,
}],
meta: {
error: { kind: "role_ordering", message: errorText },
},
};// 會觸發 Failover 機制
// 輪替到下一個 Auth Profile
throw new FailoverError(message, {
reason: "auth",
provider,
model: modelId,
status: 401,
});// 會觸發帳號輪替
// 自動切換到下一個可用帳號
await markAuthProfileFailure({
store: authStore,
profileId: lastProfileId,
reason: "rate_limit",
});
const rotated = await advanceAuthProfile();
if (rotated) continue; // 重試
throw new FailoverError("LLM request rate limited.", {
reason: "rate_limit",
provider,
model: modelId,
});場景: 使用者透過 Telegram 發送「幫我分析這張圖片」+ 附件.jpg
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1. [Channel Adapter] - src/telegram/bot-message-dispatch.ts
├─ 收到 Telegram 訊息
├─ 解析訊息內容
│ ├─ text: "幫我分析這張圖片"
│ └─ photo: [FileId]
├─ 下載附件 → MediaAttachment[]
└─ 建構 MsgContext
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
2. [Routing] - src/routing/resolve-route.ts
├─ 輸入:
│ ├─ channel: "telegram"
│ ├─ accountId: "bot_123"
│ └─ peer: { kind: "dm", id: "user_456" }
├─ 查找 bindings
└─ 輸出:
├─ agentId: "default"
├─ sessionKey: "agent-default:telegram:dm:user_456"
└─ matchedBy: "default"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
3. [Media Understanding] - src/media-understanding/apply.ts
├─ 偵測到 image attachment
├─ 檢查 activeModel 是否支援 vision
│ ├─ 如果支援: 圖片會自動注入 prompt
│ └─ 如果不支援: 使用 imageModel 生成描述
├─ 執行 Vision Model
└─ 注入結果到 ctx.Body:
"[Image: 一張風景照片,顯示...]
幫我分析這張圖片"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
4. [Agent Runner] - src/agents/pi-embedded-runner/run.ts
├─ 載入 Session 歷史
├─ 建構 System Prompt
│ ├─ 核心身份
│ ├─ 工具列表 (含 image tool)
│ ├─ Skills
│ ├─ Runtime 資訊
│ └─ Project Context
├─ 準備 messages:
│ └─ [user]: "[Image: ...] 幫我分析這張圖片"
└─ 呼叫 Claude API
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
5. [LLM Decision]
├─ 分析使用者意圖: 分析圖片
├─ 檢查上下文: 圖片描述已在 prompt 中
└─ 決定:
├─ 情況 A: 圖片已自動注入 → 直接回覆分析
└─ 情況 B: 需要更詳細分析 → 呼叫 image tool
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
6. [Response] - 回傳路徑
├─ LLM 生成回覆文字
├─ 格式化 (Markdown for Telegram)
├─ 透過 Telegram Bot API 發送
└─ 更新 Session 歷史
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
結果: 使用者收到圖片分析結果
位置: ~/.clawdbot/config.json
{
"agents": {
"defaults": {
"model": "anthropic/claude-opus-4-5",
"imageModel": {
"primary": "openai/gpt-5-mini",
"fallbacks": ["anthropic/claude-opus-4-5"]
},
"thinkLevel": "off",
"workspaceDir": "~/clawdbot-workspace",
"userTimezone": "Asia/Taipei",
"mediaMaxMb": 20,
"heartbeat": {
"model": "anthropic/claude-haiku-3"
}
},
"list": [
{ "id": "default", "name": "Default Agent" }
]
},
"tools": {
"media": {
"image": {
"enabled": true,
"provider": "openai"
},
"audio": {
"enabled": true,
"transcription": true,
"provider": "whisper"
},
"video": {
"enabled": true
}
},
"browser": {
"enabled": true,
"headless": true
}
},
"session": {
"dmScope": "per-peer",
"identityLinks": {},
"typingIntervalSeconds": 6
},
"routing": {
"bindings": [
{
"match": { "channel": "telegram", "accountId": "*" },
"agentId": "default"
},
{
"match": { "channel": "discord", "guildId": "123456" },
"agentId": "discord-agent"
}
]
},
"channels": {
"telegram": {
"enabled": true,
"token": "BOT_TOKEN"
},
"discord": {
"enabled": true,
"token": "BOT_TOKEN"
}
}
}| 選項 | 說明 |
|---|---|
main |
所有 DM 共用同一個 session |
per-peer |
每個對話對象獨立 session |
per-channel-peer |
每個通道+對象組合獨立 session |
| 模式 | 說明 | 用途 |
|---|---|---|
full |
完整 System Prompt | 主要 Agent |
minimal |
精簡版 (Tooling + Workspace + Runtime) | Sub-agents |
none |
只有基本身份行 | 特殊用途 |
| 功能 | 檔案位置 |
|---|---|
| System Prompt 建構 | src/agents/system-prompt.ts |
| Agent 執行主迴圈 | src/agents/pi-embedded-runner/run.ts |
| 訊息路由 | src/routing/resolve-route.ts |
| 媒體理解 | src/media-understanding/apply.ts |
| Auto-Reply 入口 | src/auto-reply/reply/get-reply.ts |
| 圖片工具 | src/agents/tools/image-tool.ts |
| 瀏覽器工具 | src/agents/tools/browser-tool.ts |
| 訊息工具 | src/agents/tools/message-tool.ts |
| Cron 工具 | src/agents/tools/cron-tool.ts |
| Skills 載入 | src/agents/skills/workspace.ts |
| Auth Profile 管理 | src/agents/auth-profiles.ts |
| 配置載入 | src/config/config.ts |
Clawdbot 是一個模組化的多通道 AI Agent 平台,具有以下特點:
-
多通道支援: 支援 15+ 通訊平台 (WhatsApp, Telegram, Discord, Slack, Signal, iMessage 等)
-
動態 System Prompt: 根據配置、工具、Skills、Runtime 資訊動態組建
-
智慧工具選擇: LLM 根據工具描述和使用者意圖自動選擇合適工具
-
多媒體處理: 完整支援圖片、音訊、影片的理解和處理
-
Failover 機制: 自動處理 Auth 失敗、Rate Limit、Context Overflow
-
Session 隔離: 靈活的 session scope 配置
-
可擴展性: 透過 Skills 和 Plugins 系統擴展功能
文檔生成日期: 2026-01-27