Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save dollspace-gay/e820e684a7ccf883215a0e6a35be5f47 to your computer and use it in GitHub Desktop.

Select an option

Save dollspace-gay/e820e684a7ccf883215a0e6a35be5f47 to your computer and use it in GitHub Desktop.

Critical Analysis: Claude Code System Prompt

An engineering review of the Claude Code system prompt against prompt engineering best practices, Anthropic's own published research, and structural quality heuristics.

Methodology: Analysis grounded in Anthropic's prompt engineering docs, context engineering guide, tool design guide, the "Lost in the Middle" phenomenon (Liu et al., 2024), and the QA heuristics framework (SOLID/complexity/security applied to prompt architecture).


Executive Summary

The Claude Code system prompt is a sophisticated, production-grade piece of context engineering. It solves real problems — cache efficiency, safety guardrails, tool routing, memory persistence — with genuine architectural thoughtfulness. However, its organic growth has produced a prompt that violates several of Anthropic's own published best practices: it is heavily negative-instruction-biased, suffers from lost-in-the-middle vulnerability on its most critical safety content, contains redundant instructions that fragment attention, and has a structural tension between its ant/external variants that leaks complexity into every conditional path.

Estimated token count: ~8,000-12,000 tokens (static sections only), expanding to ~15,000-20,000+ with dynamic sections, memory, tool descriptions, and MCP instructions. This is large enough that attention degradation is a real concern, not a theoretical one.


Where It Works Well

1. Cache-Aware Architecture (CRITICAL STRENGTH)

Severity: COMMENDATION

The SYSTEM_PROMPT_DYNAMIC_BOUNDARY split is genuinely excellent engineering. Static behavioral instructions are separated from per-session dynamic content (environment, MCP, memory), enabling the API's prompt caching to avoid re-processing ~8K tokens on every turn. The systemPromptSection() memoization layer adds a second cache tier for dynamic sections that don't change between turns.

This directly follows Anthropic's cache guidance: "Place breakpoints on stable content only. Placing cache_control on content that changes per request defeats caching entirely." The prompt's architecture is designed around this constraint, not retrofitted.

2. Safety Guardrails — Depth and Specificity

Severity: COMMENDATION

The "Executing actions with care" section is a standout. Rather than vague "be careful" instructions, it provides:

  • A concrete decision framework (reversibility + blast radius)
  • Enumerated examples organized by risk category
  • Explicit override semantics ("A user approving an action once does NOT mean they approve it in all contexts")
  • Anti-pattern callouts ("do not use destructive actions as a shortcut")

This follows Anthropic's own recommended pattern from their safeguards documentation almost verbatim. The "measure twice, cut once" closing is effective anchoring.

3. Memory System Design

Severity: COMMENDATION

The memory taxonomy (user/feedback/project/reference) is well-designed:

  • Each type has description, when_to_save, how_to_use, and examples — matching Anthropic's recommendation to use structured XML tags for complex taxonomies
  • The "What NOT to save" section prevents a known failure mode (memory pollution with derivable information)
  • The "Before recommending from memory" section addresses staleness — a subtle but important failure mode
  • The body_structure field for feedback/project types ("lead with the rule, then Why, then How to apply") creates a consistent retrieval format

The two-step save process (file + MEMORY.md index) is a pragmatic solution to the "how do you search unstructured memory" problem.

4. Tool Routing Instructions

Severity: COMMENDATION

The Bash tool description's "prefer dedicated tools" section is well-structured: a clear rule ("Do NOT use Bash when a dedicated tool exists"), followed by a mapping table (Read instead of cat, Edit instead of sed, etc.), followed by the reasoning ("better user experience and easier to review"). This matches Anthropic's tool design guidance: "When writing tool descriptions, think of how you would describe your tool to a new hire."

5. Git Safety Protocol

Severity: COMMENDATION

The commit/PR instructions are a masterclass in procedural prompting:

  • Numbered steps with explicit parallelism annotations ("Run the following in parallel")
  • HEREDOC examples for correct formatting
  • Safety invariants called out with "NEVER" and "CRITICAL" markers
  • The attribution line (Co-Authored-By) elegantly embedded in the workflow

6. Ant vs. External Variant Strategy

Severity: COMMENDATION (design intent)

The ant-only sections represent a mature internal feedback loop. Items like "Report outcomes faithfully" and "Before reporting a task complete, verify it actually works" address real observed failure modes (false-claims, premature completion claims). The comment annotations (@[MODEL LAUNCH], PR references) show these are eval-driven, not guesswork.


Where It Falls Short

1. Massive Negative Instruction Bias

Severity: CRITICAL
Dimension: Cognitive Readability / Instruction Design
Location: Doing Tasks, Tool Routing, Git Safety, Output Efficiency

Anthropic's own docs state: "Tell Claude what to do, not what not to do." Instead of "Do not use markdown," they recommend "Your response should be composed of smoothly flowing prose paragraphs."

The Claude Code prompt violates this systematically. A count of negative instructions in the core sections:

Section "Don't/Never/Do NOT/Avoid" count
Doing Tasks 14
Using Your Tools 3
Git Safety Protocol 9
Output Efficiency 5
Tone and Style 3
Agent Tool 5
Total ~39

Examples of the pattern:

  • "Don't add features, refactor code, or make 'improvements' beyond what was asked"
  • "Don't add error handling, fallbacks, or validation for scenarios that can't happen"
  • "Don't create helpers, utilities, or abstractions for one-time operations"
  • "Do not restate what the user said"
  • "NEVER run destructive git commands"
  • "NEVER skip hooks"
  • "NEVER commit changes unless the user explicitly asks"

Why this matters: Negative instructions consume tokens telling the model what NOT to do without efficiently anchoring what it SHOULD do. Research shows models process positive framing more reliably — "keep changes minimal and scoped to the request" is more effective than "Don't add features, refactor code, or make improvements beyond what was asked." The negative form requires the model to infer the desired behavior from the prohibition, which is strictly harder.

Mandated refactor: Rewrite the top 10 highest-impact negative instructions as positive directives. Example:

  • Before: "Don't add features, refactor code, or make 'improvements' beyond what was asked."
  • After: "Keep changes scoped precisely to the user's request. A bug fix touches only the bug; a feature adds only what was specified."

2. Lost-in-the-Middle Vulnerability

Severity: CRITICAL
Dimension: Architectural Alignment / Performance
Location: Prompt assembly order (prompts.ts:560-576)

The "Lost in the Middle" phenomenon (Liu et al., 2024) demonstrates a 30%+ accuracy drop when critical information sits in the middle of long contexts. Anthropic's own long-context tips acknowledge this as "context rot."

The current assembly order places the prompt's most important behavioral instructions in the exact middle:

Position 1-2: Identity + System (top — high recall)
Position 3-5: Doing Tasks + Actions + Tools (MIDDLE — degraded recall)
Position 6-7: Tone + Output Efficiency (MIDDLE — degraded recall)
--- DYNAMIC BOUNDARY ---
Position 8+: Session guidance, Memory, Environment (end — high recall)

The "Doing Tasks" section — which contains the core coding philosophy (YAGNI, minimal changes, security) — sits in positions 3-5 of a 15-20 element array. This is the dead zone. The "Executing actions with care" section, arguably the most safety-critical behavioral instruction, is at position 4.

Meanwhile, the environment section (cwd, platform, model ID) — purely informational, zero behavioral weight — occupies the high-recall end position.

Mandated refactor: Move the highest-priority behavioral instructions (safety guardrails, coding philosophy) to either the very beginning (position 1-2, after identity) or the very end (just before the dynamic boundary). Move purely informational sections (environment, model metadata) to the middle where recall degradation is acceptable.

3. Redundant and Contradictory Instructions

Severity: WARNING
Dimension: Structural Modularity / DRY
Location: Multiple sections

Several instructions appear in near-identical form across multiple sections, wasting tokens and fragmenting the model's attention:

Redundancy: "Be concise" — stated in at least 4 places:

  1. Tone and Style: "Your responses should be short and concise."
  2. Output Efficiency: "Go straight to the point... Be extra concise."
  3. Output Efficiency: "If you can say it in one sentence, don't use three."
  4. Proactive Mode: "Keep your text output brief and high-level... If you can say it in one sentence, don't use three."

Redundancy: "Don't use Bash for file operations" — stated in:

  1. System prompt "Using your tools" section
  2. Bash tool description header
  3. Bash tool "Instructions" section (the mapping table)

Redundancy: "Never skip hooks" — stated in:

  1. Git Safety Protocol: "NEVER skip hooks (--no-verify, --no-gpg-sign, etc)"
  2. Bash tool git subitem: "Never skip hooks (--no-verify) or bypass signing"
  3. Ant-only Git operations: "NEVER skip hooks (--no-verify, --no-gpg-sign, etc)"

Contradiction: Conciseness vs. Explanatory depth — The external prompt says "Be extra concise" and "If you can say it in one sentence, don't use three," but also "When explaining, include only what is necessary for the user to understand." The ant prompt explicitly contradicts this: "Err on the side of more explanation." This isn't a true contradiction (different audiences), but since the ant-only section is conditionally injected into the same prompt position, the model must hold two conflicting directives weighted by a build flag it cannot see.

Mandated refactor: Deduplicate instructions. Each behavioral directive should appear exactly once, in the section with the highest positional recall for that directive's priority level. Cross-references should use "As stated in [section]" rather than restating.

4. Tool Description Bloat

Severity: WARNING
Dimension: Performance & Complexity / Architectural Alignment
Location: BashTool/prompt.ts, AgentTool/prompt.ts

Anthropic's tool design guide warns: "One of the most common failure modes we see is bloated tool sets." The same principle applies to tool descriptions.

The Bash tool description alone contains:

  • The tool's core function (~50 tokens)
  • A "prefer dedicated tools" section (~200 tokens) — duplicated from the system prompt
  • Detailed instructions for issuing multiple commands (~150 tokens)
  • Git command guidelines (~100 tokens)
  • Sleep avoidance guidelines (~150 tokens)
  • Sandbox configuration (~300+ tokens)
  • Full commit workflow (~500 tokens)
  • Full PR creation workflow (~400 tokens)
  • Other common operations (~50 tokens)

Total: ~1,900+ tokens for a single tool description. The commit and PR workflows are essentially full sub-prompts embedded inside a tool description. This violates the "tools > system > messages" cache hierarchy — any change to the commit workflow invalidates the entire tool schema cache.

The Agent tool description is similarly bloated (~800+ tokens) with examples, "when NOT to use" lists, and a full prompt-writing tutorial.

Mandated refactor: Extract the commit and PR workflows from the Bash tool description into dedicated tools or skills. The Bash tool description should describe what Bash does and its constraints, not contain 900 tokens of git workflow instructions. This would also improve cache hit rates since the tool schema would change less frequently.

5. Absence of Positive Examples for Core Behaviors

Severity: WARNING
Dimension: Testability / Cognitive Readability
Location: Doing Tasks, Actions, Output Efficiency

Anthropic recommends 3-5 diverse examples in <example> tags for complex behaviors. The memory system follows this perfectly — each type has 2 worked examples. But the core coding instructions have zero examples:

  • "Don't add features beyond what was asked" — no example of what "scoped to the request" looks like
  • "If an approach fails, diagnose why before switching tactics" — no example of the diagnostic process
  • "Be careful not to introduce security vulnerabilities" — no example of catching and fixing one
  • "Carefully consider the reversibility and blast radius" — no example of the decision process

The git commit and PR sections have examples, but they're for formatting (HEREDOC syntax), not for judgment calls.

Mandated refactor: Add 2-3 <example> blocks to the "Doing Tasks" section showing the model correctly scoping a change, and 1-2 to the "Actions" section showing the reversibility decision process. These would be far more effective than the current wall of prohibitions.

6. No Priority Hierarchy Among Instructions

Severity: WARNING
Dimension: Cognitive Readability / Functional Correctness
Location: All sections

When instructions conflict (as they inevitably do in edge cases), the model has no guidance on which takes precedence. Examples of real conflicts:

  1. "Keep changes scoped" vs. "Be careful not to introduce security vulnerabilities" — what if fixing a vulnerability requires changes beyond the user's request?
  2. "Check with the user before proceeding" (Actions) vs. "Escalate to the user with AskUserQuestion only when you're genuinely stuck" (Doing Tasks) — which threshold applies?
  3. "NEVER commit changes unless the user explicitly asks" vs. proactive mode's "Make code changes. Commit when you reach a good stopping point" — the proactive section is feature-gated, but the git safety protocol is in the always-present Bash tool description.

The prompt never establishes: "When instructions conflict, prioritize safety > user explicit instructions > coding philosophy > style preferences."

Mandated refactor: Add a 2-3 line priority hierarchy at the top of the prompt, after identity. Example: "When guidelines conflict: (1) safety and reversibility always win, (2) explicit user instructions override defaults, (3) correctness over style."

7. System Prompt Self-Promotion

Severity: WARNING
Dimension: Documentation & Intent / YAGNI
Location: Environment section

The environment section contains three lines of product marketing:

  • "The most recent Claude model family is Claude 4.5/4.6. Model IDs — Opus 4.6: 'claude-opus-4-6', Sonnet 4.6: 'claude-sonnet-4-6', Haiku 4.5: 'claude-haiku-4-5-20251001'. When building AI applications, default to the latest and most capable Claude models."
  • "Claude Code is available as a CLI in the terminal, desktop app (Mac/Windows), web app (claude.ai/code), and IDE extensions (VS Code, JetBrains)."
  • "Fast mode for Claude Code uses the same Claude Opus 4.6 model with faster output. It does NOT switch to a different model. It can be toggled with /fast."

These consume ~150 tokens on every single API call to tell the model to recommend Anthropic products. This is legitimate for a product (users asking "what model am I using?" or "how do I use Claude in VS Code?" need answers), but it violates the "minimal set of information" principle. The model IDs in particular are a cache-busting liability — every model launch requires updating three hardcoded strings in the static prompt prefix.

Mandated refactor: Move model ID table and product availability to a dynamic section or an on-demand skill. The model already knows its own identity from the API metadata.

8. Conditional Complexity Explosion

Severity: WARNING
Dimension: Structural Modularity / Testability
Location: prompts.ts throughout

The prompt has at least 7 independent conditional axes:

  1. USER_TYPE === 'ant' (ant vs. external)
  2. feature('PROACTIVE') / feature('KAIROS') (proactive mode)
  3. feature('CACHED_MICROCOMPACT') (function result clearing)
  4. feature('TOKEN_BUDGET') (token budget mode)
  5. feature('KAIROS_BRIEF') (brief mode)
  6. feature('EXPERIMENTAL_SKILL_SEARCH') (skill search)
  7. isUndercover() (undercover mode)
  8. isForkSubagentEnabled() (fork vs. traditional agents)
  9. hasEmbeddedSearchTools() (ant-native search)
  10. isReplModeEnabled() (REPL mode)

This produces a theoretical 2^10 = 1,024 prompt variants. In practice, many combinations are impossible (feature flags are correlated), but the actual number of distinct prompts in production is likely in the dozens.

The testing problem: Each variant can have different behavioral characteristics. The "Doing Tasks" section alone has 4 conditional blocks. The Agent tool description has 3 conditional branches. There is no evidence in the codebase of systematic testing across prompt variants — the eval files test individual sections, not combinatorial interactions.

Mandated refactor: Reduce conditional axes by consolidating related flags (KAIROS/KAIROS_BRIEF/PROACTIVE could be a single enum). Document the intended prompt variants as a matrix and ensure eval coverage for at least the high-traffic combinations.

9. XML vs. Markdown Inconsistency

Severity: NITPICK
Dimension: Cognitive Readability
Location: Memory section vs. all other sections

Anthropic recommends XML tags for structured data and Markdown for prose. The prompt mostly follows this — except inconsistently:

  • Memory types use <types>/<type>/<name>/<description> XML tags (correct)
  • Git examples use <example> tags (correct)
  • The rest of the prompt uses Markdown headers and bullet points (correct)
  • But the memory <examples> blocks contain unstructured user:/assistant: pairs without XML wrapping for each example (inconsistent — should be <example> tags matching the Agent tool's pattern)

10. The "IMPORTANT" Inflation Problem

Severity: NITPICK
Dimension: Cognitive Readability
Location: Throughout

The word "IMPORTANT" appears as a callout prefix in:

  1. Cyber risk instruction (2x)
  2. Output efficiency section
  3. Bash tool — avoid dedicated tools
  4. Bash tool — never use -uall
  5. Bash tool — never use -i flag
  6. Bash tool — do not use --no-edit
  7. Git safety — never commit without asking
  8. PR creation — follow steps carefully
  9. Scratchpad — always use scratchpad directory
  10. Skill tool — only use for listed skills

When everything is IMPORTANT, nothing is. The model cannot distinguish between "IMPORTANT: don't use git add -i" (minor UX issue) and "IMPORTANT: Assist with authorized security testing" (safety-critical boundary). The Claude 4.x docs specifically warn that "CRITICAL: You MUST use this tool when..." causes overtriggering in newer models and should be dialed back.

Mandated refactor: Reserve IMPORTANT/CRITICAL/NEVER for the top 3-5 truly safety-critical instructions. Downgrade the rest to regular prose. The git -i flag restriction does not merit the same severity marker as the security boundary.


Structural Metrics

Metric Value Assessment
Estimated static token count ~8,000-12,000 Above the "attention degradation starts" threshold for mid-context content
Estimated total token count (with tools + dynamic) ~15,000-25,000 Significant context budget consumed before the user says anything
Negative instructions (Don't/Never/Avoid) ~39 High; Anthropic recommends positive framing
Redundant instruction pairs ~8 Moderate; each wastes ~50-100 tokens
Conditional axes 10 High; produces unmaintainable variant space
Worked examples in core sections 0 (excluding git HEREDOC formatting) Below Anthropic's recommended 3-5
Worked examples in memory section 8 Excellent
IMPORTANT/CRITICAL markers ~10 Diluted; should be 3-5
Sections with priority conflicts 3 identified No resolution mechanism provided

Recommendations Summary (Ranked by Impact)

Priority Issue Effort Impact
1 Rewrite top negative instructions as positive directives Medium High — improves instruction following across all tasks
2 Reorder sections for lost-in-the-middle mitigation Low High — safety/coding instructions get better recall
3 Add worked examples to Doing Tasks and Actions Medium High — examples are the single most effective prompt technique
4 Add explicit priority hierarchy for conflicting instructions Low Medium — prevents edge-case confusion
5 Deduplicate redundant instructions Low Medium — saves ~400-500 tokens, reduces attention fragmentation
6 Extract git workflows from Bash tool description Medium Medium — improves cache hit rate, reduces tool description bloat
7 Consolidate conditional flags High Medium — reduces variant space, improves testability
8 Reserve IMPORTANT markers for safety-critical items only Low Low-Medium — prevents severity inflation
9 Move product marketing to dynamic/on-demand section Low Low — saves ~150 tokens, reduces cache-bust risk
10 Standardize XML tag usage in examples Low Low — consistency improvement

Sources

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment