jj-ai: AI Attribution Tracking for Jujutsu VCS

Version: 0.1.1-draft
Status: Design Specification
Date: 2026-02-03

Executive Summary

jj-ai bridges Jujutsu's rewrite-heavy workflow with git-ai's commit-SHA-keyed attribution system. By storing attributions keyed by jj's stable change IDs and materializing to refs/notes/ai at export time, we achieve:

Attribution that survives rebases, amends, squashes, and splits
Full git-ai v3.0.0 and v4.0.0 compatibility for Git ecosystem interop
First-class support for agentic primitives: plans, subagents, forked threads
No dependency on hooks or jj internals

1. Problem Statement

1.1 The Fundamental Conflict

git-ai stores AI authorship attribution in Git notes keyed by commit SHA:

refs/notes/ai/<commit-sha> → attestation + metadata

Jujutsu rewrites commits constantly. A single logical change may cycle through dozens of SHAs during normal workflow:

jj new → jj describe → jj rebase → jj squash  # 4 different SHAs, same change

Every rewrite invalidates the SHA key, orphaning the attribution.

1.2 jj's Advantage: Stable Change IDs

jj assigns each change a change ID (16 random bytes, displayed as reverse-hex z-k alphabet) that persists across all rewrites. This is exactly the stable identifier that attribution needs.

1.3 Integration Constraints

Constraint	Implication
jj ignores `refs/notes/*` during import/export	Must manually bridge notes to/from Git
jj has no hook system	Cannot intercept rewrites automatically
jj has no external subcommand discovery	`jj-ai` won't auto-invoke as `jj ai`
git-ai keys by commit SHA	Must resolve change-id → SHA at sync time

2. Architecture

2.1 Data Flow

┌─────────────────────────────────────────────────────────────────┐
│                    Coding Agent (Cursor, etc.)                   │
│  jj-ai attach -r @ --tool=cursor --conversation-id=abc-123      │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                 jj-ai Local Store (canonical)                    │
│  .jj/ai/notes/<change-id>.json                                  │
│  • Keyed by change-id (stable across rewrites)                  │
│  • Contains attestation + git-ai v3.0.0 metadata                │
│  • Producer/version tracking for safe pruning                   │
└─────────────────────────────┬───────────────────────────────────┘
                              │ jj-ai sync --to-git
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Git Notes (materialized)                      │
│  refs/notes/ai                                                  │
│  • Keyed by commit SHA (resolved at sync time)                  │
│  • git-ai v3.0.0 compatible format                              │
│  • Pushable to GitHub/GitLab                                    │
└─────────────────────────────────────────────────────────────────┘

2.2 Storage Location

.jj/
├── repo/
│   └── store/
│       └── extra/          # jj internals (DO NOT USE)
└── ai/                     # jj-ai storage (separate from jj internals)
    ├── config.toml         # jj-ai configuration
    ├── .lock                # advisory lock file
    ├── notes/              # attribution records (by change-id, then commit-id)
    │   └── <change-id>/
    │       ├── <commit-id-1>.json   # variant 1
    │       ├── <commit-id-2>.json   # variant 2 (for divergent changes)
    │       └── ...
    └── sessions/           # optional: cached session metadata
        └── <session-hash>.json

Storage model: Attribution is keyed by (change-id, commit-id) to support:

Normal case: one commit per change → one file
Divergent changes: multiple commits share change-id → multiple files
Each file represents attribution for a specific commit snapshot

Rationale: Storing under .jj/ai/ (not .jj/repo/store/extra/) avoids coupling to jj's internal storage format while remaining repo-local and gitignored.

3. Storage Format

3.1 Attribution Record

Each <change-id>.json file contains:

{
  "schema_version": "jj-ai/0.1.0",
  "change_id": "kpqvuntorskozwnu",
  "recorded_at": "2026-02-03T10:30:00Z",
  "recorded_commit_id": "abc123def456...",
  
  "attestation": {
    "files": [
      {
        "path": "src/main.rs",
        "ranges": [
          { "session": "a1b2c3d4e5f67890", "lines": "1-10,15-20" },
          { "session": "0987654321fedcba", "lines": "25,30-35" }
        ]
      }
    ]
  },
  
  "prompts": {
    "a1b2c3d4e5f67890": {
      "agent_id": {
        "tool": "cursor",
        "model": "claude-sonnet-4-20250514",
        "conversation_id": "6ef2299e-..."
      },
      "human_author": "Developer <[email protected]>",
      "messages": [],
      "total_additions": 50,
      "total_deletions": 10,
      "accepted_lines": 45,
      "overridden_lines": 5
    }
  },
  
  "content_anchors": [
    {
      "path": "src/main.rs",
      "snippet_hash": "sha256:...",
      "context_hash": "sha256:..."
    }
  ]
}

3.2 Session Hash Generation

Per git-ai v3.0.0 spec:

session_hash = lowercase_hex(SHA-256(utf8("{tool}:{conversation_id}")))[0:16]

Requirements:

tool: Exact string as provided (case-sensitive, no normalization)
conversation_id: Exact string as provided (UTF-8 encoded)
Compute SHA-256 over the UTF-8 byte string
Take first 16 characters of the lowercase hexadecimal digest
Result MUST contain only characters [0-9a-f]

Example:

input: "cursor:6ef2299e-abc-123"
SHA-256(utf8(input)) = "a1b2c3d4e5f67890..."
session_hash = "a1b2c3d4e5f67890"

3.3 Git Notes Export Format

When syncing to refs/notes/ai, render git-ai v3.0.0 compatible format:

src/main.rs
  a1b2c3d4e5f67890 1-10,15-20
  0987654321fedcba 25,30-35
---
{
  "schema_version": "authorship/3.0.0",
  "base_commit_sha": "7734793b756b3921c88db5375a8c156e9532447b",
  "prompts": {
    "a1b2c3d4e5f67890": {
      "agent_id": { "tool": "cursor", "model": "claude-sonnet-4-20250514", "id": "6ef2299e-..." },
      "human_author": "Developer <[email protected]>",
      "messages": [],
      "total_additions": 50,
      "total_deletions": 10,
      "accepted_lines": 45,
      "overridden_lines": 5
    }
  },
  "extensions": {
    "jj-ai": {
      "producer": "jj-ai/0.1.0",
      "jj_change_id": "kpqvuntorskozwnu",
      "stale": false
    }
  }
}

Attestation section formatting:

File paths sorted lexicographically
Paths with spaces/tabs/newlines MUST be double-quoted
Within each file: entries sorted by session hash, then by range start
Line ranges normalized: sorted, merged adjacent, no duplicates

Extension fields (namespaced under extensions.jj-ai):

producer: Identifies jj-ai as the source (enables safe pruning)
jj_change_id: Back-reference for import
stale: Indicates attribution may be outdated

3.4 Storage Atomicity and Locking

Requirements:

All writes to .jj/ai/notes/ MUST be atomic (write to temp file → fsync → rename)
Implementations SHOULD use file locking to prevent concurrent attach races
Lock file: .jj/ai/.lock with advisory locking

Concurrent write behavior:

If lock cannot be acquired within 5 seconds, fail with error
On crash/timeout, stale locks older than 60 seconds MAY be broken

4. CLI Design

4.1 Binary Name

Standalone binary: jj-ai

Since jj doesn't support external subcommand discovery, users invoke directly:

jj-ai <command> [options]

Recommended shell alias:

alias jja='jj-ai'

4.2 Commands

`jj-ai attach`

Record AI attribution for a change.

jj-ai attach [OPTIONS]

Options:
  -r, --revision <REV>       Target revision [default: @]
  -t, --tool <TOOL>          AI tool name (cursor, copilot, claude-code, etc.)
  -m, --model <MODEL>        Model identifier
  -c, --conversation-id <ID> Conversation/session identifier
  -f, --file <PATH>          Specific file(s) to attribute [repeatable]
  -l, --lines <RANGE>        Line ranges (e.g., "1-10,15-20")
      --transcript-stdin     Read JSON transcript from stdin
      --human-author <SIG>   Override human author signature

Examples:

# Attach attribution to current change
jj-ai attach -r @ --tool=cursor --model=claude-sonnet-4-20250514 --conversation-id=abc-123

# Attribute specific file/lines
jj-ai attach -r @ --tool=cursor -c abc-123 --file src/main.rs --lines 1-50

# With transcript
echo '{"messages": [...]}' | jj-ai attach -r @ --tool=cursor -c abc-123 --transcript-stdin

`jj-ai show`

Display attribution for a change.

jj-ai show [OPTIONS]

Options:
  -r, --revision <REV>    Target revision [default: @]
      --format <FMT>      Output format: pretty, json, git-ai [default: pretty]

`jj-ai blame`

AI-enhanced blame output. Leverages jj file annotate internally.

jj-ai blame <PATH> [OPTIONS]

Options:
  -r, --revision <REV>    Target revision [default: @]
      --porcelain         Machine-readable output

Output:

src/main.rs:
  1-10   [AI: cursor/claude-sonnet-4-20250514] a1b2c3d4  "Implement auth flow"
  11-14  [Human]                         e5f67890
  15-20  [AI: cursor/claude-sonnet-4-20250514] a1b2c3d4  "Implement auth flow"

Implementation: Calls jj file annotate with a JSON-emitting template, then enriches each line's change-id with AI attribution from .jj/ai/notes/.

Native jj Template Integration

For users who prefer native jj file annotate, jj-ai provides a helper that pre-computes AI labels:

# Generate a template alias with AI data baked in
jj-ai template-gen --revision @ > /tmp/ai-annotate.toml

# Use with jj directly
jj file annotate src/main.rs --config-file /tmp/ai-annotate.toml \
  -T 'ai_annotate'

The generated template uses if() with change_id().short() comparisons:

[template-aliases]
'ai_annotate' = '''
if(commit().change_id().short() == "kpqvunto",
  label("ai cursor", line_number() ++ " [AI:cursor] " ++ content()),
  if(commit().change_id().short() == "xyzw1234",
    label("ai copilot", line_number() ++ " [AI:copilot] " ++ content()),
    line_number() ++ " " ++ content()
  )
)
'''

Limitation: Generated template is static; re-run template-gen after new attributions.

`jj-ai sync`

Synchronize attributions between jj-ai store and Git notes.

jj-ai sync [OPTIONS]

Options:
      --to-git              Export to refs/notes/ai
      --from-git            Import from refs/notes/ai
      --revs <REVSET>       Scope [default: stack(@)]
      --all-reachable       Sync all reachable commits
      --strict              Fail on any stale attribution
      --recompute           Attempt to remap stale line ranges
      --prune               Remove orphaned notes
      --prune-dry-run       Show what would be pruned
      --force               Allow overwriting non-jj-ai notes
      --dry-run             Show what would be synced
      --json                JSON output for scripting

Examples:

# Standard workflow
jj git export && jj-ai sync --to-git

# CI: strict mode, full repo
jj-ai sync --to-git --all-reachable --strict

# Preview sync
jj-ai sync --to-git --dry-run

# Import from collaborator
git fetch origin refs/notes/ai:refs/notes/ai
jj git import
jj-ai sync --from-git

`jj-ai stats`

Display AI contribution statistics.

jj-ai stats [OPTIONS]

Options:
  -r, --revset <REVSET>   Scope [default: @]
      --by-tool           Group by AI tool
      --by-file           Group by file
      --by-author         Group by human author

`jj-ai move`

Transfer attribution between changes (for splits/squashes).

jj-ai move --from <REV> --to <REV> [OPTIONS]

Options:
      --file <PATH>       Only move attribution for specific file(s)
      --lines <RANGE>     Only move specific line ranges
      --merge             Merge with existing attribution on target
      --remap             Attempt to remap line numbers via diff

`jj-ai checkpoint`

Record a checkpoint for precise attribution diffing.

jj-ai checkpoint [OPTIONS]

Options:
  -r, --revision <REV>    Target revision [default: @]
      --type <TYPE>       Checkpoint type: human, ai-start, ai-end

Checkpoints enable precise attribution by marking boundaries:

jj-ai checkpoint --type=human     # Mark current state as human baseline
# ... AI makes changes ...
jj-ai checkpoint --type=ai-end    # Mark AI changes complete
jj-ai attach -r @ --from-checkpoint  # Auto-compute ranges from diff

`jj-ai template-gen`

Generate jj template aliases for AI-aware annotation.

jj-ai template-gen [OPTIONS]

Options:
  -r, --revset <REVSET>   Scope for template data [default: ancestors(@, 20)]
  -o, --output <PATH>     Output file [default: stdout]
      --format <FMT>      Output format: toml, json [default: toml]

Example:

jj-ai template-gen -r 'trunk()..@' > /tmp/ai.toml
jj file annotate src/main.rs --config-file /tmp/ai.toml -T ai_annotate

`jj-ai diff`

Calculate AI contribution statistics on-demand (replaces stored stats in v4).

jj-ai diff [OPTIONS]

Options:
  -r, --revset <REVSET>   Scope [default: @]
      --by-session        Break down by session
      --by-file           Break down by file
      --by-tool           Break down by AI tool
      --json              JSON output

Output:

$ jj-ai diff -r 'trunk()..@'
AI Contribution Summary (15 commits):
  Total lines:     1,234
  AI-authored:       890 (72%)
  Human-authored:    344 (28%)
  
By tool:
  cursor:     650 lines (53%)
  copilot:    240 lines (19%)

5. Sync Behavior

5.1 Scope

Flag	Revset	Use Case
(default)	`stack(@)`	Day-to-day development
`--revs <R>`	User-specified	Custom scope
`--all-reachable`	`all()`	CI, releases

Definition of stack(@): All ancestors of @ until reaching immutable commits or trunk.

5.2 Stale Attribution Handling

Attribution becomes "stale" when the change's content has been modified since recording.

Detection: Compare recorded_commit_id with current commit ID for the change. If different, check content anchors.

Default behavior: Warn and sync with stale: true marker.

$ jj-ai sync --to-git
warning: attribution for kpqvuntor is stale (content changed since recording)
  → syncing with stale marker
Synced 3 attributions to refs/notes/ai

Strict mode (--strict): Fail if any attribution is stale.

$ jj-ai sync --to-git --strict
error: attribution for kpqvuntor is stale
  recorded against: abc123def456
  current commit:   789xyz...
  hint: use --recompute to attempt range remapping, or re-attach attribution

Recompute mode (--recompute): Attempt to remap line ranges using content anchors.

Algorithm:

For each stale file attribution, search for snippet_hash in current file
If unique match found, compute line offset and adjust ranges
If no match or ambiguous, keep stale marker

5.3 Prune Behavior

Default: Never prune. Sync is non-destructive.

Prune semantics are orphan-based, NOT scope-based. This prevents accidental deletion of valid notes outside the current working set.

--prune: Remove Git notes where the annotated commit is unreachable from any Git ref (branches, tags, HEAD). This mirrors git notes prune semantics.

Additional requirements for pruning:

Commit must be unreachable (not just outside sync scope)
Note must have been produced by jj-ai (extensions.jj-ai.producer present)

--prune --force: Also remove notes without jj-ai producer marker (use with caution).

--prune-dry-run: List what would be removed without deleting.

$ jj-ai sync --to-git --prune-dry-run
Would prune 2 notes (commits unreachable from any ref):
  abc123... (unreachable, producer: jj-ai/0.1.0)
  def456... (unreachable, producer: jj-ai/0.1.0)
Skipping 1 note (not produced by jj-ai):
  789abc... (unreachable, producer: unknown)
Preserving 5 notes (commits still reachable)

WARNING: --prune based on revset scope (e.g., stack(@)) is explicitly NOT supported, as it would delete valid notes for commits on other branches or trunk.

5.4 Conflict Handling

When syncing to Git and a note already exists with different content:

Default: Refuse to overwrite, show diff.

$ jj-ai sync --to-git
error: note conflict for commit abc123...
  existing note differs from jj-ai attribution
  hint: use --merge to 3-way merge, --force to overwrite, or --from-git to import existing

--merge: Perform 3-way merge of notes content:

Parse both notes (local jj-ai and existing git note)
Merge prompts maps: Union by session hash (no conflict possible - session hashes are unique)
Merge attestation ranges:
- For non-overlapping ranges: union
- For overlapping ranges with same session: keep as-is
- For overlapping ranges with different sessions: later recorded_at wins, log warning
Merge extensions: Local extensions.jj-ai overwrites remote

$ jj-ai sync --to-git --merge
Merged note for commit abc123...
  added 2 sessions from remote
  resolved 1 range conflict (local wins: newer timestamp)

--force: Overwrite existing note entirely (local wins, remote discarded).

5.5 Notes Collaboration Workflow

For teams sharing AI attributions via refs/notes/ai:

# Fetch remote notes
git fetch origin refs/notes/ai:refs/notes/ai-remote

# Import into jj-ai (creates local attribution files)
jj-ai sync --from-git --notes-ref=refs/notes/ai-remote

# ... do local work, attach attributions ...

# Export with merge
jj git export
jj-ai sync --to-git --merge

# Push notes
git push origin refs/notes/ai

Merge conflicts during git fetch: If refs/notes/ai has diverged, git will refuse to fast-forward. Use:

git fetch origin refs/notes/ai:refs/notes/ai-remote
git notes --ref=ai merge refs/notes/ai-remote
# Or let jj-ai handle it:
jj-ai sync --from-git --notes-ref=refs/notes/ai-remote --merge

6. Complex Rewrite Handling

6.1 Squash

When changes A and B squash into C:

C retains A's change-id (jj behavior)
Attribution from both A and B merge into C's record
Line ranges require remapping based on diff hunks

fn handle_squash(
    target_change: &ChangeId,   // C (keeps A's change-id)
    target_commit: &CommitId,   // New commit after squash
    absorbed_change: &ChangeId, // B (absorbed into C)
    absorbed_commit: &CommitId, // B's commit before squash
    repo: &Repository,
    store: &mut AttributionStore,
) -> Result<()> {
    let absorbed_attr = store.get(absorbed_change, absorbed_commit)?;
    let target_attr = store.get_or_create(target_change, target_commit)?;
    
    // Compute how B's content maps into C
    let diff = repo.diff(absorbed_commit, target_commit)?;
    
    // Merge file attributions with line remapping
    for file_attr in absorbed_attr.attestation.files {
        let remapped = remap_ranges_via_diff(&file_attr.ranges, &diff, &file_attr.path)?;
        target_attr.merge_file(FileAttribution {
            path: file_attr.path,
            ranges: remapped,
        })?;
    }
    
    // Union prompt metadata
    target_attr.prompts.extend(absorbed_attr.prompts);
    
    // Remove absorbed record
    store.remove(absorbed_change, absorbed_commit)?;
    
    Ok(())
}

Range conflict resolution (when same lines attributed to different sessions):

Later recorded_at timestamp wins
Warning logged for conflicts
Original session preserved in prompts for audit trail

6.2 Split

When change A splits into B and C:

B and C get new change-ids
Attribution distributes based on diff/hunk mapping, not file existence
Original A's record is removed after distribution

Distribution algorithm:

Compute the diff from A's parent to each target (B, C)
For each attributed range in A, determine which target's diff contains those lines
Remap line numbers based on hunk offsets in each target
If a range spans hunks going to different targets, split the range

fn handle_split(
    source: &ChangeId,
    source_commit: &CommitId,
    targets: &[(ChangeId, CommitId)],
    repo: &Repository,
    store: &mut AttributionStore,
) -> Result<()> {
    let source_attr = store.get(source, source_commit)?;
    let parent = repo.get_parent(source_commit)?;
    
    for (target_change_id, target_commit_id) in targets {
        let mut attr = AttributionData::new(*target_change_id, *target_commit_id);
        
        // Compute hunks that landed in this target
        let diff = repo.diff(&parent, target_commit_id)?;
        
        for file_attr in &source_attr.attestation.files {
            if let Some(file_diff) = diff.get(&file_attr.path) {
                // Intersect attributed ranges with hunks in this target
                let remapped = remap_ranges_to_hunks(&file_attr.ranges, file_diff);
                if !remapped.is_empty() {
                    attr.attestation.files.push(FileAttribution {
                        path: file_attr.path.clone(),
                        ranges: remapped,
                    });
                }
            }
        }
        
        // Include only referenced sessions
        let used_sessions = attr.referenced_sessions();
        attr.prompts = source_attr.prompts
            .iter()
            .filter(|(k, _)| used_sessions.contains(k))
            .cloned()
            .collect();
        
        if !attr.attestation.files.is_empty() {
            store.insert(attr)?;
        }
    }
    
    store.remove(source, source_commit)?;
    Ok(())
}

fn remap_ranges_to_hunks(
    ranges: &[Range],
    diff: &FileDiff,
) -> Vec<Range> {
    // For each range, find overlapping hunks and compute new line numbers
    // based on the hunk's position in the target file
    // Returns empty vec if no overlap
    ...
}

Fallback for complex splits: If hunk mapping cannot be reliably computed (e.g., significant refactoring), mark attribution as stale and emit a warning suggesting manual jj-ai move or re-attach.

6.3 Divergent Changes

When a change diverges (concurrent edits creating multiple visible commits with the same change-id), each variant has its own attribution record.

Storage model: Each divergent commit gets its own file:

.jj/ai/notes/<change-id>/
  ├── <commit-id-variant-1>.json
  └── <commit-id-variant-2>.json

Behavior:

jj-ai attach -r @ writes to the specific commit's file
jj-ai show -r @ shows attribution for that specific commit
jj-ai sync --to-git exports each variant to its respective commit SHA in refs/notes/ai

Resolution when divergence resolves (user picks one variant or merges):

If one variant is abandoned, its attribution file is removed
If variants merge, attributions are merged using these rules:
1. Session union: All prompt records from both variants are preserved
2. Range conflict: For overlapping ranges with different sessions, the variant with later recorded_at wins
3. Stale marking: Merged attribution is marked stale: true if content changed during merge

Sync behavior for divergent changes:

$ jj-ai sync --to-git
warning: change kpqvunto is divergent (2 variants)
  → syncing each variant to its commit SHA
  variant abc123... → refs/notes/ai
  variant def456... → refs/notes/ai

Error case: If jj-ai attach is called on a change-id without specifying which commit (and multiple visible commits exist), it MUST error:

$ jj-ai attach -r kpqvunto ...
error: change kpqvunto is divergent with 2 visible commits
  hint: specify exact commit with -r <commit-id> or resolve divergence first

7. Integration Points

7.1 Coding Agent Integration

Agents (Cursor, Claude Code, Copilot) should call jj-ai after making changes:

# Pre-edit checkpoint (optional, for precise diffing)
jj-ai checkpoint --type=human -r @

# ... agent makes changes ...

# Post-edit: attach attribution
jj-ai attach -r @ \
  --tool=cursor \
  --model=claude-sonnet-4-20250514 \
  --conversation-id="$CONVERSATION_ID" \
  --transcript-stdin < transcript.json

7.2 CI/CD Integration

# GitHub Actions example
- name: Sync AI attributions
  run: |
    jj git export
    jj-ai sync --to-git --all-reachable --strict
    git push origin refs/notes/ai

7.3 Pre-push Hook (Git side)

If using Git for pushing, add a pre-push hook:

#!/bin/bash
# .git/hooks/pre-push
jj git export
jj-ai sync --to-git --strict || exit 1

8. Git-ai v4.0.0 Compatibility

This section describes jj-ai alignment with the git-ai v4.0.0 working proposal, which adds first-class support for agentic primitives: plans, subagents, forked threads, and content-addressable prompt storage.

8.1 Schema Version

When targeting v4 compatibility:

{
  "schema_version": "authorship/4.0.0",
  ...
}

jj-ai MUST support both v3 and v4 formats, auto-detecting via schema_version.

8.2 Enhanced Attestation Section

v4 adds message index tracking to pinpoint which message in a conversation caused a change.

IMPORTANT: To maintain v3 parser compatibility, the attestation section wire format MUST NOT change. v4 metadata (message indices, flags) are stored in the JSON metadata section only.

Wire format (unchanged from v3):

src/main.rs
  a1b2c3d4e5f67890 1-10,15-20,25-30
  0987654321fedcba 40-50

v4 metadata in JSON section:

{
  "schema_version": "authorship/4.0.0",
  "attestation_meta": {
    "src/main.rs": {
      "a1b2c3d4e5f67890": {
        "1-10,15-20": { "message_idx": 5 },
        "25-30": { "message_idx": 7 }
      },
      "0987654321fedcba": {
        "40-50": { "message_idx": 3, "overridden": true }
      }
    }
  }
}

jj-ai local storage format (richer than wire format):

{
  "attestation": {
    "files": [
      {
        "path": "src/main.rs",
        "ranges": [
          { "session": "a1b2c3d4e5f67890", "message_idx": 5, "lines": "1-10,15-20" },
          { "session": "a1b2c3d4e5f67890", "message_idx": 7, "lines": "25-30" },
          { "session": "0987654321fedcba", "message_idx": 3, "lines": "40-50", "overridden": true }
        ]
      }
    ]
  }
}

Flags (stored in attestation_meta, not attestation section):

overridden: true - lines originally AI-authored, subsequently modified by human
removed: true - lines deleted (tracked for audit purposes)

8.3 Enhanced Message Types

v4 expands message types beyond user/assistant/tool_use:

Type	Description
`human`	Message from the human (replaces `user`)
`ai`	Response from the AI (replaces `assistant`)
`thinking`	AI reasoning/chain-of-thought (e.g., Claude's extended thinking)
`subagent`	Sub-agent invocation with nested session
`edit`	Normalized file edit operation
`tool`	Generic tool call
`plan`	AI-generated plan (may evolve across messages)

Example message array:

{
  "messages": [
    { "type": "human", "text": "Add authentication to the API", "timestamp": "..." },
    { "type": "thinking", "text": "I need to consider JWT vs session-based...", "timestamp": "..." },
    { "type": "plan", "plan_id": "plan-001", "content": { "steps": [...] }, "timestamp": "..." },
    { "type": "ai", "text": "I'll implement JWT-based auth...", "timestamp": "..." },
    { "type": "subagent", "session_id": "c3d4e5f678901234", "task": "Research JWT libraries", "timestamp": "..." },
    { "type": "edit", "path": "src/auth.rs", "diff": "...", "timestamp": "..." },
    { "type": "tool", "name": "bash", "input": { "command": "cargo test" }, "timestamp": "..." }
  ]
}

8.4 Subagent Support

Subagents are nested AI sessions spawned by a parent session. v4 adds:

{
  "prompts": {
    "a1b2c3d4e5f67890": {
      "agent_id": { "tool": "claude-code", "model": "claude-sonnet-4-20250514", "id": "..." },
      "parent_session_id": null,
      "messages": [...]
    },
    "c3d4e5f678901234": {
      "agent_id": { "tool": "claude-code", "model": "claude-sonnet-4-20250514", "id": "..." },
      "parent_session_id": "a1b2c3d4e5f67890",
      "task": "Research JWT libraries for Rust",
      "messages": [...]
    }
  }
}

jj-ai tracks subagent lineage via parent_session_id. Attribution flows to the root session for summary statistics.

8.5 Plan Tracking

Plans are first-class objects that may evolve across messages:

{
  "plans": {
    "plan-001": {
      "created_at": "2026-02-03T10:30:00Z",
      "revisions": [
        {
          "revision": 1,
          "message_idx": 2,
          "content": {
            "goal": "Add JWT authentication",
            "steps": [
              { "id": "1", "description": "Add jsonwebtoken dependency", "status": "completed" },
              { "id": "2", "description": "Create auth middleware", "status": "in_progress" },
              { "id": "3", "description": "Add token refresh endpoint", "status": "pending" }
            ]
          }
        },
        {
          "revision": 2,
          "message_idx": 8,
          "content": {
            "goal": "Add JWT authentication",
            "steps": [
              { "id": "1", "description": "Add jsonwebtoken dependency", "status": "completed" },
              { "id": "2", "description": "Create auth middleware", "status": "completed" },
              { "id": "3", "description": "Add token refresh endpoint", "status": "completed" },
              { "id": "4", "description": "Add rate limiting", "status": "pending" }
            ]
          }
        }
      ]
    }
  }
}

8.6 Content-Addressable Prompt Store

For sensitive transcripts that shouldn't live in Git history, v4 defines a protocol for external storage:

{
  "prompts": {
    "a1b2c3d4e5f67890": {
      "agent_id": { "tool": "cursor", "model": "claude-sonnet-4-20250514", "id": "..." },
      "messages": { "$ref": "cas://sha256:abc123.../messages" },
      "summary": "User requested auth implementation. AI created JWT middleware."
    }
  }
}

The $ref field indicates content stored externally. Supported schemes:

cas://sha256:<hash>/<path> - Content-addressable store (implementation-defined)
file://<path> - Local file reference (RESTRICTED - see Security)
https://<url> - Remote HTTPS endpoint (RESTRICTED - see Security)

Security restrictions for $ref resolution (see also §11.3):

Scheme	Default	Flag to enable
`cas://`	Allowed (local CAS only)	N/A
`file://`	BLOCKED	`--allow-file-refs` (restricted to repo)
`https://`	BLOCKED	`--allow-network-refs`
`http://`	BLOCKED	Not supported (no flag)

jj-ai stores the reference in .jj/ai/notes/ and resolves on read. For jj-ai sync --to-git:

Default: Inline messages if small (<10KB), else use $ref with summary
--inline-transcripts: Always inline (may bloat notes)
--strip-transcripts: Only include summary, drop messages entirely

8.7 Removed Fields

v4 removes per-session statistics that can be computed on-demand via jj-ai diff or git-ai diff:

~~total_additions~~
~~total_deletions~~
~~accepted_lines~~
~~overridden_lines~~

Calculating on-demand is faster and more accurate than maintaining stale counters.

8.8 Forked Threads

When a conversation branches (e.g., user backtracks and tries a different approach), v4 tracks thread lineage:

{
  "prompts": {
    "thread-main-abc": {
      "agent_id": { ... },
      "forked_from": null,
      "fork_point_message_idx": null,
      "messages": [...]
    },
    "thread-alt-xyz": {
      "agent_id": { ... },
      "forked_from": "thread-main-abc",
      "fork_point_message_idx": 5,
      "messages": [...]
    }
  }
}

This enables tools to visualize conversation evolution and understand why code changed direction.

8.9 Backwards Compatibility

Wire format compatibility is paramount. The attestation section grammar MUST remain v3-compatible.

Reading	Writing
jj-ai MUST read v3 and v4	jj-ai SHOULD write v3-compatible wire format by default
v3 `user` → v4 `human`	v4 extras stored in `attestation_meta` JSON only
v3 `assistant` → v4 `ai`	Stats fields omitted in v4
v3 `tool_use` → v4 `tool`	Message index defaults to `null` if unknown

Version negotiation:

--schema-version=3: Pure v3 output (no attestation_meta)
--schema-version=4: v3 attestation section + v4 JSON metadata (default when v4 data present)
Auto-detect on read via schema_version field

8.10 v4 CLI Additions

New flags for v4 features:

jj-ai attach [OPTIONS]
      --plan <JSON>           Attach plan JSON
      --parent-session <ID>   Mark as subagent of parent session
      --message-idx <N>       Associate with specific message index
      --cas-store <URL>       Store transcript in content-addressable store

jj-ai show [OPTIONS]
      --expand-refs           Resolve $ref pointers to inline content
      --plans                 Show plan evolution

jj-ai diff [OPTIONS]
      -r, --revset <REVSET>   Calculate AI contribution stats on-demand
      --by-session            Break down by session

9. Upstream jj Tracking

This section tracks relevant upstream jj issues and design work that may impact jj-ai.

9.1 Active Upstream Issues

Issue	Title	Status	Impact on jj-ai
#8166	design: Add support for per-revision metadata	🟢 Open	High - Could replace `.jj/ai/notes/`
#6664	FR: key/value store per change and commit	🟢 Open	High - Broader metadata use cases
#4766	FR: Make `jj annotate` extendable with blame layers	🟢 Open	Medium - Could simplify `jj-ai blame`
#8395	FR: Add support for jj notes	🔴 Closed	Duplicate of #8166
#4706	FR: Transfer change ids via Git remote	🔴 Closed	Solved via `git.write-change-id-header`

9.2 Per-Revision Metadata Design (Issue #8166)

A design document is in progress with these key points:

Semantics:

Metadata attaches to either change ID (survives rewrites) or commit hash (pinned)
Change-id attachment is exactly what jj-ai needs
Squash merges metadata; split follows change-id

Proposed CLI:

jj metaedit --set KEY=VALUE -r REVISION
jj metaedit --unset KEY -r REVISION

Scope explicitly excludes:

❌ Syndication to remotes (no git notes export)
❌ Topics formalization (separate effort)

Maintainer comments (martinvonz):

"We can probably still map the metadata to Git notes. We should be able to have a notes ref like refs/notes/jj/metadata..."

"Attaching notes to change ids may not work using Git notes because Git notes can only be attached to existing objects (and we don't model changes as Git objects)."

This confirms jj-ai's approach: store locally by change-id, materialize to git notes on sync.

9.3 Community Interest in AI Attribution

From @roninjin10 in #8166:

"I believe attaching prompts as metadata to changes is extremely powerful for JJ users doing agentic coding. Imagine being able to 'prompt rebase' when a model comes out rerunning all your prompts with a stronger model."

The jj community is already thinking about AI attribution use cases.

9.4 Historical Context

Issue #7 (2021) explains why jj moved away from git notes internally:

libgit2 doesn't do sharding → slow with many commits
Replaced with custom format for performance
Exchange via git notes was deprioritized

This validates jj-ai's design choice to use a separate storage layer.

9.5 Potential Migration Path

If upstream implements per-revision metadata (#8166):

Current jj-ai storage:

.jj/ai/notes/<change-id>.json

Potential future:

jj metaedit --set ai.attribution='<json>' -r @

jj-ai could migrate to use native metadata API while maintaining the sync layer for git-ai compatibility. The jj-ai sync --to-git command would still be necessary since upstream explicitly scoped out remote syndication.

9.6 Recommended Upstream Proposals

Features that would benefit jj-ai (not yet proposed upstream):

Opt-in notes import/export

jj git export --include-notes=refs/notes/ai
jj git import --include-notes=refs/notes/ai

Template external data access
- Allow templates to read from external sources
- Would enable native jj file annotate -T 'ai_attribution()'
Operation triggers
- Post-operation callbacks for automation

10. Future Considerations

10.1 Revset Extensions

If jj adds symbol resolver extension support for external tools:

# .jj/config.toml
[revset-aliases]
'ai()' = 'jj-ai:attributed()'
'ai(tool)' = 'jj-ai:attributed(tool)'
'human_only()' = 'all() ~ ai()'

Enabling queries like:

jj log -r 'ai("cursor")'
jj log -r 'trunk()..@ & human_only()'

11. Security Considerations

11.1 Sensitive Data

Transcripts: May contain sensitive prompts/responses. Store hashes only by default; full transcripts opt-in.
API keys: Never store in attribution records.
User identities: Respect Git's user.email settings.

11.2 Tampering

Git notes are mutable and not cryptographically signed by default. For high-assurance environments:

Sign notes with GPG: git notes --ref=ai add --gpg-sign
Verify on import: jj-ai sync --from-git --verify-signatures

11.3 External Reference Resolution (`$ref`)

Threat model: A malicious collaborator could push notes containing $ref URIs that:

Exfiltrate local files: file:///etc/passwd
Probe internal networks (SSRF): https://internal-metadata-service/...
Leak information via DNS/request timing

Default restrictions:

Operation	Behavior
`$ref` with `file://`	BLOCKED by default
`$ref` with `https://`	BLOCKED by default
`$ref` with `http://`	Always blocked (not supported)
`$ref` with `cas://`	Allowed (resolves to local `.jj/ai/cas/` only)

Enabling external refs (explicit opt-in required):

# Allow file:// refs, restricted to repo directory
jj-ai show --expand-refs --allow-file-refs

# Allow network refs (use with caution)
jj-ai show --expand-refs --allow-network-refs

file:// restrictions when enabled:

Path must be under repository root or .jj/ai/
Symlinks are not followed outside allowed directories
Absolute paths outside repo are rejected

Config for persistent settings:

# .jj/ai/config.toml
[security]
allow_file_refs = false      # default
allow_network_refs = false   # default
allowed_cas_hosts = []       # for remote CAS, e.g. ["cas.example.com"]

11.4 Transcript and Context Storage

thinking messages (chain-of-thought):

High privacy risk: may contain sensitive reasoning, rejected approaches
Default: stripped from export unless --include-thinking
Warning emitted if thinking messages present and not explicitly handled

content_anchors:

Previous spec stored raw context_before/context_after strings
Risk: Leaks proprietary code into git notes
New default: Store only context_hash (SHA-256 of context)
--store-context-plaintext: Opt-in to store raw context (not recommended)

Recommended defaults for jj-ai sync --to-git:

--strip-transcripts      # Only summary, no full messages
--strip-thinking         # Remove chain-of-thought (default)
--hash-context           # Store context hashes, not plaintext (default)

12. Appendix

A. Example Output

A.1 Native `jj file annotate` (baseline)

$ jj file annotate src/auth.rs

kpqvunto [email protected] 2026-02-03 10:30  1: use std::collections::HashMap;
kpqvunto [email protected] 2026-02-03 10:30  2: use jsonwebtoken::{encode, Header};
kpqvunto [email protected] 2026-02-03 10:30  3: 
kpqvunto [email protected] 2026-02-03 10:30  4: pub struct AuthService {
kpqvunto [email protected] 2026-02-03 10:30  5:     secret: String,
kpqvunto [email protected] 2026-02-03 10:30  6:     tokens: HashMap<String, Token>,
kpqvunto [email protected] 2026-02-03 10:30  7: }
kpqvunto [email protected] 2026-02-03 10:30  8: 
xyzw1234 [email protected] 2026-02-03 11:15  9: impl AuthService {
xyzw1234 [email protected] 2026-02-03 11:15 10:     pub fn new(secret: &str) -> Self {
xyzw1234 [email protected] 2026-02-03 11:15 11:         Self { secret: secret.to_string(), tokens: HashMap::new() }
xyzw1234 [email protected] 2026-02-03 11:15 12:     }
mlkn9876 [email protected] 2026-02-03 14:22 13: 
mlkn9876 [email protected] 2026-02-03 14:22 14:     pub fn validate(&self, token: &str) -> bool {
mlkn9876 [email protected] 2026-02-03 14:22 15:         self.tokens.contains_key(token)
mlkn9876 [email protected] 2026-02-03 14:22 16:     }
xyzw1234 [email protected] 2026-02-03 11:15 17: }

A.2 `jj-ai blame` (enriched)

$ jj-ai blame src/auth.rs

kpqvunto 2026-02-03 [AI cursor/sonnet-4]  1: use std::collections::HashMap;
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  2: use jsonwebtoken::{encode, Header};
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  3: 
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  4: pub struct AuthService {
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  5:     secret: String,
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  6:     tokens: HashMap<String, Token>,
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  7: }
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  8: 
xyzw1234 2026-02-03 [Human]               9: impl AuthService {
xyzw1234 2026-02-03 [Human]              10:     pub fn new(secret: &str) -> Self {
xyzw1234 2026-02-03 [Human]              11:         Self { secret: secret.to_string(), tokens: HashMap::new() }
xyzw1234 2026-02-03 [Human]              12:     }
mlkn9876 2026-02-03 [AI copilot/gpt-4]   13: 
mlkn9876 2026-02-03 [AI copilot/gpt-4]   14:     pub fn validate(&self, token: &str) -> bool {
mlkn9876 2026-02-03 [AI copilot/gpt-4]   15:         self.tokens.contains_key(token)
mlkn9876 2026-02-03 [AI copilot/gpt-4]   16:     }
xyzw1234 2026-02-03 [Human]              17: }

A.3 Native `jj file annotate` with generated template

$ jj-ai template-gen -r 'ancestors(@, 10)' > /tmp/ai.toml
$ jj file annotate src/auth.rs --config-file /tmp/ai.toml -T 'ai_annotate'

kpqvunto [AI:cursor]   1: use std::collections::HashMap;
kpqvunto [AI:cursor]   2: use jsonwebtoken::{encode, Header};
kpqvunto [AI:cursor]   3: 
kpqvunto [AI:cursor]   4: pub struct AuthService {
kpqvunto [AI:cursor]   5:     secret: String,
kpqvunto [AI:cursor]   6:     tokens: HashMap<String, Token>,
kpqvunto [AI:cursor]   7: }
kpqvunto [AI:cursor]   8: 
xyzw1234              9: impl AuthService {
xyzw1234             10:     pub fn new(secret: &str) -> Self {
xyzw1234             11:         Self { secret: secret.to_string(), tokens: HashMap::new() }
xyzw1234             12:     }
mlkn9876 [AI:copilot] 13: 
mlkn9876 [AI:copilot] 14:     pub fn validate(&self, token: &str) -> bool {
mlkn9876 [AI:copilot] 15:         self.tokens.contains_key(token)
mlkn9876 [AI:copilot] 16:     }
xyzw1234             17: }

Native template is less detailed (no model) but works with pure jj.

A.4 Porcelain output (for tooling)

$ jj-ai blame src/auth.rs --porcelain

{"line":1,"change_id":"kpqvunto","ai":{"tool":"cursor","model":"claude-sonnet-4-20250514","session":"a1b2c3d4e5f67890"}}
{"line":2,"change_id":"kpqvunto","ai":{"tool":"cursor","model":"claude-sonnet-4-20250514","session":"a1b2c3d4e5f67890"}}
...
{"line":9,"change_id":"xyzw1234","ai":null}
{"line":10,"change_id":"xyzw1234","ai":null}
...
{"line":14,"change_id":"mlkn9876","ai":{"tool":"copilot","model":"gpt-4","session":"0987654321fedcba"}}

A.5 Generated template file

# Generated by jj-ai template-gen
# Scope: ancestors(@, 10)
# Generated: 2026-02-03T15:00:00Z

[template-aliases]
'ai_annotate' = '''
separate(" ",
  commit().change_id().shortest(8),
  if(commit().change_id().short(8) == "kpqvunto",
    label("ai", "[AI:cursor]"),
    if(commit().change_id().short(8) == "mlkn9876",
      label("ai", "[AI:copilot]"),
      pad_end(12, "")
    )
  ),
  pad_end(4, line_number()),
  content()
)
'''

B. Change ID Format

Storage: 16 bytes (128 bits)
Display: 32 characters, reverse-hex using z-k alphabet
Short form: First 12 characters (default format_short_id)

Alphabet mapping:

0123456789abcdef  (hex)
zyxwvutsrqponmlk  (reverse-hex display)

C. Git-ai v3.0.0 Required Fields

{
  "schema_version": "authorship/3.0.0",
  "base_commit_sha": "<sha>",
  "prompts": {
    "<session_hash>": {
      "agent_id": { "tool": "<tool>", "model": "<model>", "id": "<conversation_id>" },
      "human_author": "<name> <email>",
      "messages": [],
      "total_additions": 0,
      "total_deletions": 0,
      "accepted_lines": 0,
      "overridden_lines": 0
    }
  }
}

D. Git-ai v4.0.0 Required Fields

{
  "schema_version": "authorship/4.0.0",
  "base_commit_sha": "<sha>",
  "prompts": {
    "<session_hash>": {
      "agent_id": { "tool": "<tool>", "model": "<model>", "id": "<conversation_id>" },
      "human_author": "<name> <email>",
      "parent_session_id": null,
      "messages": []
    }
  },
  "plans": {}
}

Note: v4 removes total_additions, total_deletions, accepted_lines, overridden_lines (computed on-demand).

E. References

F. Comparison with git-ai 3.0.0

This section compares jj-ai's approach to tooling/agent/editor integration against the git-ai 3.0.0 specification.

F.1 Core Architecture Difference

Aspect	git-ai 3.0.0	jj-ai 0.1.0
Key	Commit SHA	Change ID (stable)
Rewrite handling	Complex operations require remap	Change-id survives rewrites natively
Storage	`refs/notes/ai` directly	`.jj/ai/notes/` → materialized on export

F.2 Agent Integration Points

git-ai requires agents to:

Write to working state / staged attributions
Attach attribution before commit time
Re-track attributions through history rewrites

The spec recommends: "integrating with published implementation, not implementing this spec" and links to external integration docs.

jj-ai simplifies to:

# Pre-edit checkpoint (optional, for precise diffing)
jj-ai checkpoint --type=human -r @

# ... agent makes changes ...

# Post-edit: attach attribution
jj-ai attach -r @ \
  --tool=cursor \
  --model=claude-sonnet-4-20250514 \
  --conversation-id="$CONVERSATION_ID" \
  --transcript-stdin < transcript.json

F.3 Key Differences for Tool Integrators

Concern	git-ai 3.0.0	jj-ai 0.1.0
When to attach	Before commit (working state)	After change exists (any revision)
Rewrite survival	Tool must re-track through rewrites via "working state"	Auto-survives; sync on export
Working state	Central concept for `merge --squash`, `reset`, `stash`	Not needed - change-id stable
Transcript ingestion	Via `messages[]` array	`--transcript-stdin` flag
Session hash	`SHA-256("{tool}:{conversation_id}")[0:16]`	Same (compatible)

F.4 git-ai's "Working State" Burden

git-ai spec §2.1-2.6 defines extensive rules for preserving attributions through:

Rebase (§2.1) - must track through reordering, squash, split, drop, edit
Merge (§2.2) - --squash requires moving to working state
Reset (§2.3) - soft/mixed must preserve in working state
Cherry-pick (§2.4) - --no-commit requires working state
Stash (§2.5) - stores under refs/notes/ai-stash
Amend (§2.6) - must combine original + working state

jj-ai sidesteps all of this by keying on change-id. Complex rewrite handling reduces to:

Squash → merge attributions (§6.1)
Split → distribute by content (§6.2)
Divergent → union merge with warnings (§6.3)

F.5 Integration Complexity

For a coding agent (Cursor, Claude Code, Copilot):

git-ai integration requires:

Track uncommitted state
Handle reattachment through history rewrites
Potentially integrate deeply with IDE's git operations
Implement working state persistence

jj-ai integration requires:

Call jj-ai attach after making changes
Done - jj-ai handles rewrite survival internally

F.6 CI/CD Comparison

Both have similar CI patterns, but jj-ai adds an explicit sync step:

# git-ai
- run: git push origin refs/notes/ai

# jj-ai
- run: |
    jj git export
    jj-ai sync --to-git --all-reachable --strict
    git push origin refs/notes/ai

F.7 Summary

	git-ai 3.0.0	jj-ai 0.1.0
Integration simplicity	Complex (working state management)	Simple (just `attach`)
Rewrite robustness	Requires careful implementation	Inherent via change-id
Sync step	None (direct to notes)	Required before push
Storage overhead	Single location	Dual (local + materialized)
Subcommand discovery	N/A (git extension)	Manual (`jj-ai`, not `jj ai`)

Verdict: jj-ai shifts complexity from tool integrators to a single sync step, making it significantly easier for coding agents to adopt while maintaining git-ai wire-format compatibility on export.

Changelog

0.1.1-draft (2026-02-03): Oracle review fixes
- Fixed session hash examples to use valid hex characters
- Fixed v4 attestation format to preserve v3 wire compatibility
- Fixed prune semantics: orphan-based, not scope-based
- Fixed split algorithm: hunk-based distribution, not file existence
- Fixed divergence model: per-commit-id variants
- Added merge semantics for notes conflicts (§5.4, §5.5)
- Added locking/atomicity requirements (§3.4)
- Namespaced extension fields under extensions.jj-ai
- Added security restrictions for $ref resolution (§11.3)
- Added checkpoint, template-gen, diff command specs
- Added canonical formatting/sorting rules for attestation
- Clarified session hash generation (lowercase hex, UTF-8)
0.1.0-draft (2026-02-03): Initial specification

dmmulroy/jj-ai-spec.md