Skip to content

Instantly share code, notes, and snippets.

@dmmulroy
Last active February 4, 2026 03:45
Show Gist options
  • Select an option

  • Save dmmulroy/fb3205b49cfe917cdc50e39124d652b9 to your computer and use it in GitHub Desktop.

Select an option

Save dmmulroy/fb3205b49cfe917cdc50e39124d652b9 to your computer and use it in GitHub Desktop.

jj-ai: AI Attribution Tracking for Jujutsu VCS

Version: 0.1.1-draft
Status: Design Specification
Date: 2026-02-03


Executive Summary

jj-ai bridges Jujutsu's rewrite-heavy workflow with git-ai's commit-SHA-keyed attribution system. By storing attributions keyed by jj's stable change IDs and materializing to refs/notes/ai at export time, we achieve:

  • Attribution that survives rebases, amends, squashes, and splits
  • Full git-ai v3.0.0 and v4.0.0 compatibility for Git ecosystem interop
  • First-class support for agentic primitives: plans, subagents, forked threads
  • No dependency on hooks or jj internals

1. Problem Statement

1.1 The Fundamental Conflict

git-ai stores AI authorship attribution in Git notes keyed by commit SHA:

refs/notes/ai/<commit-sha> → attestation + metadata

Jujutsu rewrites commits constantly. A single logical change may cycle through dozens of SHAs during normal workflow:

jj new → jj describe → jj rebase → jj squash  # 4 different SHAs, same change

Every rewrite invalidates the SHA key, orphaning the attribution.

1.2 jj's Advantage: Stable Change IDs

jj assigns each change a change ID (16 random bytes, displayed as reverse-hex z-k alphabet) that persists across all rewrites. This is exactly the stable identifier that attribution needs.

1.3 Integration Constraints

Constraint Implication
jj ignores refs/notes/* during import/export Must manually bridge notes to/from Git
jj has no hook system Cannot intercept rewrites automatically
jj has no external subcommand discovery jj-ai won't auto-invoke as jj ai
git-ai keys by commit SHA Must resolve change-id → SHA at sync time

2. Architecture

2.1 Data Flow

┌─────────────────────────────────────────────────────────────────┐
│                    Coding Agent (Cursor, etc.)                   │
│  jj-ai attach -r @ --tool=cursor --conversation-id=abc-123      │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                 jj-ai Local Store (canonical)                    │
│  .jj/ai/notes/<change-id>.json                                  │
│  • Keyed by change-id (stable across rewrites)                  │
│  • Contains attestation + git-ai v3.0.0 metadata                │
│  • Producer/version tracking for safe pruning                   │
└─────────────────────────────┬───────────────────────────────────┘
                              │ jj-ai sync --to-git
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Git Notes (materialized)                      │
│  refs/notes/ai                                                  │
│  • Keyed by commit SHA (resolved at sync time)                  │
│  • git-ai v3.0.0 compatible format                              │
│  • Pushable to GitHub/GitLab                                    │
└─────────────────────────────────────────────────────────────────┘

2.2 Storage Location

.jj/
├── repo/
│   └── store/
│       └── extra/          # jj internals (DO NOT USE)
└── ai/                     # jj-ai storage (separate from jj internals)
    ├── config.toml         # jj-ai configuration
    ├── .lock                # advisory lock file
    ├── notes/              # attribution records (by change-id, then commit-id)
    │   └── <change-id>/
    │       ├── <commit-id-1>.json   # variant 1
    │       ├── <commit-id-2>.json   # variant 2 (for divergent changes)
    │       └── ...
    └── sessions/           # optional: cached session metadata
        └── <session-hash>.json

Storage model: Attribution is keyed by (change-id, commit-id) to support:

  • Normal case: one commit per change → one file
  • Divergent changes: multiple commits share change-id → multiple files
  • Each file represents attribution for a specific commit snapshot

Rationale: Storing under .jj/ai/ (not .jj/repo/store/extra/) avoids coupling to jj's internal storage format while remaining repo-local and gitignored.


3. Storage Format

3.1 Attribution Record

Each <change-id>.json file contains:

{
  "schema_version": "jj-ai/0.1.0",
  "change_id": "kpqvuntorskozwnu",
  "recorded_at": "2026-02-03T10:30:00Z",
  "recorded_commit_id": "abc123def456...",
  
  "attestation": {
    "files": [
      {
        "path": "src/main.rs",
        "ranges": [
          { "session": "a1b2c3d4e5f67890", "lines": "1-10,15-20" },
          { "session": "0987654321fedcba", "lines": "25,30-35" }
        ]
      }
    ]
  },
  
  "prompts": {
    "a1b2c3d4e5f67890": {
      "agent_id": {
        "tool": "cursor",
        "model": "claude-sonnet-4-20250514",
        "conversation_id": "6ef2299e-..."
      },
      "human_author": "Developer <[email protected]>",
      "messages": [],
      "total_additions": 50,
      "total_deletions": 10,
      "accepted_lines": 45,
      "overridden_lines": 5
    }
  },
  
  "content_anchors": [
    {
      "path": "src/main.rs",
      "snippet_hash": "sha256:...",
      "context_hash": "sha256:..."
    }
  ]
}

3.2 Session Hash Generation

Per git-ai v3.0.0 spec:

session_hash = lowercase_hex(SHA-256(utf8("{tool}:{conversation_id}")))[0:16]

Requirements:

  • tool: Exact string as provided (case-sensitive, no normalization)
  • conversation_id: Exact string as provided (UTF-8 encoded)
  • Compute SHA-256 over the UTF-8 byte string
  • Take first 16 characters of the lowercase hexadecimal digest
  • Result MUST contain only characters [0-9a-f]

Example:

input: "cursor:6ef2299e-abc-123"
SHA-256(utf8(input)) = "a1b2c3d4e5f67890..."
session_hash = "a1b2c3d4e5f67890"

3.3 Git Notes Export Format

When syncing to refs/notes/ai, render git-ai v3.0.0 compatible format:

src/main.rs
  a1b2c3d4e5f67890 1-10,15-20
  0987654321fedcba 25,30-35
---
{
  "schema_version": "authorship/3.0.0",
  "base_commit_sha": "7734793b756b3921c88db5375a8c156e9532447b",
  "prompts": {
    "a1b2c3d4e5f67890": {
      "agent_id": { "tool": "cursor", "model": "claude-sonnet-4-20250514", "id": "6ef2299e-..." },
      "human_author": "Developer <[email protected]>",
      "messages": [],
      "total_additions": 50,
      "total_deletions": 10,
      "accepted_lines": 45,
      "overridden_lines": 5
    }
  },
  "extensions": {
    "jj-ai": {
      "producer": "jj-ai/0.1.0",
      "jj_change_id": "kpqvuntorskozwnu",
      "stale": false
    }
  }
}

Attestation section formatting:

  • File paths sorted lexicographically
  • Paths with spaces/tabs/newlines MUST be double-quoted
  • Within each file: entries sorted by session hash, then by range start
  • Line ranges normalized: sorted, merged adjacent, no duplicates

Extension fields (namespaced under extensions.jj-ai):

  • producer: Identifies jj-ai as the source (enables safe pruning)
  • jj_change_id: Back-reference for import
  • stale: Indicates attribution may be outdated

3.4 Storage Atomicity and Locking

Requirements:

  • All writes to .jj/ai/notes/ MUST be atomic (write to temp file → fsync → rename)
  • Implementations SHOULD use file locking to prevent concurrent attach races
  • Lock file: .jj/ai/.lock with advisory locking

Concurrent write behavior:

  • If lock cannot be acquired within 5 seconds, fail with error
  • On crash/timeout, stale locks older than 60 seconds MAY be broken

4. CLI Design

4.1 Binary Name

Standalone binary: jj-ai

Since jj doesn't support external subcommand discovery, users invoke directly:

jj-ai <command> [options]

Recommended shell alias:

alias jja='jj-ai'

4.2 Commands

jj-ai attach

Record AI attribution for a change.

jj-ai attach [OPTIONS]

Options:
  -r, --revision <REV>       Target revision [default: @]
  -t, --tool <TOOL>          AI tool name (cursor, copilot, claude-code, etc.)
  -m, --model <MODEL>        Model identifier
  -c, --conversation-id <ID> Conversation/session identifier
  -f, --file <PATH>          Specific file(s) to attribute [repeatable]
  -l, --lines <RANGE>        Line ranges (e.g., "1-10,15-20")
      --transcript-stdin     Read JSON transcript from stdin
      --human-author <SIG>   Override human author signature

Examples:

# Attach attribution to current change
jj-ai attach -r @ --tool=cursor --model=claude-sonnet-4-20250514 --conversation-id=abc-123

# Attribute specific file/lines
jj-ai attach -r @ --tool=cursor -c abc-123 --file src/main.rs --lines 1-50

# With transcript
echo '{"messages": [...]}' | jj-ai attach -r @ --tool=cursor -c abc-123 --transcript-stdin

jj-ai show

Display attribution for a change.

jj-ai show [OPTIONS]

Options:
  -r, --revision <REV>    Target revision [default: @]
      --format <FMT>      Output format: pretty, json, git-ai [default: pretty]

jj-ai blame

AI-enhanced blame output. Leverages jj file annotate internally.

jj-ai blame <PATH> [OPTIONS]

Options:
  -r, --revision <REV>    Target revision [default: @]
      --porcelain         Machine-readable output

Output:

src/main.rs:
  1-10   [AI: cursor/claude-sonnet-4-20250514] a1b2c3d4  "Implement auth flow"
  11-14  [Human]                         e5f67890
  15-20  [AI: cursor/claude-sonnet-4-20250514] a1b2c3d4  "Implement auth flow"

Implementation: Calls jj file annotate with a JSON-emitting template, then enriches each line's change-id with AI attribution from .jj/ai/notes/.

Native jj Template Integration

For users who prefer native jj file annotate, jj-ai provides a helper that pre-computes AI labels:

# Generate a template alias with AI data baked in
jj-ai template-gen --revision @ > /tmp/ai-annotate.toml

# Use with jj directly
jj file annotate src/main.rs --config-file /tmp/ai-annotate.toml \
  -T 'ai_annotate'

The generated template uses if() with change_id().short() comparisons:

[template-aliases]
'ai_annotate' = '''
if(commit().change_id().short() == "kpqvunto",
  label("ai cursor", line_number() ++ " [AI:cursor] " ++ content()),
  if(commit().change_id().short() == "xyzw1234",
    label("ai copilot", line_number() ++ " [AI:copilot] " ++ content()),
    line_number() ++ " " ++ content()
  )
)
'''

Limitation: Generated template is static; re-run template-gen after new attributions.

jj-ai sync

Synchronize attributions between jj-ai store and Git notes.

jj-ai sync [OPTIONS]

Options:
      --to-git              Export to refs/notes/ai
      --from-git            Import from refs/notes/ai
      --revs <REVSET>       Scope [default: stack(@)]
      --all-reachable       Sync all reachable commits
      --strict              Fail on any stale attribution
      --recompute           Attempt to remap stale line ranges
      --prune               Remove orphaned notes
      --prune-dry-run       Show what would be pruned
      --force               Allow overwriting non-jj-ai notes
      --dry-run             Show what would be synced
      --json                JSON output for scripting

Examples:

# Standard workflow
jj git export && jj-ai sync --to-git

# CI: strict mode, full repo
jj-ai sync --to-git --all-reachable --strict

# Preview sync
jj-ai sync --to-git --dry-run

# Import from collaborator
git fetch origin refs/notes/ai:refs/notes/ai
jj git import
jj-ai sync --from-git

jj-ai stats

Display AI contribution statistics.

jj-ai stats [OPTIONS]

Options:
  -r, --revset <REVSET>   Scope [default: @]
      --by-tool           Group by AI tool
      --by-file           Group by file
      --by-author         Group by human author

jj-ai move

Transfer attribution between changes (for splits/squashes).

jj-ai move --from <REV> --to <REV> [OPTIONS]

Options:
      --file <PATH>       Only move attribution for specific file(s)
      --lines <RANGE>     Only move specific line ranges
      --merge             Merge with existing attribution on target
      --remap             Attempt to remap line numbers via diff

jj-ai checkpoint

Record a checkpoint for precise attribution diffing.

jj-ai checkpoint [OPTIONS]

Options:
  -r, --revision <REV>    Target revision [default: @]
      --type <TYPE>       Checkpoint type: human, ai-start, ai-end

Checkpoints enable precise attribution by marking boundaries:

jj-ai checkpoint --type=human     # Mark current state as human baseline
# ... AI makes changes ...
jj-ai checkpoint --type=ai-end    # Mark AI changes complete
jj-ai attach -r @ --from-checkpoint  # Auto-compute ranges from diff

jj-ai template-gen

Generate jj template aliases for AI-aware annotation.

jj-ai template-gen [OPTIONS]

Options:
  -r, --revset <REVSET>   Scope for template data [default: ancestors(@, 20)]
  -o, --output <PATH>     Output file [default: stdout]
      --format <FMT>      Output format: toml, json [default: toml]

Example:

jj-ai template-gen -r 'trunk()..@' > /tmp/ai.toml
jj file annotate src/main.rs --config-file /tmp/ai.toml -T ai_annotate

jj-ai diff

Calculate AI contribution statistics on-demand (replaces stored stats in v4).

jj-ai diff [OPTIONS]

Options:
  -r, --revset <REVSET>   Scope [default: @]
      --by-session        Break down by session
      --by-file           Break down by file
      --by-tool           Break down by AI tool
      --json              JSON output

Output:

$ jj-ai diff -r 'trunk()..@'
AI Contribution Summary (15 commits):
  Total lines:     1,234
  AI-authored:       890 (72%)
  Human-authored:    344 (28%)
  
By tool:
  cursor:     650 lines (53%)
  copilot:    240 lines (19%)

5. Sync Behavior

5.1 Scope

Flag Revset Use Case
(default) stack(@) Day-to-day development
--revs <R> User-specified Custom scope
--all-reachable all() CI, releases

Definition of stack(@): All ancestors of @ until reaching immutable commits or trunk.

5.2 Stale Attribution Handling

Attribution becomes "stale" when the change's content has been modified since recording.

Detection: Compare recorded_commit_id with current commit ID for the change. If different, check content anchors.

Default behavior: Warn and sync with stale: true marker.

$ jj-ai sync --to-git
warning: attribution for kpqvuntor is stale (content changed since recording)
  → syncing with stale marker
Synced 3 attributions to refs/notes/ai

Strict mode (--strict): Fail if any attribution is stale.

$ jj-ai sync --to-git --strict
error: attribution for kpqvuntor is stale
  recorded against: abc123def456
  current commit:   789xyz...
  hint: use --recompute to attempt range remapping, or re-attach attribution

Recompute mode (--recompute): Attempt to remap line ranges using content anchors.

Algorithm:

  1. For each stale file attribution, search for snippet_hash in current file
  2. If unique match found, compute line offset and adjust ranges
  3. If no match or ambiguous, keep stale marker

5.3 Prune Behavior

Default: Never prune. Sync is non-destructive.

Prune semantics are orphan-based, NOT scope-based. This prevents accidental deletion of valid notes outside the current working set.

--prune: Remove Git notes where the annotated commit is unreachable from any Git ref (branches, tags, HEAD). This mirrors git notes prune semantics.

Additional requirements for pruning:

  1. Commit must be unreachable (not just outside sync scope)
  2. Note must have been produced by jj-ai (extensions.jj-ai.producer present)

--prune --force: Also remove notes without jj-ai producer marker (use with caution).

--prune-dry-run: List what would be removed without deleting.

$ jj-ai sync --to-git --prune-dry-run
Would prune 2 notes (commits unreachable from any ref):
  abc123... (unreachable, producer: jj-ai/0.1.0)
  def456... (unreachable, producer: jj-ai/0.1.0)
Skipping 1 note (not produced by jj-ai):
  789abc... (unreachable, producer: unknown)
Preserving 5 notes (commits still reachable)

WARNING: --prune based on revset scope (e.g., stack(@)) is explicitly NOT supported, as it would delete valid notes for commits on other branches or trunk.

5.4 Conflict Handling

When syncing to Git and a note already exists with different content:

Default: Refuse to overwrite, show diff.

$ jj-ai sync --to-git
error: note conflict for commit abc123...
  existing note differs from jj-ai attribution
  hint: use --merge to 3-way merge, --force to overwrite, or --from-git to import existing

--merge: Perform 3-way merge of notes content:

  1. Parse both notes (local jj-ai and existing git note)
  2. Merge prompts maps: Union by session hash (no conflict possible - session hashes are unique)
  3. Merge attestation ranges:
    • For non-overlapping ranges: union
    • For overlapping ranges with same session: keep as-is
    • For overlapping ranges with different sessions: later recorded_at wins, log warning
  4. Merge extensions: Local extensions.jj-ai overwrites remote
$ jj-ai sync --to-git --merge
Merged note for commit abc123...
  added 2 sessions from remote
  resolved 1 range conflict (local wins: newer timestamp)

--force: Overwrite existing note entirely (local wins, remote discarded).

5.5 Notes Collaboration Workflow

For teams sharing AI attributions via refs/notes/ai:

# Fetch remote notes
git fetch origin refs/notes/ai:refs/notes/ai-remote

# Import into jj-ai (creates local attribution files)
jj-ai sync --from-git --notes-ref=refs/notes/ai-remote

# ... do local work, attach attributions ...

# Export with merge
jj git export
jj-ai sync --to-git --merge

# Push notes
git push origin refs/notes/ai

Merge conflicts during git fetch: If refs/notes/ai has diverged, git will refuse to fast-forward. Use:

git fetch origin refs/notes/ai:refs/notes/ai-remote
git notes --ref=ai merge refs/notes/ai-remote
# Or let jj-ai handle it:
jj-ai sync --from-git --notes-ref=refs/notes/ai-remote --merge

6. Complex Rewrite Handling

6.1 Squash

When changes A and B squash into C:

  • C retains A's change-id (jj behavior)
  • Attribution from both A and B merge into C's record
  • Line ranges require remapping based on diff hunks
fn handle_squash(
    target_change: &ChangeId,   // C (keeps A's change-id)
    target_commit: &CommitId,   // New commit after squash
    absorbed_change: &ChangeId, // B (absorbed into C)
    absorbed_commit: &CommitId, // B's commit before squash
    repo: &Repository,
    store: &mut AttributionStore,
) -> Result<()> {
    let absorbed_attr = store.get(absorbed_change, absorbed_commit)?;
    let target_attr = store.get_or_create(target_change, target_commit)?;
    
    // Compute how B's content maps into C
    let diff = repo.diff(absorbed_commit, target_commit)?;
    
    // Merge file attributions with line remapping
    for file_attr in absorbed_attr.attestation.files {
        let remapped = remap_ranges_via_diff(&file_attr.ranges, &diff, &file_attr.path)?;
        target_attr.merge_file(FileAttribution {
            path: file_attr.path,
            ranges: remapped,
        })?;
    }
    
    // Union prompt metadata
    target_attr.prompts.extend(absorbed_attr.prompts);
    
    // Remove absorbed record
    store.remove(absorbed_change, absorbed_commit)?;
    
    Ok(())
}

Range conflict resolution (when same lines attributed to different sessions):

  • Later recorded_at timestamp wins
  • Warning logged for conflicts
  • Original session preserved in prompts for audit trail

6.2 Split

When change A splits into B and C:

  • B and C get new change-ids
  • Attribution distributes based on diff/hunk mapping, not file existence
  • Original A's record is removed after distribution

Distribution algorithm:

  1. Compute the diff from A's parent to each target (B, C)
  2. For each attributed range in A, determine which target's diff contains those lines
  3. Remap line numbers based on hunk offsets in each target
  4. If a range spans hunks going to different targets, split the range
fn handle_split(
    source: &ChangeId,
    source_commit: &CommitId,
    targets: &[(ChangeId, CommitId)],
    repo: &Repository,
    store: &mut AttributionStore,
) -> Result<()> {
    let source_attr = store.get(source, source_commit)?;
    let parent = repo.get_parent(source_commit)?;
    
    for (target_change_id, target_commit_id) in targets {
        let mut attr = AttributionData::new(*target_change_id, *target_commit_id);
        
        // Compute hunks that landed in this target
        let diff = repo.diff(&parent, target_commit_id)?;
        
        for file_attr in &source_attr.attestation.files {
            if let Some(file_diff) = diff.get(&file_attr.path) {
                // Intersect attributed ranges with hunks in this target
                let remapped = remap_ranges_to_hunks(&file_attr.ranges, file_diff);
                if !remapped.is_empty() {
                    attr.attestation.files.push(FileAttribution {
                        path: file_attr.path.clone(),
                        ranges: remapped,
                    });
                }
            }
        }
        
        // Include only referenced sessions
        let used_sessions = attr.referenced_sessions();
        attr.prompts = source_attr.prompts
            .iter()
            .filter(|(k, _)| used_sessions.contains(k))
            .cloned()
            .collect();
        
        if !attr.attestation.files.is_empty() {
            store.insert(attr)?;
        }
    }
    
    store.remove(source, source_commit)?;
    Ok(())
}

fn remap_ranges_to_hunks(
    ranges: &[Range],
    diff: &FileDiff,
) -> Vec<Range> {
    // For each range, find overlapping hunks and compute new line numbers
    // based on the hunk's position in the target file
    // Returns empty vec if no overlap
    ...
}

Fallback for complex splits: If hunk mapping cannot be reliably computed (e.g., significant refactoring), mark attribution as stale and emit a warning suggesting manual jj-ai move or re-attach.

6.3 Divergent Changes

When a change diverges (concurrent edits creating multiple visible commits with the same change-id), each variant has its own attribution record.

Storage model: Each divergent commit gets its own file:

.jj/ai/notes/<change-id>/
  ├── <commit-id-variant-1>.json
  └── <commit-id-variant-2>.json

Behavior:

  • jj-ai attach -r @ writes to the specific commit's file
  • jj-ai show -r @ shows attribution for that specific commit
  • jj-ai sync --to-git exports each variant to its respective commit SHA in refs/notes/ai

Resolution when divergence resolves (user picks one variant or merges):

  • If one variant is abandoned, its attribution file is removed
  • If variants merge, attributions are merged using these rules:
    1. Session union: All prompt records from both variants are preserved
    2. Range conflict: For overlapping ranges with different sessions, the variant with later recorded_at wins
    3. Stale marking: Merged attribution is marked stale: true if content changed during merge

Sync behavior for divergent changes:

$ jj-ai sync --to-git
warning: change kpqvunto is divergent (2 variants)
  → syncing each variant to its commit SHA
  variant abc123... → refs/notes/ai
  variant def456... → refs/notes/ai

Error case: If jj-ai attach is called on a change-id without specifying which commit (and multiple visible commits exist), it MUST error:

$ jj-ai attach -r kpqvunto ...
error: change kpqvunto is divergent with 2 visible commits
  hint: specify exact commit with -r <commit-id> or resolve divergence first

7. Integration Points

7.1 Coding Agent Integration

Agents (Cursor, Claude Code, Copilot) should call jj-ai after making changes:

# Pre-edit checkpoint (optional, for precise diffing)
jj-ai checkpoint --type=human -r @

# ... agent makes changes ...

# Post-edit: attach attribution
jj-ai attach -r @ \
  --tool=cursor \
  --model=claude-sonnet-4-20250514 \
  --conversation-id="$CONVERSATION_ID" \
  --transcript-stdin < transcript.json

7.2 CI/CD Integration

# GitHub Actions example
- name: Sync AI attributions
  run: |
    jj git export
    jj-ai sync --to-git --all-reachable --strict
    git push origin refs/notes/ai

7.3 Pre-push Hook (Git side)

If using Git for pushing, add a pre-push hook:

#!/bin/bash
# .git/hooks/pre-push
jj git export
jj-ai sync --to-git --strict || exit 1

8. Git-ai v4.0.0 Compatibility

This section describes jj-ai alignment with the git-ai v4.0.0 working proposal, which adds first-class support for agentic primitives: plans, subagents, forked threads, and content-addressable prompt storage.

8.1 Schema Version

When targeting v4 compatibility:

{
  "schema_version": "authorship/4.0.0",
  ...
}

jj-ai MUST support both v3 and v4 formats, auto-detecting via schema_version.

8.2 Enhanced Attestation Section

v4 adds message index tracking to pinpoint which message in a conversation caused a change.

IMPORTANT: To maintain v3 parser compatibility, the attestation section wire format MUST NOT change. v4 metadata (message indices, flags) are stored in the JSON metadata section only.

Wire format (unchanged from v3):

src/main.rs
  a1b2c3d4e5f67890 1-10,15-20,25-30
  0987654321fedcba 40-50

v4 metadata in JSON section:

{
  "schema_version": "authorship/4.0.0",
  "attestation_meta": {
    "src/main.rs": {
      "a1b2c3d4e5f67890": {
        "1-10,15-20": { "message_idx": 5 },
        "25-30": { "message_idx": 7 }
      },
      "0987654321fedcba": {
        "40-50": { "message_idx": 3, "overridden": true }
      }
    }
  }
}

jj-ai local storage format (richer than wire format):

{
  "attestation": {
    "files": [
      {
        "path": "src/main.rs",
        "ranges": [
          { "session": "a1b2c3d4e5f67890", "message_idx": 5, "lines": "1-10,15-20" },
          { "session": "a1b2c3d4e5f67890", "message_idx": 7, "lines": "25-30" },
          { "session": "0987654321fedcba", "message_idx": 3, "lines": "40-50", "overridden": true }
        ]
      }
    ]
  }
}

Flags (stored in attestation_meta, not attestation section):

  • overridden: true - lines originally AI-authored, subsequently modified by human
  • removed: true - lines deleted (tracked for audit purposes)

8.3 Enhanced Message Types

v4 expands message types beyond user/assistant/tool_use:

Type Description
human Message from the human (replaces user)
ai Response from the AI (replaces assistant)
thinking AI reasoning/chain-of-thought (e.g., Claude's extended thinking)
subagent Sub-agent invocation with nested session
edit Normalized file edit operation
tool Generic tool call
plan AI-generated plan (may evolve across messages)

Example message array:

{
  "messages": [
    { "type": "human", "text": "Add authentication to the API", "timestamp": "..." },
    { "type": "thinking", "text": "I need to consider JWT vs session-based...", "timestamp": "..." },
    { "type": "plan", "plan_id": "plan-001", "content": { "steps": [...] }, "timestamp": "..." },
    { "type": "ai", "text": "I'll implement JWT-based auth...", "timestamp": "..." },
    { "type": "subagent", "session_id": "c3d4e5f678901234", "task": "Research JWT libraries", "timestamp": "..." },
    { "type": "edit", "path": "src/auth.rs", "diff": "...", "timestamp": "..." },
    { "type": "tool", "name": "bash", "input": { "command": "cargo test" }, "timestamp": "..." }
  ]
}

8.4 Subagent Support

Subagents are nested AI sessions spawned by a parent session. v4 adds:

{
  "prompts": {
    "a1b2c3d4e5f67890": {
      "agent_id": { "tool": "claude-code", "model": "claude-sonnet-4-20250514", "id": "..." },
      "parent_session_id": null,
      "messages": [...]
    },
    "c3d4e5f678901234": {
      "agent_id": { "tool": "claude-code", "model": "claude-sonnet-4-20250514", "id": "..." },
      "parent_session_id": "a1b2c3d4e5f67890",
      "task": "Research JWT libraries for Rust",
      "messages": [...]
    }
  }
}

jj-ai tracks subagent lineage via parent_session_id. Attribution flows to the root session for summary statistics.

8.5 Plan Tracking

Plans are first-class objects that may evolve across messages:

{
  "plans": {
    "plan-001": {
      "created_at": "2026-02-03T10:30:00Z",
      "revisions": [
        {
          "revision": 1,
          "message_idx": 2,
          "content": {
            "goal": "Add JWT authentication",
            "steps": [
              { "id": "1", "description": "Add jsonwebtoken dependency", "status": "completed" },
              { "id": "2", "description": "Create auth middleware", "status": "in_progress" },
              { "id": "3", "description": "Add token refresh endpoint", "status": "pending" }
            ]
          }
        },
        {
          "revision": 2,
          "message_idx": 8,
          "content": {
            "goal": "Add JWT authentication",
            "steps": [
              { "id": "1", "description": "Add jsonwebtoken dependency", "status": "completed" },
              { "id": "2", "description": "Create auth middleware", "status": "completed" },
              { "id": "3", "description": "Add token refresh endpoint", "status": "completed" },
              { "id": "4", "description": "Add rate limiting", "status": "pending" }
            ]
          }
        }
      ]
    }
  }
}

8.6 Content-Addressable Prompt Store

For sensitive transcripts that shouldn't live in Git history, v4 defines a protocol for external storage:

{
  "prompts": {
    "a1b2c3d4e5f67890": {
      "agent_id": { "tool": "cursor", "model": "claude-sonnet-4-20250514", "id": "..." },
      "messages": { "$ref": "cas://sha256:abc123.../messages" },
      "summary": "User requested auth implementation. AI created JWT middleware."
    }
  }
}

The $ref field indicates content stored externally. Supported schemes:

  • cas://sha256:<hash>/<path> - Content-addressable store (implementation-defined)
  • file://<path> - Local file reference (RESTRICTED - see Security)
  • https://<url> - Remote HTTPS endpoint (RESTRICTED - see Security)

Security restrictions for $ref resolution (see also §11.3):

Scheme Default Flag to enable
cas:// Allowed (local CAS only) N/A
file:// BLOCKED --allow-file-refs (restricted to repo)
https:// BLOCKED --allow-network-refs
http:// BLOCKED Not supported (no flag)

jj-ai stores the reference in .jj/ai/notes/ and resolves on read. For jj-ai sync --to-git:

  • Default: Inline messages if small (<10KB), else use $ref with summary
  • --inline-transcripts: Always inline (may bloat notes)
  • --strip-transcripts: Only include summary, drop messages entirely

8.7 Removed Fields

v4 removes per-session statistics that can be computed on-demand via jj-ai diff or git-ai diff:

  • total_additions
  • total_deletions
  • accepted_lines
  • overridden_lines

Calculating on-demand is faster and more accurate than maintaining stale counters.

8.8 Forked Threads

When a conversation branches (e.g., user backtracks and tries a different approach), v4 tracks thread lineage:

{
  "prompts": {
    "thread-main-abc": {
      "agent_id": { ... },
      "forked_from": null,
      "fork_point_message_idx": null,
      "messages": [...]
    },
    "thread-alt-xyz": {
      "agent_id": { ... },
      "forked_from": "thread-main-abc",
      "fork_point_message_idx": 5,
      "messages": [...]
    }
  }
}

This enables tools to visualize conversation evolution and understand why code changed direction.

8.9 Backwards Compatibility

Wire format compatibility is paramount. The attestation section grammar MUST remain v3-compatible.

Reading Writing
jj-ai MUST read v3 and v4 jj-ai SHOULD write v3-compatible wire format by default
v3 user → v4 human v4 extras stored in attestation_meta JSON only
v3 assistant → v4 ai Stats fields omitted in v4
v3 tool_use → v4 tool Message index defaults to null if unknown

Version negotiation:

  • --schema-version=3: Pure v3 output (no attestation_meta)
  • --schema-version=4: v3 attestation section + v4 JSON metadata (default when v4 data present)
  • Auto-detect on read via schema_version field

8.10 v4 CLI Additions

New flags for v4 features:

jj-ai attach [OPTIONS]
      --plan <JSON>           Attach plan JSON
      --parent-session <ID>   Mark as subagent of parent session
      --message-idx <N>       Associate with specific message index
      --cas-store <URL>       Store transcript in content-addressable store

jj-ai show [OPTIONS]
      --expand-refs           Resolve $ref pointers to inline content
      --plans                 Show plan evolution

jj-ai diff [OPTIONS]
      -r, --revset <REVSET>   Calculate AI contribution stats on-demand
      --by-session            Break down by session

9. Upstream jj Tracking

This section tracks relevant upstream jj issues and design work that may impact jj-ai.

9.1 Active Upstream Issues

Issue Title Status Impact on jj-ai
#8166 design: Add support for per-revision metadata 🟢 Open High - Could replace .jj/ai/notes/
#6664 FR: key/value store per change and commit 🟢 Open High - Broader metadata use cases
#4766 FR: Make jj annotate extendable with blame layers 🟢 Open Medium - Could simplify jj-ai blame
#8395 FR: Add support for jj notes 🔴 Closed Duplicate of #8166
#4706 FR: Transfer change ids via Git remote 🔴 Closed Solved via git.write-change-id-header

9.2 Per-Revision Metadata Design (Issue #8166)

A design document is in progress with these key points:

Semantics:

  • Metadata attaches to either change ID (survives rewrites) or commit hash (pinned)
  • Change-id attachment is exactly what jj-ai needs
  • Squash merges metadata; split follows change-id

Proposed CLI:

jj metaedit --set KEY=VALUE -r REVISION
jj metaedit --unset KEY -r REVISION

Scope explicitly excludes:

  • ❌ Syndication to remotes (no git notes export)
  • ❌ Topics formalization (separate effort)

Maintainer comments (martinvonz):

"We can probably still map the metadata to Git notes. We should be able to have a notes ref like refs/notes/jj/metadata..."

"Attaching notes to change ids may not work using Git notes because Git notes can only be attached to existing objects (and we don't model changes as Git objects)."

This confirms jj-ai's approach: store locally by change-id, materialize to git notes on sync.

9.3 Community Interest in AI Attribution

From @roninjin10 in #8166:

"I believe attaching prompts as metadata to changes is extremely powerful for JJ users doing agentic coding. Imagine being able to 'prompt rebase' when a model comes out rerunning all your prompts with a stronger model."

The jj community is already thinking about AI attribution use cases.

9.4 Historical Context

Issue #7 (2021) explains why jj moved away from git notes internally:

  • libgit2 doesn't do sharding → slow with many commits
  • Replaced with custom format for performance
  • Exchange via git notes was deprioritized

This validates jj-ai's design choice to use a separate storage layer.

9.5 Potential Migration Path

If upstream implements per-revision metadata (#8166):

Current jj-ai storage:

.jj/ai/notes/<change-id>.json

Potential future:

jj metaedit --set ai.attribution='<json>' -r @

jj-ai could migrate to use native metadata API while maintaining the sync layer for git-ai compatibility. The jj-ai sync --to-git command would still be necessary since upstream explicitly scoped out remote syndication.

9.6 Recommended Upstream Proposals

Features that would benefit jj-ai (not yet proposed upstream):

  1. Opt-in notes import/export

    jj git export --include-notes=refs/notes/ai
    jj git import --include-notes=refs/notes/ai
  2. Template external data access

    • Allow templates to read from external sources
    • Would enable native jj file annotate -T 'ai_attribution()'
  3. Operation triggers

    • Post-operation callbacks for automation

10. Future Considerations

10.1 Revset Extensions

If jj adds symbol resolver extension support for external tools:

# .jj/config.toml
[revset-aliases]
'ai()' = 'jj-ai:attributed()'
'ai(tool)' = 'jj-ai:attributed(tool)'
'human_only()' = 'all() ~ ai()'

Enabling queries like:

jj log -r 'ai("cursor")'
jj log -r 'trunk()..@ & human_only()'

11. Security Considerations

11.1 Sensitive Data

  • Transcripts: May contain sensitive prompts/responses. Store hashes only by default; full transcripts opt-in.
  • API keys: Never store in attribution records.
  • User identities: Respect Git's user.email settings.

11.2 Tampering

Git notes are mutable and not cryptographically signed by default. For high-assurance environments:

  • Sign notes with GPG: git notes --ref=ai add --gpg-sign
  • Verify on import: jj-ai sync --from-git --verify-signatures

11.3 External Reference Resolution ($ref)

Threat model: A malicious collaborator could push notes containing $ref URIs that:

  • Exfiltrate local files: file:///etc/passwd
  • Probe internal networks (SSRF): https://internal-metadata-service/...
  • Leak information via DNS/request timing

Default restrictions:

Operation Behavior
$ref with file:// BLOCKED by default
$ref with https:// BLOCKED by default
$ref with http:// Always blocked (not supported)
$ref with cas:// Allowed (resolves to local .jj/ai/cas/ only)

Enabling external refs (explicit opt-in required):

# Allow file:// refs, restricted to repo directory
jj-ai show --expand-refs --allow-file-refs

# Allow network refs (use with caution)
jj-ai show --expand-refs --allow-network-refs

file:// restrictions when enabled:

  • Path must be under repository root or .jj/ai/
  • Symlinks are not followed outside allowed directories
  • Absolute paths outside repo are rejected

Config for persistent settings:

# .jj/ai/config.toml
[security]
allow_file_refs = false      # default
allow_network_refs = false   # default
allowed_cas_hosts = []       # for remote CAS, e.g. ["cas.example.com"]

11.4 Transcript and Context Storage

thinking messages (chain-of-thought):

  • High privacy risk: may contain sensitive reasoning, rejected approaches
  • Default: stripped from export unless --include-thinking
  • Warning emitted if thinking messages present and not explicitly handled

content_anchors:

  • Previous spec stored raw context_before/context_after strings
  • Risk: Leaks proprietary code into git notes
  • New default: Store only context_hash (SHA-256 of context)
  • --store-context-plaintext: Opt-in to store raw context (not recommended)

Recommended defaults for jj-ai sync --to-git:

--strip-transcripts      # Only summary, no full messages
--strip-thinking         # Remove chain-of-thought (default)
--hash-context           # Store context hashes, not plaintext (default)

12. Appendix

A. Example Output

A.1 Native jj file annotate (baseline)

$ jj file annotate src/auth.rs

kpqvunto [email protected] 2026-02-03 10:30  1: use std::collections::HashMap;
kpqvunto [email protected] 2026-02-03 10:30  2: use jsonwebtoken::{encode, Header};
kpqvunto [email protected] 2026-02-03 10:30  3: 
kpqvunto [email protected] 2026-02-03 10:30  4: pub struct AuthService {
kpqvunto [email protected] 2026-02-03 10:30  5:     secret: String,
kpqvunto [email protected] 2026-02-03 10:30  6:     tokens: HashMap<String, Token>,
kpqvunto [email protected] 2026-02-03 10:30  7: }
kpqvunto [email protected] 2026-02-03 10:30  8: 
xyzw1234 [email protected] 2026-02-03 11:15  9: impl AuthService {
xyzw1234 [email protected] 2026-02-03 11:15 10:     pub fn new(secret: &str) -> Self {
xyzw1234 [email protected] 2026-02-03 11:15 11:         Self { secret: secret.to_string(), tokens: HashMap::new() }
xyzw1234 [email protected] 2026-02-03 11:15 12:     }
mlkn9876 [email protected] 2026-02-03 14:22 13: 
mlkn9876 [email protected] 2026-02-03 14:22 14:     pub fn validate(&self, token: &str) -> bool {
mlkn9876 [email protected] 2026-02-03 14:22 15:         self.tokens.contains_key(token)
mlkn9876 [email protected] 2026-02-03 14:22 16:     }
xyzw1234 [email protected] 2026-02-03 11:15 17: }

A.2 jj-ai blame (enriched)

$ jj-ai blame src/auth.rs

kpqvunto 2026-02-03 [AI cursor/sonnet-4]  1: use std::collections::HashMap;
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  2: use jsonwebtoken::{encode, Header};
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  3: 
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  4: pub struct AuthService {
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  5:     secret: String,
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  6:     tokens: HashMap<String, Token>,
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  7: }
kpqvunto 2026-02-03 [AI cursor/sonnet-4]  8: 
xyzw1234 2026-02-03 [Human]               9: impl AuthService {
xyzw1234 2026-02-03 [Human]              10:     pub fn new(secret: &str) -> Self {
xyzw1234 2026-02-03 [Human]              11:         Self { secret: secret.to_string(), tokens: HashMap::new() }
xyzw1234 2026-02-03 [Human]              12:     }
mlkn9876 2026-02-03 [AI copilot/gpt-4]   13: 
mlkn9876 2026-02-03 [AI copilot/gpt-4]   14:     pub fn validate(&self, token: &str) -> bool {
mlkn9876 2026-02-03 [AI copilot/gpt-4]   15:         self.tokens.contains_key(token)
mlkn9876 2026-02-03 [AI copilot/gpt-4]   16:     }
xyzw1234 2026-02-03 [Human]              17: }

A.3 Native jj file annotate with generated template

$ jj-ai template-gen -r 'ancestors(@, 10)' > /tmp/ai.toml
$ jj file annotate src/auth.rs --config-file /tmp/ai.toml -T 'ai_annotate'

kpqvunto [AI:cursor]   1: use std::collections::HashMap;
kpqvunto [AI:cursor]   2: use jsonwebtoken::{encode, Header};
kpqvunto [AI:cursor]   3: 
kpqvunto [AI:cursor]   4: pub struct AuthService {
kpqvunto [AI:cursor]   5:     secret: String,
kpqvunto [AI:cursor]   6:     tokens: HashMap<String, Token>,
kpqvunto [AI:cursor]   7: }
kpqvunto [AI:cursor]   8: 
xyzw1234              9: impl AuthService {
xyzw1234             10:     pub fn new(secret: &str) -> Self {
xyzw1234             11:         Self { secret: secret.to_string(), tokens: HashMap::new() }
xyzw1234             12:     }
mlkn9876 [AI:copilot] 13: 
mlkn9876 [AI:copilot] 14:     pub fn validate(&self, token: &str) -> bool {
mlkn9876 [AI:copilot] 15:         self.tokens.contains_key(token)
mlkn9876 [AI:copilot] 16:     }
xyzw1234             17: }

Native template is less detailed (no model) but works with pure jj.

A.4 Porcelain output (for tooling)

$ jj-ai blame src/auth.rs --porcelain

{"line":1,"change_id":"kpqvunto","ai":{"tool":"cursor","model":"claude-sonnet-4-20250514","session":"a1b2c3d4e5f67890"}}
{"line":2,"change_id":"kpqvunto","ai":{"tool":"cursor","model":"claude-sonnet-4-20250514","session":"a1b2c3d4e5f67890"}}
...
{"line":9,"change_id":"xyzw1234","ai":null}
{"line":10,"change_id":"xyzw1234","ai":null}
...
{"line":14,"change_id":"mlkn9876","ai":{"tool":"copilot","model":"gpt-4","session":"0987654321fedcba"}}

A.5 Generated template file

# Generated by jj-ai template-gen
# Scope: ancestors(@, 10)
# Generated: 2026-02-03T15:00:00Z

[template-aliases]
'ai_annotate' = '''
separate(" ",
  commit().change_id().shortest(8),
  if(commit().change_id().short(8) == "kpqvunto",
    label("ai", "[AI:cursor]"),
    if(commit().change_id().short(8) == "mlkn9876",
      label("ai", "[AI:copilot]"),
      pad_end(12, "")
    )
  ),
  pad_end(4, line_number()),
  content()
)
'''

B. Change ID Format

  • Storage: 16 bytes (128 bits)
  • Display: 32 characters, reverse-hex using z-k alphabet
  • Short form: First 12 characters (default format_short_id)

Alphabet mapping:

0123456789abcdef  (hex)
zyxwvutsrqponmlk  (reverse-hex display)

C. Git-ai v3.0.0 Required Fields

{
  "schema_version": "authorship/3.0.0",
  "base_commit_sha": "<sha>",
  "prompts": {
    "<session_hash>": {
      "agent_id": { "tool": "<tool>", "model": "<model>", "id": "<conversation_id>" },
      "human_author": "<name> <email>",
      "messages": [],
      "total_additions": 0,
      "total_deletions": 0,
      "accepted_lines": 0,
      "overridden_lines": 0
    }
  }
}

D. Git-ai v4.0.0 Required Fields

{
  "schema_version": "authorship/4.0.0",
  "base_commit_sha": "<sha>",
  "prompts": {
    "<session_hash>": {
      "agent_id": { "tool": "<tool>", "model": "<model>", "id": "<conversation_id>" },
      "human_author": "<name> <email>",
      "parent_session_id": null,
      "messages": []
    }
  },
  "plans": {}
}

Note: v4 removes total_additions, total_deletions, accepted_lines, overridden_lines (computed on-demand).

E. References

F. Comparison with git-ai 3.0.0

This section compares jj-ai's approach to tooling/agent/editor integration against the git-ai 3.0.0 specification.

F.1 Core Architecture Difference

Aspect git-ai 3.0.0 jj-ai 0.1.0
Key Commit SHA Change ID (stable)
Rewrite handling Complex operations require remap Change-id survives rewrites natively
Storage refs/notes/ai directly .jj/ai/notes/ → materialized on export

F.2 Agent Integration Points

git-ai requires agents to:

  • Write to working state / staged attributions
  • Attach attribution before commit time
  • Re-track attributions through history rewrites

The spec recommends: "integrating with published implementation, not implementing this spec" and links to external integration docs.

jj-ai simplifies to:

# Pre-edit checkpoint (optional, for precise diffing)
jj-ai checkpoint --type=human -r @

# ... agent makes changes ...

# Post-edit: attach attribution
jj-ai attach -r @ \
  --tool=cursor \
  --model=claude-sonnet-4-20250514 \
  --conversation-id="$CONVERSATION_ID" \
  --transcript-stdin < transcript.json

F.3 Key Differences for Tool Integrators

Concern git-ai 3.0.0 jj-ai 0.1.0
When to attach Before commit (working state) After change exists (any revision)
Rewrite survival Tool must re-track through rewrites via "working state" Auto-survives; sync on export
Working state Central concept for merge --squash, reset, stash Not needed - change-id stable
Transcript ingestion Via messages[] array --transcript-stdin flag
Session hash SHA-256("{tool}:{conversation_id}")[0:16] Same (compatible)

F.4 git-ai's "Working State" Burden

git-ai spec §2.1-2.6 defines extensive rules for preserving attributions through:

  • Rebase (§2.1) - must track through reordering, squash, split, drop, edit
  • Merge (§2.2) - --squash requires moving to working state
  • Reset (§2.3) - soft/mixed must preserve in working state
  • Cherry-pick (§2.4) - --no-commit requires working state
  • Stash (§2.5) - stores under refs/notes/ai-stash
  • Amend (§2.6) - must combine original + working state

jj-ai sidesteps all of this by keying on change-id. Complex rewrite handling reduces to:

  1. Squash → merge attributions (§6.1)
  2. Split → distribute by content (§6.2)
  3. Divergent → union merge with warnings (§6.3)

F.5 Integration Complexity

For a coding agent (Cursor, Claude Code, Copilot):

git-ai integration requires:

  1. Track uncommitted state
  2. Handle reattachment through history rewrites
  3. Potentially integrate deeply with IDE's git operations
  4. Implement working state persistence

jj-ai integration requires:

  1. Call jj-ai attach after making changes
  2. Done - jj-ai handles rewrite survival internally

F.6 CI/CD Comparison

Both have similar CI patterns, but jj-ai adds an explicit sync step:

# git-ai
- run: git push origin refs/notes/ai

# jj-ai
- run: |
    jj git export
    jj-ai sync --to-git --all-reachable --strict
    git push origin refs/notes/ai

F.7 Summary

git-ai 3.0.0 jj-ai 0.1.0
Integration simplicity Complex (working state management) Simple (just attach)
Rewrite robustness Requires careful implementation Inherent via change-id
Sync step None (direct to notes) Required before push
Storage overhead Single location Dual (local + materialized)
Subcommand discovery N/A (git extension) Manual (jj-ai, not jj ai)

Verdict: jj-ai shifts complexity from tool integrators to a single sync step, making it significantly easier for coding agents to adopt while maintaining git-ai wire-format compatibility on export.


Changelog

  • 0.1.1-draft (2026-02-03): Oracle review fixes
    • Fixed session hash examples to use valid hex characters
    • Fixed v4 attestation format to preserve v3 wire compatibility
    • Fixed prune semantics: orphan-based, not scope-based
    • Fixed split algorithm: hunk-based distribution, not file existence
    • Fixed divergence model: per-commit-id variants
    • Added merge semantics for notes conflicts (§5.4, §5.5)
    • Added locking/atomicity requirements (§3.4)
    • Namespaced extension fields under extensions.jj-ai
    • Added security restrictions for $ref resolution (§11.3)
    • Added checkpoint, template-gen, diff command specs
    • Added canonical formatting/sorting rules for attestation
    • Clarified session hash generation (lowercase hex, UTF-8)
  • 0.1.0-draft (2026-02-03): Initial specification
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment