Enable distribution of Pi session data across a network (including p2p) by content-addressing session files without modifying Pi itself.
Pi stores sessions as JSONL files at ~/.pi/agent/sessions/, organized by working directory:
~/.pi/agent/sessions/--<path>--/<timestamp>_<uuid>.jsonl
Each file is append-only. The first line is a header:
{"type":"session","version":3,"id":"<uuid>","timestamp":"<iso>","cwd":"/path/to/project"}Forked sessions include a parentSession field in the header — an absolute local path to the original session file.
Subsequent lines are session entries (messages, tool results, compactions, model changes, etc.) linked into a tree via id/parentId fields. All branches coexist in a single file.
- Session files reference parent sessions by local filesystem path, which is meaningless on another machine.
- Session files have no content-based identity — they're identified by path and filename.
- We want to distribute sessions without modifying Pi's format or forking Pi.
The raw Pi session .jsonl file, hashed verbatim with Blake3. No transformation, no normalization, no envelope. Pi's file is the blob.
blake3(raw_bytes_of_session.jsonl) → session hash
A small JSON file that gives a session blob its network identity and lineage:
{
"type": "branch",
"version": 1,
"src": "<blake3 hash of session file>",
"parent": "<blake3 hash of parent branch sidecar, or null>"
}The branch sidecar is itself content-addressed:
blake3(raw_bytes_of_branch.json) → branch hash
The branch hash is the primary identifier shared on the network.
branch sidecar (hash: X) branch sidecar (hash: Y)
├── src: <session blob hash A> ├── src: <session blob hash B>
└── parent: null └── parent: X
│ │
▼ ▼
session.jsonl A session.jsonl B
(root session) (forked from A)
This mirrors git's object model:
| Git | This design |
|---|---|
| blob/tree | Session JSONL file (hashed verbatim) |
| commit | Branch sidecar (points to content + parent) |
To resolve a branch hash:
- Fetch the branch sidecar by its hash
- Read
src→ fetch the session JSONL by that hash - Optionally read
parent→ fetch the parent branch sidecar, recurse
The path is uniform whether the session is a root or a fork.
The recipient can reconstruct the expected local path from the session JSONL's first line (the header), which contains id, timestamp, and cwd. The filename convention <timestamp>_<uuid>.jsonl and directory convention --<path>--/ are derivable from these fields.
Note: cwd is the original machine's path and may not match the recipient's filesystem layout. This is accepted — we don't try to fix paths.
- Fast, parallelizable, streaming-friendly
- Supports incremental hashing via
update()/finalize()— ideal for append-only files - Well-supported across languages (Rust reference impl, JS/WASM bindings)
Since Pi sessions are append-only, we can avoid re-hashing the entire file on each update:
- Hash the session file using Blake3's streaming API
- Persist the hasher state to a sidecar (e.g.,
<session_file>.blake3state) - When the session file grows, read only the new bytes (track the last-hashed byte offset)
- Restore the hasher state, feed the new bytes, finalize for the updated hash
Cost per update: O(new bytes), not O(file size).
This requires a Blake3 implementation that supports serializing/deserializing hasher state. The Rust reference implementation supports this. Availability in JS/WASM bindings should be verified.
Use the same string serialization format as Iroh (hex lowercase).
A simple manifest.json mapping branch hashes to local session file paths:
{
"abc123...": "/Users/gordon/.pi/agent/sessions/--some--path/1701234_abcd.jsonl",
"def456...": "/Users/gordon/.pi/agent/sessions/--other--path/1701235_efgh.jsonl"
}This is a local index for convenience — it is not content-addressed and is not distributed.
Content addressing happens on demand, not continuously. When the user wants to distribute a session:
- Hash the current session file → session blob hash
- If the session has a
parentSession, look up (or compute) its branch hash - Create the branch sidecar → hash it → branch hash
- Update the local manifest
- The branch hash is the distributable identifier
- Content-addressed data model (two object types: session blob + branch sidecar)
- Blake3 hashing with efficient incremental updates
- Local manifest (hash → path)
- Plan for implementation as a Pi extension or standalone CLI tool
- Network transport / p2p protocol
- Lazy fetching of parent chains
- Content-addressable store (CAS directory)
- Bao sidecar files for verified random access
- Per-entry hashing within session files
- Internal tree position (leaf/branch state)
- Path normalization across machines
- Mutable pointers ("latest version of session X")
These are all natural escalation points if the basic model proves useful.
Build as a Pi extension or standalone CLI tool (TBD). Either way, the core logic is:
- A function that takes a session file path and produces a branch sidecar + hashes
- A manifest manager that persists the hash → path mapping
- Blake3 hashing with optional state persistence for incremental updates
The implementation should be a small, focused module with no modifications to Pi.