Skip to content

Instantly share code, notes, and snippets.

View YoraiLevi's full-sized avatar
🎯
Focusing

Yorai Levi YoraiLevi

🎯
Focusing
View GitHub Profile
@YoraiLevi
YoraiLevi / operator-playbook.md
Created May 15, 2026 15:51
DGX Fleet & Control-Plane Operator Playbook — consolidated self-contained operator procedures across both repos (2026-05-15)

DGX Fleet & Control-Plane Operator Playbook

This is an operator playbook. It is NOT a tutorial, a postmortem, or an architecture document. It is the single self-contained reference an operator opens to execute every operator-only and operator-led procedure across both the dgx-fleet workload repo and the dgx-control-plane platform repo. Procedures are self-contained — you do NOT need to clone either repo to execute them. You DO need: Vault CLI, GitHub gh CLI, SSH access to relevant hosts, and the credentials each section calls for.

Last verified. 2026-05-15. Per-section staleness budgets noted inline; re-verify before executing a procedure if its budget has elapsed.

Provenance.

  • Findings (cited): assembled from the operator-only and operator-led runbooks in dgx-fleet/docs/runbooks/ and [dgx-control-plane/docs/runbooks/](https
@YoraiLevi
YoraiLevi / claude-code-in-containers-guide.md
Last active May 15, 2026 15:42
Running Claude Code Inside Isolated Containers with Local-Feeling Persistence — a catalog of approaches, solutions, security caveats, and a recommended recipe

Complete Step-by-Step Guide: Claude Code in an Isolated Container

What you'll have when done: Claude Code running fully inside a Docker container, using --dangerously-skip-permissions freely, with your sessions and credentials persisted on disk, your project files editable as if local, and a firewall preventing any exfiltration — all without ever risking your host system, ~/.ssh, cloud credentials, or other files you didn't explicitly share.


Before You Start: Two Paths

This guide covers two ways to do this. Pick one:

@YoraiLevi
YoraiLevi / decoupling-decision-record.md
Created May 15, 2026 15:10
Decision record: decoupling dgx-fleet from dgx-control-plane (2026-05-15)

Decision record: decoupling dgx-fleet from dgx-control-plane

This is a decision record (R-17). It is NOT a runbook, a tutorial, or a postmortem. It records why and how the two coupled Ansible repos dgx-fleet and dgx-control-plane were decoupled into independently-cloneable artifacts on 2026-05-15, and which trade-offs were taken.

Last verified. 2026-05-15. Staleness budget: 365 days (decision records age slowly).

Provenance.

  • Findings (cited): inventory of both repos on 2026-05-15 yielded ~233 cross-repo references across 32 files. Six coupling categories were identified (see §Context). The byte-equality versions.env test in test/run_tests.sh and the versions-env-sync pre-commit hook were the active coupling; the rest were path-style narrative references.
  • Synthesis: the unidirectional-interface-contract design with the producer/consumer flip for the runner-image workflow is origi
@YoraiLevi
YoraiLevi / interface-contract.md
Created May 15, 2026 14:57
Interface contract — dgx-fleet ↔ a conforming control plane (v1.0.0)

Interface contract — dgx-fleet ↔ a conforming control plane

This is an interface contract. It is NOT a runbook, a tutorial, or a postmortem. It defines the surface a control plane must expose for the dgx-fleet Ansible playbook to bootstrap and operate against it. The dgx-control-plane repo is one implementation; any other implementation that exposes the surfaces below is a conforming control plane.

Last verified. 2026-05-15.

Per-section staleness budgets (re-verify after this many days):

  • §Vault server — 180d
  • §ARC runner — 90d
  • §ESO — 90d
@YoraiLevi
YoraiLevi / project-map.md
Created May 15, 2026 13:21
Project map — dgx-fleet + dgx-control-plane infrastructure repos + six published gists + one external sub-project

Project Map — dgx-fleet + dgx-control-plane infrastructure

Genre. This is a cross-repo project map. It is NOT a tutorial, a runbook, a methodology spec, or a single-repo HANDOFF. It is the index that lets a person joining any of the involved repos discover the others.

Last verified. 2026-05-14. Per-section staleness budget: 90 days for gist URLs (verify they still resolve); 180 days for repo descriptions (refactor invalidates).

Provenance. Original synthesis produced during the May 2026 session that externalized the RFE methodology. Not derived from any prior project document.

What this is

@YoraiLevi
YoraiLevi / video-creation-prompts.md
Created May 15, 2026 13:20
Video-creation prompts for the Bootstrap series (NotebookLM + Veo + OBS) — Conundrum briefing + Handbook tutorial series + drill video

Video-Creation Prompts for the Bootstrap Series

Genre. This is a prompt library. It is NOT a video-production tutorial, a Veo/NotebookLM setup guide, or a guide to making videos in general. It is the prompts you paste into Gemini/Veo/NotebookLM/OBS to produce video versions of the published gists.

Last verified. 2026-05-14. Per-section staleness budget: 60 days for tool-specific syntax (NotebookLM/Veo evolve); 180 days for the structural arcs.

Provenance. Original synthesis produced during a single working session in May 2026 while planning video versions of the Bootstrap Conundrum book and the Bootstrap Handbook. Tool-fit notes adapted from Anthropic's docs on NotebookLM Video Overview and Google's Veo 3 documentation.

What this is

@YoraiLevi
YoraiLevi / rfe-methodology-v2.1.md
Created May 15, 2026 13:10
RFE v2.1 — Research-Formalize-Educate methodology (canonical spec, 23 rules + 6 skills + lint hook + 5-part fresh-agent test)

RFE v2.1 — The Research-Formalize-Educate Methodology

Genre. This is a methodology specification. It is NOT a tutorial, a project review, or a research artifact in itself. It is the spec an operator installs into a repo so that future research / synthesis / publication runs follow a documented discipline.

Last verified. 2026-05-14. Per-section staleness budget: 180 days for rule definitions; 90 days for skill SKILL.md content (tools change); revisit cross-reference URLs every 60 days.

Provenance. Original synthesis. The methodology emerged organically over a single working session in May 2026 across ~20 substantive exchanges and was refined by running it on itself (the meta-application test in §9). The 23 rules are the load-bearing output of that work. R-1–R-6 are extracted from the original RFE v1 patterns; R-7–R-10 are formalizations of phase-discipline practices that were implicit; R-11–R-21 are RFE v2 additions documented in the [RFE v2 + notebooklm-py defensive-review gist](htt

@YoraiLevi
YoraiLevi / rfe-v2-and-notebooklm-py-review.md
Created May 13, 2026 21:19
RFE v2 + Defensive Review: notebooklm-py adoption decision and Research-Formalize-Educate methodology refined by running it on itself

RFE v2 + Defensive Review: notebooklm-py

What this is. A two-part document. Part I is a defensive adoption review of the Python package teng-lin/notebooklm-py — should you install it, under what constraints, with what blast radius. Part II is the v2 revision of our Research-Formalize-Educate methodology, refined by observing how it performed when applied to itself.

What this is not. It is not a tutorial on NotebookLM, an attack guide, or a complete supply-chain primer. The companion documents linked at the end carry the conceptual scaffolding.

Last verified. 2026-05-14. Per-section staleness budget: 90 days for tool-version specifics (uv, notebooklm-py); 180 days for the methodology rules; reverify external URLs on any modification of this document.

Provenance. Part I is original synthesis of 4 parallel research agents (1 refused). Part II is original synthesis of 1 meta-observation agent + the patterns we've used across [rookie handoff](https://gist.github.com/YoraiLevi/1

@YoraiLevi
YoraiLevi / dgx-bootstrap-handbook.md
Created May 13, 2026 20:19
The Bootstrap Handbook: A Rookie's Implementation Manual (Vault, k3s, ESO, backups, SOPS, Renovate, cosign, monitoring, drills)

The Bootstrap Handbook

A Rookie's Implementation Manual for Vault, k3s, ESO, and Friends


Front matter

Who this is for. You read the Bootstrap Conundrum book (https://gist.github.com/YoraiLevi/d788d3ecbc8545d40c41e0957683ca22), watched the video, and walked away thinking "OK, but what do I actually type into a terminal?" This is the answer. Ten implementation chapters, one tool per chapter, full file contents you can copy-paste, verification commands you can run after every step, and a literal day-by-day 30-day plan that ties it all together. >

@YoraiLevi
YoraiLevi / dgx-bootstrap-conundrum-book.md
Created May 13, 2026 16:50
The Bootstrap Conundrum: A Rookie's Live Book on How Real Companies Solve the Trust Problem

The Bootstrap Conundrum

A Rookie's Live Book on How Real Companies Solve the Trust Problem


Front matter

Who this book is for. You've just been told that your team runs "infrastructure" — Vault, Kubernetes, GitHub Actions runners, whatever — and that there's a "Day 1 setup" you need to do, and then "Day 2 operations," and then "disaster recovery," and at every step you keep running into the same uncomfortable feeling: "how can I configure X if X is what I'm trying to set up?" You're not crazy. You've discovered the bootstrap problem. This book is everything our team learned from 10 weeks of reading the industry's accumulated answer to that question, distilled so you don't have to re-derive it. >