Skip to content

Instantly share code, notes, and snippets.

@DMontgomery40
Last active February 10, 2026 05:05
Show Gist options
  • Select an option

  • Save DMontgomery40/08c1bdede08ca1cee8800db7da1cda25 to your computer and use it in GitHub Desktop.

Select an option

Save DMontgomery40/08c1bdede08ca1cee8800db7da1cda25 to your computer and use it in GitHub Desktop.
Ralph audit loop: Codex CLI read-only code audit runner

Ralph Audit Loop (OpenAI Codex)

This is my Ralph "audit" loop wired up to the OpenAI Codex CLI. It runs a long-lived, autonomous, read-only code audit: the agent documents problems, but does not modify your repo.

You obviously need to tailor this to your codebase. The included prd.json is my example set of audit tasks for a specific stack/project; treat it as a template.

What’s Different Here (The Codex Part)

People know what a Ralph loop is; the part folks often miss is that you can run the loop with Codex and enforce read-only execution.

ralph.sh does this each iteration:

  • Picks the next story in prd.json where passes: false
  • Builds a prompt from that story plus CODEX.md
  • Runs codex exec in read-only mode (-s read-only)
  • (Optional) Enables web research with --search
  • Captures only the model’s final message via --output-last-message
  • Writes that markdown into the output file declared in the story acceptance criteria (the line starting with Created ...)
  • Marks the story passed and moves on

Prereqs

  • codex CLI on your PATH and authenticated
  • jq (the runner uses it to read/update prd.json)
  • Bash

How to Run

These files are meant to live at .codex/ralph-audit/ in your repo (or adjust REPO_ROOT in ralph.sh).

cd .codex/ralph-audit
./ralph.sh 20

Web research is enabled by default. To disable it:

./ralph.sh 20 --no-search

Logs / Tailing

  • High-level progress: events.log
  • Full Codex output: run.log
tail -n 200 -f events.log
tail -n 200 -f run.log

Output

Reports land in audit/*.md (relative to .codex/ralph-audit/). The PRD’s acceptance criteria define the exact output filenames.

Customize

  • Edit prd.json to match your repo’s file layout and the audits you care about.
  • Edit CODEX.md to reflect your quality bar and safety rules.
  • Edit the model pin in ralph.sh (REQUESTED_MODEL, REASONING_EFFORT).
  • If you’re on Linux, sed -i '' (macOS sed) will need a small adjustment.

Ralph Audit Agent Instructions (OpenAI Codex)


Safety Notice (Customize)

If this codebase is production, handles money, or touches sensitive data: treat this audit loop as a high-risk operation. Run with least privilege, avoid exporting long-lived credentials in your shell, and keep the agent in read-only mode.


You are an autonomous CODE AUDITOR. Your ONLY job is to find problems and document them. You DO NOT fix anything.

Web Research Policy (Use When Appropriate)

This repo depends on fast-moving tools and specs. Use web research selectively to avoid outdated assumptions.

  1. Use web research when validating claims about:
  • Next.js / React / Tailwind / Vercel / Netlify behavior or deprecations (especially 2025-2026 changes)
  • MCP spec / OpenClaw / other agent frameworks (rapidly evolving)
  • 3rd-party integrations and webhooks (Stripe, Coinbase Commerce, ProxyPics, etc.)
  • Any library/API surface that likely changed since 2024
  1. Do not use web research for timeless basics (JSON, HTTP fundamentals, TypeScript syntax, etc.).
  2. Prefer primary sources (official docs, upstream GitHub repos/releases).
  3. When validating a framework/library behavior, first identify the version used in this repo (for example from package.json, lockfiles, or official config), and search against that version’s docs/release notes.
  4. When you rely on web research for a finding, include an External References section in the report with:
  • URL
  • Date accessed (today’s date is provided by the runner)

Critical Rules

  1. DO NOT FIX ANYTHING - No code changes, no edits, no patches. Documentation only.
  2. DO NOT PLAN FIXES - Don't suggest how to fix. Just document what's broken.
  3. DO NOT SKIP ANYTHING - Read every line of every file in scope. Be exhaustive.
  4. BE EXTREMELY DETAILED - Include file paths, line numbers, code snippets, severity.

Your Task

  1. Read the PRD at .codex/ralph-audit/prd.json
  2. Pick the highest priority audit task where passes: false (or use the story id provided by the runner)
  3. Read EVERY file in the scope defined for that task
  4. For each file, scan line by line looking for ALL problem types (see below)
  5. Output the full markdown report (the exact contents that should be written to the task’s target .codex/ralph-audit/audit/XX-name.md file) as your final response
  6. Do NOT modify any files (the runner persists your output and updates PRD state)
  7. End your turn (next iteration picks up next task)

Allowed Changes (Strict)

Do NOT modify any files in the repo (read-only audit). Output only.

What To Look For (EVERY TASK)

For EVERY audit task, regardless of its specific focus, look for ALL of these:

Comments and JSDoc (Use as Signal, Not Truth)

  • Pay attention to inline comments and JSDoc strings when judging intent and expected behavior.
  • Comments/JSDoc are not a source of truth (they can be stale or wrong). The code and runtime behavior are the source of truth.
  • If comments/JSDoc contradict the implementation, document the mismatch explicitly as a finding (often broken-logic, will-break, or unfinished).

Broken Logic

  • Code that doesn't do what it claims to do
  • Conditions that are always true or always false
  • Functions that return wrong values
  • Off-by-one errors
  • Null/undefined not handled
  • Race conditions
  • Infinite loops possible
  • Dead code paths that can never execute

Unfinished Features

  • TODO/FIXME/HACK/XXX comments
  • Functions that return early with placeholder
  • throw new Error('not implemented')
  • Empty function bodies
  • Commented out code blocks
  • Console.log debugging left in
  • Features mentioned in comments but not coded

Code Slop

  • Copy-paste code (same code in multiple places)
  • Magic numbers without explanation
  • Unclear variable/function names
  • Functions that are way too long (>50 lines)
  • Deeply nested conditionals (>3 levels)
  • Mixed concerns in one function
  • Inconsistent patterns vs rest of codebase
  • Unused imports
  • Unused variables
  • Unused function parameters

Dead Ends

  • Functions defined but never called
  • Files that are never imported
  • Components never rendered
  • API routes that don't connect to anything
  • Types/interfaces never used
  • Exports that nothing imports

Stubs & Skeleton Code

  • Functions returning hardcoded/mock data
  • API routes returning fake responses
  • Components rendering placeholder content
  • Lorem ipsum text
  • Sample data that should be dynamic
  • // TODO: implement with empty body

Things That Will Break

  • Missing error handling on async operations
  • .single() without error handling (throws on 0 or >1 results)
  • No try/catch around operations that can fail
  • No validation on user input
  • No auth check on protected routes
  • Promises without .catch()
  • useEffect without cleanup
  • Memory leak patterns
  • State that can get out of sync

Output Format

Write to the specified .codex/ralph-audit/audit/XX-name.md file using this format:

# [Audit Name] Findings

Audit Date: [timestamp]
Files Examined: [count]
Total Findings: [count]

## Summary by Severity
- Critical: X
- High: X
- Medium: X
- Low: X

---

## Findings

### [SEVERITY] Finding #1: [Short description]

**File:** `path/to/file.ts`
**Lines:** 42-48
**Category:** [broken-logic | unfinished | slop | dead-end | stub | will-break]

**Description:**
[Detailed explanation of what's wrong]

**Code:**
```typescript
// The problematic code snippet

Why this matters: [Brief explanation of impact/risk]


[SEVERITY] Finding #2: ...

[Continue for all findings]


## Severity Levels

- **CRITICAL**: Will definitely break in production. Data loss risk. Security issue.
- **HIGH**: Likely to cause bugs. Major functionality broken. Poor UX.
- **MEDIUM**: Could cause issues. Incomplete feature. Inconsistent behavior.
- **LOW**: Code smell. Technical debt. Minor issues.

## Stop Condition

After documenting ALL findings for one audit task:
1. End your response (next iteration handles next task)
2. The runner will persist your markdown into the target output file and mark the story as passed

If you are explicitly asked for a final completion signal (all tasks passed), output:

COMPLETE


## Important Reminders

- You are NOT here to fix code. Just document.
- You are NOT here to suggest fixes. Just document what's broken.
- Read EVERY FILE in scope. Don't skim.
- Include CODE SNIPPETS for every finding.
- Include LINE NUMBERS for every finding.
- When in doubt, document it. Better too many findings than too few.
- The goal is a comprehensive audit that a human can review later.
{
"project": "Deep Code Audit (Example PRD)",
"branchName": "main",
"description": "Pure audit loop - NO FIXES. Exhaustively document all broken logic, unfinished features, code slop, dead ends, stubs, skeleton code, and things that will break. Write findings to .codex/ralph-audit/audit/*.md files.",
"verificationCommands": {
"typecheck": "echo 'Audit mode - no verification needed'",
"lint": "echo 'Audit mode - no verification needed'",
"test": "echo 'Audit mode - no verification needed'"
},
"userStories": [
{
"id": "AUDIT-001",
"title": "API Routes Deep Audit",
"description": "Exhaustively audit apps/web/src/app/api/v1/ for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/01-api-routes.md with ALL findings",
"Every API route file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity (critical/high/medium/low), detailed description of the problem, code snippet showing the issue"
],
"priority": 1,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Read every file in apps/web/src/app/api/v1/**/*.ts. For each file, scan line by line looking for: TODO/FIXME/HACK comments, hardcoded values that should be config, missing error handling, incomplete implementations (functions that return early or have placeholder logic), dead code paths that can never execute, copy-paste code smell, inconsistent patterns vs other routes, missing validation, missing auth checks, race conditions, potential null/undefined crashes, promises without error handling, skeleton/stub responses, feature flags or conditionals that are always true/false, commented out code, magic numbers, unclear variable names, functions that are too long, missing edge case handling. Produce the full markdown report content for .codex/ralph-audit/audit/01-api-routes.md with extreme detail."
},
{
"id": "AUDIT-002",
"title": "Authentication & Authorization Deep Audit",
"description": "Exhaustively audit all auth-related code for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/02-auth.md with ALL findings",
"Every auth file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 2,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine: middleware.ts, lib/api-auth.ts, lib/supabase/client.ts, lib/supabase/server.ts, lib/supabase/middleware.ts, app/auth/callback/route.ts, app/auth/signout/route.ts, app/(auth)/login/page.tsx, app/(auth)/signup/page.tsx, lib/session-owner-agent.ts. Look for: incomplete auth flows, missing redirect handling, race conditions in session creation, broken cookie handling, missing CSRF protection, auth checks that can be bypassed, inconsistent auth patterns, hardcoded redirects, missing error states, unclear failure modes, session handling edge cases, token refresh issues, logout that doesn't fully clear state, signup flow gaps, login flow gaps, OAuth callback edge cases. Produce the full markdown report content for .codex/ralph-audit/audit/02-auth.md."
},
{
"id": "AUDIT-003",
"title": "Database & Supabase Deep Audit",
"description": "Exhaustively audit all database interactions for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/03-database.md with ALL findings",
"Every Supabase query examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 3,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Search entire codebase for supabase.from(), .select(), .insert(), .update(), .delete(), .rpc(). Also examine packages/database/supabase/migrations/. Look for: queries that could fail silently, missing .single() error handling, N+1 query patterns, queries selecting too much data, missing indexes for common queries, RLS policies that might block legitimate access, RLS policies that might allow unauthorized access, inconsistent query patterns, hardcoded IDs, missing foreign key handling, cascade delete issues, transaction-like operations without actual transactions, race conditions in read-modify-write, stale data issues, missing optimistic updates, queries with no limit (could return huge datasets), type mismatches between code and schema, columns referenced that don't exist, tables referenced that don't exist. Produce the full markdown report content for .codex/ralph-audit/audit/03-database.md."
},
{
"id": "AUDIT-004",
"title": "React Components Deep Audit",
"description": "Exhaustively audit apps/web/src/components/ for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/04-components.md with ALL findings",
"Every component file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 4,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Read every file in apps/web/src/components/**/*.tsx. Look for: TODO/FIXME comments, placeholder text or Lorem ipsum, hardcoded strings that should be props, missing loading states, missing error states, missing empty states, useEffect without cleanup, useEffect with missing dependencies, useState that should be derived, props not being used, props with any type, event handlers that do nothing, broken onClick/onChange handlers, forms without validation, forms without error display, accessibility issues (missing aria labels, no keyboard nav), components that are too large, components with mixed concerns, CSS classes that don't exist, responsive breakpoints missing, dark mode issues, state that resets unexpectedly, infinite re-render risks, memory leak patterns, components importing things they don't use, dead code branches, commented out JSX. Produce the full markdown report content for .codex/ralph-audit/audit/04-components.md."
},
{
"id": "AUDIT-005",
"title": "Dashboard Pages Deep Audit",
"description": "Exhaustively audit apps/web/src/app/(dashboard)/ for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/05-dashboard-pages.md with ALL findings",
"Every dashboard page examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 5,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Read every file in apps/web/src/app/(dashboard)/**/*.tsx. Look for: pages that load but don't work, forms that don't submit, buttons that don't do anything, links to pages that don't exist, data that doesn't load, data that loads but doesn't display, filters that don't filter, search that doesn't search, pagination that doesn't work, tabs that don't switch, modals that don't open/close, dropdowns that don't work, broken navigation, missing breadcrumbs, inconsistent layouts, pages missing auth checks, pages showing wrong data for user, stale data after mutations, optimistic updates that don't rollback on error, missing loading skeletons, jarring layout shifts, broken responsive layouts, features half-implemented, features that look done but have broken logic underneath. Produce the full markdown report content for .codex/ralph-audit/audit/05-dashboard-pages.md."
},
{
"id": "AUDIT-006",
"title": "Public Pages Deep Audit",
"description": "Exhaustively audit public-facing pages (landing, browse, bounties, docs) for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/06-public-pages.md with ALL findings",
"Every public page examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 6,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine: app/page.tsx (landing), app/browse/page.tsx, app/bounties/page.tsx, app/humans/[id]/page.tsx, app/mcp/page.tsx, app/api-docs/page.tsx. Look for: broken links, links to nonexistent pages, placeholder content, Lorem ipsum, hardcoded sample data that should be dynamic, features described but not implemented, CTAs that go nowhere, forms that don't work, search/filter that's broken, pagination issues, SEO problems (missing meta tags), social sharing issues, broken images, missing alt text, copy that's wrong or outdated, responsive layout breaks, dark mode contrast issues, animations that don't work, loading states missing, error states missing, empty states missing. Produce the full markdown report content for .codex/ralph-audit/audit/06-public-pages.md."
},
{
"id": "AUDIT-007",
"title": "Stripe Integration Deep Audit",
"description": "Exhaustively audit all Stripe/payment code for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/07-stripe.md with ALL findings",
"Every Stripe-related file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 7,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine: lib/stripe.ts, api/v1/webhooks/stripe/route.ts, api/v1/bookings/[id]/fund-escrow/route.ts, api/v1/bookings/[id]/complete/route.ts, api/v1/humans/me/stripe-connect/route.ts, and any other files referencing stripe. Look for: webhook events not handled, webhook signature verification issues, payment states not tracked properly, refund logic missing or broken, dispute handling missing, Connect onboarding incomplete, transfer logic broken, fee calculation errors, currency handling issues, amount conversion errors (cents vs dollars), idempotency key issues, retry logic missing, error messages exposing internal details, test mode code in production paths, hardcoded Stripe IDs, missing null checks on Stripe responses, race conditions in payment status updates. Produce the full markdown report content for .codex/ralph-audit/audit/07-stripe.md."
},
{
"id": "AUDIT-008",
"title": "Booking Flow Deep Audit",
"description": "Exhaustively audit the entire booking lifecycle for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/08-booking-flow.md with ALL findings",
"Every booking-related file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 8,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Trace the ENTIRE booking flow: application -> acceptance -> booking creation -> escrow funding -> work in progress -> proof submission -> proof approval -> completion -> payout. Examine every API route and UI component involved. Look for: state transitions that can get stuck, states with no way out, invalid state transitions allowed, missing status checks, race conditions between status updates, proof upload that doesn't work, proof approval that doesn't release funds, auto-complete that's not scheduled, 72-hour timeout not implemented, transaction records not created, notifications not sent, conversations not created, booking detail page missing info, booking list missing bookings, status badges wrong, amounts calculated wrong, timestamps wrong. Produce the full markdown report content for .codex/ralph-audit/audit/08-booking-flow.md."
},
{
"id": "AUDIT-009",
"title": "Bounty Flow Deep Audit",
"description": "Exhaustively audit the entire bounty lifecycle for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/09-bounty-flow.md with ALL findings",
"Every bounty-related file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 9,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Trace the ENTIRE bounty flow: creation -> listing -> application -> acceptance -> rejection -> status updates. Examine: api/v1/bounties/**, app/bounties/page.tsx, app/(dashboard)/dashboard/bounties/page.tsx. Look for: bounty creation that doesn't validate, bounties that don't appear in listing, filters that don't work, applications that don't submit, application status not updating, accept/reject not working, booking not created on accept, conversation not created, notification not sent, bounty status not updating, completed bounties still showing as open, minimum budget not enforced, skills matching broken, deadline handling broken, closed bounties accepting applications. Produce the full markdown report content for .codex/ralph-audit/audit/09-bounty-flow.md."
},
{
"id": "AUDIT-010",
"title": "Messaging & Conversations Deep Audit",
"description": "Exhaustively audit realtime messaging for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/10-messaging.md with ALL findings",
"Every messaging-related file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 10,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine: api/v1/conversations/**, app/(dashboard)/dashboard/conversations/**, hooks or utilities for realtime. Look for: messages that don't send, messages that don't appear, realtime subscription not connecting, realtime subscription leaking (not cleaning up), messages appearing twice, messages in wrong order, unread count not updating, read status not syncing, conversation list not updating, conversation not found errors, messages from wrong sender showing, agent messages not appearing for human, human messages not appearing for agent, typing indicators broken, message timestamps wrong, long messages breaking layout, special characters breaking messages, images/attachments not working. Produce the full markdown report content for .codex/ralph-audit/audit/10-messaging.md."
},
{
"id": "AUDIT-011",
"title": "Notification System Deep Audit",
"description": "Exhaustively audit notifications for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/11-notifications.md with ALL findings",
"Every notification-related file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 11,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine: lib/notifications.ts, lib/notifications-query.ts, api/v1/notifications/**, api/v1/agent/notifications/**, components/notifications/**. Look for: notifications not being created when they should, notification types not all implemented, notification bell not showing count, notifications not marking as read, click on notification not navigating, realtime notifications not working, old notifications not clearing, notification preferences not respected, agent notifications different from human notifications, notification content wrong or missing data, notification links broken, notification timestamps wrong. Produce the full markdown report content for .codex/ralph-audit/audit/11-notifications.md."
},
{
"id": "AUDIT-012",
"title": "Moderation System Deep Audit",
"description": "Exhaustively audit content moderation for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/12-moderation.md with ALL findings",
"Every moderation-related file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 12,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine: lib/moderation/**, api/v1/moderation/**, api/v1/admin/moderation/**, app/(admin)/admin/moderation/**. Look for: moderation not running on content that needs it, moderation results not stored, moderation status not affecting visibility, admin dashboard not loading, rescan queue not working, appeal flow not implemented, banned content still visible, false positives with no override, classifier not calling external API, classifier errors not handled, moderation slowing down user actions, moderation running multiple times, admin auth not checking properly. Produce the full markdown report content for .codex/ralph-audit/audit/12-moderation.md."
},
{
"id": "AUDIT-013",
"title": "MCP Server Deep Audit",
"description": "Exhaustively audit the MCP server package for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/13-mcp-server.md with ALL findings",
"Every MCP file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 13,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine: packages/analoglabor-mcp/src/index.ts, package.json, README.md, tsconfig.json. Look for: tools that don't match API endpoints, tools with wrong parameters, tools with wrong return types, error handling that swallows errors, API URL wrong or hardcoded wrong, API key not being sent, auth header format wrong, tools that are stubs (just return placeholder), tools missing entirely, README documenting tools that don't exist, README with wrong examples, package.json with wrong entry points, build not producing working output, shebang missing or wrong. Produce the full markdown report content for .codex/ralph-audit/audit/13-mcp-server.md."
},
{
"id": "AUDIT-014",
"title": "Lib/Utils Deep Audit",
"description": "Exhaustively audit apps/web/src/lib/ utilities for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/14-lib-utils.md with ALL findings",
"Every lib file examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 14,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Read every file in apps/web/src/lib/**/*.ts. Look for: functions that are stubs, functions that throw 'not implemented', functions with TODO comments, utility functions with edge case bugs, date/time handling issues, timezone bugs, currency formatting bugs, validation functions that miss cases, type conversion that can fail, parsing that can throw, regex that's wrong, functions that should be async but aren't, functions that are async but don't need to be, circular dependencies, utils that duplicate other utils, utils that are never used, utils with wrong exports. Produce the full markdown report content for .codex/ralph-audit/audit/14-lib-utils.md."
},
{
"id": "AUDIT-015",
"title": "Types & Interfaces Deep Audit",
"description": "Exhaustively audit TypeScript types for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/15-types.md with ALL findings",
"All type definitions examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 15,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine: packages/database/src/types.ts (the generated Supabase types), apps/web/src/lib/**/*.ts for any type definitions. Search for: export type, export interface, type aliases. Look for: types that don't match actual data shapes, types that are any or unknown when they should be specific, types that are too narrow (literal types that should be wider), types that are too wide (any when specific shape known), types missing optional markers, types with wrong nullability, types duplicated across files, types that reference things that don't exist, types with TODO comments, types that are stubs, types that will cause runtime errors if used. Produce the full markdown report content for .codex/ralph-audit/audit/15-types.md."
},
{
"id": "AUDIT-016",
"title": "Environment & Config Deep Audit",
"description": "Exhaustively audit configuration and environment handling for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/16-config.md with ALL findings",
"All config files examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 16,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine: all process.env usage, .env.local.example, netlify.toml, next.config.js, tsconfig.json, turbo.json, package.json files, tailwind.config.ts. Look for: env vars used but not documented, env vars documented but not used, env vars with no fallback that crash on missing, NEXT_PUBLIC_ vars exposing secrets, hardcoded values that should be env vars, config that differs between dev and prod incorrectly, build config that's wrong, deploy config that's wrong, missing required config, deprecated config options. Produce the full markdown report content for .codex/ralph-audit/audit/16-config.md."
},
{
"id": "AUDIT-017",
"title": "Error Handling Deep Audit",
"description": "Exhaustively audit error handling patterns for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/17-error-handling.md with ALL findings",
"All error handling examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 17,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Search entire codebase for: try/catch blocks, .catch() calls, throw statements, console.error, error boundaries. Look for: catch blocks that swallow errors silently, catch blocks that just console.log, catch blocks that rethrow without adding context, async functions without try/catch, promises without .catch(), error messages that expose internal details, error messages that are unhelpful, errors not shown to user, errors shown to user in ugly way, network errors not handled, timeout errors not handled, validation errors not specific enough, error states in UI missing, error recovery not possible. Produce the full markdown report content for .codex/ralph-audit/audit/17-error-handling.md."
},
{
"id": "AUDIT-018",
"title": "Tests & Testing Deep Audit",
"description": "Exhaustively audit test files for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/18-tests.md with ALL findings",
"All test files examined for: broken logic, unfinished features, stubs, skeleton code, dead ends, code slop, things that will break",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 18,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Examine all files in tests/**. Look for: tests that are skipped, tests that always pass (no real assertions), tests that test implementation not behavior, tests with hardcoded values that will break, tests missing for critical paths, test files that exist but have no tests, mocks that don't match real implementation, snapshot tests that are meaningless, flaky tests, tests that depend on order, tests that depend on external services, tests that don't clean up, test utilities that are broken. Also identify critical code paths with NO test coverage. Produce the full markdown report content for .codex/ralph-audit/audit/18-tests.md."
},
{
"id": "AUDIT-019",
"title": "Dead Code & Unused Files Deep Audit",
"description": "Exhaustively find dead code and orphan files for ALL problems.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/19-dead-code.md with ALL findings",
"Entire codebase scanned for: unused exports, orphan files, commented code, unreachable code",
"Each finding has: file path, line numbers, severity, detailed description, code snippet"
],
"priority": 19,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Scan entire codebase for: files that are never imported, functions that are exported but never imported, components that are never used, types that are never used, constants that are never used, large blocks of commented out code, code after return statements, conditions that are always true or false, feature flags that are never toggled, imports that are never used, variables that are assigned but never read, parameters that are never used, catch blocks that catch but ignore. Produce the full markdown report content for .codex/ralph-audit/audit/19-dead-code.md."
},
{
"id": "AUDIT-020",
"title": "Final Consolidation & Index",
"description": "Create master index of all findings and summary statistics.",
"acceptanceCriteria": [
"Created .codex/ralph-audit/audit/00-INDEX.md with summary of all findings",
"Total counts by severity across all audit files",
"Top 10 most critical findings highlighted",
"List of all audit files with finding counts"
],
"priority": 20,
"passes": false,
"notes": "DO NOT FIX ANYTHING. Read all .codex/ralph-audit/audit/*.md files created in previous steps. Create 00-INDEX.md with: executive summary, total findings count, breakdown by severity (critical/high/medium/low), breakdown by category, list of the 10 most critical findings that need immediate attention, table of contents linking to each audit file. This is documentation only - no fixes, no plans, just consolidate what was found."
}
]
}
# Ralph Audit Progress Log (Codex)
Started: 2026-02-09
Purpose: Deep code audit - document all broken logic, unfinished features, code slop, dead ends, stubs, skeleton code, and things that will break.
Output: .codex/ralph-audit/audit/*.md files
---
## Audit Checklist
- [ ] AUDIT-001: API Routes
- [ ] AUDIT-002: Auth
- [ ] AUDIT-003: Database
- [ ] AUDIT-004: Components
- [ ] AUDIT-005: Dashboard Pages
- [ ] AUDIT-006: Public Pages
- [ ] AUDIT-007: Stripe
- [ ] AUDIT-008: Booking Flow
- [ ] AUDIT-009: Bounty Flow
- [ ] AUDIT-010: Messaging
- [ ] AUDIT-011: Notifications
- [ ] AUDIT-012: Moderation
- [ ] AUDIT-013: MCP Server
- [ ] AUDIT-014: Lib/Utils
- [ ] AUDIT-015: Types
- [ ] AUDIT-016: Config
- [ ] AUDIT-017: Error Handling
- [ ] AUDIT-018: Tests
- [ ] AUDIT-019: Dead Code
- [ ] AUDIT-020: Index/Summary
---
#!/bin/bash
# Ralph Audit Loop (OpenAI Codex) - Long-running autonomous *read-only* audit loop.
# Usage: ./ralph.sh [max_iterations] [--skip-security-check] [--no-search]
#
# Writes all artifacts under `.codex/ralph-audit/` (PRD state, logs, and audit reports).
set -euo pipefail
MAX_ITERATIONS=20
MAX_ATTEMPTS_PER_STORY="${MAX_ATTEMPTS_PER_STORY:-5}"
SKIP_SECURITY="${SKIP_SECURITY_CHECK:-false}"
ENABLE_SEARCH="true"
TAIL_N="${TAIL_N:-200}"
while [[ $# -gt 0 ]]; do
case $1 in
--skip-security-check)
SKIP_SECURITY="true"
shift
;;
--search)
ENABLE_SEARCH="true"
shift
;;
--no-search)
ENABLE_SEARCH="false"
shift
;;
*)
if [[ "$1" =~ ^[0-9]+$ ]]; then
MAX_ITERATIONS="$1"
fi
shift
;;
esac
done
if [[ "$SKIP_SECURITY" != "true" ]]; then
echo ""
echo "==============================================================="
echo " Security Pre-Flight Check"
echo "==============================================================="
echo ""
SECURITY_WARNINGS=()
if [[ -n "${AWS_ACCESS_KEY_ID:-}" ]]; then
SECURITY_WARNINGS+=("AWS_ACCESS_KEY_ID is set - production credentials may be exposed")
fi
if [[ -n "${DATABASE_URL:-}" ]]; then
SECURITY_WARNINGS+=("DATABASE_URL is set - database credentials may be exposed")
fi
if [[ ${#SECURITY_WARNINGS[@]} -gt 0 ]]; then
echo "WARNING: Potential credential exposure detected:"
echo ""
for warning in "${SECURITY_WARNINGS[@]}"; do
echo " - $warning"
done
echo ""
echo "Running an autonomous agent with these credentials set could expose"
echo "them in logs, commit messages, or API calls."
echo ""
echo "See your repo's security docs for sandboxing guidance."
echo ""
read -p "Continue anyway? (y/N) " -n 1 -r
echo ""
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "Aborted. Unset credentials or use --skip-security-check to bypass."
exit 1
fi
else
echo "No credential exposure risks detected."
fi
echo ""
fi
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
PRD_FILE="$SCRIPT_DIR/prd.json"
RUN_LOG="$SCRIPT_DIR/run.log"
EVENT_LOG="$SCRIPT_DIR/events.log"
MODEL_CHECK_LOG="$SCRIPT_DIR/.model-check.log"
PROGRESS_FILE="$SCRIPT_DIR/progress.txt"
mkdir -p "$SCRIPT_DIR/audit"
ATTEMPTS_FILE="$SCRIPT_DIR/.story-attempts"
LAST_STORY_FILE="$SCRIPT_DIR/.last-story"
if [ ! -f "$ATTEMPTS_FILE" ]; then
echo "{}" > "$ATTEMPTS_FILE"
fi
get_current_story() {
if [ -f "$PRD_FILE" ]; then
jq -r '.userStories[] | select(.passes == false) | .id' "$PRD_FILE" 2>/dev/null | head -1
fi
}
get_story_attempts() {
local story_id="$1"
jq -r --arg id "$story_id" '.[$id] // 0' "$ATTEMPTS_FILE" 2>/dev/null || echo "0"
}
increment_story_attempts() {
local story_id="$1"
local current
current=$(get_story_attempts "$story_id")
local new_count=$((current + 1))
jq --arg id "$story_id" --argjson count "$new_count" '.[$id] = $count' "$ATTEMPTS_FILE" > "$ATTEMPTS_FILE.tmp" \
&& mv "$ATTEMPTS_FILE.tmp" "$ATTEMPTS_FILE"
echo "$new_count"
}
mark_story_skipped() {
local story_id="$1"
local max_attempts="$2"
local note="Skipped: exceeded $max_attempts attempts without passing"
jq --arg id "$story_id" --arg note "$note" '
.userStories = [
.userStories[]
| if .id == $id then
(.notes = $note) | (.passes = true) | (.skipped = true)
else
.
end
]
' "$PRD_FILE" > "$PRD_FILE.tmp" && mv "$PRD_FILE.tmp" "$PRD_FILE"
echo "Circuit breaker: Marked story $story_id as skipped after $max_attempts attempts"
}
check_circuit_breaker() {
local story_id="$1"
local attempts
attempts=$(get_story_attempts "$story_id")
if [ "$attempts" -ge "$MAX_ATTEMPTS_PER_STORY" ]; then
echo "Circuit breaker: Story $story_id has reached max attempts ($attempts/$MAX_ATTEMPTS_PER_STORY)"
mark_story_skipped "$story_id" "$MAX_ATTEMPTS_PER_STORY"
return 0
fi
return 1
}
ts() {
date '+%Y-%m-%dT%H:%M:%S%z'
}
log_event() {
echo "[$(ts)] $*" >> "$EVENT_LOG"
}
get_story_title() {
local story_id="$1"
jq -r --arg id "$story_id" '.userStories[] | select(.id == $id) | .title' "$PRD_FILE" 2>/dev/null || true
}
get_story_description() {
local story_id="$1"
jq -r --arg id "$story_id" '.userStories[] | select(.id == $id) | .description' "$PRD_FILE" 2>/dev/null || true
}
get_story_notes() {
local story_id="$1"
jq -r --arg id "$story_id" '.userStories[] | select(.id == $id) | (.notes // "")' "$PRD_FILE" 2>/dev/null || true
}
get_story_output_relpath() {
local story_id="$1"
# Extract the target output file from acceptance criteria, e.g.:
# "Created .codex/ralph-audit/audit/01-api-routes.md with ALL findings"
jq -r --arg id "$story_id" '
.userStories[]
| select(.id == $id)
| .acceptanceCriteria[]
| select(test("^Created "))
| split(" ")[1]
' "$PRD_FILE" 2>/dev/null | head -n 1
}
mark_story_passed() {
local story_id="$1"
jq --arg id "$story_id" '
.userStories = [
.userStories[]
| if .id == $id then
(.passes = true)
else
.
end
]
' "$PRD_FILE" > "$PRD_FILE.tmp" && mv "$PRD_FILE.tmp" "$PRD_FILE"
}
mark_progress_checked() {
local story_id="$1"
if [ ! -f "$PROGRESS_FILE" ]; then
return 0
fi
# Replace: - [ ] AUDIT-001: ... -> - [x] AUDIT-001: ...
sed -i '' "s|^- \\[ \\] ${story_id}:|- [x] ${story_id}:|g" "$PROGRESS_FILE" || true
}
# Pinned by default. Adjust as needed for your Codex access and preference.
REQUESTED_MODEL="gpt-5.2"
REASONING_EFFORT="high"
if [[ -n "${CODEX_MODEL:-}" && "${CODEX_MODEL}" != "$REQUESTED_MODEL" ]]; then
echo "ERROR: This loop is pinned to CODEX_MODEL=$REQUESTED_MODEL. Unset CODEX_MODEL to continue."
exit 1
fi
if [[ -n "${CODEX_REASONING_EFFORT:-}" && "${CODEX_REASONING_EFFORT}" != "$REASONING_EFFORT" ]]; then
echo "ERROR: This loop is pinned to CODEX_REASONING_EFFORT=$REASONING_EFFORT. Unset CODEX_REASONING_EFFORT to continue."
exit 1
fi
touch "$RUN_LOG" "$EVENT_LOG"
echo "Starting Ralph Audit (OpenAI Codex)"
echo " Max iterations: $MAX_ITERATIONS"
echo " Max attempts per story: $MAX_ATTEMPTS_PER_STORY"
echo " Model: $REQUESTED_MODEL (reasoning_effort=$REASONING_EFFORT)"
echo " Logs:"
echo " - events: $EVENT_LOG"
echo " - full: $RUN_LOG"
echo " Tail:"
echo " tail -n $TAIL_N -f $EVENT_LOG"
echo " tail -n $TAIL_N -f $RUN_LOG"
log_event "RUN START max_iterations=$MAX_ITERATIONS max_attempts_per_story=$MAX_ATTEMPTS_PER_STORY search=$ENABLE_SEARCH model=$REQUESTED_MODEL reasoning_effort=$REASONING_EFFORT"
# Preflight: verify the requested model works for current Codex auth.
MODEL_CHECK_CMD=(
codex
-a never
exec
-C "$REPO_ROOT"
-m "$REQUESTED_MODEL"
-c "model_reasoning_effort=\"$REASONING_EFFORT\""
-s read-only
"Respond with exactly: OK"
)
if ! "${MODEL_CHECK_CMD[@]}" > "$MODEL_CHECK_LOG" 2>&1; then
echo "ERROR: Model preflight failed for '$REQUESTED_MODEL'. See: $MODEL_CHECK_LOG"
echo "Fix options:"
echo " 1) Re-auth with an API key that has access:"
echo " printenv OPENAI_API_KEY | codex login --with-api-key"
exit 1
fi
CODEX_ARGS=(
-a never
)
if [[ "$ENABLE_SEARCH" == "true" ]]; then
CODEX_ARGS+=(--search)
fi
CODEX_ARGS+=(
exec
-C "$REPO_ROOT"
-m "$REQUESTED_MODEL"
-c "model_reasoning_effort=\"$REASONING_EFFORT\""
-s read-only
)
for i in $(seq 1 "$MAX_ITERATIONS"); do
echo ""
echo "==============================================================="
echo " Ralph Audit Iteration $i of $MAX_ITERATIONS"
echo "==============================================================="
echo "" >> "$RUN_LOG"
echo "===============================================================" >> "$RUN_LOG"
echo "Ralph Audit Iteration $i of $MAX_ITERATIONS - $(date)" >> "$RUN_LOG"
echo "===============================================================" >> "$RUN_LOG"
log_event "ITERATION START $i/$MAX_ITERATIONS"
CURRENT_STORY=$(get_current_story)
if [ -z "$CURRENT_STORY" ]; then
log_event "RUN COMPLETE (no incomplete stories)"
echo "No incomplete stories found."
echo ""
echo "Ralph audit completed all tasks!"
echo "<promise>COMPLETE</promise>"
exit 0
fi
LAST_STORY=""
if [ -f "$LAST_STORY_FILE" ]; then
LAST_STORY=$(cat "$LAST_STORY_FILE" 2>/dev/null || echo "")
fi
if [ "$CURRENT_STORY" == "$LAST_STORY" ]; then
echo "Consecutive attempt on story: $CURRENT_STORY"
ATTEMPTS=$(increment_story_attempts "$CURRENT_STORY")
echo "Attempts on $CURRENT_STORY: $ATTEMPTS/$MAX_ATTEMPTS_PER_STORY"
if check_circuit_breaker "$CURRENT_STORY"; then
echo "Skipping to next story..."
echo "$CURRENT_STORY" > "$LAST_STORY_FILE"
sleep 1
continue
fi
else
ATTEMPTS=$(increment_story_attempts "$CURRENT_STORY")
echo "Starting story: $CURRENT_STORY (attempt $ATTEMPTS/$MAX_ATTEMPTS_PER_STORY)"
fi
echo "$CURRENT_STORY" > "$LAST_STORY_FILE"
STORY_TITLE="$(get_story_title "$CURRENT_STORY")"
STORY_DESC="$(get_story_description "$CURRENT_STORY")"
STORY_NOTES="$(get_story_notes "$CURRENT_STORY")"
OUT_REL="$(get_story_output_relpath "$CURRENT_STORY")"
if [ -z "$OUT_REL" ] || [ "$OUT_REL" == "null" ]; then
log_event "ERROR story=$CURRENT_STORY could-not-determine-output-path"
echo "ERROR: Could not determine output path for story $CURRENT_STORY from prd.json acceptanceCriteria."
exit 1
fi
OUT_FILE="$REPO_ROOT/$OUT_REL"
mkdir -p "$(dirname "$OUT_FILE")"
log_event "STORY START id=$CURRENT_STORY attempt=$ATTEMPTS out=$OUT_REL title=$(printf '%s' "$STORY_TITLE" | tr '\n' ' ')"
PROMPT_FILE="$SCRIPT_DIR/.prompt.md"
LAST_MESSAGE_FILE="$SCRIPT_DIR/.last-message.md"
{
printf "# Ralph Audit (OpenAI Codex)\n\n"
printf "Today's date: %s\n\n" "$(date +%Y-%m-%d)"
printf "Current story: %s — %s\n" "$CURRENT_STORY" "$STORY_TITLE"
printf "Target output file (relative to repo root): %s\n\n" "$OUT_REL"
printf "Hard requirements:\n"
printf "- Do NOT modify any files in the repo.\n"
printf "- Your final response MUST be ONLY the markdown report contents for %s.\n" "$OUT_REL"
printf " Do not include any extra commentary.\n\n"
printf "Story description:\n%s\n\n" "$STORY_DESC"
printf "Story notes:\n%s\n\n" "$STORY_NOTES"
printf "---\n\n"
cat "$SCRIPT_DIR/CODEX.md"
} > "$PROMPT_FILE"
# Run Codex read-only; persist the model's last message to a file we control.
codex "${CODEX_ARGS[@]}" --output-last-message "$LAST_MESSAGE_FILE" < "$PROMPT_FILE" 2>&1 | tee -a "$RUN_LOG" || true
if [ ! -s "$LAST_MESSAGE_FILE" ]; then
log_event "ERROR story=$CURRENT_STORY codex-empty-last-message"
echo "ERROR: Codex did not produce a last message file (or it was empty). See: $RUN_LOG"
echo "Iteration $i complete (failed). Continuing..."
sleep 2
continue
fi
# Persist the audit report and mark story passed in PRD state.
cat "$LAST_MESSAGE_FILE" > "$OUT_FILE"
mark_story_passed "$CURRENT_STORY"
mark_progress_checked "$CURRENT_STORY"
log_event "STORY COMPLETE id=$CURRENT_STORY wrote=$OUT_REL bytes=$(wc -c < \"$OUT_FILE\" | tr -d ' ')"
REMAINING=$(jq -r '.userStories[] | select(.passes == false) | .id' "$PRD_FILE" 2>/dev/null | head -n 1 || true)
if [ -z "$REMAINING" ]; then
log_event "RUN COMPLETE (all stories passed)"
echo ""
echo "All audit tasks are marked passes:true."
echo "Ralph audit completed all tasks!"
echo "<promise>COMPLETE</promise>"
exit 0
fi
echo "Iteration $i complete. Continuing..."
sleep 2
done
echo ""
echo "Ralph reached max iterations ($MAX_ITERATIONS) without completing all tasks."
echo "Tail log: tail -f $RUN_LOG"
log_event "RUN STOPPED (reached max iterations without completing all tasks)"
exit 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment