Scripting Copilot: Piping Hot Intelligence

Enterprise CI/CD Edition - Microsoft AI Dev Days (15-20 min)

The Concept

"Piping Hot Intelligence" for the enterprise:

Piping = Unix pipes (|) chaining commands and AI
Hot = Fresh, production-ready, straight from the pipeline

Target Audience: DevOps engineers, platform teams, security-conscious enterprises

ACT 1: THE HOOK (1-2 minutes)

Opening: The Mystery File

Slide: Terminal with a simple command

Say this:

"I want to show you something. Watch carefully."

$ ma task.copilot.md

Pause. The terminal comes alive.

"That markdown file just ran an AI agent. It read my codebase. It analyzed my changes. It wrote a report. But here's the question..."

Reveal the punchline:

"What if that markdown file lived in your CI/CD pipeline? What if EVERY pull request got this treatment, automatically?"

Show the real magic:

$ git diff main..HEAD | copilot -p "Review this PR for security issues" -s

"Data in. Intelligence out. The fifty-year-old pipe operator now carries something new. Let me show you how to productionize it."

ACT 2: MAIN CONTENT (12-15 minutes)

Section 2.1: The Enterprise Equation (2 min)

The old way:

Developer writes code → Manual review → Hope for the best → Ship it

The new way:

Developer writes code → AI pre-review → Human review → AI changelog → Ship with confidence

The equation:

Production AI = Instructions + Tools + Permissions + Audit Trail

Key insight:

"In enterprise, you don't just run AI. You run AI you can explain, audit, and control."

Section 2.2: Copilot CLI - The Enterprise Workhorse (6-8 min)

The Core Flags for CI/CD

# Non-interactive mode (REQUIRED for pipelines)
copilot -p "prompt" -s

# Model selection for cost control
copilot -p "..." --model claude-haiku-4.5      # $0.25/M tokens - triage
copilot -p "..." --model claude-sonnet-4       # $3/M tokens - daily work
copilot -p "..." --model claude-opus-4.5       # $15/M tokens - deep analysis

# Permission boundaries (CRITICAL for security)
copilot -p "..." --allow-tool 'shell(git:*)'   # Git operations only
copilot -p "..." --allow-tool 'Read'           # Read-only access
copilot -p "..." --allow-tool 'shell(npm:run)' # Specific npm command

Why -s (silent) matters:

"Silent mode strips the interactive decorations. What's left is pure text output that can be captured, parsed, and piped. This is what makes Copilot CI/CD-ready."

DEMO 1: PR Review in 30 Seconds

Slide: GitHub PR interface

# In your CI pipeline
git diff main..HEAD | copilot -p "Review for:
1. Security vulnerabilities
2. Breaking API changes
3. Missing error handling

Format as markdown checklist." -s > pr-review.md

Output:

## PR Review

### Security
- [ ] Line 47: SQL query uses string concatenation - potential injection
- [x] Auth tokens properly scoped

### Breaking Changes
- [ ] `getUserById` signature changed - update callers

### Error Handling
- [ ] `fetchData()` missing try/catch

"That runs in 3 seconds. Every PR. Automatically. Before a human even looks at it."

DEMO 2: The Escalation Pattern

Slide: Traffic light: Green (Haiku), Yellow (Sonnet), Red (Opus)

#!/bin/bash
# escalate-review.sh - Smart model selection

DIFF=$(git diff main..HEAD)
LINES=$(echo "$DIFF" | wc -l)

if [ "$LINES" -lt 50 ]; then
  # Small change: fast model
  echo "$DIFF" | copilot -p "Quick review" -s --model claude-haiku-4.5
elif [ "$LINES" -lt 500 ]; then
  # Medium change: balanced model
  echo "$DIFF" | copilot -p "Thorough review" -s --model claude-sonnet-4
else
  # Large change: deep analysis
  echo "$DIFF" | copilot -p "Architecture review" -s --model claude-opus-4.5
fi

The math:

PR Size	Model	Cost	Time
< 50 lines	Haiku	~$0.01	~2s
50-500 lines	Sonnet	~$0.05	~5s
500+ lines	Opus	~$0.50	~15s

"Haiku handles 80% of PRs. Opus handles the 5% that matter. Your CI budget thanks you."

DEMO 3: Permission Boundaries - The Security Layer

Slide: Vault door with "PERMISSIONS" label

# LOCKED DOWN: Analysis only (no execution, no writes)
copilot -p "Analyze this code" -s \
  --allow-tool 'Read'

# CONTROLLED: Git operations only
copilot -p "Generate commit message" -s \
  --allow-tool 'shell(git:*)' \
  --allow-tool 'Read'

# SCOPED EXECUTION: Specific npm commands only
copilot -p "Run tests and fix failures" -s \
  --allow-tool 'shell(npm:test)' \
  --allow-tool 'shell(npm:run lint)' \
  --allow-tool 'Read' \
  --allow-tool 'Edit'

# NEVER IN CI: Full autonomy (interactive only!)
copilot -p "..." --allow-all-tools  # Developer machines only

The trust matrix:

Environment	Permission Level	`--allow-tool` Pattern
CI Read	Audit only	`Read`
CI Write	Controlled	`Read`, `Edit`, `shell(git:*)`
Local Dev	Extended	`shell(npm:)`, `shell(pnpm:)`
Supervised	Full	`--allow-all-tools` (with human)

"In production, permissions aren't optional. They're your audit trail. They're your compliance story."

Section 2.3: Enterprise Patterns (4-5 min)

Pattern 1: Automated Changelog Generation

# In CI/CD after merge to main
git log --oneline $(git describe --tags --abbrev=0)..HEAD | \
  copilot -p "Generate changelog:
  - Group by: Features, Fixes, Breaking Changes
  - Format: Keep a Changelog style
  - Highlight security-relevant changes" -s >> CHANGELOG.md

Output:

## [Unreleased]

### Added
- User profile API endpoint (#142)
- OAuth2 refresh token support (#147)

### Fixed
- Memory leak in connection pool (#144)

### Security
- Updated bcrypt to v5.1 (CVE-2024-xxxx) (#148)

### Breaking
- `getUser()` now returns Promise (#145)

Pattern 2: Test Failure Analysis

# In CI on test failure
npm test 2>&1 | \
  copilot -p "Analyze these test failures:
  1. Identify root cause
  2. Check if related to recent changes
  3. Suggest fix or workaround

Recent changes:
$(git log --oneline -5)" -s --model claude-sonnet-4

Why this matters:

"Junior devs get senior-level failure analysis. On every build. At 3 AM."

Pattern 3: Security Scan Triage

# After security scanner runs
cat security-scan.json | \
  copilot -p "Triage these findings:
  - CRITICAL: Must fix before merge
  - HIGH: Fix within sprint
  - MEDIUM: Track in backlog
  - LOW/FALSE POSITIVE: Explain why

Context: This is a Node.js API service." -s \
  --model claude-sonnet-4 > security-triage.md

Pattern 4: The Multi-Stage Pipeline

# .github/workflows/ai-review.yml
name: AI-Assisted Review

on: [pull_request]

jobs:
  ai-review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Quick Triage (Haiku)
        run: |
          git diff origin/main..HEAD | \
          copilot -p "SIMPLE, MODERATE, or COMPLEX?" -s \
          --model claude-haiku-4.5 > complexity.txt

      - name: Full Review (Model based on complexity)
        run: |
          COMPLEXITY=$(cat complexity.txt)
          if [ "$COMPLEXITY" = "COMPLEX" ]; then
            MODEL="claude-opus-4.5"
          else
            MODEL="claude-sonnet-4"
          fi
          git diff origin/main..HEAD | \
          copilot -p "Security and architecture review" -s \
          --model $MODEL > review.md

      - name: Post Comment
        uses: actions/github-script@v7
        with:
          script: |
            const review = require('fs').readFileSync('review.md', 'utf8');
            github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: review
            });

ACT 3: THE RESOLUTION (3-4 minutes)

The Problem: Pipelines as Code, But Not Prompts

"We version our infrastructure. We version our tests. But our AI prompts? They're scattered across scripts, undocumented, untested."

Pain points:

Complex flag combinations across different pipelines
No standardization across teams
Prompts duplicated and drifting
No way to test prompt changes before production

The Reveal: markdown-agent as Prompt Infrastructure

Slide: Architecture diagram showing .md files in repo

$ ma review.copilot.md

The file:

---
model: claude-sonnet-4
allow-tool:
  - 'shell(git:*)'
  - Read
silent: true
---

## PR Review Checklist

Review these changes for enterprise readiness:

### Security
- Authentication/authorization issues
- Input validation gaps
- Secrets or credentials in code

### API Stability
- Breaking changes to public interfaces
- Deprecation warnings needed

### Operations
- Missing observability (logs, metrics)
- Error handling completeness
- Resource cleanup

Changes to review:
!`git diff main..HEAD`

Why Markdown Files for Enterprise

1. Version Control

git log -- agents/review.copilot.md
# See exactly when prompts changed

2. Code Review for Prompts

# PR: "Update security review prompt"
git diff agents/review.copilot.md

3. Testing

# Test prompt changes on sample diffs
cat test-fixtures/security-vuln.diff | ma review.copilot.md

4. Standardization

repo/
  agents/
    review.copilot.md      # All teams use same review
    changelog.copilot.md   # Consistent changelogs
    triage.copilot.md      # Same triage logic

The CI/CD Recipe Book

# agents/ci/pr-review.copilot.md
---
model: claude-sonnet-4
allow-tool:
  - 'shell(git:*)'
  - Read
silent: true
---
Enterprise PR review. Security-first.

# agents/ci/changelog.copilot.md
---
model: claude-haiku-4.5
allow-tool:
  - 'shell(git:log)'
silent: true
---
Generate changelog from commits.

# agents/ci/test-analysis.copilot.md
---
model: claude-sonnet-4
allow-tool:
  - Read
silent: true
---
Analyze test failures and suggest fixes.

Usage in pipeline:

ma agents/ci/pr-review.copilot.md < changes.diff
ma agents/ci/changelog.copilot.md >> CHANGELOG.md
npm test 2>&1 | ma agents/ci/test-analysis.copilot.md

CLOSING (1 minute)

Call to Action: Your First Production Agent

This week:

Add one agent to your CI: pr-review.copilot.md
Set permissions explicitly: --allow-tool 'shell(git:*)'
Use escalation: Haiku for triage, Sonnet for work, Opus for complexity
Version your prompts: Commit them like code

The Enterprise Equation

Production AI = Instructions + Tools + Permissions + Audit Trail

Your markdown files are now:

Your AI runbooks
Your compliance documentation
Your team's institutional knowledge
Your competitive advantage

Resources

markdown-agent: npm install -g markdown-agent
GitHub Copilot CLI: gh copilot
This talk's agents: [gist link]

Final Thought

"We spent a decade on 'infrastructure as code.' Now we add 'intelligence as code.' Same rigor. Same version control. Same code review. New capability."

TIMING GUIDE

Section	Time	Cumulative
Hook: ma + pipe demo	2 min	2 min
Enterprise equation	2 min	4 min
Core flags for CI/CD	2 min	6 min
Demo 1: PR Review	2 min	8 min
Demo 2: Escalation pattern	2 min	10 min
Demo 3: Permissions	2 min	12 min
Enterprise patterns (4 examples)	4 min	16 min
Resolution: markdown-agent reveal	3 min	19 min
Closing	1 min	20 min

KEY TAKEAWAYS (for slides)

-s unlocks CI/CD - Silent mode makes output pipeable
Permissions are mandatory - --allow-tool is your audit trail
Escalate by complexity - Haiku triages, Sonnet works, Opus thinks
Version your prompts - Markdown files in git = prompt infrastructure
Production AI = Instructions + Tools + Permissions + Audit

DEMO CHECKLIST

Terminal with large font (24pt+)
Git repo with realistic PR diff
GitHub Actions workflow file ready
Pre-written .copilot.md agent files
Test all permission combinations before talk
Backup: screenshots of each demo
Copilot CLI installed and authenticated
Network connectivity verified

DEMO FILES

pr-review.copilot.md

---
model: claude-sonnet-4
allow-tool:
  - 'shell(git:*)'
  - Read
silent: true
---
Review this PR for enterprise readiness:
- Security vulnerabilities
- Breaking API changes
- Missing error handling
- Observability gaps

Format as actionable checklist.

triage.copilot.md

---
model: claude-haiku-4.5
silent: true
---
Classify this change as SIMPLE, MODERATE, or COMPLEX.
Respond with ONLY one word.

changelog.copilot.md

---
model: claude-haiku-4.5
allow-tool:
  - 'shell(git:log)'
silent: true
---
Generate a changelog entry from these commits.
Group by: Features, Fixes, Security, Breaking Changes.
Use Keep a Changelog format.

test-analysis.copilot.md

---
model: claude-sonnet-4
silent: true
---
Analyze these test failures:
1. Identify root cause
2. Suggest specific fix
3. Note if this is a flaky test pattern

QUOTABLE MOMENTS