Engineering Complexity Matrix: Easy/Difficult × Small/Large Radius

This framework helps teams classify, discuss, and de‑risk changes based on logical complexity (easy vs. difficult) and blast radius (small vs. large). Use it in grooming, architecture review, PR preparation, and onboarding.

1) Quadrant Matrix (Primary Framework)

Classify the work first; the rest of the guidance follows from the quadrant.

Quadrant	Effort (Logic)	Radius (Blast Surface)	Typical Examples	Risks	Recommended Strategy
Q1: Easy–Small	Low	Small	Localized UI tweak; copy change; private method refactor	Minimal	Standard PR; unit tests
Q2: Easy–Large	Low	Large	Date formatting change in shared helper; config key rename; design token tweak	Hidden regressions across many consumers	Introduce facade/compat layer; contract tests for top N consumers
Q3: Difficult–Small	High	Small	Complex algorithm refactor within an isolated module; tricky state machine	Local correctness issues	Pair review; property‑based tests; targeted docs
Q4: Difficult–Large	High	Large	Shared API response shape update; cross‑cutting auth flow; telemetry schema change	System‑wide break risk	Versioned contracts; migration plan; isolation boundary

2) Key Dimensions Engineering Should Evaluate

Dimension	Scale	What to Check	Why it Matters	Control
Criticality	Low → High	Does it affect revenue, security, SLAs, or core journeys?	Higher stakes demand higher validation standards	Strengthen tests and reviews; enforce approvals
Coupling (Fan‑out)	Few → Many	How many modules/services depend on it?	More dependents = wider blast radius	Encapsulate behind a facade/accessor
Compatibility Window	Short → Long	Can old and new coexist? For how long?	Short windows force risky cutovers	Versioned contract or shim layer
Observability	Minimal → Full	Do we have logs/metrics/traces on the boundary?	You cannot fix what you cannot see	Instrument before change; create dashboards
Rollback Cost	Easy → Hard	How quickly can we revert or disable?	Hard rollbacks increase MTTR	Use a feature flag or configuration switch; keep changes isolated
Volatility	Stable → Volatile	Will this area change again soon?	Volatile areas invite rework	Add indirection; avoid lockstep coupling
Ownership	Clear → Diffuse	Do we know who owns each consumer?	Diffuse ownership increases coordination overhead	Notify owners; attach a migration guide

3) Risk Score (Simple Model)

Use this to tune test depth and coordination overhead.

Risk = (Radius 1–5) + (Criticality 1–5) + (Volatility 1–5) − (Observability 1–5)

Guidance:
  0–3   → Standard workflow
  4–6   → Extra tests + broader peer review
  7–9   → Structured rollout (no canary) + contract tests
  ≥10   → Versioned approach + explicit migration plan + senior review

4) Execution Matrix (What to Do by Quadrant)

Quadrant	Test Scope	Architecture Controls	Release Strategy	Operational Prep
Easy–Small	Unit tests	None	Standard	No additional ops needed
Easy–Large	Unit + Contract tests (top N consumers); screenshot tests if UI	Facade/Accessor + compatibility layer	Feature flag or config switch; document rollback steps	Basic dashboards and alerts for the affected boundary
Difficult–Small	Unit + property‑based + focused integration	Keep interface stable; verify encapsulation boundary	Standard or staged rollout by environment	Targeted metrics on the module
Difficult–Large	Unit + Contract + Integration + E2E smoke (critical flows only)	Versioned contract, isolation boundary, schema registry (if applicable)	Staged release by environment (e.g., dev → test → prod) with a clear revert path	SLO/error‑rate guardrails; runbook for reversion

5) Change‑Type Heuristics (Map Your Change Quickly)

Change Type	Likely Quadrant	Anti‑Pattern to Avoid	Preferred Pattern
Formatting/util change (date, currency, number)	Easy–Large	Editing call sites directly across the codebase	Central formatter module; versioned API if semantics change
Common UI control tweak	Easy–Large	Per‑page overrides	Design system tokens + component library update
Shared config rename	Easy–Large	Direct env reads in each service	Typed config accessor with defaults and validation
Algorithm upgrade (isolated)	Difficult–Small	Leaking new concerns across interfaces	Keep interface stable; property‑based tests
API response shape change	Difficult–Large	Breaking changes without a grace period	Versioned API/contract + consumer‑driven contract tests
Telemetry schema change	Easy–Large → Difficult–Large	Renaming fields without mapping or doc	Telemetry facade with mapping and deprecation window

6) PR Checklist (Paste Into Your PR Template)

Quadrant classification included (Easy/Difficult × Small/Large)
Risk score provided and rationale noted
Architecture boundary identified (facade, contract, isolation)
Test plan attached (unit, contract, integration; screenshot if UI)
Rollback plan documented (flag/switch name; exact steps)
Observability in place (dashboards/alerts for the boundary)
Coordination: impacted code owners tagged; migration guide attached if needed

7) PR Description Template (Minimal)

### Classification
Quadrant: <Easy–Small | Easy–Large | Difficult–Small | Difficult–Large>
Risk Score: <value>  (R = Radius + Criticality + Volatility − Observability)

### Change Summary
- What: <one line>
- Why: <value/requirement>
- Scope: <repos/modules/pages>

### Controls
- Architecture: <facade/compat layer | versioned contract | isolation boundary>
- Tests: <unit, contract (top N consumers), integration, screenshot (if UI)>
- Release: <flag/switch name> with documented revert steps
- Rollback: <exact command/step to disable or revert>

### Observability
- Dashboards: <links>
- Alerts: <links>
- SLO Guardrails: <brief>

### Coordination
- Owners notified: <teams/users>
- Migration guide: <link>

8) Pragmatic Yes/No (for grooming and PR review)

Can this change be routed through a single facade today?
Do we have a feature flag or config switch and a tested rollback?
Are top N consumers covered by contract tests in CI?
Can old and new coexist via a versioned contract?
Are dashboards/alerts in place before the change lands?

9) Socratic Prompts (to reduce future radius)

What is the smallest boundary behind which 100% of this behavior can live?
Which consumer assumptions are most likely false in production?
If this will change again within a quarter, what indirection prevents rewiring?
How does this design lower the radius of the next similar change?

32teeth/Engineering Complexity Matrix.md

Select an option

No results found