Problem:
Agents produce lots of code. Divergent code, often without hollistic understanding. Agents see a snapsho and solve within that snapshot.
Solution:
AI Code reviews solve that, by framing the work in a bigger context. Ensuring blast radius is contained. But its an art. AI code review is also non deterministic. And often add more blast radius than it contains if used blindly Lets ecplore.
The first thing you should tell an AI code reviewer is to avoid nit picks, pedantic opinions, scope creep. AI code revierwers will often add more problems than they solve if you do not limit their input. They are eager to always help, and add, this is by design in their core LLM instructuions. Create more work always. But that only leads to bigger and bigger blast radius with regression issues and scope creep and pedantic opinions at every turn. leading to an ever increasing divergent state. What you want to achive is convergence to good code health. So how do you do it?
- Always append. these 3: "no nit picks, no pedantic opinions, no scope creep"
- Always append. "If there isnt anything major just say so". (this avoids endless loop back and forth)
- Always append. "Only review the diff. Do not review the codebase as a whole". (this avoids scope creep)
The kickoff prompt:
Review this and post your review as a comment in the pr. "link to pr". ("no nit picks, no pedantic opinions, no scope creep" - If there isnt anything major just say so. Only review the diff. Do not review the codebase as a whole)
- Send this kickoff promt to various agents 1 to 3.
- Then send each comment link to the coder agent. feedback. fix what makes sense: link1, link2, link3
- Once the fixes lands. Do the review over again. do this over and over until the review agents stop complaining. then merge the solution.
Thats it. This should harden, reduce blast-radius and avoid regression for any feature addition your agents might make.
Aditionally
- You could add CI systems that run on each commit in each pr. This will gate any addition as the agents work and fix reviews. Continously hardening to your code standards that you have set. I use vitests, pupeteer uitests, eslint, sonar qube code health, etc that forces agents to converge to high standards at every turn autonatically.
- For big advance feature additions runing paralell prs is smart (divergence means all paths to Rome are explored). Do all the above. And finish the feature in their own PRs. end to end. Once they are ready to merge. hav emany agents judge which ones are best. Pros and cons. Sometimes agents will suggest cherry picking from one to the other etc.