Yoav Goldberg, May 2026
A recurring claim around agentic vibe coding is "when compilers were introduced, we switched to writing in high level without looking at the produced assembly language. Now with coding agents, we can similarly switch to writing in English, without looking at the produced code."
A common objection to this claim is "but compilers are deterministic and coding agents are not!". I think this is a bad objection, as it misses the mark. First, compiler engineers will be very quick to point out that compilers are also non-deterministic. Well, ok, but this also misses the mark for me.
I argue that "AI coding agents" are fundamentally different from compilers, and that even if the AI coding agents will be perfect in always either:
- following their instructions fully, in the exact same way every time, and without making any mistakes, or
- failing a given instruction and letting you know that they failed
They will still be fundamentally different than compilers. And this difference is such that the cost of "not looking at the produced code" is much cheaper for compilers than it is for coding agents.1
I will explain why.
The brief answer is that compilers translate a complete specification to a runnable program, while a coding agent translates instructions and previously produced code to newly produced code. The compiler is stateless (each invocation is fully independent) and the agent is stateful (each invocation depends on the result of previous invocations). When using a compiler, your input to it is a sequence of independent, complete programs. When using a coding agents your input to it is a sequence of change modification that rely on each other. I think this is a big difference. If on top of that the instructions are not about modifying code but about modifying the behavior of the produced program, then you are essentially giving a sequence of indirect code modification instructions, on a code you do not observe. Another big difference.
If you understood the point, you can stop reading now. If not, continue reading. I hope my elaboration will at least add some doubt to your mind that not reading compiler output has different tradeoffs than not reading agent outputs.
Compilers invocations are isolated from each other. When you invoke a compiler, you are providing it with input (a program that someone wrote) and it produces an output (a translation of the input program to machine code). You trust the translation to be correct, so you don't have to look at it. But this is OK because you do understand the input. And if you observe bad behavior from the resulting program, you edit the input and modify it, so that the behavior is as you expect. And then run the compiler again on the modified input.
Compare it to agentic coding. Coding-agent invocations are not isolated from each other. They take as input both your instructions and their own previous outputs (the code produced so far), and produce a new output. In a pure vibe-coding setup, you are not allowed to look at the output, only to execute it and observe the runtime behavior. If you observe a bad behavior from the resulting program, you ask to fix it. The agent then takes the code and your instruction, and produces a new version of the code. In other words, your input to the agent is not a specification, but a sequence of behavior change instructions. And these instructions are executed by modifications of a hidden state that you are not allowed to observe. At some point, the agent will fail to accomodate your request: it could not change the behavior in the way you requested, given its current hidden state (the hidden spec, or program, it is maintaining and which you do not have access to). How do you proceed from here? Note that you are not allowed to look at the code, or to ask it about the code. Only to instruct it about behavior change. (If you can ask it about the code, then it means you do care about the structure of the code.)
Coding agents don't translate instructions to code. They translate instructions and code to modifications of code. The instructions can be about current and desired behavior of the program the current code translates to. Or they can be directly about the underlying code. Naturally if you do not look at the code, you could not provide instructions of the second kind. Are instructions of the second kind needed? If you are comfortable discussing behavioral requirements without considering implementation at all, and if the coding agent is perfect at execution your behavior change requests, then instructions of the first kind are sufficient. Do you trust the coding agent to be perfect? do you trust yourself that talking only about behavior is sufficient?
Regardless of the nature of the instructions (about behavior or about code) the sequential and incremental nature of them makes the coding agent more like a co-worker than like a compiler. Programmers very often do read the code produced by their co-workers, this is called "code review" and is often mandated before a code can enter a shared project. What is the purpose of human-to-human code review? it is rarely to check for correctness. We trust our workers to produce working and correct code, and we trust that it can be modified if the behavior will be wrong.2 We review our peers code to verify that it passes some standards of readability and maintainability. We read it to make sure it is organized in a way which will allow to change it in the future with relative ease. To make sure components are well separated, that no un-needed dependencies are introduced, that future changes to behavior could be relatively isolated, and that it will be clear to the future editor of the code which parts are relevant for a particular change, and which could be ignored. Code review is not about function, its about structure.
If we don't trust our peers enough to produce well-structured, readable and maintainable code---code that will be easy for future modifications---why do we trust coding agents to do so? well, we review our peers code because we fear that some day we will be the ones who will need to modify it. But if we trust all future edits to be done by the coding agents, shouldn't the agent be the one to worry about the code being maintainable? can't the coding agent review the code for maintainability by future agents? This is certainly a possible future. But there is no indication nor any reason to believe that current day coding agents can assess what code will be easy or hard for themselves to maintain. I do not think any one knows what agent-maintainable code looks like, and what are the qualities it should posses. It is too early. An interesting question to research, though. If, however, we do believe that practices that make code maintainable for humans will also be effective for agents, it may be worth making sure the agents follow these practices. More importantly, since your role as the agents-master is going to be modification-instruction-author, maybe its time to consider what would be a structure that will make future modifications-through-instructions easy. What does it mean for an agent-driven project to be maintainable? does it require knowing the structure of the underlying code?
What happens if several people instruct agents to modify the code, each adding different functionality? can they keep track of the changes effectively? do they need to? will it be easier for them if the code itself is also readable for humans? perhaps it be better for them to have an incremental knowledge of the code, and to know how it evolved through their instructions, and what instructions they gave modified code that was produced or touched by instructions of others? will it be easier for such a team to synchronize with each other by reading each other's stream of behavior-modification instructions to agents, or through team members keeping track of the shared code base being modified? (In the compilers case, the team of human developers is kept synchronized through editing (and knowing) a shared code base. The compiler gets as input the entire code base at each invocation, and is oblivious to all of this). Maybe a conclusion here is that we should have smaller (human) teams, with only a single person in charge of a team of agents. This essentially translates to a different organizational structure, in which a single product manager is commanding a team of developers that he fully trusts. (Of course, in such teams, there is usually at least one human who keeps a high level view of the entire code base and its structure, and knows how different code design decisions relate to each other and to the different requirements by the product or project manager. Who takes this role with coding agents? Or what replaces the needs for it?)
To summarize, a coding agent is not like a compiler. It is like a co-worker, or an employee. It doesn't translate specifications to programs, it incrementally modifies code based on instructions by you or by others. If you don't want to read the code it produces, it might be ok. But don't do it because you also don't read your compiler's output. Do it because it is a co-worker or employee that you trust enough to not review their code, and to not even look at it. I hope that you agree with me that the set of considerations for not reading code produced by an employee or a co-worker is very different than those for not reading compiler output. And that the difference is not the non-determinism of your co-workers.
Footnotes
-
Why is condition 2 needed? Can't we have a coding agent that never fails to follow an instruction? No, because some instructions are just not possible. And, more realistically, it is likely that agents will also have some blind spots, and fail to realize how to implement some functionality of fix some bug even if it possible to do so. ↩
-
This is not fully the case. In some cases code reviews do catch correctness. And security related issues, which are a special form of correctness. But these are not the majority, and lets assume for a minute that they are. After all, we are making an analogy to coding agents here, and we already assume that the code of future coding agents will be perfectly correct. ↩