TDD as the Agent Feedback Loop¶
matt-pocock's single-sentence thesis on agent quality: the ceiling on AI code output is the quality of your feedback loops, and TDD + type-check is the best such loop we have.
"These feedback loops are essential to AI, essential to get AI to produce anything reasonable. … If your code base doesn't have feedback loops, you're never ever ever going to get decent output out of AI. And often what you'll find is that the quality of your feedback loops influences how good your AI can code, essentially. That is the ceiling." — Pocock, ~1:09:51–1:10:30
Mechanics¶
- Red-green-refactor — the implementer skill forces the agent to write a single failing test first, confirm red, then implement until green. Shipped as a skill in Pocock's workshop repo.
- Why tests-first works for agents specifically. Agents that write the implementation first tend to cheat the tests afterwards — "it will do the entire implementation and then it will do the entire test layer just below it" (1:08:01). TDD instruments behaviour before the code exists, making cheating structurally harder.
- Combine with type-check. After tests pass, the harness runs
npm run type-check; in the demo a singletypeof level threshold numbererror surfaces and is auto-fixed. Types + tests together = dense self-correction signal.
Codebase prerequisites¶
Pocock spends the final third of the workshop on deep modules (Ousterhout, A Philosophy of Software Design). Shallow modules force the agent to either wrap every tiny function in its own test (and mock neighbours) or test blobs of related files together and "hope and pray." Deep modules give you a big test boundary per module with a small interface — feedback loops become cheap, and the agent becomes productive. This is the structural definition of agent-legible-software.
Related¶
- feedback-subagent — Florian Jüngermann's out-of-band feedback mechanism
- tracer-bullets — the per-slice unit these loops validate
- afk-implementation — the loop is what makes night-shift safe
- harness-engineering · agent-legible-software · claude-code-skills