Skip to content

Dark Factory Skepticism

Pushback from armin-ronacher and mario-zechner against the 2026 "dark factory" / "software factory with zero humans" fantasy — the narrative that tens, hundreds, or thousands of agents will organize themselves into roles (PM, engineer, QA, mayor), break down a spec, execute, and produce a finished product without human review.

The steelman (what they're arguing against)

"The dark factory — tens or hundreds or thousands of agents, you give them a spec, they go and they break it up, they organize themselves, they have the QA agent, they have the roles, you give them context, they ship."

The fantasy rests on three assumptions: (1) agent output scales linearly with agent count, (2) review can be fully automated by more agents, (3) specs are complete enough to eliminate ambiguity. Armin & Mario dispute all three.

Their counter-claims

  • Output scales, error rate doesn't shrink proportionally. Even at half a human's error rate × 10× throughput = 5× more bugs shipped. See slow-the-f-down for the math.
  • Review throughput is the bottleneck, and it's human. Agents reviewing agents' code is a closed loop with no external grounding — "you need some way to review all of that code, but you can't as a human because you're used to spitting out 1.5k LOC a day."
  • Complete specs don't exist at the level of real products. The "way to the mountain is never a straight line" (Peter's version) — the discovery work that happens mid-build is where taste lives.
  • Armin, closing line on it: "I'm not worried about all the dark factory and all the software is dead and SaaS is dead and all that. I generally believe this is just part of the hype machine and that will self-correct."

Where this sits relative to the wiki

The wiki's 2026 corpus runs a welded thread through roughly every agentic-engineering voice: Hashimoto's "human stays architect," Lopopolo's harness-engineering, Krentsel's sessions-as-processes, Horthy's do-not-outsource-thinking, Sanchez's agents-scale-not-fix, Zakariasson's engineer-as-director-of-agents. Armin + Mario land on the same island with a different approach vector: the human in the loop is load-bearing because review doesn't scale. See cross-ingest synthesis in the query page.

Who disagrees

Anyone running a fully-autonomous agent fleet as a product claim (e.g. Symphony-era OpenAI framings where ryan-lopopolo talks about 0% human code/review). The actual Symphony talk has caveats — humans still architect and review harnesses, not code — but the headline number feeds the dark-factory narrative this concept is pushing back on. Tension, not contradiction.

Connects to