Skip to content

Ryan Lopopolo — Harness Engineering: How to Build Software When Humans Steer, Agents Execute

Source index. ~46-min conference talk. Raw at ryan-lopopolo-harness-engineering-2026.

Thesis

Nine months of building software exclusively through agents at openai. Code is free. Implementation is no longer scarce. The job of the senior engineer is now harness-engineering — making the codebase, docs, and processes legible to agents so they can do the full job while humans do higher-leverage delegation, system design, and taste.

Structure

  1. "I am a token billionaire" — the AGI-pill framing
  2. The banned-editor experiment — team works only through the harness
  3. Three scarce resources: human time, human attention, model context window
  4. code-is-free axiom — consequences for refactor, migration, P3 work
  5. Non-functional requirements are the hard part — non-functional-requirements-as-prompts
  6. "Don't accept slop" — short-term velocity hit to install durable guardrails
  7. Observability-for-agents — DevTools via skill, local dev stack invocable by Codex
  8. The lint-bespoke-to-the-codebase pattern (fetch without retry/timeout example)
  9. Test-the-source-code (file ≤350 lines as a context-window invariant)
  10. PR as hub-and-spoke broadcast — throughput over blocking review
  11. Five-to-ten deep skills > wide shallow skill surface
  12. Q&A: working in the car, CarPlay voice mode not ready

Concepts introduced

Entities

Memorable moves

  • "I am a token billionaire and I believe that in order for us to get into our AGI future, we want everybody to be token billionaires."
  • "I've lived that experience by banning my team from even touching their editors."
  • "Code is free… it's free to produce, free to refactor, and it is not a thing to get hung up on anymore."
  • "Things are either P0s or P2s. Those P3s will never get done. However, in a world where code is free… all those P3s get kicked off immediately, maybe 4x in parallel."
  • "The important thing is not the code but the prompt and the guardrails that got you there."
  • "Don't produce slop. Don't accept slop. You won't get slop in your codebase."
  • "We need to make them legible to those agents that are driving the implementation."
  • "We don't go super wide on skills, preferring to mature them deeply."

Open questions

  • The "ban editors" policy is maximalist. What's the minimum-viable version for a team not at OpenAI scale? Which parts generalize?
  • 5–10 skills as the centralization target — what's the selection criterion? How do you retire a skill? Lopopolo doesn't say.
  • Tests-about-source-code (file length, naming conventions) vs. standard linters — when does this cross into over-engineering? Where's the break-even on the agent-context savings vs. author cognitive load?
  • Throughput-over-blocking PR review: what's the defect-escape rate? Lopopolo claims guardrails + tests catch it, but he doesn't cite numbers.
  • "Implementation agent can acknowledge, defer, or reject any feedback." Does this scale past a small, high-trust team, or is it an OpenAI-culture-specific move?
  • The agents-building-agents reflexive dimension: he builds internal agents to improve co-workers' productivity. What's the compounding / regression profile?

Synthesis note

Five AI Engineer 2026 ingests now form a layered picture of the agentic SDLC:

Lopopolo sits above the other four — the others describe individual trust boundaries (code, eval, exec, identity); Lopopolo describes the day-to-day practice of running a team where the trust boundaries are presumed and the work is moving synchronous human time into higher-leverage guidance. My inference: the four edges without the operating discipline produce a hardened but idle system; the operating discipline without the edges produces throughput with leaks. Neither speaker says this explicitly — it's my synthesis across the five talks.