Jared Zoneraich — How Claude Code Works¶
Source index. ~1h05m closing workshop at AI Engineer 2025 (uploaded Dec 2025). Raw at jared-zoneraich-claude-code-works-2026. Unofficial reverse engineering of claude-code — not endorsed by Anthropic.
Structure¶
- Why coding agents suddenly work (simple architecture + better models)
- Internals tour: master loop, tools, to-dos, sub-agents, sandboxing, skills
- Comparative: Codex, Amp, Cursor Composer, Factory Droid, Devin
- Evaluation: agent smell, back-testing, rigorous tools
- Future: headless SDKs, agentic endpoints, fewer tool calls
Concepts introduced / minted¶
- claude-code-master-loop —
nO, four-line while-tool-call loop - claude-code-todo-tool — prompt-enforced planning scratchpad
- context-compaction — H2A async buffer, ~92% trigger, head+tail summarisation
- bash-as-universal-tool — "bash is all you need"
- claude-code-skills — extendable system prompts, auto-selection gap
- agent-smell — surface-level agent metrics (tool-call count, retries, latency)
Reinforces existing: subagent-architecture, context-engineering, agentic-engineering, harness-engineering, contextual-prompt-engineering, skill-distillation, eval-lifecycle-pre-to-production.
Entities¶
- jared-zoneraich — speaker, founder/CEO
- promptlayer — company, AI-engineering workbench
Notable claims¶
- "Less scaffolding, more model." Scaffolding to paper over current-model flaws is obsolete in 3–6 months; invest in the outer loop + rigorous tools instead.
- Master loop (
nO) is ~4 lines. The architectural simplification is the story; model improvements then compound 1:1. - Grep > RAG for general-purpose agents. Vector DBs were a workaround for weak long-context / weak tool use.
- To-dos are not deterministically enforced — purely system-prompt-level structure, only possible because 2025 models follow instructions.
- Handoff > compact (Amp) is probably the winning context-continuation pattern.
- "AI therapist problem" — no global maximum in agent design; Claude Code, Codex, Composer, Amp, Droid each own different use cases. Taste and domain experts (promptlayer's thesis) are the moat.
- Future: most LLM calls may be replaced by
claude-codeSDK invocations — the agent loop as the new completions API.
Cross-ingest links¶
- mitchell-hashimoto-zed-agentic-2025 — Hashimoto's live-session usage of claude-code (slash commands, jujutsu snapshots, Zig weakness) is the practitioner complement to Zoneraich's architectural read. Read together: Zoneraich explains why the loop tolerates Hashimoto's freewheeling exploration.
- peter-steinberger-state-of-the-claw-2026 — Peter's "State of the Claw" leans on Claude Code internals (dreaming/memory, sub-agents, sandboxing). Zoneraich's H2A/compaction + skills analysis is the missing architectural lower-layer that Peter's ecosystem-level commentary sits on top of.
- Adjacent: florian-juengermann-listen-agents-2026 (Claude Code SDK inside E2B for PowerPoint subagents), ryan-lopopolo-harness-engineering-2026 (same "rigorous tools, flexible loop" thesis from a Frontier/OpenAI angle).
Open questions¶
- Is skill auto-selection a post-training problem or an inherent limit of prompt-level dispatch?
- Does the "one mega tool call" vs "hundreds of tool calls" debate resolve via model capability, or via tool-ecosystem standards?
- How do you eval a maximally flexible master loop? agent-smell is a start — what's the right aggregation?