Jared Zoneraich — How Claude Code Works¶

Source index. ~1h05m closing workshop at AI Engineer 2025 (uploaded Dec 2025). Raw at jared-zoneraich-claude-code-works-2026. Unofficial reverse engineering of claude-code — not endorsed by Anthropic.

Structure¶

Why coding agents suddenly work (simple architecture + better models)
Internals tour: master loop, tools, to-dos, sub-agents, sandboxing, skills
Comparative: Codex, Amp, Cursor Composer, Factory Droid, Devin
Evaluation: agent smell, back-testing, rigorous tools
Future: headless SDKs, agentic endpoints, fewer tool calls

Concepts introduced / minted¶

claude-code-master-loop — nO, four-line while-tool-call loop
claude-code-todo-tool — prompt-enforced planning scratchpad
context-compaction — H2A async buffer, ~92% trigger, head+tail summarisation
bash-as-universal-tool — "bash is all you need"
claude-code-skills — extendable system prompts, auto-selection gap
agent-smell — surface-level agent metrics (tool-call count, retries, latency)

Reinforces existing: subagent-architecture, context-engineering, agentic-engineering, harness-engineering, contextual-prompt-engineering, skill-distillation, eval-lifecycle-pre-to-production.

Entities¶

jared-zoneraich — speaker, founder/CEO
promptlayer — company, AI-engineering workbench

Notable claims¶

"Less scaffolding, more model." Scaffolding to paper over current-model flaws is obsolete in 3–6 months; invest in the outer loop + rigorous tools instead.
Master loop (nO) is ~4 lines. The architectural simplification is the story; model improvements then compound 1:1.
Grep > RAG for general-purpose agents. Vector DBs were a workaround for weak long-context / weak tool use.
To-dos are not deterministically enforced — purely system-prompt-level structure, only possible because 2025 models follow instructions.
Handoff > compact (Amp) is probably the winning context-continuation pattern.
"AI therapist problem" — no global maximum in agent design; Claude Code, Codex, Composer, Amp, Droid each own different use cases. Taste and domain experts (promptlayer's thesis) are the moat.
Future: most LLM calls may be replaced by claude-code SDK invocations — the agent loop as the new completions API.

Cross-ingest links¶

mitchell-hashimoto-zed-agentic-2025 — Hashimoto's live-session usage of claude-code (slash commands, jujutsu snapshots, Zig weakness) is the practitioner complement to Zoneraich's architectural read. Read together: Zoneraich explains why the loop tolerates Hashimoto's freewheeling exploration.
peter-steinberger-state-of-the-claw-2026 — Peter's "State of the Claw" leans on Claude Code internals (dreaming/memory, sub-agents, sandboxing). Zoneraich's H2A/compaction + skills analysis is the missing architectural lower-layer that Peter's ecosystem-level commentary sits on top of.
Adjacent: florian-juengermann-listen-agents-2026 (Claude Code SDK inside E2B for PowerPoint subagents), ryan-lopopolo-harness-engineering-2026 (same "rigorous tools, flexible loop" thesis from a Frontier/OpenAI angle).

Open questions¶

Is skill auto-selection a post-training problem or an inherent limit of prompt-level dispatch?
Does the "one mega tool call" vs "hundreds of tool calls" debate resolve via model capability, or via tool-ecosystem standards?
How do you eval a maximally flexible master loop? agent-smell is a start — what's the right aggregation?