Context Compaction¶

The mechanism claude-code uses to keep the model's context below the "stupid threshold." Internally referred to (per jared-zoneraich's read) as H2A — an async I/O buffer that decouples terminal/reasoning output from what goes back into the model.

Behaviour¶

Triggers around ~92% context fill.
Strategy: summarise head and tail, drop the middle (classic "lost-in-the-middle"-aware compression), then continue the loop.
Long-term memory is offloaded to the file system via bash — the agent writes markdown notes and re-reads them, rather than keeping everything in-context. Zoneraich predicts every chat UI will ship a sandbox for exactly this reason.

Alternative designs mentioned¶

Amp's handoff — instead of compacting in place, spawn a fresh thread and hand over a hand-written briefing. "Faster than reloading, like switching weapons in Call of Duty." Zoneraich leans toward handoff-style as the winning pattern.
Cursor Composer — distilled fast model absorbs some of the compaction burden.

Why it matters¶

Context is "the boogeyman" of agent design — the claude-code-master-loop only works because something is actively pruning. Compaction is the reason simple loops beat DAGs: DAGs pay their context cost per node; a well-compacted loop amortises it.

claude-code · context-engineering · claude-code-master-loop · subagent-architecture

Context Compaction¶

Behaviour¶

Alternative designs mentioned¶

Why it matters¶

Related¶