Agentic Post-Training Failure Modes¶
faye-zhang (Pinterest) listed four recurring ways that sub-agent post-training pipelines break in production:
- Spec drift — "throughout the iteration of training eval redo, the model main framework often forget what the success metrics looks like." The success rubric itself rots under iteration.
- Data distribution skew — "arguably this is the biggest bottleneck… you have very uneven data representation bias." Shows up as warped loss curves.
- Memory collapse — "Once you started to train your models after 20, 30 to 100 epochs, the model essentially forgot how to get the right memory config from previous time, and they started to hallucinate."
- Tool misuse — "still something to watch out for," though less acute in 2026 thanks to better tool-use conventions.
Why this matters¶
Zhang's list is the post-training analogue to Konstanty's eval-drift taxonomy (eval-lifecycle-pre-to-production). Both insist: parallelising the work does not eliminate the failure surface — it shifts it from "engineer took too long" to "pipeline drifted silently." The remediation is orchestration discipline (Anthropic Agent SDK hooks) plus the same error-analysis-as-detective-work stance Konstanty advocates on the eval side.
Fixes cited¶
- Anthropic Agent SDK for explicit orchestration + hooks
- Dynamic sub-agent scaling to avoid hot-celebrity-orchestration-problem
- Treat memory as a learned component (PPO/GRPO per cited papers)