Agents as Model Selectors¶

ash-lewis (Fastino) flips the usual frame: instead of an engineer picking a model, an agent continuously tests, swaps, and tunes models against production traffic.

"It's done experiments with Llama, Glyner, DeepSeek, and it's giving you accuracy reports for each one of those. And more importantly, what's going to happen is as we're inferencing, we can use all that great inference data — cuz we know what your users are actually using the model for — to improve the model."

Structural claim¶

Model selection is not a one-time procurement event — it's a closed-loop control problem. Agent observes traffic distribution shift, re-evaluates candidate models, swaps routing, fine-tunes on inference logs, loops.

Why this matters¶

This is the productisation of Konstanty's eval discipline. If error-analysis is error-analysis-as-detective-work performed by a PM, Lewis's Pioneer automates the re-evaluation leg of that loop so the detective's findings actually result in a swapped/retuned model rather than a JIRA ticket.