title: Trigger: Long-Running Agent vs Early Results created: 2026-05-02 updated: 2026-05-02 type: concept tags: [domain/ai, agents, ux] sources: [raw/transcripts/florian-juengermann-listen-agents-2026.md] confidence: medium
Trigger: Long-Running Agent vs Early Results¶
Design decision listen-labs makes at the first turn of every research conversation: is this a 30-second live answer, or a 30-minute async deep-dive? The answer changes the entire orchestration — same agent architecture, different parameters (context budget, tool-depth, whether the feedback-subagent blocks on review).
Two modes¶
| Mode | Latency | User affordance | Feedback subagent role |
|---|---|---|---|
| Live chat | seconds | Streams partial results; user can redirect | Runs as async eval after the fact |
| Long async | ~30 min | User leaves, comes back; full report delivered | Runs inline, blocks final output until review passes |
How Listen decides¶
Not a user toggle — the main agent classifies the incoming question: - "How many respondents mentioned X?" → live, single map-reduce-classification call - "Produce a full competitive landscape report with slide deck" → long async, spawns the powerpoint-subagent
Florian (source):
"We do run like one analysis run up front and that can take like 30 minutes… we use the same agent architecture for that as for the live interaction chat, there's some different parameters."
Related¶
- sync-plan-async-execute — Zakariasson's variant; same instinct split differently
- feedback-subagent — behaves differently per mode
- silent-failure-dropoff — risk specific to live mode
My inference: the bimodal split is likely a transitional design — as latency improves and evals tighten, the "long async" path shortens and the boundary drifts. Not what Florian said explicitly.